Sample records for background protein structures

  1. Highly Resolved Sub-Terahertz Vibrational Spectroscopy of Biological Macromolecules and Bacteria Cells

    DTIC Science & Technology

    2016-07-01

    between average background spectrum and chicken egg - white lysozyme protein spectrum...spectroscopic signatures were conducted using human insulin protein and chicken egg -white lysozyme protein. Proteins with different structures...the comparison between the average background THz spectrum (black line in Figure 13) and the chicken egg -white lysozyme THz spectrum (blue line

  2. Course 12: Proteins: Structural, Thermodynamic and Kinetic Aspects

    NASA Astrophysics Data System (ADS)

    Finkelstein, A. V.

    1 Introduction 2 Overview of protein architectures and discussion of physical background of their natural selection 2.1 Protein structures 2.2 Physical selection of protein structures 3 Thermodynamic aspects of protein folding 3.1 Reversible denaturation of protein structures 3.2 What do denatured proteins look like? 3.3 Why denaturation of a globular protein is the first-order phase transition 3.4 "Gap" in energy spectrum: The main characteristic that distinguishes protein chains from random polymers 4 Kinetic aspects of protein folding 4.1 Protein folding in vivo 4.2 Protein folding in vitro (in the test-tube) 4.3 Theory of protein folding rates and solution of the Levinthal paradox

  3. Molecular and ultrastructural analysis of forisome subunits reveals the principles of forisome assembly

    PubMed Central

    Müller, Boje; Groscurth, Sira; Menzel, Matthias; Rüping, Boris A.; Twyman, Richard M.; Prüfer, Dirk; Noll, Gundula A.

    2014-01-01

    Background and Aims Forisomes are specialized structural phloem proteins that mediate sieve element occlusion after wounding exclusively in papilionoid legumes, but most studies of forisome structure and function have focused on the Old World clade rather than the early lineages. A comprehensive phylogenetic, molecular, structural and functional analysis of forisomes from species covering a broad spectrum of the papilionoid legumes was therefore carried out, including the first analysis of Dipteryx panamensis forisomes, representing the earliest branch of the Papilionoideae lineage. The aim was to study the molecular, structural and functional conservation among forisomes from different tribes and to establish the roles of individual forisome subunits. Methods Sequence analysis and bioinformatics were combined with structural and functional analysis of native forisomes and artificial forisome-like protein bodies, the latter produced by expressing forisome genes from different legumes in a heterologous background. The structure of these bodies was analysed using a combination of confocal laser scanning microscopy (CLSM), scanning electron microscopy (SEM) and transmission electron microscopy (TEM), and the function of individual subunits was examined by combinatorial expression, micromanipulation and light microscopy. Key Results Dipteryx panamensis native forisomes and homomeric protein bodies assembled from the single sieve element occlusion by forisome (SEO-F) subunit identified in this species were structurally and functionally similar to forisomes from the Old World clade. In contrast, homomeric protein bodies assembled from individual SEO-F subunits from Old World species yielded artificial forisomes differing in proportion to their native counterparts, suggesting that multiple SEO-F proteins are required for forisome assembly in these plants. Structural differences between Medicago truncatula native forisomes, homomeric protein bodies and heteromeric bodies containing all possible subunit combinations suggested that combinations of SEO-F proteins may fine-tune the geometric proportions and reactivity of forisomes. Conclusions It is concluded that forisome structure and function have been strongly conserved during evolution and that species-dependent subsets of SEO-F proteins may have evolved to fine-tune the structure of native forisomes. PMID:24694827

  4. Representing and comparing protein structures as paths in three-dimensional space

    PubMed Central

    Zhi, Degui; Krishna, S Sri; Cao, Haibo; Pevzner, Pavel; Godzik, Adam

    2006-01-01

    Background Most existing formulations of protein structure comparison are based on detailed atomic level descriptions of protein structures and bypass potential insights that arise from a higher-level abstraction. Results We propose a structure comparison approach based on a simplified representation of proteins that describes its three-dimensional path by local curvature along the generalized backbone of the polypeptide. We have implemented a dynamic programming procedure that aligns curvatures of proteins by optimizing a defined sum turning angle deviation measure. Conclusion Although our procedure does not directly optimize global structural similarity as measured by RMSD, our benchmarking results indicate that it can surprisingly well recover the structural similarity defined by structure classification databases and traditional structure alignment programs. In addition, our program can recognize similarities between structures with extensive conformation changes that are beyond the ability of traditional structure alignment programs. We demonstrate the applications of procedure to several contexts of structure comparison. An implementation of our procedure, CURVE, is available as a public webserver. PMID:17052359

  5. Direct demodulation method for heavy atom position determination in protein crystallography

    NASA Astrophysics Data System (ADS)

    Zhou, Liang; Liu, Zhong-Chuan; Liu, Peng; Dong, Yu-Hui

    2013-01-01

    The first step of phasing in any de novo protein structure determination using isomorphous replacement (IR) or anomalous scattering (AD) experiments is to find heavy atom positions. Traditionally, heavy atom positions can be solved by inspecting the difference Patterson maps. Due to the weak signals in isomorphous or anomalous differences and the noisy background in the Patterson map, the search for heavy atoms may become difficult. Here, the direct demodulation (DD) method is applied to the difference Patterson maps to reduce the noisy backgrounds and sharpen the signal peaks. The real space Patterson search by using these optimized maps can locate the heavy atom positions more accurately. It is anticipated that the direct demodulation method can assist in heavy atom position determination and facilitate the de novo structure determination of proteins.

  6. Structural alphabets derived from attractors in conformational space

    PubMed Central

    2010-01-01

    Background The hierarchical and partially redundant nature of protein structures justifies the definition of frequently occurring conformations of short fragments as 'states'. Collections of selected representatives for these states define Structural Alphabets, describing the most typical local conformations within protein structures. These alphabets form a bridge between the string-oriented methods of sequence analysis and the coordinate-oriented methods of protein structure analysis. Results A Structural Alphabet has been derived by clustering all four-residue fragments of a high-resolution subset of the protein data bank and extracting the high-density states as representative conformational states. Each fragment is uniquely defined by a set of three independent angles corresponding to its degrees of freedom, capturing in simple and intuitive terms the properties of the conformational space. The fragments of the Structural Alphabet are equivalent to the conformational attractors and therefore yield a most informative encoding of proteins. Proteins can be reconstructed within the experimental uncertainty in structure determination and ensembles of structures can be encoded with accuracy and robustness. Conclusions The density-based Structural Alphabet provides a novel tool to describe local conformations and it is specifically suitable for application in studies of protein dynamics. PMID:20170534

  7. An approach to large scale identification of non-obvious structural similarities between proteins

    PubMed Central

    Cherkasov, Artem; Jones, Steven JM

    2004-01-01

    Background A new sequence independent bioinformatics approach allowing genome-wide search for proteins with similar three dimensional structures has been developed. By utilizing the numerical output of the sequence threading it establishes putative non-obvious structural similarities between proteins. When applied to the testing set of proteins with known three dimensional structures the developed approach was able to recognize structurally similar proteins with high accuracy. Results The method has been developed to identify pathogenic proteins with low sequence identity and high structural similarity to host analogues. Such protein structure relationships would be hypothesized to arise through convergent evolution or through ancient horizontal gene transfer events, now undetectable using current sequence alignment techniques. The pathogen proteins, which could mimic or interfere with host activities, would represent candidate virulence factors. The developed approach utilizes the numerical outputs from the sequence-structure threading. It identifies the potential structural similarity between a pair of proteins by correlating the threading scores of the corresponding two primary sequences against the library of the standard folds. This approach allowed up to 64% sensitivity and 99.9% specificity in distinguishing protein pairs with high structural similarity. Conclusion Preliminary results obtained by comparison of the genomes of Homo sapiens and several strains of Chlamydia trachomatis have demonstrated the potential usefulness of the method in the identification of bacterial proteins with known or potential roles in virulence. PMID:15147578

  8. Structural deformation upon protein-protein interaction: A structural alphabet approach

    PubMed Central

    Martin, Juliette; Regad, Leslie; Lecornet, Hélène; Camproux, Anne-Claude

    2008-01-01

    Background In a number of protein-protein complexes, the 3D structures of bound and unbound partners significantly differ, supporting the induced fit hypothesis for protein-protein binding. Results In this study, we explore the induced fit modifications on a set of 124 proteins available in both bound and unbound forms, in terms of local structure. The local structure is described thanks to a structural alphabet of 27 structural letters that allows a detailed description of the backbone. Using a control set to distinguish induced fit from experimental error and natural protein flexibility, we show that the fraction of structural letters modified upon binding is significantly greater than in the control set (36% versus 28%). This proportion is even greater in the interface regions (41%). Interface regions preferentially involve coils. Our analysis further reveals that some structural letters in coil are not favored in the interface. We show that certain structural letters in coil are particularly subject to modifications at the interface, and that the severity of structural change also varies. These information are used to derive a structural letter substitution matrix that summarizes the local structural changes observed in our data set. We also illustrate the usefulness of our approach to identify common binding motifs in unrelated proteins. Conclusion Our study provides qualitative information about induced fit. These results could be of help for flexible docking. PMID:18307769

  9. Influence of production process design on inclusion bodies protein: the case of an Antarctic flavohemoglobin

    PubMed Central

    2010-01-01

    Background Protein over-production in Escherichia coli often results in formation of inclusion bodies (IBs). Some recent reports have shown that the aggregation into IBs does not necessarily mean that the target protein is inactivated and that IBs may contain a high proportion of correctly folded protein. This proportion is variable depending on the protein itself, the genetic background of the producing cells and the expression temperature. In this paper we have evaluated the influence of other production process parameters on the quality of an inclusion bodies protein. Results The present paper describes the recombinant production in Escherichia coli of the flavohemoglobin from the Antarctic bacterium Pseudoalteromonas haloplanktis TAC125. Flavohemoglobins are multidomain proteins requiring FAD and heme cofactors. The production was carried out in several different experimental setups differing in bioreactor geometry, oxygen supply and the presence of a nitrosating compound. In all production processes, the recombinant protein accumulates in IBs, from which it was solubilized in non-denaturing conditions. Comparing structural properties of the solubilized flavohemoglobins, i.e. deriving from the different process designs, our data demonstrated that the protein preparations differ significantly in the presence of cofactors (heme and FAD) and as far as their secondary and tertiary structure content is concerned. Conclusions Data reported in this paper demonstrate that other production process parameters, besides growth temperature, can influence the structure of a recombinant product that accumulates in IBs. To the best of our knowledge, this is the first reported example in which the structural properties of a protein solubilized from inclusion bodies have been correlated to the production process design. PMID:20334669

  10. Classification of protein quaternary structure by functional domain composition

    PubMed Central

    Yu, Xiaojing; Wang, Chuan; Li, Yixue

    2006-01-01

    Background The number and the arrangement of subunits that form a protein are referred to as quaternary structure. Quaternary structure is an important protein attribute that is closely related to its function. Proteins with quaternary structure are called oligomeric proteins. Oligomeric proteins are involved in various biological processes, such as metabolism, signal transduction, and chromosome replication. Thus, it is highly desirable to develop some computational methods to automatically classify the quaternary structure of proteins from their sequences. Results To explore this problem, we adopted an approach based on the functional domain composition of proteins. Every protein was represented by a vector calculated from the domains in the PFAM database. The nearest neighbor algorithm (NNA) was used for classifying the quaternary structure of proteins from this information. The jackknife cross-validation test was performed on the non-redundant protein dataset in which the sequence identity was less than 25%. The overall success rate obtained is 75.17%. Additionally, to demonstrate the effectiveness of this method, we predicted the proteins in an independent dataset and achieved an overall success rate of 84.11% Conclusion Compared with the amino acid composition method and Blast, the results indicate that the domain composition approach may be a more effective and promising high-throughput method in dealing with this complicated problem in bioinformatics. PMID:16584572

  11. Projections for fast protein structure retrieval

    PubMed Central

    Bhattacharya, Sourangshu; Bhattacharyya, Chiranjib; Chandra, Nagasuma R

    2006-01-01

    Background In recent times, there has been an exponential rise in the number of protein structures in databases e.g. PDB. So, design of fast algorithms capable of querying such databases is becoming an increasingly important research issue. This paper reports an algorithm, motivated from spectral graph matching techniques, for retrieving protein structures similar to a query structure from a large protein structure database. Each protein structure is specified by the 3D coordinates of residues of the protein. The algorithm is based on a novel characterization of the residues, called projections, leading to a similarity measure between the residues of the two proteins. This measure is exploited to efficiently compute the optimal equivalences. Results Experimental results show that, the current algorithm outperforms the state of the art on benchmark datasets in terms of speed without losing accuracy. Search results on SCOP 95% nonredundant database, for fold similarity with 5 proteins from different SCOP classes show that the current method performs competitively with the standard algorithm CE. The algorithm is also capable of detecting non-topological similarities between two proteins which is not possible with most of the state of the art tools like Dali. PMID:17254310

  12. Exploring representations of protein structure for automated remote homology detection and mapping of protein structure space

    PubMed Central

    2014-01-01

    Background Due to rapid sequencing of genomes, there are now millions of deposited protein sequences with no known function. Fast sequence-based comparisons allow detecting close homologs for a protein of interest to transfer functional information from the homologs to the given protein. Sequence-based comparison cannot detect remote homologs, in which evolution has adjusted the sequence while largely preserving structure. Structure-based comparisons can detect remote homologs but most methods for doing so are too expensive to apply at a large scale over structural databases of proteins. Recently, fragment-based structural representations have been proposed that allow fast detection of remote homologs with reasonable accuracy. These representations have also been used to obtain linearly-reducible maps of protein structure space. It has been shown, as additionally supported from analysis in this paper that such maps preserve functional co-localization of the protein structure space. Methods Inspired by a recent application of the Latent Dirichlet Allocation (LDA) model for conducting structural comparisons of proteins, we propose higher-order LDA-obtained topic-based representations of protein structures to provide an alternative route for remote homology detection and organization of the protein structure space in few dimensions. Various techniques based on natural language processing are proposed and employed to aid the analysis of topics in the protein structure domain. Results We show that a topic-based representation is just as effective as a fragment-based one at automated detection of remote homologs and organization of protein structure space. We conduct a detailed analysis of the information content in the topic-based representation, showing that topics have semantic meaning. The fragment-based and topic-based representations are also shown to allow prediction of superfamily membership. Conclusions This work opens exciting venues in designing novel representations to extract information about protein structures, as well as organizing and mining protein structure space with mature text mining tools. PMID:25080993

  13. HDAPD: a web tool for searching the disease-associated protein structures

    PubMed Central

    2010-01-01

    Background The protein structures of the disease-associated proteins are important for proceeding with the structure-based drug design to against a particular disease. Up until now, proteins structures are usually searched through a PDB id or some sequence information. However, in the HDAPD database presented here the protein structure of a disease-associated protein can be directly searched through the associated disease name keyed in. Description The search in HDAPD can be easily initiated by keying some key words of a disease, protein name, protein type, or PDB id. The protein sequence can be presented in FASTA format and directly copied for a BLAST search. HDAPD is also interfaced with Jmol so that users can observe and operate a protein structure with Jmol. The gene ontological data such as cellular components, molecular functions, and biological processes are provided once a hyperlink to Gene Ontology (GO) is clicked. Further, HDAPD provides a link to the KEGG map such that where the protein is placed and its relationship with other proteins in a metabolic pathway can be found from the map. The latest literatures namely titles, journals, authors, and abstracts searched from PubMed for the protein are also presented as a length controllable list. Conclusions Since the HDAPD data content can be routinely updated through a PHP-MySQL web page built, the new database presented is useful for searching the structures for some disease-associated proteins that may play important roles in the disease developing process for performing the structure-based drug design to against the diseases. PMID:20158919

  14. Structure prediction of polyglutamine disease proteins: comparison of methods

    PubMed Central

    2014-01-01

    Background The expansion of polyglutamine (poly-Q) repeats in several unrelated proteins is associated with at least ten neurodegenerative diseases. The length of the poly-Q regions plays an important role in the progression of the diseases. The number of glutamines (Q) is inversely related to the onset age of these polyglutamine diseases, and the expansion of poly-Q repeats has been associated with protein misfolding. However, very little is known about the structural changes induced by the expansion of the repeats. Computational methods can provide an alternative to determine the structure of these poly-Q proteins, but it is important to evaluate their performance before large scale prediction work is done. Results In this paper, two popular protein structure prediction programs, I-TASSER and Rosetta, have been used to predict the structure of the N-terminal fragment of a protein associated with Huntington's disease with 17 glutamines. Results show that both programs have the ability to find the native structures, but I-TASSER performs better for the overall task. Conclusions Both I-TASSER and Rosetta can be used for structure prediction of proteins with poly-Q repeats. Knowledge of poly-Q structure may significantly contribute to development of therapeutic strategies for poly-Q diseases. PMID:25080018

  15. Towards comprehensive structural motif mining for better fold annotation in the "twilight zone" of sequence dissimilarity

    PubMed Central

    Jia, Yi; Huan, Jun; Buhr, Vincent; Zhang, Jintao; Carayannopoulos, Leonidas N

    2009-01-01

    Background Automatic identification of structure fingerprints from a group of diverse protein structures is challenging, especially for proteins whose divergent amino acid sequences may fall into the "twilight-" or "midnight-" zones where pair-wise sequence identities to known sequences fall below 25% and sequence-based functional annotations often fail. Results Here we report a novel graph database mining method and demonstrate its application to protein structure pattern identification and structure classification. The biologic motivation of our study is to recognize common structure patterns in "immunoevasins", proteins mediating virus evasion of host immune defense. Our experimental study, using both viral and non-viral proteins, demonstrates the efficiency and efficacy of the proposed method. Conclusion We present a theoretic framework, offer a practical software implementation for incorporating prior domain knowledge, such as substitution matrices as studied here, and devise an efficient algorithm to identify approximate matched frequent subgraphs. By doing so, we significantly expanded the analytical power of sophisticated data mining algorithms in dealing with large volume of complicated and noisy protein structure data. And without loss of generality, choice of appropriate compatibility matrices allows our method to be easily employed in domains where subgraph labels have some uncertainty. PMID:19208148

  16. Raster-scanning serial protein crystallography using micro- and nano-focused synchrotron beams

    PubMed Central

    Coquelle, Nicolas; Brewster, Aaron S.; Kapp, Ulrike; Shilova, Anastasya; Weinhausen, Britta; Burghammer, Manfred; Colletier, Jacques-Philippe

    2015-01-01

    High-resolution structural information was obtained from lysozyme microcrystals (20 µm in the largest dimension) using raster-scanning serial protein crystallography on micro- and nano-focused beamlines at the ESRF. Data were collected at room temperature (RT) from crystals sandwiched between two silicon nitride wafers, thereby preventing their drying, while limiting background scattering and sample consumption. In order to identify crystal hits, new multi-processing and GUI-driven Python-based pre-analysis software was developed, named NanoPeakCell, that was able to read data from a variety of crystallographic image formats. Further data processing was carried out using CrystFEL, and the resultant structures were refined to 1.7 Å resolution. The data demonstrate the feasibility of RT raster-scanning serial micro- and nano-protein crystallography at synchrotrons and validate it as an alternative approach for the collection of high-resolution structural data from micro-sized crystals. Advantages of the proposed approach are its thriftiness, its handling-free nature, the reduced amount of sample required, the adjustable hit rate, the high indexing rate and the minimization of background scattering. PMID:25945583

  17. Raster-scanning serial protein crystallography using micro- and nano-focused synchrotron beams

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Coquelle, Nicolas; Brewster, Aaron S.; Kapp, Ulrike

    High-resolution structural information was obtained from lysozyme microcrystals (20 µm in the largest dimension) using raster-scanning serial protein crystallography on micro- and nano-focused beamlines at the ESRF. Data were collected at room temperature (RT) from crystals sandwiched between two silicon nitride wafers, thereby preventing their drying, while limiting background scattering and sample consumption. In order to identify crystal hits, new multi-processing and GUI-driven Python-based pre-analysis software was developed, named NanoPeakCell, that was able to read data from a variety of crystallographic image formats. Further data processing was carried out using CrystFEL, and the resultant structures were refined to 1.7 Åmore » resolution. The data demonstrate the feasibility of RT raster-scanning serial micro- and nano-protein crystallography at synchrotrons and validate it as an alternative approach for the collection of high-resolution structural data from micro-sized crystals. Advantages of the proposed approach are its thriftiness, its handling-free nature, the reduced amount of sample required, the adjustable hit rate, the high indexing rate and the minimization of background scattering.« less

  18. Raster-scanning serial protein crystallography using micro- and nano-focused synchrotron beams.

    PubMed

    Coquelle, Nicolas; Brewster, Aaron S; Kapp, Ulrike; Shilova, Anastasya; Weinhausen, Britta; Burghammer, Manfred; Colletier, Jacques Philippe

    2015-05-01

    High-resolution structural information was obtained from lysozyme microcrystals (20 µm in the largest dimension) using raster-scanning serial protein crystallography on micro- and nano-focused beamlines at the ESRF. Data were collected at room temperature (RT) from crystals sandwiched between two silicon nitride wafers, thereby preventing their drying, while limiting background scattering and sample consumption. In order to identify crystal hits, new multi-processing and GUI-driven Python-based pre-analysis software was developed, named NanoPeakCell, that was able to read data from a variety of crystallographic image formats. Further data processing was carried out using CrystFEL, and the resultant structures were refined to 1.7 Å resolution. The data demonstrate the feasibility of RT raster-scanning serial micro- and nano-protein crystallography at synchrotrons and validate it as an alternative approach for the collection of high-resolution structural data from micro-sized crystals. Advantages of the proposed approach are its thriftiness, its handling-free nature, the reduced amount of sample required, the adjustable hit rate, the high indexing rate and the minimization of background scattering.

  19. Raster-scanning serial protein crystallography using micro- and nano-focused synchrotron beams

    DOE PAGES

    Coquelle, Nicolas; Brewster, Aaron S.; Kapp, Ulrike; ...

    2015-04-25

    High-resolution structural information was obtained from lysozyme microcrystals (20 µm in the largest dimension) using raster-scanning serial protein crystallography on micro- and nano-focused beamlines at the ESRF. Data were collected at room temperature (RT) from crystals sandwiched between two silicon nitride wafers, thereby preventing their drying, while limiting background scattering and sample consumption. In order to identify crystal hits, new multi-processing and GUI-driven Python-based pre-analysis software was developed, named NanoPeakCell, that was able to read data from a variety of crystallographic image formats. Further data processing was carried out using CrystFEL, and the resultant structures were refined to 1.7 Åmore » resolution. The data demonstrate the feasibility of RT raster-scanning serial micro- and nano-protein crystallography at synchrotrons and validate it as an alternative approach for the collection of high-resolution structural data from micro-sized crystals. Advantages of the proposed approach are its thriftiness, its handling-free nature, the reduced amount of sample required, the adjustable hit rate, the high indexing rate and the minimization of background scattering.« less

  20. Restricted N-glycan conformational space in the PDB and its implication in glycan structure modeling.

    PubMed

    Jo, Sunhwan; Lee, Hui Sun; Skolnick, Jeffrey; Im, Wonpil

    2013-01-01

    Understanding glycan structure and dynamics is central to understanding protein-carbohydrate recognition and its role in protein-protein interactions. Given the difficulties in obtaining the glycan's crystal structure in glycoconjugates due to its flexibility and heterogeneity, computational modeling could play an important role in providing glycosylated protein structure models. To address if glycan structures available in the PDB can be used as templates or fragments for glycan modeling, we present a survey of the N-glycan structures of 35 different sequences in the PDB. Our statistical analysis shows that the N-glycan structures found on homologous glycoproteins are significantly conserved compared to the random background, suggesting that N-glycan chains can be confidently modeled with template glycan structures whose parent glycoproteins share sequence similarity. On the other hand, N-glycan structures found on non-homologous glycoproteins do not show significant global structural similarity. Nonetheless, the internal substructures of these N-glycans, particularly, the substructures that are closer to the protein, show significantly similar structures, suggesting that such substructures can be used as fragments in glycan modeling. Increased interactions with protein might be responsible for the restricted conformational space of N-glycan chains. Our results suggest that structure prediction/modeling of N-glycans of glycoconjugates using structure database could be effective and different modeling approaches would be needed depending on the availability of template structures.

  1. Restricted N-glycan Conformational Space in the PDB and Its Implication in Glycan Structure Modeling

    PubMed Central

    Jo, Sunhwan; Lee, Hui Sun; Skolnick, Jeffrey; Im, Wonpil

    2013-01-01

    Understanding glycan structure and dynamics is central to understanding protein-carbohydrate recognition and its role in protein-protein interactions. Given the difficulties in obtaining the glycan's crystal structure in glycoconjugates due to its flexibility and heterogeneity, computational modeling could play an important role in providing glycosylated protein structure models. To address if glycan structures available in the PDB can be used as templates or fragments for glycan modeling, we present a survey of the N-glycan structures of 35 different sequences in the PDB. Our statistical analysis shows that the N-glycan structures found on homologous glycoproteins are significantly conserved compared to the random background, suggesting that N-glycan chains can be confidently modeled with template glycan structures whose parent glycoproteins share sequence similarity. On the other hand, N-glycan structures found on non-homologous glycoproteins do not show significant global structural similarity. Nonetheless, the internal substructures of these N-glycans, particularly, the substructures that are closer to the protein, show significantly similar structures, suggesting that such substructures can be used as fragments in glycan modeling. Increased interactions with protein might be responsible for the restricted conformational space of N-glycan chains. Our results suggest that structure prediction/modeling of N-glycans of glycoconjugates using structure database could be effective and different modeling approaches would be needed depending on the availability of template structures. PMID:23516343

  2. DWARF – a data warehouse system for analyzing protein families

    PubMed Central

    Fischer, Markus; Thai, Quan K; Grieb, Melanie; Pleiss, Jürgen

    2006-01-01

    Background The emerging field of integrative bioinformatics provides the tools to organize and systematically analyze vast amounts of highly diverse biological data and thus allows to gain a novel understanding of complex biological systems. The data warehouse DWARF applies integrative bioinformatics approaches to the analysis of large protein families. Description The data warehouse system DWARF integrates data on sequence, structure, and functional annotation for protein fold families. The underlying relational data model consists of three major sections representing entities related to the protein (biochemical function, source organism, classification to homologous families and superfamilies), the protein sequence (position-specific annotation, mutant information), and the protein structure (secondary structure information, superimposed tertiary structure). Tools for extracting, transforming and loading data from public available resources (ExPDB, GenBank, DSSP) are provided to populate the database. The data can be accessed by an interface for searching and browsing, and by analysis tools that operate on annotation, sequence, or structure. We applied DWARF to the family of α/β-hydrolases to host the Lipase Engineering database. Release 2.3 contains 6138 sequences and 167 experimentally determined protein structures, which are assigned to 37 superfamilies 103 homologous families. Conclusion DWARF has been designed for constructing databases of large structurally related protein families and for evaluating their sequence-structure-function relationships by a systematic analysis of sequence, structure and functional annotation. It has been applied to predict biochemical properties from sequence, and serves as a valuable tool for protein engineering. PMID:17094801

  3. Designing and benchmarking the MULTICOM protein structure prediction system

    PubMed Central

    2013-01-01

    Background Predicting protein structure from sequence is one of the most significant and challenging problems in bioinformatics. Numerous bioinformatics techniques and tools have been developed to tackle almost every aspect of protein structure prediction ranging from structural feature prediction, template identification and query-template alignment to structure sampling, model quality assessment, and model refinement. How to synergistically select, integrate and improve the strengths of the complementary techniques at each prediction stage and build a high-performance system is becoming a critical issue for constructing a successful, competitive protein structure predictor. Results Over the past several years, we have constructed a standalone protein structure prediction system MULTICOM that combines multiple sources of information and complementary methods at all five stages of the protein structure prediction process including template identification, template combination, model generation, model assessment, and model refinement. The system was blindly tested during the ninth Critical Assessment of Techniques for Protein Structure Prediction (CASP9) in 2010 and yielded very good performance. In addition to studying the overall performance on the CASP9 benchmark, we thoroughly investigated the performance and contributions of each component at each stage of prediction. Conclusions Our comprehensive and comparative study not only provides useful and practical insights about how to select, improve, and integrate complementary methods to build a cutting-edge protein structure prediction system but also identifies a few new sources of information that may help improve the design of a protein structure prediction system. Several components used in the MULTICOM system are available at: http://sysbio.rnet.missouri.edu/multicom_toolbox/. PMID:23442819

  4. Deciphering the shape and deformation of secondary structures through local conformation analysis

    PubMed Central

    2011-01-01

    Background Protein deformation has been extensively analysed through global methods based on RMSD, torsion angles and Principal Components Analysis calculations. Here we use a local approach, able to distinguish among the different backbone conformations within loops, α-helices and β-strands, to address the question of secondary structures' shape variation within proteins and deformation at interface upon complexation. Results Using a structural alphabet, we translated the 3 D structures of large sets of protein-protein complexes into sequences of structural letters. The shape of the secondary structures can be assessed by the structural letters that modeled them in the structural sequences. The distribution analysis of the structural letters in the three protein compartments (surface, core and interface) reveals that secondary structures tend to adopt preferential conformations that differ among the compartments. The local description of secondary structures highlights that curved conformations are preferred on the surface while straight ones are preferred in the core. Interfaces display a mixture of local conformations either preferred in core or surface. The analysis of the structural letters transition occurring between protein-bound and unbound conformations shows that the deformation of secondary structure is tightly linked to the compartment preference of the local conformations. Conclusion The conformation of secondary structures can be further analysed and detailed thanks to a structural alphabet which allows a better description of protein surface, core and interface in terms of secondary structures' shape and deformation. Induced-fit modification tendencies described here should be valuable information to identify and characterize regions under strong structural constraints for functional reasons. PMID:21284872

  5. Raster-scanning serial protein crystallography using micro- and nano-focused synchrotron beams

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Coquelle, Nicolas; CNRS, IBS, 38044 Grenoble; CEA, IBS, 38044 Grenoble

    A raster scanning serial protein crystallography approach is presented, that consumes as low ∼200–700 nl of sedimented crystals. New serial data pre-analysis software, NanoPeakCell, is introduced. High-resolution structural information was obtained from lysozyme microcrystals (20 µm in the largest dimension) using raster-scanning serial protein crystallography on micro- and nano-focused beamlines at the ESRF. Data were collected at room temperature (RT) from crystals sandwiched between two silicon nitride wafers, thereby preventing their drying, while limiting background scattering and sample consumption. In order to identify crystal hits, new multi-processing and GUI-driven Python-based pre-analysis software was developed, named NanoPeakCell, that was able tomore » read data from a variety of crystallographic image formats. Further data processing was carried out using CrystFEL, and the resultant structures were refined to 1.7 Å resolution. The data demonstrate the feasibility of RT raster-scanning serial micro- and nano-protein crystallography at synchrotrons and validate it as an alternative approach for the collection of high-resolution structural data from micro-sized crystals. Advantages of the proposed approach are its thriftiness, its handling-free nature, the reduced amount of sample required, the adjustable hit rate, the high indexing rate and the minimization of background scattering.« less

  6. Reciprocating free-flow isoelectric focusing device for preparative separation of proteins.

    PubMed

    Kong, Fan-Zhi; Yang, Ying; Wang, Yi; Li, Guo-Qing; Li, Shan; Xiao, Hua; Fan, Liu-Yin; Liu, Shao-Rong; Cao, Cheng-Xi

    2015-11-27

    The traditional recycling free-flow isoelectric focusing (RFFIEF) suffered from complex structure, tedious operations and poor extensibility as well as high cost. To address these issues, a novel reciprocating free-flow isoelectric focusing device (ReFFIEF) was developed for proteins or peptides pre-fractionation. In the new device, a reciprocating background flow was for the first time introduced into free flow electrophoresis (FFE) system. The gas cushion injector (GCI) used in the previous continuous free-flow electrophoresis (CFFE) was redesigned for the reciprocating background flow. With the GCI, the reciprocating background flow could be achieved between the GCI, separation chamber and transient self-balance collector (tSBC). In a run, process fluid flowed to and from, forming a stable reciprocating fluid flow in the separation chamber. A pH gradient was created within the separation chamber, and at the same time proteins were focused repeatedly when passing through the chamber under perpendicular electric field. The ReFFIEF procedure was optimized for fractionations of three model proteins, and the optimized method was further used for pre-fractionation of model human serum samples. As compared with the traditional RFFIEF devices developed about 25 years ago, the new ReFFIEF system showed several merits, such as simple design and structure, user-friendly operation and easy to extend as well as low cost. Copyright © 2015 Elsevier B.V. All rights reserved.

  7. Dissecting protein loops with a statistical scalpel suggests a functional implication of some structural motifs

    PubMed Central

    2011-01-01

    Background One of the strategies for protein function annotation is to search particular structural motifs that are known to be shared by proteins with a given function. Results Here, we present a systematic extraction of structural motifs of seven residues from protein loops and we explore their correspondence with functional sites. Our approach is based on the structural alphabet HMM-SA (Hidden Markov Model - Structural Alphabet), which allows simplification of protein structures into uni-dimensional sequences, and advanced pattern statistics adapted to short sequences. Structural motifs of interest are selected by looking for structural motifs significantly over-represented in SCOP superfamilies in protein loops. We discovered two types of structural motifs significantly over-represented in SCOP superfamilies: (i) ubiquitous motifs, shared by several superfamilies and (ii) superfamily-specific motifs, over-represented in few superfamilies. A comparison of ubiquitous words with known small structural motifs shows that they contain well-described motifs as turn, niche or nest motifs. A comparison between superfamily-specific motifs and biological annotations of Swiss-Prot reveals that some of them actually correspond to functional sites involved in the binding sites of small ligands, such as ATP/GTP, NAD(P) and SAH/SAM. Conclusions Our findings show that statistical over-representation in SCOP superfamilies is linked to functional features. The detection of over-represented motifs within structures simplified by HMM-SA is therefore a promising approach for prediction of functional sites and annotation of uncharacterized proteins. PMID:21689388

  8. Accelerating large-scale protein structure alignments with graphics processing units

    PubMed Central

    2012-01-01

    Background Large-scale protein structure alignment, an indispensable tool to structural bioinformatics, poses a tremendous challenge on computational resources. To ensure structure alignment accuracy and efficiency, efforts have been made to parallelize traditional alignment algorithms in grid environments. However, these solutions are costly and of limited accessibility. Others trade alignment quality for speedup by using high-level characteristics of structure fragments for structure comparisons. Findings We present ppsAlign, a parallel protein structure Alignment framework designed and optimized to exploit the parallelism of Graphics Processing Units (GPUs). As a general-purpose GPU platform, ppsAlign could take many concurrent methods, such as TM-align and Fr-TM-align, into the parallelized algorithm design. We evaluated ppsAlign on an NVIDIA Tesla C2050 GPU card, and compared it with existing software solutions running on an AMD dual-core CPU. We observed a 36-fold speedup over TM-align, a 65-fold speedup over Fr-TM-align, and a 40-fold speedup over MAMMOTH. Conclusions ppsAlign is a high-performance protein structure alignment tool designed to tackle the computational complexity issues from protein structural data. The solution presented in this paper allows large-scale structure comparisons to be performed using massive parallel computing power of GPU. PMID:22357132

  9. Extraction, integration and analysis of alternative splicing and protein structure distributed information

    PubMed Central

    D'Antonio, Matteo; Masseroli, Marco

    2009-01-01

    Background Alternative splicing has been demonstrated to affect most of human genes; different isoforms from the same gene encode for proteins which differ for a limited number of residues, thus yielding similar structures. This suggests possible correlations between alternative splicing and protein structure. In order to support the investigation of such relationships, we have developed the Alternative Splicing and Protein Structure Scrutinizer (PASS), a Web application to automatically extract, integrate and analyze human alternative splicing and protein structure data sparsely available in the Alternative Splicing Database, Ensembl databank and Protein Data Bank. Primary data from these databases have been integrated and analyzed using the Protein Identifier Cross-Reference, BLAST, CLUSTALW and FeatureMap3D software tools. Results A database has been developed to store the considered primary data and the results from their analysis; a system of Perl scripts has been implemented to automatically create and update the database and analyze the integrated data; a Web interface has been implemented to make the analyses easily accessible; a database has been created to manage user accesses to the PASS Web application and store user's data and searches. Conclusion PASS automatically integrates data from the Alternative Splicing Database with protein structure data from the Protein Data Bank. Additionally, it comprehensively analyzes the integrated data with publicly available well-known bioinformatics tools in order to generate structural information of isoform pairs. Further analysis of such valuable information might reveal interesting relationships between alternative splicing and protein structure differences, which may be significantly associated with different functions. PMID:19828075

  10. Hydrogen atoms in protein structures: high-resolution X-ray diffraction structure of the DFPase

    PubMed Central

    2013-01-01

    Background Hydrogen atoms represent about half of the total number of atoms in proteins and are often involved in substrate recognition and catalysis. Unfortunately, X-ray protein crystallography at usual resolution fails to access directly their positioning, mainly because light atoms display weak contributions to diffraction. However, sub-Ångstrom diffraction data, careful modeling and a proper refinement strategy can allow the positioning of a significant part of hydrogen atoms. Results A comprehensive study on the X-ray structure of the diisopropyl-fluorophosphatase (DFPase) was performed, and the hydrogen atoms were modeled, including those of solvent molecules. This model was compared to the available neutron structure of DFPase, and differences in the protein and the active site solvation were noticed. Conclusions A further examination of the DFPase X-ray structure provides substantial evidence about the presence of an activated water molecule that may constitute an interesting piece of information as regard to the enzymatic hydrolysis mechanism. PMID:23915572

  11. Computational prediction of hinge axes in proteins

    PubMed Central

    2014-01-01

    Background A protein's function is determined by the wide range of motions exhibited by its 3D structure. However, current experimental techniques are not able to reliably provide the level of detail required for elucidating the exact mechanisms of protein motion essential for effective drug screening and design. Computational tools are instrumental in the study of the underlying structure-function relationship. We focus on a special type of proteins called "hinge proteins" which exhibit a motion that can be interpreted as a rotation of one domain relative to another. Results This work proposes a computational approach that uses the geometric structure of a single conformation to predict the feasible motions of the protein and is founded in recent work from rigidity theory, an area of mathematics that studies flexibility properties of general structures. Given a single conformational state, our analysis predicts a relative axis of motion between two specified domains. We analyze a dataset of 19 structures known to exhibit this hinge-like behavior. For 15, the predicted axis is consistent with a motion to a second, known conformation. We present a detailed case study for three proteins whose dynamics have been well-studied in the literature: calmodulin, the LAO binding protein and the Bence-Jones protein. Conclusions Our results show that incorporating rigidity-theoretic analyses can lead to effective computational methods for understanding hinge motions in macromolecules. This initial investigation is the first step towards a new tool for probing the structure-dynamics relationship in proteins. PMID:25080829

  12. An amelogenin mutation leads to disruption of the odontogenic apparatus and aberrant expression of Notch I

    PubMed Central

    Chen, Xu; Li, Yong; Alawi, Faizan; Bouchard, Jessica R.; Kulkarni, Ashok B.; Gibson, Carolyn W.

    2012-01-01

    BACKGROUND Amelogenins are highly conserved proteins secreted by ameloblasts in the dental organ of developing teeth. These proteins regulate dental enamel thickness and structure in humans and mice. Mice that express an amelogenin transgene with a P70T mutation (TgP70T) develop abnormal epithelial proliferation in an amelogenin null (KO) background. Some of these cellular masses have the appearance of proliferating stratum intermedium, which is the layer adjacent to the ameloblasts in unerupted teeth. As Notch proteins are thought to constitute the developmental switch that separates ameloblasts from stratum intermedium, these signaling proteins were evaluated in normal and proliferating tissues. METHODS Mandibles were dissected for histology and immunohistochemistry using Notch I antibodies. Molar teeth were dissected for western blotting and RT-PCR for evaluation of Notch levels through imaging and statistical analyses. RESULTS Notch I was immunolocalized to ameloblasts of TgP70TKO mice, KO ameloblasts stained, but less strongly, and wild-type teeth had minimal staining. Cells within the proliferating epithelial cell masses were positive for Notch I and had an appearance reminiscent of calcifying epithelial odontogenic tumor with amyloid-like deposits. Notch I protein and mRNA were elevated in molar teeth from TgP70TKO mice. CONCLUSION Expression of TgP70T leads to abnormal structures in mandibles and maxillae of mice with the KO genetic background and these mice have elevated levels of Notch I in developing molars. As cells within the masses also express transgenic amelogenins, development of the abnormal proliferations suggests communication between amelogenin producing cells and the proliferating cells, dependent on the presence of the mutated amelogenin protein. PMID:20923441

  13. In situ data collection and structure refinement from microcapillary protein crystallization

    PubMed Central

    Yadav, Maneesh K.; Gerdts, Cory J.; Sanishvili, Ruslan; Smith, Ward W.; Roach, L. Spencer; Ismagilov, Rustem F.; Kuhn, Peter; Stevens, Raymond C.

    2007-01-01

    In situ X-ray data collection has the potential to eliminate the challenging task of mounting and cryocooling often fragile protein crystals, reducing a major bottleneck in the structure determination process. An apparatus used to grow protein crystals in capillaries and to compare the background X-ray scattering of the components, including thin-walled glass capillaries against Teflon, and various fluorocarbon oils against each other, is described. Using thaumatin as a test case at 1.8 Å resolution, this study demonstrates that high-resolution electron density maps and refined models can be obtained from in situ diffraction of crystals grown in microcapillaries. PMID:17468785

  14. Three-dimensional (3D) structure prediction of the American and African oil-palms β-ketoacyl-[ACP] synthase-II protein by comparative modelling

    PubMed Central

    Wang, Edina; Chinni, Suresh; Bhore, Subhash Janardhan

    2014-01-01

    Background: The fatty-acid profile of the vegetable oils determines its properties and nutritional value. Palm-oil obtained from the African oil-palm [Elaeis guineensis Jacq. (Tenera)] contains 44% palmitic acid (C16:0), but, palm-oil obtained from the American oilpalm [Elaeis oleifera] contains only 25% C16:0. In part, the b-ketoacyl-[ACP] synthase II (KASII) [EC: 2.3.1.179] protein is responsible for the high level of C16:0 in palm-oil derived from the African oil-palm. To understand more about E. guineensis KASII (EgKASII) and E. oleifera KASII (EoKASII) proteins, it is essential to know its structures. Hence, this study was undertaken. Objective: The objective of this study was to predict three-dimensional (3D) structure of EgKASII and EoKASII proteins using molecular modelling tools. Materials and Methods: The amino-acid sequences for KASII proteins were retrieved from the protein database of National Center for Biotechnology Information (NCBI), USA. The 3D structures were predicted for both proteins using homology modelling and ab-initio technique approach of protein structure prediction. The molecular dynamics (MD) simulation was performed to refine the predicted structures. The predicted structure models were evaluated and root mean square deviation (RMSD) and root mean square fluctuation (RMSF) values were calculated. Results: The homology modelling showed that EgKASII and EoKASII proteins are 78% and 74% similar with Streptococcus pneumonia KASII and Brucella melitensis KASII, respectively. The EgKASII and EoKASII structures predicted by using ab-initio technique approach shows 6% and 9% deviation to its structures predicted by homology modelling, respectively. The structure refinement and validation confirmed that the predicted structures are accurate. Conclusion: The 3D structures for EgKASII and EoKASII proteins were predicted. However, further research is essential to understand the interaction of EgKASII and EoKASII proteins with its substrates. PMID:24748752

  15. Protein Denaturation on p-T Axes--Thermodynamics and Analysis.

    PubMed

    Smeller, László

    2015-01-01

    Proteins are essential players in the vast majority of molecular level life processes. Since their structure is in most cases substantial for their correct function, study of their structural changes attracted great interest in the past decades. The three dimensional structure of proteins is influenced by several factors including temperature, pH, presence of chaotropic and cosmotropic agents, or presence of denaturants. Although pressure is an equally important thermodynamic parameter as temperature, pressure studies are considerably less frequent in the literature, probably due to the technical difficulties associated to the pressure studies. Although the first steps in the high-pressure protein study have been done 100 years ago with Bridgman's ground breaking work, the field was silent until the modern spectroscopic techniques allowed the characterization of the protein structural changes, while the protein was under pressure. Recently a number of proteins were studied under pressure, and complete pressure-temperature phase diagrams were determined for several of them. This review summarizes the thermodynamic background of the typical elliptic p-T phase diagram, its limitations and the possible reasons for deviations of the experimental diagrams from the theoretical one. Finally we show some examples of experimentally determined pressure-temperature phase diagrams.

  16. Sites Inferred by Metabolic Background Assertion Labeling (SIMBAL): adapting the Partial Phylogenetic Profiling algorithm to scan sequences for signatures that predict protein function

    PubMed Central

    2010-01-01

    Background Comparative genomics methods such as phylogenetic profiling can mine powerful inferences from inherently noisy biological data sets. We introduce Sites Inferred by Metabolic Background Assertion Labeling (SIMBAL), a method that applies the Partial Phylogenetic Profiling (PPP) approach locally within a protein sequence to discover short sequence signatures associated with functional sites. The approach is based on the basic scoring mechanism employed by PPP, namely the use of binomial distribution statistics to optimize sequence similarity cutoffs during searches of partitioned training sets. Results Here we illustrate and validate the ability of the SIMBAL method to find functionally relevant short sequence signatures by application to two well-characterized protein families. In the first example, we partitioned a family of ABC permeases using a metabolic background property (urea utilization). Thus, the TRUE set for this family comprised members whose genome of origin encoded a urea utilization system. By moving a sliding window across the sequence of a permease, and searching each subsequence in turn against the full set of partitioned proteins, the method found which local sequence signatures best correlated with the urea utilization trait. Mapping of SIMBAL "hot spots" onto crystal structures of homologous permeases reveals that the significant sites are gating determinants on the cytosolic face rather than, say, docking sites for the substrate-binding protein on the extracellular face. In the second example, we partitioned a protein methyltransferase family using gene proximity as a criterion. In this case, the TRUE set comprised those methyltransferases encoded near the gene for the substrate RF-1. SIMBAL identifies sequence regions that map onto the substrate-binding interface while ignoring regions involved in the methyltransferase reaction mechanism in general. Neither method for training set construction requires any prior experimental characterization. Conclusions SIMBAL shows that, in functionally divergent protein families, selected short sequences often significantly outperform their full-length parent sequence for making functional predictions by sequence similarity, suggesting avenues for improved functional classifiers. When combined with structural data, SIMBAL affords the ability to localize and model functional sites. PMID:20102603

  17. Protein structural similarity search by Ramachandran codes

    PubMed Central

    Lo, Wei-Cheng; Huang, Po-Jung; Chang, Chih-Hung; Lyu, Ping-Chiang

    2007-01-01

    Background Protein structural data has increased exponentially, such that fast and accurate tools are necessary to access structure similarity search. To improve the search speed, several methods have been designed to reduce three-dimensional protein structures to one-dimensional text strings that are then analyzed by traditional sequence alignment methods; however, the accuracy is usually sacrificed and the speed is still unable to match sequence similarity search tools. Here, we aimed to improve the linear encoding methodology and develop efficient search tools that can rapidly retrieve structural homologs from large protein databases. Results We propose a new linear encoding method, SARST (Structural similarity search Aided by Ramachandran Sequential Transformation). SARST transforms protein structures into text strings through a Ramachandran map organized by nearest-neighbor clustering and uses a regenerative approach to produce substitution matrices. Then, classical sequence similarity search methods can be applied to the structural similarity search. Its accuracy is similar to Combinatorial Extension (CE) and works over 243,000 times faster, searching 34,000 proteins in 0.34 sec with a 3.2-GHz CPU. SARST provides statistically meaningful expectation values to assess the retrieved information. It has been implemented into a web service and a stand-alone Java program that is able to run on many different platforms. Conclusion As a database search method, SARST can rapidly distinguish high from low similarities and efficiently retrieve homologous structures. It demonstrates that the easily accessible linear encoding methodology has the potential to serve as a foundation for efficient protein structural similarity search tools. These search tools are supposed applicable to automated and high-throughput functional annotations or predictions for the ever increasing number of published protein structures in this post-genomic era. PMID:17716377

  18. Proteins with Novel Structure, Function and Dynamics

    NASA Technical Reports Server (NTRS)

    Pohorille, Andrew

    2014-01-01

    Recently, a small enzyme that ligates two RNA fragments with the rate of 10(exp 6) above background was evolved in vitro (Seelig and Szostak, Nature 448:828-831, 2007). This enzyme does not resemble any contemporary protein (Chao et al., Nature Chem. Biol. 9:81-83, 2013). It consists of a dynamic, catalytic loop, a small, rigid core containing two zinc ions coordinated by neighboring amino acids, and two highly flexible tails that might be unimportant for protein function. In contrast to other proteins, this enzyme does not contain ordered secondary structure elements, such as alpha-helix or beta-sheet. The loop is kept together by just two interactions of a charged residue and a histidine with a zinc ion, which they coordinate on the opposite side of the loop. Such structure appears to be very fragile. Surprisingly, computer simulations indicate otherwise. As the coordinating, charged residue is mutated to alanine, another, nearby charged residue takes its place, thus keeping the structure nearly intact. If this residue is also substituted by alanine a salt bridge involving two other, charged residues on the opposite sides of the loop keeps the loop in place. These adjustments are facilitated by high flexibility of the protein. Computational predictions have been confirmed experimentally, as both mutants retain full activity and overall structure. These results challenge our notions about what is required for protein activity and about the relationship between protein dynamics, stability and robustness. We hypothesize that small, highly dynamic proteins could be both active and fault tolerant in ways that many other proteins are not, i.e. they can adjust to retain their structure and activity even if subjected to mutations in structurally critical regions. This opens the doors for designing proteins with novel functions, structures and dynamics that have not been yet considered.

  19. Characterizing protein domain associations by Small-molecule ligand binding

    PubMed Central

    Li, Qingliang; Cheng, Tiejun; Wang, Yanli; Bryant, Stephen H.

    2012-01-01

    Background Protein domains are evolutionarily conserved building blocks for protein structure and function, which are conventionally identified based on protein sequence or structure similarity. Small molecule binding domains are of great importance for the recognition of small molecules in biological systems and drug development. Many small molecules, including drugs, have been increasingly identified to bind to multiple targets, leading to promiscuous interactions with protein domains. Thus, a large scale characterization of the protein domains and their associations with respect to small-molecule binding is of particular interest to system biology research, drug target identification, as well as drug repurposing. Methods We compiled a collection of 13,822 physical interactions of small molecules and protein domains derived from the Protein Data Bank (PDB) structures. Based on the chemical similarity of these small molecules, we characterized pairwise associations of the protein domains and further investigated their global associations from a network point of view. Results We found that protein domains, despite lack of similarity in sequence and structure, were comprehensively associated through binding the same or similar small-molecule ligands. Moreover, we identified modules in the domain network that consisted of closely related protein domains by sharing similar biochemical mechanisms, being involved in relevant biological pathways, or being regulated by the same cognate cofactors. Conclusions A novel protein domain relationship was identified in the context of small-molecule binding, which is complementary to those identified by traditional sequence-based or structure-based approaches. The protein domain network constructed in the present study provides a novel perspective for chemogenomic study and network pharmacology, as well as target identification for drug repurposing. PMID:23745168

  20. Identify High-Quality Protein Structural Models by Enhanced K-Means.

    PubMed

    Wu, Hongjie; Li, Haiou; Jiang, Min; Chen, Cheng; Lv, Qiang; Wu, Chuang

    2017-01-01

    Background. One critical issue in protein three-dimensional structure prediction using either ab initio or comparative modeling involves identification of high-quality protein structural models from generated decoys. Currently, clustering algorithms are widely used to identify near-native models; however, their performance is dependent upon different conformational decoys, and, for some algorithms, the accuracy declines when the decoy population increases. Results. Here, we proposed two enhanced K -means clustering algorithms capable of robustly identifying high-quality protein structural models. The first one employs the clustering algorithm SPICKER to determine the initial centroids for basic K -means clustering ( SK -means), whereas the other employs squared distance to optimize the initial centroids ( K -means++). Our results showed that SK -means and K -means++ were more robust as compared with SPICKER alone, detecting 33 (59%) and 42 (75%) of 56 targets, respectively, with template modeling scores better than or equal to those of SPICKER. Conclusions. We observed that the classic K -means algorithm showed a similar performance to that of SPICKER, which is a widely used algorithm for protein-structure identification. Both SK -means and K -means++ demonstrated substantial improvements relative to results from SPICKER and classical K -means.

  1. Identify High-Quality Protein Structural Models by Enhanced K-Means

    PubMed Central

    Li, Haiou; Chen, Cheng; Lv, Qiang; Wu, Chuang

    2017-01-01

    Background. One critical issue in protein three-dimensional structure prediction using either ab initio or comparative modeling involves identification of high-quality protein structural models from generated decoys. Currently, clustering algorithms are widely used to identify near-native models; however, their performance is dependent upon different conformational decoys, and, for some algorithms, the accuracy declines when the decoy population increases. Results. Here, we proposed two enhanced K-means clustering algorithms capable of robustly identifying high-quality protein structural models. The first one employs the clustering algorithm SPICKER to determine the initial centroids for basic K-means clustering (SK-means), whereas the other employs squared distance to optimize the initial centroids (K-means++). Our results showed that SK-means and K-means++ were more robust as compared with SPICKER alone, detecting 33 (59%) and 42 (75%) of 56 targets, respectively, with template modeling scores better than or equal to those of SPICKER. Conclusions. We observed that the classic K-means algorithm showed a similar performance to that of SPICKER, which is a widely used algorithm for protein-structure identification. Both SK-means and K-means++ demonstrated substantial improvements relative to results from SPICKER and classical K-means. PMID:28421198

  2. A novel inert crystal delivery medium for serial femtosecond crystallography

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Conrad, Chelsie E.; Basu, Shibom; James, Daniel

    Serial femtosecond crystallography (SFX) has opened a new era in crystallography by permitting nearly damage-free, room-temperature structure determination of challenging proteins such as membrane proteins. In SFX, femtosecond X-ray free-electron laser pulses produce diffraction snapshots from nanocrystals and microcrystals delivered in a liquid jet, which leads to high protein consumption. A slow-moving stream of agarose has been developed as a new crystal delivery medium for SFX. It has low background scattering, is compatible with both soluble and membrane proteins, and can deliver the protein crystals at a wide range of temperatures down to 4°C. Using this crystal-laden agarose stream, themore » structure of a multi-subunit complex, phycocyanin, was solved to 2.5 Å resolution using 300 µg of microcrystals embedded into the agarose medium post-crystallization. The agarose delivery method reduces protein consumption by at least 100-fold and has the potential to be used for a diverse population of proteins, including membrane protein complexes.« less

  3. A novel inert crystal delivery medium for serial femtosecond crystallography

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Conrad, Chelsie E.; Basu, Shibom; James, Daniel

    Serial femtosecond crystallography (SFX) has opened a new era in crystallography by permitting nearly damage-free, room-temperature structure determination of challenging proteins such as membrane proteins. In SFX, femtosecond X-ray free-electron laser pulses produce diffraction snapshots from nanocrystals and microcrystals delivered in a liquid jet, which leads to high protein consumption. A slow-moving stream of agarose has been developed as a new crystal delivery medium for SFX. It has low background scattering, is compatible with both soluble and membrane proteins, and can deliver the protein crystals at a wide range of temperatures down to 4°C. Using this crystal-laden agarose stream, themore » structure of a multi-subunit complex, phycocyanin, was solved to 2.5Å resolution using 300µg of microcrystals embedded into the agarose medium post-crystallization. The agarose delivery method reduces protein consumption by at least 100-fold and has the potential to be used for a diverse population of proteins, including membrane protein complexes.« less

  4. A novel inert crystal delivery medium for serial femtosecond crystallography

    DOE PAGES

    Conrad, Chelsie E.; Basu, Shibom; James, Daniel; ...

    2015-06-30

    Serial femtosecond crystallography (SFX) has opened a new era in crystallography by permitting nearly damage-free, room-temperature structure determination of challenging proteins such as membrane proteins. In SFX, femtosecond X-ray free-electron laser pulses produce diffraction snapshots from nanocrystals and microcrystals delivered in a liquid jet, which leads to high protein consumption. A slow-moving stream of agarose has been developed as a new crystal delivery medium for SFX. It has low background scattering, is compatible with both soluble and membrane proteins, and can deliver the protein crystals at a wide range of temperatures down to 4°C. Using this crystal-laden agarose stream, themore » structure of a multi-subunit complex, phycocyanin, was solved to 2.5 Å resolution using 300 µg of microcrystals embedded into the agarose medium post-crystallization. The agarose delivery method reduces protein consumption by at least 100-fold and has the potential to be used for a diverse population of proteins, including membrane protein complexes.« less

  5. Objective identification of residue ranges for the superposition of protein structures

    PubMed Central

    2011-01-01

    Background The automation of objectively selecting amino acid residue ranges for structure superpositions is important for meaningful and consistent protein structure analyses. So far there is no widely-used standard for choosing these residue ranges for experimentally determined protein structures, where the manual selection of residue ranges or the use of suboptimal criteria remain commonplace. Results We present an automated and objective method for finding amino acid residue ranges for the superposition and analysis of protein structures, in particular for structure bundles resulting from NMR structure calculations. The method is implemented in an algorithm, CYRANGE, that yields, without protein-specific parameter adjustment, appropriate residue ranges in most commonly occurring situations, including low-precision structure bundles, multi-domain proteins, symmetric multimers, and protein complexes. Residue ranges are chosen to comprise as many residues of a protein domain that increasing their number would lead to a steep rise in the RMSD value. Residue ranges are determined by first clustering residues into domains based on the distance variance matrix, and then refining for each domain the initial choice of residues by excluding residues one by one until the relative decrease of the RMSD value becomes insignificant. A penalty for the opening of gaps favours contiguous residue ranges in order to obtain a result that is as simple as possible, but not simpler. Results are given for a set of 37 proteins and compared with those of commonly used protein structure validation packages. We also provide residue ranges for 6351 NMR structures in the Protein Data Bank. Conclusions The CYRANGE method is capable of automatically determining residue ranges for the superposition of protein structure bundles for a large variety of protein structures. The method correctly identifies ordered regions. Global structure superpositions based on the CYRANGE residue ranges allow a clear presentation of the structure, and unnecessary small gaps within the selected ranges are absent. In the majority of cases, the residue ranges from CYRANGE contain fewer gaps and cover considerably larger parts of the sequence than those from other methods without significantly increasing the RMSD values. CYRANGE thus provides an objective and automatic method for standardizing the choice of residue ranges for the superposition of protein structures. PMID:21592348

  6. Optimizing physical energy functions for protein folding.

    PubMed

    Fujitsuka, Yoshimi; Takada, Shoji; Luthey-Schulten, Zaida A; Wolynes, Peter G

    2004-01-01

    We optimize a physical energy function for proteins with the use of the available structural database and perform three benchmark tests of the performance: (1) recognition of native structures in the background of predefined decoy sets of Levitt, (2) de novo structure prediction using fragment assembly sampling, and (3) molecular dynamics simulations. The energy parameter optimization is based on the energy landscape theory and uses a Monte Carlo search to find a set of parameters that seeks the largest ratio deltaE(s)/DeltaE for all proteins in a training set simultaneously. Here, deltaE(s) is the stability gap between the native and the average in the denatured states and DeltaE is the energy fluctuation among these states. Some of the energy parameters optimized are found to show significant correlation with experimentally observed quantities: (1) In the recognition test, the optimized function assigns the lowest energy to either the native or a near-native structure among many decoy structures for all the proteins studied. (2) Structure prediction with the fragment assembly sampling gives structure models with root mean square deviation less than 6 A in one of the top five cluster centers for five of six proteins studied. (3) Structure prediction using molecular dynamics simulation gives poorer performance, implying the importance of having a more precise description of local structures. The physical energy function solely inferred from a structural database neither utilizes sequence information from the family of the target nor the outcome of the secondary structure prediction but can produce the correct native fold for many small proteins. Copyright 2003 Wiley-Liss, Inc.

  7. Automatic classification of protein structures relying on similarities between alignments

    PubMed Central

    2012-01-01

    Background Identification of protein structural cores requires isolation of sets of proteins all sharing a same subset of structural motifs. In the context of an ever growing number of available 3D protein structures, standard and automatic clustering algorithms require adaptations so as to allow for efficient identification of such sets of proteins. Results When considering a pair of 3D structures, they are stated as similar or not according to the local similarities of their matching substructures in a structural alignment. This binary relation can be represented in a graph of similarities where a node represents a 3D protein structure and an edge states that two 3D protein structures are similar. Therefore, classifying proteins into structural families can be viewed as a graph clustering task. Unfortunately, because such a graph encodes only pairwise similarity information, clustering algorithms may include in the same cluster a subset of 3D structures that do not share a common substructure. In order to overcome this drawback we first define a ternary similarity on a triple of 3D structures as a constraint to be satisfied by the graph of similarities. Such a ternary constraint takes into account similarities between pairwise alignments, so as to ensure that the three involved protein structures do have some common substructure. We propose hereunder a modification algorithm that eliminates edges from the original graph of similarities and gives a reduced graph in which no ternary constraints are violated. Our approach is then first to build a graph of similarities, then to reduce the graph according to the modification algorithm, and finally to apply to the reduced graph a standard graph clustering algorithm. Such method was used for classifying ASTRAL-40 non-redundant protein domains, identifying significant pairwise similarities with Yakusa, a program devised for rapid 3D structure alignments. Conclusions We show that filtering similarities prior to standard graph based clustering process by applying ternary similarity constraints i) improves the separation of proteins of different classes and consequently ii) improves the classification quality of standard graph based clustering algorithms according to the reference classification SCOP. PMID:22974051

  8. Query3d: a new method for high-throughput analysis of functional residues in protein structures

    PubMed Central

    Ausiello, Gabriele; Via, Allegra; Helmer-Citterich, Manuela

    2005-01-01

    Background The identification of local similarities between two protein structures can provide clues of a common function. Many different methods exist for searching for similar subsets of residues in proteins of known structure. However, the lack of functional and structural information on single residues, together with the low level of integration of this information in comparison methods, is a limitation that prevents these methods from being fully exploited in high-throughput analyses. Results Here we describe Query3d, a program that is both a structural DBMS (Database Management System) and a local comparison method. The method conserves a copy of all the residues of the Protein Data Bank annotated with a variety of functional and structural information. New annotations can be easily added from a variety of methods and known databases. The algorithm makes it possible to create complex queries based on the residues' function and then to compare only subsets of the selected residues. Functional information is also essential to speed up the comparison and the analysis of the results. Conclusion With Query3d, users can easily obtain statistics on how many and which residues share certain properties in all proteins of known structure. At the same time, the method also finds their structural neighbours in the whole PDB. Programs and data can be accessed through the PdbFun web interface. PMID:16351754

  9. Protein structure prediction with local adjust tabu search algorithm

    PubMed Central

    2014-01-01

    Background Protein folding structure prediction is one of the most challenging problems in the bioinformatics domain. Because of the complexity of the realistic protein structure, the simplified structure model and the computational method should be adopted in the research. The AB off-lattice model is one of the simplification models, which only considers two classes of amino acids, hydrophobic (A) residues and hydrophilic (B) residues. Results The main work of this paper is to discuss how to optimize the lowest energy configurations in 2D off-lattice model and 3D off-lattice model by using Fibonacci sequences and real protein sequences. In order to avoid falling into local minimum and faster convergence to the global minimum, we introduce a novel method (SATS) to the protein structure problem, which combines simulated annealing algorithm and tabu search algorithm. Various strategies, such as the new encoding strategy, the adaptive neighborhood generation strategy and the local adjustment strategy, are adopted successfully for high-speed searching the optimal conformation corresponds to the lowest energy of the protein sequences. Experimental results show that some of the results obtained by the improved SATS are better than those reported in previous literatures, and we can sure that the lowest energy folding state for short Fibonacci sequences have been found. Conclusions Although the off-lattice models is not very realistic, they can reflect some important characteristics of the realistic protein. It can be found that 3D off-lattice model is more like native folding structure of the realistic protein than 2D off-lattice model. In addition, compared with some previous researches, the proposed hybrid algorithm can more effectively and more quickly search the spatial folding structure of a protein chain. PMID:25474708

  10. Enlazin, a Natural Fusion of Two Classes of Canonical Cytoskeletal Proteins, Contributes to Cytokinesis Dynamics

    PubMed Central

    Octtaviani, Edelyn; Effler, Janet C.

    2006-01-01

    Cytokinesis requires a complex network of equatorial and global proteins to regulate cell shape changes. Here, using interaction genetics, we report the first characterization of a novel protein, enlazin. Enlazin is a natural fusion of two canonical classes of actin-associated proteins, the ezrin-radixin-moesin family and fimbrin, and it is localized to actin-rich structures. A fragment of enlazin, enl-tr, was isolated as a genetic suppressor of the cytokinesis defect of cortexillin-I mutants. Expression of enl-tr disrupts expression of endogenous enlazin, indicating that enl-tr functions as a dominant-negative lesion. Enlazin is distributed globally during cytokinesis and is required for cortical tension and cell adhesion. Consistent with a role in cell mechanics, inhibition of enlazin in a cortexillin-I background restores cytokinesis furrowing dynamics and suppresses the growth-in-suspension defect. However, as expected for a role in cell adhesion, inhibiting enlazin in a myosin-II background induces a synthetic cytokinesis phenotype, frequently arresting furrow ingression at the dumbbell shape and/or causing recession of the furrow. Thus, enlazin has roles in cell mechanics and adhesion, and these roles seem to be differentially significant for cytokinesis, depending on the genetic background. PMID:17050732

  11. Quality assessment of protein model-structures based on structural and functional similarities

    PubMed Central

    2012-01-01

    Background Experimental determination of protein 3D structures is expensive, time consuming and sometimes impossible. A gap between number of protein structures deposited in the World Wide Protein Data Bank and the number of sequenced proteins constantly broadens. Computational modeling is deemed to be one of the ways to deal with the problem. Although protein 3D structure prediction is a difficult task, many tools are available. These tools can model it from a sequence or partial structural information, e.g. contact maps. Consequently, biologists have the ability to generate automatically a putative 3D structure model of any protein. However, the main issue becomes evaluation of the model quality, which is one of the most important challenges of structural biology. Results GOBA - Gene Ontology-Based Assessment is a novel Protein Model Quality Assessment Program. It estimates the compatibility between a model-structure and its expected function. GOBA is based on the assumption that a high quality model is expected to be structurally similar to proteins functionally similar to the prediction target. Whereas DALI is used to measure structure similarity, protein functional similarity is quantified using standardized and hierarchical description of proteins provided by Gene Ontology combined with Wang's algorithm for calculating semantic similarity. Two approaches are proposed to express the quality of protein model-structures. One is a single model quality assessment method, the other is its modification, which provides a relative measure of model quality. Exhaustive evaluation is performed on data sets of model-structures submitted to the CASP8 and CASP9 contests. Conclusions The validation shows that the method is able to discriminate between good and bad model-structures. The best of tested GOBA scores achieved 0.74 and 0.8 as a mean Pearson correlation to the observed quality of models in our CASP8 and CASP9-based validation sets. GOBA also obtained the best result for two targets of CASP8, and one of CASP9, compared to the contest participants. Consequently, GOBA offers a novel single model quality assessment program that addresses the practical needs of biologists. In conjunction with other Model Quality Assessment Programs (MQAPs), it would prove useful for the evaluation of single protein models. PMID:22998498

  12. Structures of Rotavirus Reassortants Demonstrate Correlation of Altered Conformation of the VP4 Spike and Expression of Unexpected VP4-Associated Phenotypes

    PubMed Central

    Pesavento, Joseph B.; Billingsley, Angela M.; Roberts, Ed J.; Ramig, Robert F.; Prasad, B. V. Venkataram

    2003-01-01

    Numerous prior studies have indicated that viable rotavirus reassortants containing structural proteins of heterologous parental origin may express unexpected phenotypes, such as changes in infectivity and immunogenicity. To provide a structural basis for alterations in phenotypic expression, a three-dimensional structural analysis of these reassortants was conducted. The structures of the reassortants show that while VP4 generally maintains the parental structure when moved to a heterologous protein background, in certain reassortants, there are subtle alterations in the conformation of VP4. The alterations in VP4 conformation correlated with expression of unexpected VP4-associated phenotypes. Interactions between heterologous VP4 and VP7 in reassortants expressing unexpected phenotypes appeared to induce the conformational alterations seen in VP4. PMID:12584352

  13. Structural genomics analysis of uncharacterized protein families overrepresented in human gut bacteria identifies a novel glycoside hydrolase

    PubMed Central

    2014-01-01

    Background Bacteroides spp. form a significant part of our gut microbiome and are well known for optimized metabolism of diverse polysaccharides. Initial analysis of the archetypal Bacteroides thetaiotaomicron genome identified 172 glycosyl hydrolases and a large number of uncharacterized proteins associated with polysaccharide metabolism. Results BT_1012 from Bacteroides thetaiotaomicron VPI-5482 is a protein of unknown function and a member of a large protein family consisting entirely of uncharacterized proteins. Initial sequence analysis predicted that this protein has two domains, one on the N- and one on the C-terminal. A PSI-BLAST search found over 150 full length and over 90 half size homologs consisting only of the N-terminal domain. The experimentally determined three-dimensional structure of the BT_1012 protein confirms its two-domain architecture and structural analysis of both domains suggests their specific functions. The N-terminal domain is a putative catalytic domain with significant similarity to known glycoside hydrolases, the C-terminal domain has a beta-sandwich fold typically found in C-terminal domains of other glycosyl hydrolases, however these domains are typically involved in substrate binding. We describe the structure of the BT_1012 protein and discuss its sequence-structure relationship and their possible functional implications. Conclusions Structural and sequence analyses of the BT_1012 protein identifies it as a glycosyl hydrolase, expanding an already impressive catalog of enzymes involved in polysaccharide metabolism in Bacteroides spp. Based on this we have renamed the Pfam families representing the two domains found in the BT_1012 protein, PF13204 and PF12904, as putative glycoside hydrolase and glycoside hydrolase-associated C-terminal domain respectively. PMID:24742328

  14. Improved protein surface comparison and application to low-resolution protein structure data

    PubMed Central

    2010-01-01

    Background Recent advancements of experimental techniques for determining protein tertiary structures raise significant challenges for protein bioinformatics. With the number of known structures of unknown function expanding at a rapid pace, an urgent task is to provide reliable clues to their biological function on a large scale. Conventional approaches for structure comparison are not suitable for a real-time database search due to their slow speed. Moreover, a new challenge has arisen from recent techniques such as electron microscopy (EM), which provide low-resolution structure data. Previously, we have introduced a method for protein surface shape representation using the 3D Zernike descriptors (3DZDs). The 3DZD enables fast structure database searches, taking advantage of its rotation invariance and compact representation. The search results of protein surface represented with the 3DZD has showngood agreement with the existing structure classifications, but some discrepancies were also observed. Results The three new surface representations of backbone atoms, originally devised all-atom-surface representation, and the combination of all-atom surface with the backbone representation are examined. All representations are encoded with the 3DZD. Also, we have investigated the applicability of the 3DZD for searching protein EM density maps of varying resolutions. The surface representations are evaluated on structure retrieval using two existing classifications, SCOP and the CE-based classification. Conclusions Overall, the 3DZDs representing backbone atoms show better retrieval performance than the original all-atom surface representation. The performance further improved when the two representations are combined. Moreover, we observed that the 3DZD is also powerful in comparing low-resolution structures obtained by electron microscopy. PMID:21172052

  15. Motivated Proteins: A web application for studying small three-dimensional protein motifs

    PubMed Central

    Leader, David P; Milner-White, E James

    2009-01-01

    Background Small loop-shaped motifs are common constituents of the three-dimensional structure of proteins. Typically they comprise between three and seven amino acid residues, and are defined by a combination of dihedral angles and hydrogen bonding partners. The most abundant of these are αβ-motifs, asx-motifs, asx-turns, β-bulges, β-bulge loops, β-turns, nests, niches, Schellmann loops, ST-motifs, ST-staples and ST-turns. We have constructed a database of such motifs from a range of high-quality protein structures and built a web application as a visual interface to this. Description The web application, Motivated Proteins, provides access to these 12 motifs (with 48 sub-categories) in a database of over 400 representative proteins. Queries can be made for specific categories or sub-categories of motif, motifs in the vicinity of ligands, motifs which include part of an enzyme active site, overlapping motifs, or motifs which include a particular amino acid sequence. Individual proteins can be specified, or, where appropriate, motifs for all proteins listed. The results of queries are presented in textual form as an (X)HTML table, and may be saved as parsable plain text or XML. Motifs can be viewed and manipulated either individually or in the context of the protein in the Jmol applet structural viewer. Cartoons of the motifs imposed on a linear representation of protein secondary structure are also provided. Summary information for the motifs is available, as are histograms of amino acid distribution, and graphs of dihedral angles at individual positions in the motifs. Conclusion Motivated Proteins is a publicly and freely accessible web application that enables protein scientists to study small three-dimensional motifs without requiring knowledge of either Structured Query Language or the underlying database schema. PMID:19210785

  16. FPGA accelerator for protein secondary structure prediction based on the GOR algorithm

    PubMed Central

    2011-01-01

    Background Protein is an important molecule that performs a wide range of functions in biological systems. Recently, the protein folding attracts much more attention since the function of protein can be generally derived from its molecular structure. The GOR algorithm is one of the most successful computational methods and has been widely used as an efficient analysis tool to predict secondary structure from protein sequence. However, the execution time is still intolerable with the steep growth in protein database. Recently, FPGA chips have emerged as one promising application accelerator to accelerate bioinformatics algorithms by exploiting fine-grained custom design. Results In this paper, we propose a complete fine-grained parallel hardware implementation on FPGA to accelerate the GOR-IV package for 2D protein structure prediction. To improve computing efficiency, we partition the parameter table into small segments and access them in parallel. We aggressively exploit data reuse schemes to minimize the need for loading data from external memory. The whole computation structure is carefully pipelined to overlap the sequence loading, computing and back-writing operations as much as possible. We implemented a complete GOR desktop system based on an FPGA chip XC5VLX330. Conclusions The experimental results show a speedup factor of more than 430x over the original GOR-IV version and 110x speedup over the optimized version with multi-thread SIMD implementation running on a PC platform with AMD Phenom 9650 Quad CPU for 2D protein structure prediction. However, the power consumption is only about 30% of that of current general-propose CPUs. PMID:21342582

  17. Channel crossing: how are proteins shipped across the bacterial plasma membrane?

    PubMed

    Collinson, Ian; Corey, Robin A; Allen, William J

    2015-10-05

    The structure of the first protein-conducting channel was determined more than a decade ago. Today, we are still puzzled by the outstanding problem of protein translocation--the dynamic mechanism underlying the consignment of proteins across and into membranes. This review is an attempt to summarize and understand the energy transducing capabilities of protein-translocating machines, with emphasis on bacterial systems: how polypeptides make headway against the lipid bilayer and how the process is coupled to the free energy associated with ATP hydrolysis and the transmembrane protein motive force. In order to explore how cargo is driven across the membrane, the known structures of the protein-translocation machines are set out against the background of the historic literature, and in the light of experiments conducted in their wake. The paper will focus on the bacterial general secretory (Sec) pathway (SecY-complex), and its eukaryotic counterpart (Sec61-complex), which ferry proteins across the membrane in an unfolded state, as well as the unrelated Tat system that assembles bespoke channels for the export of folded proteins. © 2015 The Authors.

  18. Statistical discovery of site inter-dependencies in sub-molecular hierarchical protein structuring

    PubMed Central

    2012-01-01

    Background Much progress has been made in understanding the 3D structure of proteins using methods such as NMR and X-ray crystallography. The resulting 3D structures are extremely informative, but do not always reveal which sites and residues within the structure are of special importance. Recently, there are indications that multiple-residue, sub-domain structural relationships within the larger 3D consensus structure of a protein can be inferred from the analysis of the multiple sequence alignment data of a protein family. These intra-dependent clusters of associated sites are used to indicate hierarchical inter-residue relationships within the 3D structure. To reveal the patterns of associations among individual amino acids or sub-domain components within the structure, we apply a k-modes attribute (aligned site) clustering algorithm to the ubiquitin and transthyretin families in order to discover associations among groups of sites within the multiple sequence alignment. We then observe what these associations imply within the 3D structure of these two protein families. Results The k-modes site clustering algorithm we developed maximizes the intra-group interdependencies based on a normalized mutual information measure. The clusters formed correspond to sub-structural components or binding and interface locations. Applying this data-directed method to the ubiquitin and transthyretin protein family multiple sequence alignments as a test bed, we located numerous interesting associations of interdependent sites. These clusters were then arranged into cluster tree diagrams which revealed four structural sub-domains within the single domain structure of ubiquitin and a single large sub-domain within transthyretin associated with the interface among transthyretin monomers. In addition, several clusters of mutually interdependent sites were discovered for each protein family, each of which appear to play an important role in the molecular structure and/or function. Conclusions Our results demonstrate that the method we present here using a k-modes site clustering algorithm based on interdependency evaluation among sites obtained from a sequence alignment of homologous proteins can provide significant insights into the complex, hierarchical inter-residue structural relationships within the 3D structure of a protein family. PMID:22793672

  19. TAP score: torsion angle propensity normalization applied to local protein structure evaluation

    PubMed Central

    Tosatto, Silvio CE; Battistutta, Roberto

    2007-01-01

    Background Experimentally determined protein structures may contain errors and require validation. Conformational criteria based on the Ramachandran plot are mainly used to distinguish between distorted and adequately refined models. While the readily available criteria are sufficient to detect totally wrong structures, establishing the more subtle differences between plausible structures remains more challenging. Results A new criterion, called TAP score, measuring local sequence to structure fitness based on torsion angle propensities normalized against the global minimum and maximum is introduced. It is shown to be more accurate than previous methods at estimating the validity of a protein model in terms of commonly used experimental quality parameters on two test sets representing the full PDB database and a subset of obsolete PDB structures. Highly selective TAP thresholds are derived to recognize over 90% of the top experimental structures in the absence of experimental information. Both a web server and an executable version of the TAP score are available at . Conclusion A novel procedure for energy normalization (TAP) has significantly improved the possibility to recognize the best experimental structures. It will allow the user to more reliably isolate problematic structures in the context of automated experimental structure determination. PMID:17504537

  20. Antibody-protein interactions: benchmark datasets and prediction tools evaluation

    PubMed Central

    Ponomarenko, Julia V; Bourne, Philip E

    2007-01-01

    Background The ability to predict antibody binding sites (aka antigenic determinants or B-cell epitopes) for a given protein is a precursor to new vaccine design and diagnostics. Among the various methods of B-cell epitope identification X-ray crystallography is one of the most reliable methods. Using these experimental data computational methods exist for B-cell epitope prediction. As the number of structures of antibody-protein complexes grows, further interest in prediction methods using 3D structure is anticipated. This work aims to establish a benchmark for 3D structure-based epitope prediction methods. Results Two B-cell epitope benchmark datasets inferred from the 3D structures of antibody-protein complexes were defined. The first is a dataset of 62 representative 3D structures of protein antigens with inferred structural epitopes. The second is a dataset of 82 structures of antibody-protein complexes containing different structural epitopes. Using these datasets, eight web-servers developed for antibody and protein binding sites prediction have been evaluated. In no method did performance exceed a 40% precision and 46% recall. The values of the area under the receiver operating characteristic curve for the evaluated methods were about 0.6 for ConSurf, DiscoTope, and PPI-PRED methods and above 0.65 but not exceeding 0.70 for protein-protein docking methods when the best of the top ten models for the bound docking were considered; the remaining methods performed close to random. The benchmark datasets are included as a supplement to this paper. Conclusion It may be possible to improve epitope prediction methods through training on datasets which include only immune epitopes and through utilizing more features characterizing epitopes, for example, the evolutionary conservation score. Notwithstanding, overall poor performance may reflect the generality of antigenicity and hence the inability to decipher B-cell epitopes as an intrinsic feature of the protein. It is an open question as to whether ultimately discriminatory features can be found. PMID:17910770

  1. qPIPSA: Relating enzymatic kinetic parameters and interaction fields

    PubMed Central

    Gabdoulline, Razif R; Stein, Matthias; Wade, Rebecca C

    2007-01-01

    Background The simulation of metabolic networks in quantitative systems biology requires the assignment of enzymatic kinetic parameters. Experimentally determined values are often not available and therefore computational methods to estimate these parameters are needed. It is possible to use the three-dimensional structure of an enzyme to perform simulations of a reaction and derive kinetic parameters. However, this is computationally demanding and requires detailed knowledge of the enzyme mechanism. We have therefore sought to develop a general, simple and computationally efficient procedure to relate protein structural information to enzymatic kinetic parameters that allows consistency between the kinetic and structural information to be checked and estimation of kinetic constants for structurally and mechanistically similar enzymes. Results We describe qPIPSA: quantitative Protein Interaction Property Similarity Analysis. In this analysis, molecular interaction fields, for example, electrostatic potentials, are computed from the enzyme structures. Differences in molecular interaction fields between enzymes are then related to the ratios of their kinetic parameters. This procedure can be used to estimate unknown kinetic parameters when enzyme structural information is available and kinetic parameters have been measured for related enzymes or were obtained under different conditions. The detailed interaction of the enzyme with substrate or cofactors is not modeled and is assumed to be similar for all the proteins compared. The protein structure modeling protocol employed ensures that differences between models reflect genuine differences between the protein sequences, rather than random fluctuations in protein structure. Conclusion Provided that the experimental conditions and the protein structural models refer to the same protein state or conformation, correlations between interaction fields and kinetic parameters can be established for sets of related enzymes. Outliers may arise due to variation in the importance of different contributions to the kinetic parameters, such as protein stability and conformational changes. The qPIPSA approach can assist in the validation as well as estimation of kinetic parameters, and provide insights into enzyme mechanism. PMID:17919319

  2. Structural domains and main-chain flexibility in prion proteins.

    PubMed

    Blinov, N; Berjanskii, M; Wishart, D S; Stepanova, M

    2009-02-24

    In this study we describe a novel approach to define structural domains and to characterize the local flexibility in both human and chicken prion proteins. The approach we use is based on a comprehensive theory of collective dynamics in proteins that was recently developed. This method determines the essential collective coordinates, which can be found from molecular dynamics trajectories via principal component analysis. Under this particular framework, we are able to identify the domains where atoms move coherently while at the same time to determine the local main-chain flexibility for each residue. We have verified this approach by comparing our results for the predicted dynamic domain systems with the computed main-chain flexibility profiles and the NMR-derived random coil indexes for human and chicken prion proteins. The three sets of data show excellent agreement. Additionally, we demonstrate that the dynamic domains calculated in this fashion provide a highly sensitive measure of protein collective structure and dynamics. Furthermore, such an analysis is capable of revealing structural and dynamic properties of proteins that are inaccessible to the conventional assessment of secondary structure. Using the collective dynamic simulation approach described here along with a high-temperature simulations of unfolding of human prion protein, we have explored whether locations of relatively low stability could be identified where the unfolding process could potentially be facilitated. According to our analysis, the locations of relatively low stability may be associated with the beta-sheet formed by strands S1 and S2 and the adjacent loops, whereas helix HC appears to be a relatively stable part of the protein. We suggest that this kind of structural analysis may provide a useful background for a more quantitative assessment of potential routes of spontaneous misfolding in prion proteins.

  3. Protein structure analysis of mutations causing inheritable diseases. An e-Science approach with life scientist friendly interfaces.

    PubMed

    Venselaar, Hanka; Te Beek, Tim A H; Kuipers, Remko K P; Hekkelman, Maarten L; Vriend, Gert

    2010-11-08

    Many newly detected point mutations are located in protein-coding regions of the human genome. Knowledge of their effects on the protein's 3D structure provides insight into the protein's mechanism, can aid the design of further experiments, and eventually can lead to the development of new medicines and diagnostic tools. In this article we describe HOPE, a fully automatic program that analyzes the structural and functional effects of point mutations. HOPE collects information from a wide range of information sources including calculations on the 3D coordinates of the protein by using WHAT IF Web services, sequence annotations from the UniProt database, and predictions by DAS services. Homology models are built with YASARA. Data is stored in a database and used in a decision scheme to identify the effects of a mutation on the protein's 3D structure and function. HOPE builds a report with text, figures, and animations that is easy to use and understandable for (bio)medical researchers. We tested HOPE by comparing its output to the results of manually performed projects. In all straightforward cases HOPE performed similar to a trained bioinformatician. The use of 3D structures helps optimize the results in terms of reliability and details. HOPE's results are easy to understand and are presented in a way that is attractive for researchers without an extensive bioinformatics background.

  4. Recent research in flaxseed (oil seed) on molecular structure and metabolic characteristics of protein, heat processing-induced effect and nutrition with advanced synchrotron-based molecular techniques.

    PubMed

    Doiron, Kevin J; Yu, Peiqiang

    2017-01-02

    Advanced synchrotron radiation-based infrared microspectroscopy is able to reveal feed and food structure feature at cellular and molecular levels and simultaneously provides composition, structure, environment, and chemistry within intact tissue. However, to date, this advanced synchrotron-based technique is still seldom known to food and feed scientists. This article aims to provide detailed background for flaxseed (oil seed) protein research and then review recent progress and development in flaxseed research in ruminant nutrition in the areas of (1) dietary inclusion of flaxseed in rations; (2) heat processing effect; (3) assessing dietary protein; (4) synchrotron-based Fourier transform infrared microspectroscopy as a tool of nutritive evaluation within cellular and subcellular dimensions; (5) recent synchrotron applications in flaxseed research on a molecular basis. The information described in this paper gives better insight in flaxseed research progress and update.

  5. Chromophore Structure of Photochromic Fluorescent Protein Dronpa: Acid-Base Equilibrium of Two Cis Configurations.

    PubMed

    Higashino, Asuka; Mizuno, Misao; Mizutani, Yasuhisa

    2016-04-07

    Dronpa is a novel photochromic fluorescent protein that exhibits fast response to light. The present article is the first report of the resonance and preresonance Raman spectra of Dronpa. We used the intensity and frequency of Raman bands to determine the structure of the Dronpa chromophore in two thermally stable photochromic states. The acid-base equilibrium in one photochromic state was observed by spectroscopic pH titration. The Raman spectra revealed that the chromophore in this state shows a protonation/deprotonation transition with a pKa of 5.2 ± 0.3 and maintains the cis configuration. The observed resonance Raman bands showed that the other photochromic state of the chromophore is in a trans configuration. The results demonstrate that Raman bands selectively enhanced for the chromophore yield valuable information on the molecular structure of the chromophore in photochromic fluorescent proteins after careful elimination of the fluorescence background.

  6. Space-related pharma-motifs for fast search of protein binding motifs and polypharmacological targets

    PubMed Central

    2012-01-01

    Background To discover a compound inhibiting multiple proteins (i.e. polypharmacological targets) is a new paradigm for the complex diseases (e.g. cancers and diabetes). In general, the polypharmacological proteins often share similar local binding environments and motifs. As the exponential growth of the number of protein structures, to find the similar structural binding motifs (pharma-motifs) is an emergency task for drug discovery (e.g. side effects and new uses for old drugs) and protein functions. Results We have developed a Space-Related Pharmamotifs (called SRPmotif) method to recognize the binding motifs by searching against protein structure database. SRPmotif is able to recognize conserved binding environments containing spatially discontinuous pharma-motifs which are often short conserved peptides with specific physico-chemical properties for protein functions. Among 356 pharma-motifs, 56.5% interacting residues are highly conserved. Experimental results indicate that 81.1% and 92.7% polypharmacological targets of each protein-ligand complex are annotated with same biological process (BP) and molecular function (MF) terms, respectively, based on Gene Ontology (GO). Our experimental results show that the identified pharma-motifs often consist of key residues in functional (active) sites and play the key roles for protein functions. The SRPmotif is available at http://gemdock.life.nctu.edu.tw/SRP/. Conclusions SRPmotif is able to identify similar pharma-interfaces and pharma-motifs sharing similar binding environments for polypharmacological targets by rapidly searching against the protein structure database. Pharma-motifs describe the conservations of binding environments for drug discovery and protein functions. Additionally, these pharma-motifs provide the clues for discovering new sequence-based motifs to predict protein functions from protein sequence databases. We believe that SRPmotif is useful for elucidating protein functions and drug discovery. PMID:23281852

  7. Multiple graph regularized protein domain ranking

    PubMed Central

    2012-01-01

    Background Protein domain ranking is a fundamental task in structural biology. Most protein domain ranking methods rely on the pairwise comparison of protein domains while neglecting the global manifold structure of the protein domain database. Recently, graph regularized ranking that exploits the global structure of the graph defined by the pairwise similarities has been proposed. However, the existing graph regularized ranking methods are very sensitive to the choice of the graph model and parameters, and this remains a difficult problem for most of the protein domain ranking methods. Results To tackle this problem, we have developed the Multiple Graph regularized Ranking algorithm, MultiG-Rank. Instead of using a single graph to regularize the ranking scores, MultiG-Rank approximates the intrinsic manifold of protein domain distribution by combining multiple initial graphs for the regularization. Graph weights are learned with ranking scores jointly and automatically, by alternately minimizing an objective function in an iterative algorithm. Experimental results on a subset of the ASTRAL SCOP protein domain database demonstrate that MultiG-Rank achieves a better ranking performance than single graph regularized ranking methods and pairwise similarity based ranking methods. Conclusion The problem of graph model and parameter selection in graph regularized protein domain ranking can be solved effectively by combining multiple graphs. This aspect of generalization introduces a new frontier in applying multiple graphs to solving protein domain ranking applications. PMID:23157331

  8. Combining Functional and Structural Genomics to Sample the Essential Burkholderia Structome

    PubMed Central

    Baugh, Loren; Gallagher, Larry A.; Patrapuvich, Rapatbhorn; Clifton, Matthew C.; Gardberg, Anna S.; Edwards, Thomas E.; Armour, Brianna; Begley, Darren W.; Dieterich, Shellie H.; Dranow, David M.; Abendroth, Jan; Fairman, James W.; Fox, David; Staker, Bart L.; Phan, Isabelle; Gillespie, Angela; Choi, Ryan; Nakazawa-Hewitt, Steve; Nguyen, Mary Trang; Napuli, Alberto; Barrett, Lynn; Buchko, Garry W.; Stacy, Robin; Myler, Peter J.; Stewart, Lance J.; Manoil, Colin; Van Voorhis, Wesley C.

    2013-01-01

    Background The genus Burkholderia includes pathogenic gram-negative bacteria that cause melioidosis, glanders, and pulmonary infections of patients with cancer and cystic fibrosis. Drug resistance has made development of new antimicrobials critical. Many approaches to discovering new antimicrobials, such as structure-based drug design and whole cell phenotypic screens followed by lead refinement, require high-resolution structures of proteins essential to the parasite. Methodology/Principal Findings We experimentally identified 406 putative essential genes in B. thailandensis, a low-virulence species phylogenetically similar to B. pseudomallei, the causative agent of melioidosis, using saturation-level transposon mutagenesis and next-generation sequencing (Tn-seq). We selected 315 protein products of these genes based on structure-determination criteria, such as excluding very large and/or integral membrane proteins, and entered them into the Seattle Structural Genomics Center for Infection Disease (SSGCID) structure determination pipeline. To maximize structural coverage of these targets, we applied an “ortholog rescue” strategy for those producing insoluble or difficult to crystallize proteins, resulting in the addition of 387 orthologs (or paralogs) from seven other Burkholderia species into the SSGCID pipeline. This structural genomics approach yielded structures from 31 putative essential targets from B. thailandensis, and 25 orthologs from other Burkholderia species, yielding an overall structural coverage for 49 of the 406 essential gene families, with a total of 88 depositions into the Protein Data Bank. Of these, 25 proteins have properties of a potential antimicrobial drug target i.e., no close human homolog, part of an essential metabolic pathway, and a deep binding pocket. We describe the structures of several potential drug targets in detail. Conclusions/Significance This collection of structures, solubility and experimental essentiality data provides a resource for development of drugs against infections and diseases caused by Burkholderia. All expression clones and proteins created in this study are freely available by request. PMID:23382856

  9. The glassy state of crambin and the THz time scale protein-solvent fluctuations possibly related to protein function

    PubMed Central

    2014-01-01

    Background THz experiments have been used to characterize the picosecond time scale fluctuations taking place in the model, globular protein crambin. Results Using both hydration and temperature as an experimental parameter, we have identified collective fluctuations (<= 200 cm−1) in the protein. Observation of the protein dynamics in the THz spectrum from both below and above the glass transition temperature (Tg) has provided unique insight into the microscopic interactions and modes that permit the solvent to effectively couple to the protein thermal fluctuations. Conclusions Our findings suggest that the solvent dynamics on the picosecond time scale not only contribute to protein flexibility but may also delineate the types of fluctuations that are able to form within the protein structure. PMID:25184036

  10. DNA packaging intermediates of bacteriophage Φ174

    PubMed Central

    Music, Cynthia L; Cheng, R Holland; Bowen, Zorina; McKenna, Robert; Rossmann, Michael G; Baker, Timothy S; Incardona, Nino L

    2014-01-01

    Background Like many viruses, bacteriophage ΦX174 packages its I)NA genome into a procapsid that is assembled from structural intermediates and scaffolding proteins. The procapsid contains the structural proteins F, G and H, as well as the scaffolding proteins B and D. Provirions are formed by packaging of DNA together with the small internal J proteins, while losing at least some of the B scaffolding proteins. Eventually, loss of the I) scaffolding proteins and the remaining B proteins leads to the formation of mature virions. Results ΦX174 108S 'procapsids' have been purified in milligram quantities by removing 114S (mature virion) and 70S (abortive capsid) particles from crude lysates by differential precipitation with polyethylene glycol. 132S 'provirions' were purified on sucrose gradients in the presence of EDTA. Cryo-electron microscopy (cryo-EM) was used to obtain reconstructions of procapsids and provirions. Although these are very similar to each other, their structures differ greatly from that of the virion. The F and G proteins, whose atomic structures in virions were previously determined from X-ray crystallography, were fitted into the cryo-EM reconstructions. This showed that the pentamer of G proteins on each five-fold vertex changes its conformation only slightly during DNA packaging and maturation, whereas major tertiary and quaternary structural changes occur in the F protein. The procapsids and provirions were found to contain 120 copies of the I) protein arranged as tetramers on the twofold axes. IDNA might enter procapsids through one of the 30 Å diameter holes on the icosahedral three-fold axes. Conclusions Combining cryo-EM image reconstruction and X-ray crystallography has revealed the major conformational changes that can occur in viral assembly. The function of the scaffolding proteins may be, in part, to support weak interactions between the structural proteins in the procapsids and to cover surfaces that are subsequently required for subunit–subunit interaction in the virion. The structures presented here are, therefore, analogous to chaperone proteins complexed with folding intermediates of a substrate. PMID:7613866

  11. Developmental Exposure to A Commercial PBDE Mixture: Effects on Protein Networks in the Cerebellum and Hippocampus of Rats

    EPA Science Inventory

    BACKGROUND: Polybrominated diphenyl ethers (PBDEs) are structurally similar topolychlorinated biphenyls (PCBs) and have both central (learning and memory deficits) and peripheral (motor dysfunction) neurotoxic effects at concentrations/doses similar to those of PCBs. The cellular...

  12. Designing and evaluating the MULTICOM protein local and global model quality prediction methods in the CASP10 experiment

    PubMed Central

    2014-01-01

    Background Protein model quality assessment is an essential component of generating and using protein structural models. During the Tenth Critical Assessment of Techniques for Protein Structure Prediction (CASP10), we developed and tested four automated methods (MULTICOM-REFINE, MULTICOM-CLUSTER, MULTICOM-NOVEL, and MULTICOM-CONSTRUCT) that predicted both local and global quality of protein structural models. Results MULTICOM-REFINE was a clustering approach that used the average pairwise structural similarity between models to measure the global quality and the average Euclidean distance between a model and several top ranked models to measure the local quality. MULTICOM-CLUSTER and MULTICOM-NOVEL were two new support vector machine-based methods of predicting both the local and global quality of a single protein model. MULTICOM-CONSTRUCT was a new weighted pairwise model comparison (clustering) method that used the weighted average similarity between models in a pool to measure the global model quality. Our experiments showed that the pairwise model assessment methods worked better when a large portion of models in the pool were of good quality, whereas single-model quality assessment methods performed better on some hard targets when only a small portion of models in the pool were of reasonable quality. Conclusions Since digging out a few good models from a large pool of low-quality models is a major challenge in protein structure prediction, single model quality assessment methods appear to be poised to make important contributions to protein structure modeling. The other interesting finding was that single-model quality assessment scores could be used to weight the models by the consensus pairwise model comparison method to improve its accuracy. PMID:24731387

  13. Buried chloride stereochemistry in the Protein Data Bank

    PubMed Central

    2014-01-01

    Background Despite the chloride anion is involved in fundamental biological processes, its interactions with proteins are little known. In particular, we lack a systematic survey of its coordination spheres. Results The analysis of a non-redundant set (pairwise sequence identity?

  14. Bhageerath-H: A homology/ab initio hybrid server for predicting tertiary structures of monomeric soluble proteins

    PubMed Central

    2014-01-01

    Background The advent of human genome sequencing project has led to a spurt in the number of protein sequences in the databanks. Success of structure based drug discovery severely hinges on the availability of structures. Despite significant progresses in the area of experimental protein structure determination, the sequence-structure gap is continually widening. Data driven homology based computational methods have proved successful in predicting tertiary structures for sequences sharing medium to high sequence similarities. With dwindling similarities of query sequences, advanced homology/ ab initio hybrid approaches are being explored to solve structure prediction problem. Here we describe Bhageerath-H, a homology/ ab initio hybrid software/server for predicting protein tertiary structures with advancing drug design attempts as one of the goals. Results Bhageerath-H web-server was validated on 75 CASP10 targets which showed TM-scores ≥0.5 in 91% of the cases and Cα RMSDs ≤5Å from the native in 58% of the targets, which is well above the CASP10 water mark. Comparison with some leading servers demonstrated the uniqueness of the hybrid methodology in effectively sampling conformational space, scoring best decoys and refining low resolution models to high and medium resolution. Conclusion Bhageerath-H methodology is web enabled for the scientific community as a freely accessible web server. The methodology is fielded in the on-going CASP11 experiment. PMID:25521245

  15. A global optimization algorithm for protein surface alignment

    PubMed Central

    2010-01-01

    Background A relevant problem in drug design is the comparison and recognition of protein binding sites. Binding sites recognition is generally based on geometry often combined with physico-chemical properties of the site since the conformation, size and chemical composition of the protein surface are all relevant for the interaction with a specific ligand. Several matching strategies have been designed for the recognition of protein-ligand binding sites and of protein-protein interfaces but the problem cannot be considered solved. Results In this paper we propose a new method for local structural alignment of protein surfaces based on continuous global optimization techniques. Given the three-dimensional structures of two proteins, the method finds the isometric transformation (rotation plus translation) that best superimposes active regions of two structures. We draw our inspiration from the well-known Iterative Closest Point (ICP) method for three-dimensional (3D) shapes registration. Our main contribution is in the adoption of a controlled random search as a more efficient global optimization approach along with a new dissimilarity measure. The reported computational experience and comparison show viability of the proposed approach. Conclusions Our method performs well to detect similarity in binding sites when this in fact exists. In the future we plan to do a more comprehensive evaluation of the method by considering large datasets of non-redundant proteins and applying a clustering technique to the results of all comparisons to classify binding sites. PMID:20920230

  16. A Score of the Ability of a Three-Dimensional Protein Model to Retrieve Its Own Sequence as a Quantitative Measure of Its Quality and Appropriateness

    PubMed Central

    Martínez-Castilla, León P.; Rodríguez-Sotres, Rogelio

    2010-01-01

    Background Despite the remarkable progress of bioinformatics, how the primary structure of a protein leads to a three-dimensional fold, and in turn determines its function remains an elusive question. Alignments of sequences with known function can be used to identify proteins with the same or similar function with high success. However, identification of function-related and structure-related amino acid positions is only possible after a detailed study of every protein. Folding pattern diversity seems to be much narrower than sequence diversity, and the amino acid sequences of natural proteins have evolved under a selective pressure comprising structural and functional requirements acting in parallel. Principal Findings The approach described in this work begins by generating a large number of amino acid sequences using ROSETTA [Dantas G et al. (2003) J Mol Biol 332:449–460], a program with notable robustness in the assignment of amino acids to a known three-dimensional structure. The resulting sequence-sets showed no conservation of amino acids at active sites, or protein-protein interfaces. Hidden Markov models built from the resulting sequence sets were used to search sequence databases. Surprisingly, the models retrieved from the database sequences belonged to proteins with the same or a very similar function. Given an appropriate cutoff, the rate of false positives was zero. According to our results, this protocol, here referred to as Rd.HMM, detects fine structural details on the folding patterns, that seem to be tightly linked to the fitness of a structural framework for a specific biological function. Conclusion Because the sequence of the native protein used to create the Rd.HMM model was always amongst the top hits, the procedure is a reliable tool to score, very accurately, the quality and appropriateness of computer-modeled 3D-structures, without the need for spectroscopy data. However, Rd.HMM is very sensitive to the conformational features of the models' backbone. PMID:20830209

  17. Universal partitioning of the hierarchical fold network of 50-residue segments in proteins

    PubMed Central

    Ito, Jun-ichi; Sonobe, Yuki; Ikeda, Kazuyoshi; Tomii, Kentaro; Higo, Junichi

    2009-01-01

    Background Several studies have demonstrated that protein fold space is structured hierarchically and that power-law statistics are satisfied in relation between the numbers of protein families and protein folds (or superfamilies). We examined the internal structure and statistics in the fold space of 50 amino-acid residue segments taken from various protein folds. We used inter-residue contact patterns to measure the tertiary structural similarity among segments. Using this similarity measure, the segments were classified into a number (Kc) of clusters. We examined various Kc values for the clustering. The special resolution to differentiate the segment tertiary structures increases with increasing Kc. Furthermore, we constructed networks by linking structurally similar clusters. Results The network was partitioned persistently into four regions for Kc ≥ 1000. This main partitioning is consistent with results of earlier studies, where similar partitioning was reported in classifying protein domain structures. Furthermore, the network was partitioned naturally into several dozens of sub-networks (i.e., communities). Therefore, intra-sub-network clusters were mutually connected with numerous links, although inter-sub-network ones were rarely done with few links. For Kc ≥ 1000, the major sub-networks were about 40; the contents of the major sub-networks were conserved. This sub-partitioning is a novel finding, suggesting that the network is structured hierarchically: Segments construct a cluster, clusters form a sub-network, and sub-networks constitute a region. Additionally, the network was characterized by non-power-law statistics, which is also a novel finding. Conclusion Main findings are: (1) The universe of 50 residue segments found here was characterized by non-power-law statistics. Therefore, the universe differs from those ever reported for the protein domains. (2) The 50-residue segments were partitioned persistently and universally into some dozens (ca. 40) of major sub-networks, irrespective of the number of clusters. (3) These major sub-networks encompassed 90% of all segments. Consequently, the protein tertiary structure is constructed using the dozens of elements (sub-networks). PMID:19454039

  18. Molecular docking.

    PubMed

    Morris, Garrett M; Lim-Wilby, Marguerita

    2008-01-01

    Molecular docking is a key tool in structural molecular biology and computer-assisted drug design. The goal of ligand-protein docking is to predict the predominant binding mode(s) of a ligand with a protein of known three-dimensional structure. Successful docking methods search high-dimensional spaces effectively and use a scoring function that correctly ranks candidate dockings. Docking can be used to perform virtual screening on large libraries of compounds, rank the results, and propose structural hypotheses of how the ligands inhibit the target, which is invaluable in lead optimization. The setting up of the input structures for the docking is just as important as the docking itself, and analyzing the results of stochastic search methods can sometimes be unclear. This chapter discusses the background and theory of molecular docking software, and covers the usage of some of the most-cited docking software.

  19. SeqRate: sequence-based protein folding type classification and rates prediction

    PubMed Central

    2010-01-01

    Background Protein folding rate is an important property of a protein. Predicting protein folding rate is useful for understanding protein folding process and guiding protein design. Most previous methods of predicting protein folding rate require the tertiary structure of a protein as an input. And most methods do not distinguish the different kinetic nature (two-state folding or multi-state folding) of the proteins. Here we developed a method, SeqRate, to predict both protein folding kinetic type (two-state versus multi-state) and real-value folding rate using sequence length, amino acid composition, contact order, contact number, and secondary structure information predicted from only protein sequence with support vector machines. Results We systematically studied the contributions of individual features to folding rate prediction. On a standard benchmark dataset, the accuracy of folding kinetic type classification is 80%. The Pearson correlation coefficient and the mean absolute difference between predicted and experimental folding rates (sec-1) in the base-10 logarithmic scale are 0.81 and 0.79 for two-state protein folders, and 0.80 and 0.68 for three-state protein folders. SeqRate is the first sequence-based method for protein folding type classification and its accuracy of fold rate prediction is improved over previous sequence-based methods. Its performance can be further enhanced with additional information, such as structure-based geometric contacts, as inputs. Conclusions Both the web server and software of predicting folding rate are publicly available at http://casp.rnet.missouri.edu/fold_rate/index.html. PMID:20438647

  20. Integrated proteomic and transcriptomic analysis of the Aedes aegypti eggshell

    PubMed Central

    2014-01-01

    Background Mosquito eggshells show remarkable diversity in physical properties and structure consistent with adaptations to the wide variety of environments exploited by these insects. We applied proteomic, transcriptomic, and hybridization in situ techniques to identify gene products and pathways that participate in the assembly of the Aedes aegypti eggshell. Aedes aegypti population density is low during cold and dry seasons and increases immediately after rainfall. The survival of embryos through unfavorable periods is a key factor in the persistence of their populations. The work described here supports integrated vector control approaches that target eggshell formation and result in Ae. aegypti drought-intolerant phenotypes for public health initiatives directed to reduce mosquito-borne diseases. Results A total of 130 proteins were identified from the combined mass spectrometric analyses of eggshell preparations. Conclusions Classification of proteins according to their known and putative functions revealed the complexity of the eggshell structure. Three novel Ae. aegypti vitelline membrane proteins were discovered. Odorant-binding and cysteine-rich proteins that may be structural components of the eggshell were identified. Enzymes with peroxidase, laccase and phenoloxidase activities also were identified, and their likely involvements in cross-linking reactions that stabilize the eggshell structure are discussed. PMID:24707823

  1. Modular architecture of protein structures and allosteric communications: potential implications for signaling proteins and regulatory linkages

    PubMed Central

    del Sol, Antonio; Araúzo-Bravo, Marcos J; Amoros, Dolors; Nussinov, Ruth

    2007-01-01

    Background Allosteric communications are vital for cellular signaling. Here we explore a relationship between protein architectural organization and shortcuts in signaling pathways. Results We show that protein domains consist of modules interconnected by residues that mediate signaling through the shortest pathways. These mediating residues tend to be located at the inter-modular boundaries, which are more rigid and display a larger number of long-range interactions than intra-modular regions. The inter-modular boundaries contain most of the residues centrally conserved in the protein fold, which may be crucial for information transfer between amino acids. Our approach to modular decomposition relies on a representation of protein structures as residue-interacting networks, and removal of the most central residue contacts, which are assumed to be crucial for allosteric communications. The modular decomposition of 100 multi-domain protein structures indicates that modules constitute the building blocks of domains. The analysis of 13 allosteric proteins revealed that modules characterize experimentally identified functional regions. Based on the study of an additional functionally annotated dataset of 115 proteins, we propose that high-modularity modules include functional sites and are the basic functional units. We provide examples (the Gαs subunit and P450 cytochromes) to illustrate that the modular architecture of active sites is linked to their functional specialization. Conclusion Our method decomposes protein structures into modules, allowing the study of signal transmission between functional sites. A modular configuration might be advantageous: it allows signaling proteins to expand their regulatory linkages and may elicit a broader range of control mechanisms either via modular combinations or through modulation of inter-modular linkages. PMID:17531094

  2. Plasmonic nanostructures for bioanalytical applications of SERS

    NASA Astrophysics Data System (ADS)

    Kahraman, Mehmet; Wachsmann-Hogiu, Sebastian

    2016-03-01

    Surface-enhanced Raman scattering (SERS) is a potential analytical technique for the detection and identification of chemicals and biological molecules and structures in the close vicinity of metallic nanostructures. We present a novel method to fabricate tunable plasmonic nanostructures and perform a comprehensive structural and optical characterization of the structures. Spherical latex particles are uniformly deposited on glass slides and used as templates to obtain nanovoid structures on polydimethylsiloxane surfaces. The diameter and depth of the nanovoids are controlled by the size of the latex particles. The nanovoids are coated with a thin Ag layer for fabrication of uniform plasmonic nanostructures. Structural characterization of the surfaces is performed by scanning electron microscopy (SEM) and atomic force microscopy (AFM). Optical properties of these plasmonic nanostructures are evaluated via UV/Vis spectroscopy, and SERS. The sample preparation step is the key point to obtain strong and reproducible SERS spectra from the biological structures. When the colloidal suspension is used as a SERS substrate for the protein detection, the electrostatic interaction of the proteins with the nanoparticles is described by the nature of their charge status, which influences the aggregation properties such as the size and shape of the aggregates, which is critical for the SERS experiment. However, when the solid SERS substrates are fabricated, SERS signal of the proteins that are background free and independent of the protein charge. Pros and cons of using plasmonic nano colloids and nanostructures as SERS substrate will be discussed for label-free detection of proteins using SERS.

  3. MovieMaker: a web server for rapid rendering of protein motions and interactions

    PubMed Central

    Maiti, Rajarshi; Van Domselaar, Gary H.; Wishart, David S.

    2005-01-01

    MovieMaker is a web server that allows short (∼10 s), downloadable movies of protein motions to be generated. It accepts PDB files or PDB accession numbers as input and automatically calculates, renders and merges the necessary image files to create colourful animations covering a wide range of protein motions and other dynamic processes. Users have the option of animating (i) simple rotation, (ii) morphing between two end-state conformers, (iii) short-scale, picosecond vibrations, (iv) ligand docking, (v) protein oligomerization, (vi) mid-scale nanosecond (ensemble) motions and (vii) protein folding/unfolding. MovieMaker does not perform molecular dynamics calculations. Instead it is an animation tool that uses a sophisticated superpositioning algorithm in conjunction with Cartesian coordinate interpolation to rapidly and automatically calculate the intermediate structures needed for many of its animations. Users have extensive control over the rendering style, structure colour, animation quality, background and other image features. MovieMaker is intended to be a general-purpose server that allows both experts and non-experts to easily generate useful, informative protein animations for educational and illustrative purposes. MovieMaker is accessible at . PMID:15980488

  4. Molecular dynamics simulations of the Nip7 proteins from the marine deep- and shallow-water Pyrococcus species

    PubMed Central

    2014-01-01

    Background The identification of the mechanisms of adaptation of protein structures to extreme environmental conditions is a challenging task of structural biology. We performed molecular dynamics (MD) simulations of the Nip7 protein involved in RNA processing from the shallow-water (P. furiosus) and the deep-water (P. abyssi) marine hyperthermophylic archaea at different temperatures (300 and 373 K) and pressures (0.1, 50 and 100 MPa). The aim was to disclose similarities and differences between the deep- and shallow-sea protein models at different temperatures and pressures. Results The current results demonstrate that the 3D models of the two proteins at all the examined values of pressures and temperatures are compact, stable and similar to the known crystal structure of the P. abyssi Nip7. The structural deviations and fluctuations in the polypeptide chain during the MD simulations were the most pronounced in the loop regions, their magnitude being larger for the C-terminal domain in both proteins. A number of highly mobile segments the protein globule presumably involved in protein-protein interactions were identified. Regions of the polypeptide chain with significant difference in conformational dynamics between the deep- and shallow-water proteins were identified. Conclusions The results of our analysis demonstrated that in the examined ranges of temperatures and pressures, increase in temperature has a stronger effect on change in the dynamic properties of the protein globule than the increase in pressure. The conformational changes of both the deep- and shallow-sea protein models under increasing temperature and pressure are non-uniform. Our current results indicate that amino acid substitutions between shallow- and deep-water proteins only slightly affect overall stability of two proteins. Rather, they may affect the interactions of the Nip7 protein with its protein or RNA partners. PMID:25315147

  5. Structure and Stability of the Spinach Aquaporin SoPIP2;1 in Detergent Micelles and Lipid Membranes

    PubMed Central

    Plasencia, Inés; Survery, Sabeen; Ibragimova, Sania; Hansen, Jesper S.; Kjellbom, Per; Helix-Nielsen, Claus; Johanson, Urban; Mouritsen, Ole G.

    2011-01-01

    Background SoPIP2;1 constitutes one of the major integral proteins in spinach leaf plasma membranes and belongs to the aquaporin family. SoPIP2;1 is a highly permeable and selective water channel that has been successfully overexpressed and purified with high yields. In order to optimize reconstitution of the purified protein into biomimetic systems, we have here for the first time characterized the structural stability of SoPIP2;1. Methodology/Principal Finding We have characterized the protein structural stability after purification and after reconstitution into detergent micelles and proteoliposomes using circular dichroism and fluorescence spectroscopy techniques. The structure of SoPIP2;1 was analyzed either with the protein solubilized with octyl-β-D-glucopyranoside (OG) or reconstituted into lipid membranes formed by E. coli lipids, diphytanoylphosphatidylcholine (DPhPC), or reconstituted into lipid membranes formed from mixtures of 1-palmitoyl-2-oleoyl-phosphatidylcholine (POPE), 1-palmitoyl-2oleoyl-phosphatidylethanolamine (POPE), 1-palmitoyl-2-oleoyl-phosphatidylserine (POPS), and ergosterol. Generally, SoPIP2;1 secondary structure was found to be predominantly α-helical in accordance with crystallographic data. The protein has a high thermal structural stability in detergent solutions, with an irreversible thermal unfolding occurring at a melting temperature of 58°C. Incorporation of the protein into lipid membranes increases the structural stability as evidenced by an increased melting temperature of up to 70°C. Conclusion/Significance The results of this study provide insights into SoPIP2;1 stability in various host membranes and suggest suitable choices of detergent and lipid composition for reconstitution of SoPIP2;1 into biomimetic membranes for biotechnological applications. PMID:21339815

  6. Hekate: Software Suite for the Mass Spectrometric Analysis and Three-Dimensional Visualization of Cross-Linked Protein Samples

    PubMed Central

    2013-01-01

    Chemical cross-linking of proteins combined with mass spectrometry provides an attractive and novel method for the analysis of native protein structures and protein complexes. Analysis of the data however is complex. Only a small number of cross-linked peptides are produced during sample preparation and must be identified against a background of more abundant native peptides. To facilitate the search and identification of cross-linked peptides, we have developed a novel software suite, named Hekate. Hekate is a suite of tools that address the challenges involved in analyzing protein cross-linking experiments when combined with mass spectrometry. The software is an integrated pipeline for the automation of the data analysis workflow and provides a novel scoring system based on principles of linear peptide analysis. In addition, it provides a tool for the visualization of identified cross-links using three-dimensional models, which is particularly useful when combining chemical cross-linking with other structural techniques. Hekate was validated by the comparative analysis of cytochrome c (bovine heart) against previously reported data.1 Further validation was carried out on known structural elements of DNA polymerase III, the catalytic α-subunit of the Escherichia coli DNA replisome along with new insight into the previously uncharacterized C-terminal domain of the protein. PMID:24010795

  7. WS-SNPs&GO: a web server for predicting the deleterious effect of human protein variants using functional annotation

    PubMed Central

    2013-01-01

    Background SNPs&GO is a method for the prediction of deleterious Single Amino acid Polymorphisms (SAPs) using protein functional annotation. In this work, we present the web server implementation of SNPs&GO (WS-SNPs&GO). The server is based on Support Vector Machines (SVM) and for a given protein, its input comprises: the sequence and/or its three-dimensional structure (when available), a set of target variations and its functional Gene Ontology (GO) terms. The output of the server provides, for each protein variation, the probabilities to be associated to human diseases. Results The server consists of two main components, including updated versions of the sequence-based SNPs&GO (recently scored as one of the best algorithms for predicting deleterious SAPs) and of the structure-based SNPs&GO3d programs. Sequence and structure based algorithms are extensively tested on a large set of annotated variations extracted from the SwissVar database. Selecting a balanced dataset with more than 38,000 SAPs, the sequence-based approach achieves 81% overall accuracy, 0.61 correlation coefficient and an Area Under the Curve (AUC) of the Receiver Operating Characteristic (ROC) curve of 0.88. For the subset of ~6,600 variations mapped on protein structures available at the Protein Data Bank (PDB), the structure-based method scores with 84% overall accuracy, 0.68 correlation coefficient, and 0.91 AUC. When tested on a new blind set of variations, the results of the server are 79% and 83% overall accuracy for the sequence-based and structure-based inputs, respectively. Conclusions WS-SNPs&GO is a valuable tool that includes in a unique framework information derived from protein sequence, structure, evolutionary profile, and protein function. WS-SNPs&GO is freely available at http://snps.biofold.org/snps-and-go. PMID:23819482

  8. Four signature motifs define the first class of structurally related large coiled-coil proteins in plants.

    PubMed Central

    Gindullis, Frank; Rose, Annkatrin; Patel, Shalaka; Meier, Iris

    2002-01-01

    Background Animal and yeast proteins containing long coiled-coil domains are involved in attaching other proteins to the large, solid-state components of the cell. One subgroup of long coiled-coil proteins are the nuclear lamins, which are involved in attaching chromatin to the nuclear envelope and have recently been implicated in inherited human diseases. In contrast to other eukaryotes, long coiled-coil proteins have been barely investigated in plants. Results We have searched the completed Arabidopsis genome and have identified a family of structurally related long coiled-coil proteins. Filament-like plant proteins (FPP) were identified by sequence similarity to a tomato cDNA that encodes a coiled-coil protein which interacts with the nuclear envelope-associated protein, MAF1. The FPP family is defined by four novel unique sequence motifs and by two clusters of long coiled-coil domains separated by a non-coiled-coil linker. All family members are expressed in a variety of Arabidopsis tissues. A homolog sharing the structural features was identified in the monocot rice, indicating conservation among angiosperms. Conclusion Except for myosins, this is the first characterization of a family of long coiled-coil proteins in plants. The tomato homolog of the FPP family binds in a yeast two-hybrid assay to a nuclear envelope-associated protein. This might suggest that FPP family members function in nuclear envelope biology. Because the full Arabidopsis genome does not appear to contain genes for lamins, it is of interest to investigate other long coiled-coil proteins, which might functionally replace lamins in the plant kingdom. PMID:11972898

  9. Insight into the structure of photosynthetic LH2 aggregate from spectroscopy simulations.

    PubMed

    Rancova, Olga; Sulskus, Juozas; Abramavicius, Darius

    2012-07-12

    Using the electrostatic model of intermolecular interactions, we obtain the Frenkel exciton Hamiltonian parameters for the chlorophyll Qy band of a photosynthetic peripheral light harvesting complex LH2 of a purple bacteria Rhodopseudomonas acidophila from structural data. The intermolecular couplings are mostly determined by the chlorophyll relative positions, whereas the molecular transition energies are determined by the background charge distribution of the whole complex. The protonation pattern of titratable residues is used as a tunable parameter. By studying several protonation state scenarios for distinct protein groups and comparing the simulated absorption and circular dichroism spectra to experiment, we determine the most probable configuration of the protonation states of various side groups of the protein.

  10. Patterns of Protein Food Intake Are Associated with Nutrient Adequacy in the General French Adult Population.

    PubMed

    Gavelle, Erwan de; Huneau, Jean-François; Mariotti, François

    2018-02-17

    Protein food intake appears to partially structure dietary patterns, as most current emergent diets (e.g., vegetarian and flexitarian) can be described according to their levels of specific protein sources. However, few data are available on dietary protein patterns in the general population and their association with nutrient adequacy. Based on protein food intake data concerning 1678 adults from a representative French national dietary survey, and non-negative-matrix factorization followed by cluster analysis, we were able to identify distinctive dietary protein patterns and compare their nutrient adequacy (using PANDiet probabilistic scoring). The findings revealed eight patterns that clearly discriminate protein intakes and were characterized by the intakes of one or more specific protein foods: 'Processed meat', 'Poultry', 'Pork', 'Traditional', 'Milk', 'Take-away', 'Beef' and 'Fish'. 'Fish eaters' and 'Milk drinkers' had the highest overall nutrient adequacy, whereas that of 'Pork' and 'Take-away eaters' was the lowest. Nutrient adequacy could often be accounted for by the characteristics of the food contributing to protein intake: 'Meat eaters' had high probability of adequacy for iron and zinc, for example. We concluded that protein patterns constitute strong elements in the background structure of the dietary intake and are associated with the nutrient profile that they convey.

  11. Quantifying the relationship between sequence and three-dimensional structure conservation in RNA

    PubMed Central

    2010-01-01

    Background In recent years, the number of available RNA structures has rapidly grown reflecting the increased interest on RNA biology. Similarly to the studies carried out two decades ago for proteins, which gave the fundamental grounds for developing comparative protein structure prediction methods, we are now able to quantify the relationship between sequence and structure conservation in RNA. Results Here we introduce an all-against-all sequence- and three-dimensional (3D) structure-based comparison of a representative set of RNA structures, which have allowed us to quantitatively confirm that: (i) there is a measurable relationship between sequence and structure conservation that weakens for alignments resulting in below 60% sequence identity, (ii) evolution tends to conserve more RNA structure than sequence, and (iii) there is a twilight zone for RNA homology detection. Discussion The computational analysis here presented quantitatively describes the relationship between sequence and structure for RNA molecules and defines a twilight zone region for detecting RNA homology. Our work could represent the theoretical basis and limitations for future developments in comparative RNA 3D structure prediction. PMID:20550657

  12. Knowledge Discovery in Variant Databases Using Inductive Logic Programming

    PubMed Central

    Nguyen, Hoan; Luu, Tien-Dao; Poch, Olivier; Thompson, Julie D.

    2013-01-01

    Understanding the effects of genetic variation on the phenotype of an individual is a major goal of biomedical research, especially for the development of diagnostics and effective therapeutic solutions. In this work, we describe the use of a recent knowledge discovery from database (KDD) approach using inductive logic programming (ILP) to automatically extract knowledge about human monogenic diseases. We extracted background knowledge from MSV3d, a database of all human missense variants mapped to 3D protein structure. In this study, we identified 8,117 mutations in 805 proteins with known three-dimensional structures that were known to be involved in human monogenic disease. Our results help to improve our understanding of the relationships between structural, functional or evolutionary features and deleterious mutations. Our inferred rules can also be applied to predict the impact of any single amino acid replacement on the function of a protein. The interpretable rules are available at http://decrypthon.igbmc.fr/kd4v/. PMID:23589683

  13. Knowledge discovery in variant databases using inductive logic programming.

    PubMed

    Nguyen, Hoan; Luu, Tien-Dao; Poch, Olivier; Thompson, Julie D

    2013-01-01

    Understanding the effects of genetic variation on the phenotype of an individual is a major goal of biomedical research, especially for the development of diagnostics and effective therapeutic solutions. In this work, we describe the use of a recent knowledge discovery from database (KDD) approach using inductive logic programming (ILP) to automatically extract knowledge about human monogenic diseases. We extracted background knowledge from MSV3d, a database of all human missense variants mapped to 3D protein structure. In this study, we identified 8,117 mutations in 805 proteins with known three-dimensional structures that were known to be involved in human monogenic disease. Our results help to improve our understanding of the relationships between structural, functional or evolutionary features and deleterious mutations. Our inferred rules can also be applied to predict the impact of any single amino acid replacement on the function of a protein. The interpretable rules are available at http://decrypthon.igbmc.fr/kd4v/.

  14. Mining protein loops using a structural alphabet and statistical exceptionality

    PubMed Central

    2010-01-01

    Background Protein loops encompass 50% of protein residues in available three-dimensional structures. These regions are often involved in protein functions, e.g. binding site, catalytic pocket... However, the description of protein loops with conventional tools is an uneasy task. Regular secondary structures, helices and strands, have been widely studied whereas loops, because they are highly variable in terms of sequence and structure, are difficult to analyze. Due to data sparsity, long loops have rarely been systematically studied. Results We developed a simple and accurate method that allows the description and analysis of the structures of short and long loops using structural motifs without restriction on loop length. This method is based on the structural alphabet HMM-SA. HMM-SA allows the simplification of a three-dimensional protein structure into a one-dimensional string of states, where each state is a four-residue prototype fragment, called structural letter. The difficult task of the structural grouping of huge data sets is thus easily accomplished by handling structural letter strings as in conventional protein sequence analysis. We systematically extracted all seven-residue fragments in a bank of 93000 protein loops and grouped them according to the structural-letter sequence, named structural word. This approach permits a systematic analysis of loops of all sizes since we consider the structural motifs of seven residues rather than complete loops. We focused the analysis on highly recurrent words of loops (observed more than 30 times). Our study reveals that 73% of loop-lengths are covered by only 3310 highly recurrent structural words out of 28274 observed words). These structural words have low structural variability (mean RMSd of 0.85 Å). As expected, half of these motifs display a flanking-region preference but interestingly, two thirds are shared by short (less than 12 residues) and long loops. Moreover, half of recurrent motifs exhibit a significant level of amino-acid conservation with at least four significant positions and 87% of long loops contain at least one such word. We complement our analysis with the detection of statistically over-represented patterns of structural letters as in conventional DNA sequence analysis. About 30% (930) of structural words are over-represented, and cover about 40% of loop lengths. Interestingly, these words exhibit lower structural variability and higher sequential specificity, suggesting structural or functional constraints. Conclusions We developed a method to systematically decompose and study protein loops using recurrent structural motifs. This method is based on the structural alphabet HMM-SA and not on structural alignment and geometrical parameters. We extracted meaningful structural motifs that are found in both short and long loops. To our knowledge, it is the first time that pattern mining helps to increase the signal-to-noise ratio in protein loops. This finding helps to better describe protein loops and might permit to decrease the complexity of long-loop analysis. Detailed results are available at http://www.mti.univ-paris-diderot.fr/publication/supplementary/2009/ACCLoop/. PMID:20132552

  15. Functional correlation of bacterial LuxS with their quaternary associations: interface analysis of the structure networks

    PubMed Central

    Bhattacharyya, Moitrayee; Vishveshwara, Saraswathi

    2009-01-01

    Background The genome of a wide variety of prokaryotes contains the luxS gene homologue, which encodes for the protein S-ribosylhomocysteinelyase (LuxS). This protein is responsible for the production of the quorum sensing molecule, AI-2 and has been implicated in a variety of functions such as flagellar motility, metabolic regulation, toxin production and even in pathogenicity. A high structural similarity is present in the LuxS structures determined from a few species. In this study, we have modelled the structures from several other species and have investigated their dimer interfaces. We have attempted to correlate the interface features of LuxS with the phenotypic nature of the organisms. Results The protein structure networks (PSN) are constructed and graph theoretical analysis is performed on the structures obtained from X-ray crystallography and on the modelled ones. The interfaces, which are known to contain the active site, are characterized from the PSNs of these homodimeric proteins. The key features presented by the protein interfaces are investigated for the classification of the proteins in relation to their function. From our analysis, structural interface motifs are identified for each class in our dataset, which showed distinctly different pattern at the interface of LuxS for the probiotics and some extremophiles. Our analysis also reveals potential sites of mutation and geometric patterns at the interface that was not evident from conventional sequence alignment studies. Conclusion The structure network approach employed in this study for the analysis of dimeric interfaces in LuxS has brought out certain structural details at the side-chain interaction level, which were elusive from the conventional structure comparison methods. The results from this study provide a better understanding of the relation between the luxS gene and its functional role in the prokaryotes. This study also makes it possible to explore the potential direction towards the design of inhibitors of LuxS and thus towards a wide range of antimicrobials. PMID:19243584

  16. Focal switching of photochromic fluorescent proteins enables multiphoton microscopy with superior image contrast.

    PubMed

    Kao, Ya-Ting; Zhu, Xinxin; Xu, Fang; Min, Wei

    2012-08-01

    Probing biological structures and functions deep inside live organisms with light is highly desirable. Among the current optical imaging modalities, multiphoton fluorescence microscopy exhibits the best contrast for imaging scattering samples by employing a spatially confined nonlinear excitation. However, as the incident laser power drops exponentially with imaging depth into the sample due to the scattering loss, the out-of-focus background eventually overwhelms the in-focus signal, which defines a fundamental imaging-depth limit. Herein we significantly improve the image contrast for deep scattering samples by harnessing reversibly switchable fluorescent proteins (RSFPs) which can be cycled between bright and dark states upon light illumination. Two distinct techniques, multiphoton deactivation and imaging (MPDI) and multiphoton activation and imaging (MPAI), are demonstrated on tissue phantoms labeled with Dronpa protein. Such a focal switch approach can generate pseudo background-free images. Conceptually different from wave-based approaches that try to reduce light scattering in turbid samples, our work represents a molecule-based strategy that focused on imaging probes.

  17. Focal switching of photochromic fluorescent proteins enables multiphoton microscopy with superior image contrast

    PubMed Central

    Kao, Ya-Ting; Zhu, Xinxin; Xu, Fang; Min, Wei

    2012-01-01

    Probing biological structures and functions deep inside live organisms with light is highly desirable. Among the current optical imaging modalities, multiphoton fluorescence microscopy exhibits the best contrast for imaging scattering samples by employing a spatially confined nonlinear excitation. However, as the incident laser power drops exponentially with imaging depth into the sample due to the scattering loss, the out-of-focus background eventually overwhelms the in-focus signal, which defines a fundamental imaging-depth limit. Herein we significantly improve the image contrast for deep scattering samples by harnessing reversibly switchable fluorescent proteins (RSFPs) which can be cycled between bright and dark states upon light illumination. Two distinct techniques, multiphoton deactivation and imaging (MPDI) and multiphoton activation and imaging (MPAI), are demonstrated on tissue phantoms labeled with Dronpa protein. Such a focal switch approach can generate pseudo background-free images. Conceptually different from wave-based approaches that try to reduce light scattering in turbid samples, our work represents a molecule-based strategy that focused on imaging probes. PMID:22876358

  18. A method of searching for related literature on protein structure analysis by considering a user's intention

    PubMed Central

    2015-01-01

    Background In recent years, with advances in techniques for protein structure analysis, the knowledge about protein structure and function has been published in a vast number of articles. A method to search for specific publications from such a large pool of articles is needed. In this paper, we propose a method to search for related articles on protein structure analysis by using an article itself as a query. Results Each article is represented as a set of concepts in the proposed method. Then, by using similarities among concepts formulated from databases such as Gene Ontology, similarities between articles are evaluated. In this framework, the desired search results vary depending on the user's search intention because a variety of information is included in a single article. Therefore, the proposed method provides not only one input article (primary article) but also additional articles related to it as an input query to determine the search intention of the user, based on the relationship between two query articles. In other words, based on the concepts contained in the input article and additional articles, we actualize a relevant literature search that considers user intention by varying the degree of attention given to each concept and modifying the concept hierarchy graph. Conclusions We performed an experiment to retrieve relevant papers from articles on protein structure analysis registered in the Protein Data Bank by using three query datasets. The experimental results yielded search results with better accuracy than when user intention was not considered, confirming the effectiveness of the proposed method. PMID:25952498

  19. BIOPS Interactive: An e-Learning Platform Focused on Protein Structure and DNA

    ERIC Educational Resources Information Center

    Pontelli, Enrico; Pinto, Jorge; Qin, Xiaoxiao; He, Jing; Bevan, David; MacCuish, Norah; MacCuish, John; Chapman, Mitch; Moreland, David

    2009-01-01

    One of the difficulties in teaching basic molecular biology concepts to the students with little biological background is the lack of hands-on exercises that combines the challenges of the concepts with visualization and immediate feedback. BIOPS Interactive is a web-based interactive learning environment for molecular biology that complements…

  20. Structural modelling and comparative analysis of homologous, analogous and specific proteins from Trypanosoma cruzi versus Homo sapiens: putative drug targets for chagas' disease treatment

    PubMed Central

    2010-01-01

    Background Trypanosoma cruzi is the etiological agent of Chagas' disease, an endemic infection that causes thousands of deaths every year in Latin America. Therapeutic options remain inefficient, demanding the search for new drugs and/or new molecular targets. Such efforts can focus on proteins that are specific to the parasite, but analogous enzymes and enzymes with a three-dimensional (3D) structure sufficiently different from the corresponding host proteins may represent equally interesting targets. In order to find these targets we used the workflows MHOLline and AnEnΠ obtaining 3D models from homologous, analogous and specific proteins of Trypanosoma cruzi versus Homo sapiens. Results We applied genome wide comparative modelling techniques to obtain 3D models for 3,286 predicted proteins of T. cruzi. In combination with comparative genome analysis to Homo sapiens, we were able to identify a subset of 397 enzyme sequences, of which 356 are homologous, 3 analogous and 38 specific to the parasite. Conclusions In this work, we present a set of 397 enzyme models of T. cruzi that can constitute potential structure-based drug targets to be investigated for the development of new strategies to fight Chagas' disease. The strategies presented here support the concept of structural analysis in conjunction with protein functional analysis as an interesting computational methodology to detect potential targets for structure-based rational drug design. For example, 2,4-dienoyl-CoA reductase (EC 1.3.1.34) and triacylglycerol lipase (EC 3.1.1.3), classified as analogous proteins in relation to H. sapiens enzymes, were identified as new potential molecular targets. PMID:21034488

  1. Structural adaptation of extreme halophilic proteins through decrease of conserved hydrophobic contact surface

    PubMed Central

    2011-01-01

    Background Halophiles are extremophilic microorganisms growing optimally at high salt concentrations. There are two strategies used by halophiles to maintain proper osmotic pressure in their cytoplasm: accumulation of molar concentrations of potassium and chloride with extensive adaptation of the intracellular macromolecules ("salt-in" strategy) or biosynthesis and/or accumulation of organic osmotic solutes ("osmolyte" strategy). Our work was aimed at contributing to the understanding of the shared molecular mechanisms of protein haloadaptation through a detailed and systematic comparison of a sample of several three-dimensional structures of halophilic and non-halophilic proteins. Structural differences observed between the "salt-in" and the mesophilic homologous proteins were contrasted to those observed between the "osmolyte" and mesophilic pairs. Results The results suggest that haloadaptation strategy in the presence of molar salt concentration, but not of osmolytes, necessitates a weakening of the hydrophobic interactions, in particular at the level of conserved hydrophobic contacts. Weakening of these interactions counterbalances their strengthening by the presence of salts in solution and may help the structure preventing aggregation and/or loss of function in hypersaline environments. Conclusions Considering the significant increase of biotechnology applications of halophiles, the understanding of halophilicity can provide the theoretical basis for the engineering of proteins of great interest because stable at concentrations of salts that cause the denaturation or aggregation of the majority of macromolecules. PMID:22192175

  2. Protein disorder in the human diseasome: unfoldomics of human genetic diseases

    PubMed Central

    Midic, Uros; Oldfield, Christopher J; Dunker, A Keith; Obradovic, Zoran; Uversky, Vladimir N

    2009-01-01

    Background Intrinsically disordered proteins lack stable structure under physiological conditions, yet carry out many crucial biological functions, especially functions associated with regulation, recognition, signaling and control. Recently, human genetic diseases and related genes were organized into a bipartite graph (Goh KI, Cusick ME, Valle D, Childs B, Vidal M, et al. (2007) The human disease network. Proc Natl Acad Sci U S A 104: 8685–8690). This diseasome network revealed several significant features such as the common genetic origin of many diseases. Methods and findings We analyzed the abundance of intrinsic disorder in these diseasome network proteins by means of several prediction algorithms, and we analyzed the functional repertoires of these proteins based on prior studies relating disorder to function. Our analyses revealed that (i) Intrinsic disorder is common in proteins associated with many human genetic diseases; (ii) Different disease classes vary in the IDP contents of their associated proteins; (iii) Molecular recognition features, which are relatively short loosely structured protein regions within mostly disordered sequences and which gain structure upon binding to partners, are common in the diseasome, and their abundance correlates with the intrinsic disorder level; (iv) Some disease classes have a significant fraction of genes affected by alternative splicing, and the alternatively spliced regions in the corresponding proteins are predicted to be highly disordered; and (v) Correlations were found among the various diseasome graph-related properties and intrinsic disorder. Conclusion These observations provide the basis for the construction of the human-genetic-disease-associated unfoldome. PMID:19594871

  3. SMOQ: a tool for predicting the absolute residue-specific quality of a single protein model with support vector machines

    PubMed Central

    2014-01-01

    Background It is important to predict the quality of a protein structural model before its native structure is known. The method that can predict the absolute local quality of individual residues in a single protein model is rare, yet particularly needed for using, ranking and refining protein models. Results We developed a machine learning tool (SMOQ) that can predict the distance deviation of each residue in a single protein model. SMOQ uses support vector machines (SVM) with protein sequence and structural features (i.e. basic feature set), including amino acid sequence, secondary structures, solvent accessibilities, and residue-residue contacts to make predictions. We also trained a SVM model with two new additional features (profiles and SOV scores) on 20 CASP8 targets and found that including them can only improve the performance when real deviations between native and model are higher than 5Å. The SMOQ tool finally released uses the basic feature set trained on 85 CASP8 targets. Moreover, SMOQ implemented a way to convert predicted local quality scores into a global quality score. SMOQ was tested on the 84 CASP9 single-domain targets. The average difference between the residue-specific distance deviation predicted by our method and the actual distance deviation on the test data is 2.637Å. The global quality prediction accuracy of the tool is comparable to other good tools on the same benchmark. Conclusion SMOQ is a useful tool for protein single model quality assessment. Its source code and executable are available at: http://sysbio.rnet.missouri.edu/multicom_toolbox/. PMID:24776231

  4. The Evolutionary History of Protein Domains Viewed by Species Phylogeny

    PubMed Central

    Yang, Song; Bourne, Philip E.

    2009-01-01

    Background Protein structural domains are evolutionary units whose relationships can be detected over long evolutionary distances. The evolutionary history of protein domains, including the origin of protein domains, the identification of domain loss, transfer, duplication and combination with other domains to form new proteins, and the formation of the entire protein domain repertoire, are of great interest. Methodology/Principal Findings A methodology is presented for providing a parsimonious domain history based on gain, loss, vertical and horizontal transfer derived from the complete genomic domain assignments of 1015 organisms across the tree of life. When mapped to species trees the evolutionary history of domains and domain combinations is revealed, and the general evolutionary trend of domain and combination is analyzed. Conclusions/Significance We show that this approach provides a powerful tool to study how new proteins and functions emerged and to study such processes as horizontal gene transfer among more distant species. PMID:20041107

  5. Effects of autoclaving and high pressure on allergenicity of hazelnut proteins

    PubMed Central

    2012-01-01

    Background Hazelnut is reported as a causative agent of allergic reactions. However it is also an edible nut with health benefits. The allergenic characteristics of hazelnut-samples after autoclaving (AC) and high-pressure (HHP) processing have been studied and are also presented here. Previous studies demonstrated that AC treatments were responsible for structural transformation of protein structure motifs. Thus, structural analyses of allergen proteins from hazelnut were carried out to observe what is occurring in relation to the specific-IgE recognition of the related allergenic proteins. The aims of this work are to evaluate the effect of AC and HHP processing on hazelnut in vitro allergenicity using human-sera and to analyse the complexity of hazelnut allergen-protein structures. Methods Hazelnut-samples were subjected to AC and HHP processing. The specific IgE- reactivity was studied in 15 allergic clinic-patients via western blotting analyses. A series of homology-based-bioinformatics 3D-models (Cora 1, Cora 8, Cora 9 and Cora 11) were generated for the antigens included in the study to analyse the co mplexity of their protein structure. This study is supported by the Declaration of Helsinki and subsequent ethical guidelines. Results A severe reduction in vitro in allergenicity to hazelnut after AC processing was observed in the allergic clinic-patients studied. The specific-IgE binding of some of the described immunoreactive hazelnut protein-bands: Cora 1 ~18KDa, Cora 8 ~9KDa, Cora 9 ~35-40KDa and Cora 11 ~47-48 KDa decreases. Furthermore a relevant glycosylation was assigned and visualized via structural analysis of proteins (3D-modelling) for the first time in the protein-allergen Cora 11 showing a new role which could open a new door for allergenicity-unravellings. Conclusion Hazelnut allergenicity-studies in vivo via Prick-Prick and other means using AC processing are crucial to verify the data we observed via in vitro analyses. Glycosylation studies provided us with clues to elucidate, in the near future, mechanisms of the structures that contribute to hazelnut allergenicity, which thus, in turn, help alleviate food allergens. PMID:22616776

  6. Optimal contact definition for reconstruction of Contact Maps

    PubMed Central

    2010-01-01

    Background Contact maps have been extensively used as a simplified representation of protein structures. They capture most important features of a protein's fold, being preferred by a number of researchers for the description and study of protein structures. Inspired by the model's simplicity many groups have dedicated a considerable amount of effort towards contact prediction as a proxy for protein structure prediction. However a contact map's biological interest is subject to the availability of reliable methods for the 3-dimensional reconstruction of the structure. Results We use an implementation of the well-known distance geometry protocol to build realistic protein 3-dimensional models from contact maps, performing an extensive exploration of many of the parameters involved in the reconstruction process. We try to address the questions: a) to what accuracy does a contact map represent its corresponding 3D structure, b) what is the best contact map representation with regard to reconstructability and c) what is the effect of partial or inaccurate contact information on the 3D structure recovery. Our results suggest that contact maps derived from the application of a distance cutoff of 9 to 11Å around the Cβ atoms constitute the most accurate representation of the 3D structure. The reconstruction process does not provide a single solution to the problem but rather an ensemble of conformations that are within 2Å RMSD of the crystal structure and with lower values for the pairwise average ensemble RMSD. Interestingly it is still possible to recover a structure with partial contact information, although wrong contacts can lead to dramatic loss in reconstruction fidelity. Conclusions Thus contact maps represent a valid approximation to the structures with an accuracy comparable to that of experimental methods. The optimal contact definitions constitute key guidelines for methods based on contact maps such as structure prediction through contacts and structural alignments based on maximum contact map overlap. PMID:20507547

  7. SECRET domain of variola virus CrmB protein can be a member of poxviral type II chemokine-binding proteins family

    PubMed Central

    2010-01-01

    Background Variola virus (VARV) the causative agent of smallpox, eradicated in 1980, have wide spectrum of immunomodulatory proteins to evade host immunity. Recently additional biological activity was discovered for VARV CrmB protein, known to bind and inhibit tumour necrosis factor (TNF) through its N-terminal domain homologous to cellular TNF receptors. Besides binding TNF, this protein was also shown to bind with high affinity several chemokines which recruit B- and T-lymphocytes and dendritic cells to sites of viral entry and replication. Ability to bind chemokines was shown to be associated with unique C-terminal domain of CrmB protein. This domain named SECRET (Smallpox virus-Encoded Chemokine Receptor) is unrelated to the host proteins and lacks significant homology with other known viral chemokine-binding proteins or any other known protein. Findings De novo modelling of VARV-CrmB SECRET domain spatial structure revealed its apparent structural homology with cowpox virus CC-chemokine binding protein (vCCI) and vaccinia virus A41 protein, despite low sequence identity between these three proteins. Potential ligand-binding surface of modelled VARV-CrmB SECRET domain was also predicted to bear prominent electronegative charge which is characteristic to known orthopoxviral chemokine-binding proteins. Conclusions Our results suggest that SECRET should be included into the family of poxviral type II chemokine-binding proteins and that it might have been evolved from the vCCI-like predecessor protein. PMID:20979600

  8. Systems biology of the structural proteome.

    PubMed

    Brunk, Elizabeth; Mih, Nathan; Monk, Jonathan; Zhang, Zhen; O'Brien, Edward J; Bliven, Spencer E; Chen, Ke; Chang, Roger L; Bourne, Philip E; Palsson, Bernhard O

    2016-03-11

    The success of genome-scale models (GEMs) can be attributed to the high-quality, bottom-up reconstructions of metabolic, protein synthesis, and transcriptional regulatory networks on an organism-specific basis. Such reconstructions are biochemically, genetically, and genomically structured knowledge bases that can be converted into a mathematical format to enable a myriad of computational biological studies. In recent years, genome-scale reconstructions have been extended to include protein structural information, which has opened up new vistas in systems biology research and empowered applications in structural systems biology and systems pharmacology. Here, we present the generation, application, and dissemination of genome-scale models with protein structures (GEM-PRO) for Escherichia coli and Thermotoga maritima. We show the utility of integrating molecular scale analyses with systems biology approaches by discussing several comparative analyses on the temperature dependence of growth, the distribution of protein fold families, substrate specificity, and characteristic features of whole cell proteomes. Finally, to aid in the grand challenge of big data to knowledge, we provide several explicit tutorials of how protein-related information can be linked to genome-scale models in a public GitHub repository ( https://github.com/SBRG/GEMPro/tree/master/GEMPro_recon/). Translating genome-scale, protein-related information to structured data in the format of a GEM provides a direct mapping of gene to gene-product to protein structure to biochemical reaction to network states to phenotypic function. Integration of molecular-level details of individual proteins, such as their physical, chemical, and structural properties, further expands the description of biochemical network-level properties, and can ultimately influence how to model and predict whole cell phenotypes as well as perform comparative systems biology approaches to study differences between organisms. GEM-PRO offers insight into the physical embodiment of an organism's genotype, and its use in this comparative framework enables exploration of adaptive strategies for these organisms, opening the door to many new lines of research. With these provided tools, tutorials, and background, the reader will be in a position to run GEM-PRO for their own purposes.

  9. SCPRED: Accurate prediction of protein structural class for sequences of twilight-zone similarity with predicting sequences

    PubMed Central

    Kurgan, Lukasz; Cios, Krzysztof; Chen, Ke

    2008-01-01

    Background Protein structure prediction methods provide accurate results when a homologous protein is predicted, while poorer predictions are obtained in the absence of homologous templates. However, some protein chains that share twilight-zone pairwise identity can form similar folds and thus determining structural similarity without the sequence similarity would be desirable for the structure prediction. The folding type of a protein or its domain is defined as the structural class. Current structural class prediction methods that predict the four structural classes defined in SCOP provide up to 63% accuracy for the datasets in which sequence identity of any pair of sequences belongs to the twilight-zone. We propose SCPRED method that improves prediction accuracy for sequences that share twilight-zone pairwise similarity with sequences used for the prediction. Results SCPRED uses a support vector machine classifier that takes several custom-designed features as its input to predict the structural classes. Based on extensive design that considers over 2300 index-, composition- and physicochemical properties-based features along with features based on the predicted secondary structure and content, the classifier's input includes 8 features based on information extracted from the secondary structure predicted with PSI-PRED and one feature computed from the sequence. Tests performed with datasets of 1673 protein chains, in which any pair of sequences shares twilight-zone similarity, show that SCPRED obtains 80.3% accuracy when predicting the four SCOP-defined structural classes, which is superior when compared with over a dozen recent competing methods that are based on support vector machine, logistic regression, and ensemble of classifiers predictors. Conclusion The SCPRED can accurately find similar structures for sequences that share low identity with sequence used for the prediction. The high predictive accuracy achieved by SCPRED is attributed to the design of the features, which are capable of separating the structural classes in spite of their low dimensionality. We also demonstrate that the SCPRED's predictions can be successfully used as a post-processing filter to improve performance of modern fold classification methods. PMID:18452616

  10. Interfacing cellular networks of S. cerevisiae and E. coli: Connecting dynamic and genetic information

    PubMed Central

    2013-01-01

    Background In recent years, various types of cellular networks have penetrated biology and are nowadays used omnipresently for studying eukaryote and prokaryote organisms. Still, the relation and the biological overlap among phenomenological and inferential gene networks, e.g., between the protein interaction network and the gene regulatory network inferred from large-scale transcriptomic data, is largely unexplored. Results We provide in this study an in-depth analysis of the structural, functional and chromosomal relationship between a protein-protein network, a transcriptional regulatory network and an inferred gene regulatory network, for S. cerevisiae and E. coli. Further, we study global and local aspects of these networks and their biological information overlap by comparing, e.g., the functional co-occurrence of Gene Ontology terms by exploiting the available interaction structure among the genes. Conclusions Although the individual networks represent different levels of cellular interactions with global structural and functional dissimilarities, we observe crucial functions of their network interfaces for the assembly of protein complexes, proteolysis, transcription, translation, metabolic and regulatory interactions. Overall, our results shed light on the integrability of these networks and their interfacing biological processes. PMID:23663484

  11. Blocking Protein kinase C signaling pathway: mechanistic insights into the anti-leishmanial activity of prospective herbal drugs from Withania somnifera

    PubMed Central

    2012-01-01

    Background Leishmaniasis is caused by several species of leishmania protozoan and is one of the major vector-born diseases after malaria and sleeping sickness. Toxicity of available drugs and drug resistance development by protozoa in recent years has made Leishmaniasis cure difficult and challenging. This urges the need to discover new antileishmanial-drug targets and antileishmanial-drug development. Results Tertiary structure of leishmanial protein kinase C was predicted and found stable with a RMSD of 5.8Å during MD simulations. Natural compound withaferin A inhibited the predicted protein at its active site with -28.47 kcal/mol binding free energy. Withanone was also found to inhibit LPKC with good binding affinity of -22.57 kcal/mol. Both withaferin A and withanone were found stable within the binding pocket of predicted protein when MD simulations of ligand-bound protein complexes were carried out to examine the consistency of interactions between the two. Conclusions Leishmanial protein kinase C (LPKC) has been identified as a potential target to develop drugs against Leishmaniasis. We modelled and refined the tertiary structure of LPKC using computational methods such as homology modelling and molecular dynamics simulations. This structure of LPKC was used to reveal mode of inhibition of two previous experimentally reported natural compounds from Withania somnifera - withaferin A and withanone. PMID:23281834

  12. Data processing in neutron protein crystallography using positron-sensitive detectors

    NASA Astrophysics Data System (ADS)

    Schoenborn, B. P.

    Neutrons provide a unique probe for localizing hydrogen atoms and for distinguishing hydrogen from deuterons. Hydrogen atoms largely determine the three dimensional structure of proteins and are responsible for many catalytic reactions. The study of hydrogen bonding and hydrogen exchange will therefore give insight into reaction mechanisms and conformational fluctuations. In addition, neutrons provide the ability to distinguish N from C and O and to allow correct orientation of groups such as histidine and glutamine. To take advantage of these unique features of neutron crystallography, one needs accurate Fourier maps depicting atomic structure to a high precision. Special attention is given to subtraction of the high background associated with hydrogen containing molecules, which produces a disproportionately large statistical error.

  13. Accounting for epistatic interactions improves the functional analysis of protein structures.

    PubMed

    Wilkins, Angela D; Venner, Eric; Marciano, David C; Erdin, Serkan; Atri, Benu; Lua, Rhonald C; Lichtarge, Olivier

    2013-11-01

    The constraints under which sequence, structure and function coevolve are not fully understood. Bringing this mutual relationship to light can reveal the molecular basis of binding, catalysis and allostery, thereby identifying function and rationally guiding protein redesign. Underlying these relationships are the epistatic interactions that occur when the consequences of a mutation to a protein are determined by the genetic background in which it occurs. Based on prior data, we hypothesize that epistatic forces operate most strongly between residues nearby in the structure, resulting in smooth evolutionary importance across the structure. We find that when residue scores of evolutionary importance are distributed smoothly between nearby residues, functional site prediction accuracy improves. Accordingly, we designed a novel measure of evolutionary importance that focuses on the interaction between pairs of structurally neighboring residues. This measure that we term pair-interaction Evolutionary Trace yields greater functional site overlap and better structure-based proteome-wide functional predictions. Our data show that the structural smoothness of evolutionary importance is a fundamental feature of the coevolution of sequence, structure and function. Mutations operate on individual residues, but selective pressure depends in part on the extent to which a mutation perturbs interactions with neighboring residues. In practice, this principle led us to redefine the importance of a residue in terms of the importance of its epistatic interactions with neighbors, yielding better annotation of functional residues, motivating experimental validation of a novel functional site in LexA and refining protein function prediction. lichtarge@bcm.edu. Supplementary data are available at Bioinformatics online.

  14. Accounting for epistatic interactions improves the functional analysis of protein structures

    PubMed Central

    Wilkins, Angela D.; Venner, Eric; Marciano, David C.; Erdin, Serkan; Atri, Benu; Lua, Rhonald C.; Lichtarge, Olivier

    2013-01-01

    Motivation: The constraints under which sequence, structure and function coevolve are not fully understood. Bringing this mutual relationship to light can reveal the molecular basis of binding, catalysis and allostery, thereby identifying function and rationally guiding protein redesign. Underlying these relationships are the epistatic interactions that occur when the consequences of a mutation to a protein are determined by the genetic background in which it occurs. Based on prior data, we hypothesize that epistatic forces operate most strongly between residues nearby in the structure, resulting in smooth evolutionary importance across the structure. Methods and Results: We find that when residue scores of evolutionary importance are distributed smoothly between nearby residues, functional site prediction accuracy improves. Accordingly, we designed a novel measure of evolutionary importance that focuses on the interaction between pairs of structurally neighboring residues. This measure that we term pair-interaction Evolutionary Trace yields greater functional site overlap and better structure-based proteome-wide functional predictions. Conclusions: Our data show that the structural smoothness of evolutionary importance is a fundamental feature of the coevolution of sequence, structure and function. Mutations operate on individual residues, but selective pressure depends in part on the extent to which a mutation perturbs interactions with neighboring residues. In practice, this principle led us to redefine the importance of a residue in terms of the importance of its epistatic interactions with neighbors, yielding better annotation of functional residues, motivating experimental validation of a novel functional site in LexA and refining protein function prediction. Contact: lichtarge@bcm.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:24021383

  15. Improving consensus structure by eliminating averaging artifacts

    PubMed Central

    KC, Dukka B

    2009-01-01

    Background Common structural biology methods (i.e., NMR and molecular dynamics) often produce ensembles of molecular structures. Consequently, averaging of 3D coordinates of molecular structures (proteins and RNA) is a frequent approach to obtain a consensus structure that is representative of the ensemble. However, when the structures are averaged, artifacts can result in unrealistic local geometries, including unphysical bond lengths and angles. Results Herein, we describe a method to derive representative structures while limiting the number of artifacts. Our approach is based on a Monte Carlo simulation technique that drives a starting structure (an extended or a 'close-by' structure) towards the 'averaged structure' using a harmonic pseudo energy function. To assess the performance of the algorithm, we applied our approach to Cα models of 1364 proteins generated by the TASSER structure prediction algorithm. The average RMSD of the refined model from the native structure for the set becomes worse by a mere 0.08 Å compared to the average RMSD of the averaged structures from the native structure (3.28 Å for refined structures and 3.36 A for the averaged structures). However, the percentage of atoms involved in clashes is greatly reduced (from 63% to 1%); in fact, the majority of the refined proteins had zero clashes. Moreover, a small number (38) of refined structures resulted in lower RMSD to the native protein versus the averaged structure. Finally, compared to PULCHRA [1], our approach produces representative structure of similar RMSD quality, but with much fewer clashes. Conclusion The benchmarking results demonstrate that our approach for removing averaging artifacts can be very beneficial for the structural biology community. Furthermore, the same approach can be applied to almost any problem where averaging of 3D coordinates is performed. Namely, structure averaging is also commonly performed in RNA secondary prediction [2], which could also benefit from our approach. PMID:19267905

  16. MovieMaker: a web server for rapid rendering of protein motions and interactions.

    PubMed

    Maiti, Rajarshi; Van Domselaar, Gary H; Wishart, David S

    2005-07-01

    MovieMaker is a web server that allows short ( approximately 10 s), downloadable movies of protein motions to be generated. It accepts PDB files or PDB accession numbers as input and automatically calculates, renders and merges the necessary image files to create colourful animations covering a wide range of protein motions and other dynamic processes. Users have the option of animating (i) simple rotation, (ii) morphing between two end-state conformers, (iii) short-scale, picosecond vibrations, (iv) ligand docking, (v) protein oligomerization, (vi) mid-scale nanosecond (ensemble) motions and (vii) protein folding/unfolding. MovieMaker does not perform molecular dynamics calculations. Instead it is an animation tool that uses a sophisticated superpositioning algorithm in conjunction with Cartesian coordinate interpolation to rapidly and automatically calculate the intermediate structures needed for many of its animations. Users have extensive control over the rendering style, structure colour, animation quality, background and other image features. MovieMaker is intended to be a general-purpose server that allows both experts and non-experts to easily generate useful, informative protein animations for educational and illustrative purposes. MovieMaker is accessible at http://wishart.biology.ualberta.ca/moviemaker.

  17. Unbiased, scalable sampling of protein loop conformations from probabilistic priors

    PubMed Central

    2013-01-01

    Background Protein loops are flexible structures that are intimately tied to function, but understanding loop motion and generating loop conformation ensembles remain significant computational challenges. Discrete search techniques scale poorly to large loops, optimization and molecular dynamics techniques are prone to local minima, and inverse kinematics techniques can only incorporate structural preferences in adhoc fashion. This paper presents Sub-Loop Inverse Kinematics Monte Carlo (SLIKMC), a new Markov chain Monte Carlo algorithm for generating conformations of closed loops according to experimentally available, heterogeneous structural preferences. Results Our simulation experiments demonstrate that the method computes high-scoring conformations of large loops (>10 residues) orders of magnitude faster than standard Monte Carlo and discrete search techniques. Two new developments contribute to the scalability of the new method. First, structural preferences are specified via a probabilistic graphical model (PGM) that links conformation variables, spatial variables (e.g., atom positions), constraints and prior information in a unified framework. The method uses a sparse PGM that exploits locality of interactions between atoms and residues. Second, a novel method for sampling sub-loops is developed to generate statistically unbiased samples of probability densities restricted by loop-closure constraints. Conclusion Numerical experiments confirm that SLIKMC generates conformation ensembles that are statistically consistent with specified structural preferences. Protein conformations with 100+ residues are sampled on standard PC hardware in seconds. Application to proteins involved in ion-binding demonstrate its potential as a tool for loop ensemble generation and missing structure completion. PMID:24565175

  18. Bound Water at Protein-Protein Interfaces: Partners, Roles and Hydrophobic Bubbles as a Conserved Motif

    PubMed Central

    Ahmed, Mostafa H.; Spyrakis, Francesca; Cozzini, Pietro; Tripathi, Parijat K.; Mozzarelli, Andrea; Scarsdale, J. Neel; Safo, Martin A.; Kellogg, Glen E.

    2011-01-01

    Background There is a great interest in understanding and exploiting protein-protein associations as new routes for treating human disease. However, these associations are difficult to structurally characterize or model although the number of X-ray structures for protein-protein complexes is expanding. One feature of these complexes that has received little attention is the role of water molecules in the interfacial region. Methodology A data set of 4741 water molecules abstracted from 179 high-resolution (≤ 2.30 Å) X-ray crystal structures of protein-protein complexes was analyzed with a suite of modeling tools based on the HINT forcefield and hydrogen-bonding geometry. A metric termed Relevance was used to classify the general roles of the water molecules. Results The water molecules were found to be involved in: a) (bridging) interactions with both proteins (21%), b) favorable interactions with only one protein (53%), and c) no interactions with either protein (26%). This trend is shown to be independent of the crystallographic resolution. Interactions with residue backbones are consistent for all classes and account for 21.5% of all interactions. Interactions with polar residues are significantly more common for the first group and interactions with non-polar residues dominate the last group. Waters interacting with both proteins stabilize on average the proteins' interaction (−0.46 kcal mol−1), but the overall average contribution of a single water to the protein-protein interaction energy is unfavorable (+0.03 kcal mol−1). Analysis of the waters without favorable interactions with either protein suggests that this is a conserved phenomenon: 42% of these waters have SASA ≤ 10 Å2 and are thus largely buried, and 69% of these are within predominantly hydrophobic environments or “hydrophobic bubbles”. Such water molecules may have an important biological purpose in mediating protein-protein interactions. PMID:21961043

  19. Recombinant Expression Screening of P. aeruginosa Bacterial Inner Membrane Proteins

    PubMed Central

    2010-01-01

    Background Transmembrane proteins (TM proteins) make up 25% of all proteins and play key roles in many diseases and normal physiological processes. However, much less is known about their structures and molecular mechanisms than for soluble proteins. Problems in expression, solubilization, purification, and crystallization cause bottlenecks in the characterization of TM proteins. This project addressed the need for improved methods for obtaining sufficient amounts of TM proteins for determining their structures and molecular mechanisms. Results Plasmid clones were obtained that encode eighty-seven transmembrane proteins with varying physical characteristics, for example, the number of predicted transmembrane helices, molecular weight, and grand average hydrophobicity (GRAVY). All the target proteins were from P. aeruginosa, a gram negative bacterial opportunistic pathogen that causes serious lung infections in people with cystic fibrosis. The relative expression levels of the transmembrane proteins were measured under several culture growth conditions. The use of E. coli strains, a T7 promoter, and a 6-histidine C-terminal affinity tag resulted in the expression of 61 out of 87 test proteins (70%). In this study, proteins with a higher grand average hydrophobicity and more transmembrane helices were expressed less well than less hydrophobic proteins with fewer transmembrane helices. Conclusions In this study, factors related to overall hydrophobicity and the number of predicted transmembrane helices correlated with the relative expression levels of the target proteins. Identifying physical characteristics that correlate with protein expression might aid in selecting the "low hanging fruit", or proteins that can be expressed to sufficient levels using an E. coli expression system. The use of other expression strategies or host species might be needed for sufficient levels of expression of transmembrane proteins with other physical characteristics. Surveys like this one could aid in overcoming the technical bottlenecks in working with TM proteins and could potentially aid in increasing the rate of structure determination. PMID:21114855

  20. Computer analysis of protein functional sites projection on exon structure of genes in Metazoa

    PubMed Central

    2015-01-01

    Background Study of the relationship between the structural and functional organization of proteins and their coding genes is necessary for an understanding of the evolution of molecular systems and can provide new knowledge for many applications for designing proteins with improved medical and biological properties. It is well known that the functional properties of proteins are determined by their functional sites. Functional sites are usually represented by a small number of amino acid residues that are distantly located from each other in the amino acid sequence. They are highly conserved within their functional group and vary significantly in structure between such groups. According to this facts analysis of the general properties of the structural organization of the functional sites at the protein level and, at the level of exon-intron structure of the coding gene is still an actual problem. Results One approach to this analysis is the projection of amino acid residue positions of the functional sites along with the exon boundaries to the gene structure. In this paper, we examined the discontinuity of the functional sites in the exon-intron structure of genes and the distribution of lengths and phases of the functional site encoding exons in vertebrate genes. We have shown that the DNA fragments coding the functional sites were in the same exons, or in close exons. The observed tendency to cluster the exons that code functional sites which could be considered as the unit of protein evolution. We studied the characteristics of the structure of the exon boundaries that code, and do not code, functional sites in 11 Metazoa species. This is accompanied by a reduced frequency of intercodon gaps (phase 0) in exons encoding the amino acid residue functional site, which may be evidence of the existence of evolutionary limitations to the exon shuffling. Conclusions These results characterize the features of the coding exon-intron structure that affect the functionality of the encoded protein and allow a better understanding of the emergence of biological diversity. PMID:26693737

  1. 3D Protein structure prediction with genetic tabu search algorithm

    PubMed Central

    2010-01-01

    Background Protein structure prediction (PSP) has important applications in different fields, such as drug design, disease prediction, and so on. In protein structure prediction, there are two important issues. The first one is the design of the structure model and the second one is the design of the optimization technology. Because of the complexity of the realistic protein structure, the structure model adopted in this paper is a simplified model, which is called off-lattice AB model. After the structure model is assumed, optimization technology is needed for searching the best conformation of a protein sequence based on the assumed structure model. However, PSP is an NP-hard problem even if the simplest model is assumed. Thus, many algorithms have been developed to solve the global optimization problem. In this paper, a hybrid algorithm, which combines genetic algorithm (GA) and tabu search (TS) algorithm, is developed to complete this task. Results In order to develop an efficient optimization algorithm, several improved strategies are developed for the proposed genetic tabu search algorithm. The combined use of these strategies can improve the efficiency of the algorithm. In these strategies, tabu search introduced into the crossover and mutation operators can improve the local search capability, the adoption of variable population size strategy can maintain the diversity of the population, and the ranking selection strategy can improve the possibility of an individual with low energy value entering into next generation. Experiments are performed with Fibonacci sequences and real protein sequences. Experimental results show that the lowest energy obtained by the proposed GATS algorithm is lower than that obtained by previous methods. Conclusions The hybrid algorithm has the advantages from both genetic algorithm and tabu search algorithm. It makes use of the advantage of multiple search points in genetic algorithm, and can overcome poor hill-climbing capability in the conventional genetic algorithm by using the flexible memory functions of TS. Compared with some previous algorithms, GATS algorithm has better performance in global optimization and can predict 3D protein structure more effectively. PMID:20522256

  2. Sequence/structural analysis of xylem proteome emphasizes pathogenesis-related proteins, chitinases and β-1, 3-glucanases as key players in grapevine defense against Xylella fastidiosa.

    PubMed

    Chakraborty, Sandeep; Nascimento, Rafael; Zaini, Paulo A; Gouran, Hossein; Rao, Basuthkar J; Goulart, Luiz R; Dandekar, Abhaya M

    2016-01-01

    Background. Xylella fastidiosa, the causative agent of various plant diseases including Pierce's disease in the US, and Citrus Variegated Chlorosis in Brazil, remains a continual source of concern and economic losses, especially since almost all commercial varieties are sensitive to this Gammaproteobacteria. Differential expression of proteins in infected tissue is an established methodology to identify key elements involved in plant defense pathways. Methods. In the current work, we developed a methodology named CHURNER that emphasizes relevant protein functions from proteomic data, based on identification of proteins with similar structures that do not necessarily have sequence homology. Such clustering emphasizes protein functions which have multiple copies that are up/down-regulated, and highlights similar proteins which are differentially regulated. As a working example we present proteomic data enumerating differentially expressed proteins in xylem sap from grapevines that were infected with X. fastidiosa. Results. Analysis of this data by CHURNER highlighted pathogenesis related PR-1 proteins, reinforcing this as the foremost protein function in xylem sap involved in the grapevine defense response to X. fastidiosa. β-1, 3-glucanase, which has both anti-microbial and anti-fungal activities, is also up-regulated. Simultaneously, chitinases are found to be both up and down-regulated by CHURNER, and thus the net gain of this protein function loses its significance in the defense response. Discussion. We demonstrate how structural data can be incorporated in the pipeline of proteomic data analysis prior to making inferences on the importance of individual proteins to plant defense mechanisms. We expect CHURNER to be applicable to any proteomic data set.

  3. Formulation and in vitro characterization of protein-loaded liposomes

    NASA Astrophysics Data System (ADS)

    Kuzimski, Lauren

    Background/Objective: Protein-based drugs are increasingly used to treat a variety of conditions including cancer and cardio-vascular disease. Due to the immune system's innate ability to degrade the foreign particles quickly, protein-based treatments are generally short-lived. To address this limitation, the objective of the study was to: 1) develop protein-loaded liposomes; 2) characterize size, stability, encapsulation efficiency and rate of protein release; and 3) determine intracellular uptake and distribution; and 4) protein structural changes. Method: Liposomes were loaded with a fluorescent-albumin using freeze-thaw (F/T) methodology. Albumin encapsulation and release were quantified by fluorescence spectroscopic techniques. Flow cytometry was used to determine liposome uptake by macrophages. Epifluorescence microscopy was used to determine cellular distribution of liposomes. Stability was determined using dynamic light scattering by measuring liposome size over one month period. Protein structure was determined using circular dichroism (CD). Result: Encapsulation of albumin in liposome was ˜90% and was dependent on F/T rates, with fifteen cycles yielding the highest encapsulation efficacy (p < 0.05). Albumin-loaded liposomes demonstrated consistent size (<300nm). Release of encapsulated albumin in physiological buffer at 25°C was ˜60% in 72 h. Fluorescence imaging suggested an endosomal route of cellular entry for the FITC-albumin liposome with maximum uptake rates in immune cells (30% at 2hour incubation). CD suggested protein structure is minimally impacted by freeze-thaw methodology. Conclusion: Using F/T as a loading method, we were able to successfully achieve a protein-loaded liposome that was under 300nm, had encapsulation of ˜90%. Synthesized liposomes demonstrated a burst release of encapsulate protein (60%) at 72 hours. Cellular trafficking confirmed endosomal uptake, and minimal protein damage was noticed in CD.

  4. Rational Design of Protein Stability: Effect of (2S,4R)-4-Fluoroproline on the Stability and Folding Pathway of Ubiquitin

    PubMed Central

    Crespo, Maria D.; Rubini, Marina

    2011-01-01

    Background Many strategies have been employed to increase the conformational stability of proteins. The use of 4-substituted proline analogs capable to induce pre-organization in target proteins is an attractive tool to deliver an additional conformational stability without perturbing the overall protein structure. Both, peptides and proteins containing 4-fluorinated proline derivatives can be stabilized by forcing the pyrrolidine ring in its favored puckering conformation. The fluorinated pyrrolidine rings of proline can preferably stabilize either a Cγ-exo or a Cγ-endo ring pucker in dependence of proline chirality (4R/4S) in a complex protein structure. To examine whether this rational strategy can be generally used for protein stabilization, we have chosen human ubiquitin as a model protein which contains three proline residues displaying Cγ-exo puckering. Methodology/Principal Findings While (2S,4R)-4-fluoroproline ((4R)-FPro) containing ubiquitinin can be expressed in related auxotrophic Escherichia coli strain, all attempts to incorporate (2S,4S)-4-fluoroproline ((4S)-FPro) failed. Our results indicate that (4R)-FPro is favoring the Cγ-exo conformation present in the wild type structure and stabilizes the protein structure due to a pre-organization effect. This was confirmed by thermal and guanidinium chloride-induced denaturation profile analyses, where we observed an increase in stability of −4.71 kJ·mol−1 in the case of (4R)-FPro containing ubiquitin ((4R)-FPro-ub) compared to wild type ubiquitin (wt-ub). Expectedly, activity assays revealed that (4R)-FPro-ub retained the full biological activity compared to wt-ub. Conclusions/Significance The results fully confirm the general applicability of incorporating fluoroproline derivatives for improving protein stability. In general, a rational design strategy that enforces the natural occurring proline puckering conformation can be used to stabilize the desired target protein. PMID:21625626

  5. Gene Composer: database software for protein construct design, codon engineering, and gene synthesis

    PubMed Central

    Lorimer, Don; Raymond, Amy; Walchli, John; Mixon, Mark; Barrow, Adrienne; Wallace, Ellen; Grice, Rena; Burgin, Alex; Stewart, Lance

    2009-01-01

    Background To improve efficiency in high throughput protein structure determination, we have developed a database software package, Gene Composer, which facilitates the information-rich design of protein constructs and their codon engineered synthetic gene sequences. With its modular workflow design and numerous graphical user interfaces, Gene Composer enables researchers to perform all common bio-informatics steps used in modern structure guided protein engineering and synthetic gene engineering. Results An interactive Alignment Viewer allows the researcher to simultaneously visualize sequence conservation in the context of known protein secondary structure, ligand contacts, water contacts, crystal contacts, B-factors, solvent accessible area, residue property type and several other useful property views. The Construct Design Module enables the facile design of novel protein constructs with altered N- and C-termini, internal insertions or deletions, point mutations, and desired affinity tags. The modifications can be combined and permuted into multiple protein constructs, and then virtually cloned in silico into defined expression vectors. The Gene Design Module uses a protein-to-gene algorithm that automates the back-translation of a protein amino acid sequence into a codon engineered nucleic acid gene sequence according to a selected codon usage table with minimal codon usage threshold, defined G:C% content, and desired sequence features achieved through synonymous codon selection that is optimized for the intended expression system. The gene-to-oligo algorithm of the Gene Design Module plans out all of the required overlapping oligonucleotides and mutagenic primers needed to synthesize the desired gene constructs by PCR, and for physically cloning them into selected vectors by the most popular subcloning strategies. Conclusion We present a complete description of Gene Composer functionality, and an efficient PCR-based synthetic gene assembly procedure with mis-match specific endonuclease error correction in combination with PIPE cloning. In a sister manuscript we present data on how Gene Composer designed genes and protein constructs can result in improved protein production for structural studies. PMID:19383142

  6. Triangle network motifs predict complexes by complementing high-error interactomes with structural information

    PubMed Central

    Andreopoulos, Bill; Winter, Christof; Labudde, Dirk; Schroeder, Michael

    2009-01-01

    Background A lot of high-throughput studies produce protein-protein interaction networks (PPINs) with many errors and missing information. Even for genome-wide approaches, there is often a low overlap between PPINs produced by different studies. Second-level neighbors separated by two protein-protein interactions (PPIs) were previously used for predicting protein function and finding complexes in high-error PPINs. We retrieve second level neighbors in PPINs, and complement these with structural domain-domain interactions (SDDIs) representing binding evidence on proteins, forming PPI-SDDI-PPI triangles. Results We find low overlap between PPINs, SDDIs and known complexes, all well below 10%. We evaluate the overlap of PPI-SDDI-PPI triangles with known complexes from Munich Information center for Protein Sequences (MIPS). PPI-SDDI-PPI triangles have ~20 times higher overlap with MIPS complexes than using second-level neighbors in PPINs without SDDIs. The biological interpretation for triangles is that a SDDI causes two proteins to be observed with common interaction partners in high-throughput experiments. The relatively few SDDIs overlapping with PPINs are part of highly connected SDDI components, and are more likely to be detected in experimental studies. We demonstrate the utility of PPI-SDDI-PPI triangles by reconstructing myosin-actin processes in the nucleus, cytoplasm, and cytoskeleton, which were not obvious in the original PPIN. Using other complementary datatypes in place of SDDIs to form triangles, such as PubMed co-occurrences or threading information, results in a similar ability to find protein complexes. Conclusion Given high-error PPINs with missing information, triangles of mixed datatypes are a promising direction for finding protein complexes. Integrating PPINs with SDDIs improves finding complexes. Structural SDDIs partially explain the high functional similarity of second-level neighbors in PPINs. We estimate that relatively little structural information would be sufficient for finding complexes involving most of the proteins and interactions in a typical PPIN. PMID:19558694

  7. Understanding Marine Mussel Adhesion

    PubMed Central

    Roberto, Francisco F.

    2007-01-01

    In addition to identifying the proteins that have a role in underwater adhesion by marine mussels, research efforts have focused on identifying the genes responsible for the adhesive proteins, environmental factors that may influence protein production, and strategies for producing natural adhesives similar to the native mussel adhesive proteins. The production-scale availability of recombinant mussel adhesive proteins will enable researchers to formulate adhesives that are water-impervious and ecologically safe and can bind materials ranging from glass, plastics, metals, and wood to materials, such as bone or teeth, biological organisms, and other chemicals or molecules. Unfortunately, as of yet scientists have been unable to duplicate the processes that marine mussels use to create adhesive structures. This study provides a background on adhesive proteins identified in the blue mussel, Mytilus edulis, and introduces our research interests and discusses the future for continued research related to mussel adhesion. PMID:17990038

  8. A generalized analysis of hydrophobic and loop clusters within globular protein sequences

    PubMed Central

    Eudes, Richard; Le Tuan, Khanh; Delettré, Jean; Mornon, Jean-Paul; Callebaut, Isabelle

    2007-01-01

    Background Hydrophobic Cluster Analysis (HCA) is an efficient way to compare highly divergent sequences through the implicit secondary structure information directly derived from hydrophobic clusters. However, its efficiency and application are currently limited by the need of user expertise. In order to help the analysis of HCA plots, we report here the structural preferences of hydrophobic cluster species, which are frequently encountered in globular domains of proteins. These species are characterized only by their hydrophobic/non-hydrophobic dichotomy. This analysis has been extended to loop-forming clusters, using an appropriate loop alphabet. Results The structural behavior of hydrophobic cluster species, which are typical of protein globular domains, was investigated within banks of experimental structures, considered at different levels of sequence redundancy. The 294 more frequent hydrophobic cluster species were analyzed with regard to their association with the different secondary structures (frequencies of association with secondary structures and secondary structure propensities). Hydrophobic cluster species are predominantly associated with regular secondary structures, and a large part (60 %) reveals preferences for α-helices or β-strands. Moreover, the analysis of the hydrophobic cluster amino acid composition generally allows for finer prediction of the regular secondary structure associated with the considered cluster within a cluster species. We also investigated the behavior of loop forming clusters, using a "PGDNS" alphabet. These loop clusters do not overlap with hydrophobic clusters and are highly associated with coils. Finally, the structural information contained in the hydrophobic structural words, as deduced from experimental structures, was compared to the PSI-PRED predictions, revealing that β-strands and especially α-helices are generally over-predicted within the limits of typical β and α hydrophobic clusters. Conclusion The dictionary of hydrophobic clusters described here can help the HCA user to interpret and compare the HCA plots of globular protein sequences, as well as provides an original fundamental insight into the structural bricks of protein folds. Moreover, the novel loop cluster analysis brings additional information for secondary structure prediction on the whole sequence through a generalized cluster analysis (GCA), and not only on regular secondary structures. Such information lays the foundations for developing a new and original tool for secondary structure prediction. PMID:17210072

  9. Proteomic Mapping of Dental Enamel Matrix from Inbred Mouse Strains: Unraveling Potential New Players in Enamel.

    PubMed

    Lima Leite, Aline; Silva Fernandes, Mileni; Charone, Senda; Whitford, Gary Milton; Everett, Eric T; Buzalaf, Marília Afonso Rabelo

    2018-01-01

    Enamel formation is a complex 2-step process by which proteins are secreted to form an extracellular matrix, followed by massive protein degradation and subsequent mineralization. Excessive systemic exposure to fluoride can disrupt this process and lead to a condition known as dental fluorosis. The genetic background influences the responses of mineralized tissues to fluoride, such as dental fluorosis, observed in A/J and 129P3/J mice. The aim of the present study was to map the protein profile of enamel matrix from A/J and 129P3/J strains. Enamel matrix samples were obtained from A/J and 129P3/J mice and analyzed by 2-dimensional electrophoresis and liquid chromatography coupled with mass spectrometry. A total of 120 proteins were identified, and 7 of them were classified as putative uncharacterized proteins and analyzed in silico for structural and functional characterization. An interesting finding was the possibility of the uncharacterized sequence Q8BIS2 being an enzyme involved in the degradation of matrix proteins. Thus, the results provide a comprehensive view of the structure and function for putative uncharacterized proteins found in the enamel matrix that could help to elucidate the mechanisms involved in enamel biomineralization and genetic susceptibility to dental fluorosis. © 2018 S. Karger AG, Basel.

  10. Chicken genome analysis reveals novel genes encoding biotin-binding proteins related to avidin family

    PubMed Central

    Niskanen, Einari A; Hytönen, Vesa P; Grapputo, Alessandro; Nordlund, Henri R; Kulomaa, Markku S; Laitinen, Olli H

    2005-01-01

    Background A chicken egg contains several biotin-binding proteins (BBPs), whose complete DNA and amino acid sequences are not known. In order to identify and characterise these genes and proteins we studied chicken cDNAs and genes available in the NCBI database and chicken genome database using the reported N-terminal amino acid sequences of chicken egg-yolk BBPs as search strings. Results Two separate hits showing significant homology for these N-terminal sequences were discovered. For one of these hits, the chromosomal location in the immediate proximity of the avidin gene family was found. Both of these hits encode proteins having high sequence similarity with avidin suggesting that chicken BBPs are paralogous to avidin family. In particular, almost all residues corresponding to biotin binding in avidin are conserved in these putative BBP proteins. One of the found DNA sequences, however, seems to encode a carboxy-terminal extension not present in avidin. Conclusion We describe here the predicted properties of the putative BBP genes and proteins. Our present observations link BBP genes together with avidin gene family and shed more light on the genetic arrangement and variability of this family. In addition, comparative modelling revealed the potential structural elements important for the functional and structural properties of the putative BBP proteins. PMID:15777476

  11. Predicting β-turns and their types using predicted backbone dihedral angles and secondary structures

    PubMed Central

    2010-01-01

    Background β-turns are secondary structure elements usually classified as coil. Their prediction is important, because of their role in protein folding and their frequent occurrence in protein chains. Results We have developed a novel method that predicts β-turns and their types using information from multiple sequence alignments, predicted secondary structures and, for the first time, predicted dihedral angles. Our method uses support vector machines, a supervised classification technique, and is trained and tested on three established datasets of 426, 547 and 823 protein chains. We achieve a Matthews correlation coefficient of up to 0.49, when predicting the location of β-turns, the highest reported value to date. Moreover, the additional dihedral information improves the prediction of β-turn types I, II, IV, VIII and "non-specific", achieving correlation coefficients up to 0.39, 0.33, 0.27, 0.14 and 0.38, respectively. Our results are more accurate than other methods. Conclusions We have created an accurate predictor of β-turns and their types. Our method, called DEBT, is available online at http://comp.chem.nottingham.ac.uk/debt/. PMID:20673368

  12. Domain motions of Argonaute, the catalytic engine of RNA interference

    PubMed Central

    Ming, Dengming; Wall, Michael E; Sanbonmatsu, Kevin Y

    2007-01-01

    Background The Argonaute protein is the core component of the RNA-induced silencing complex, playing the central role of cleaving the mRNA target. Visual inspection of static crystal structures already has enabled researchers to suggest conformational changes of Argonaute that might occur during RNA interference. We have taken the next step by performing an all-atom normal mode analysis of the Pyrococcus furiosus and Aquifex aeolicus Argonaute crystal structures, allowing us to quantitatively assess the feasibility of these conformational changes. To perform the analysis, we begin with the energy-minimized X-ray structures. Normal modes are then calculated using an all-atom molecular mechanics force field. Results The analysis reveals low-frequency vibrations that facilitate the accommodation of RNA duplexes – an essential step in target recognition. The Pyrococcus furiosus and Aquifex aeolicus Argonaute proteins both exhibit low-frequency torsion and hinge motions; however, differences in the overall architecture of the proteins cause the detailed dynamics to be significantly different. Conclusion Overall, low-frequency vibrations of Argonaute are consistent with mechanisms within the current reaction cycle model for RNA interference. PMID:18053142

  13. Rastering strategy for screening and centring of microcrystal samples of human membrane proteins with a sub-10 µm size X-ray synchrotron beam

    PubMed Central

    Cherezov, Vadim; Hanson, Michael A.; Griffith, Mark T.; Hilgart, Mark C.; Sanishvili, Ruslan; Nagarajan, Venugopalan; Stepanov, Sergey; Fischetti, Robert F.; Kuhn, Peter; Stevens, Raymond C.

    2009-01-01

    Crystallization of human membrane proteins in lipidic cubic phase often results in very small but highly ordered crystals. Advent of the sub-10 µm minibeam at the APS GM/CA CAT has enabled the collection of high quality diffraction data from such microcrystals. Herein we describe the challenges and solutions related to growing, manipulating and collecting data from optically invisible microcrystals embedded in an opaque frozen in meso material. Of critical importance is the use of the intense and small synchrotron beam to raster through and locate the crystal sample in an efficient and reliable manner. The resulting diffraction patterns have a significant reduction in background, with strong intensity and improvement in diffraction resolution compared with larger beam sizes. Three high-resolution structures of human G protein-coupled receptors serve as evidence of the utility of these techniques that will likely be useful for future structural determination efforts. We anticipate that further innovations of the technologies applied to microcrystallography will enable the solving of structures of ever more challenging targets. PMID:19535414

  14. Comparable contributions of structural-functional constraints and expression level to the rate of protein sequence evolution

    PubMed Central

    Wolf, Maxim Y; Wolf, Yuri I; Koonin, Eugene V

    2008-01-01

    Background Proteins show a broad range of evolutionary rates. Understanding the factors that are responsible for the characteristic rate of evolution of a given protein arguably is one of the major goals of evolutionary biology. A long-standing general assumption used to be that the evolution rate is, primarily, determined by the specific functional constraints that affect the given protein. These constrains were traditionally thought to depend both on the specific features of the protein's structure and its biological role. The advent of systems biology brought about new types of data, such as expression level and protein-protein interactions, and unexpectedly, a variety of correlations between protein evolution rate and these variables have been observed. The strongest connections by far were repeatedly seen between protein sequence evolution rate and the expression level of the respective gene. It has been hypothesized that this link is due to the selection for the robustness of the protein structure to mistranslation-induced misfolding that is particularly important for highly expressed proteins and is the dominant determinant of the sequence evolution rate. Results This work is an attempt to assess the relative contributions of protein domain structure and function, on the one hand, and expression level on the other hand, to the rate of sequence evolution. To this end, we performed a genome-wide analysis of the effect of the fusion of a pair of domains in multidomain proteins on the difference in the domain-specific evolutionary rates. The mistranslation-induced misfolding hypothesis would predict that, within multidomain proteins, fused domains, on average, should evolve at substantially closer rates than the same domains in different proteins because, within a mutlidomain protein, all domains are translated at the same rate. We performed a comprehensive comparison of the evolutionary rates of mammalian and plant protein domains that are either joined in multidomain proteins or contained in distinct proteins. Substantial homogenization of evolutionary rates in multidomain proteins was, indeed, observed in both animals and plants, although highly significant differences between domain-specific rates remained. The contributions of the translation rate, as determined by the effect of the fusion of a pair of domains within a multidomain protein, and intrinsic, domain-specific structural-functional constraints appear to be comparable in magnitude. Conclusion Fusion of domains in a multidomain protein results in substantial homogenization of the domain-specific evolutionary rates but significant differences between domain-specific evolution rates remain. Thus, the rate of translation and intrinsic structural-functional constraints both exert sizable and comparable effects on sequence evolution. Reviewers This article was reviewed by Sergei Maslov, Dennis Vitkup, Claus Wilke (nominated by Orly Alter), and Allan Drummond (nominated by Joel Bader). For the full reviews, please go to the Reviewers' Reports section. PMID:18840284

  15. Domain selection combined with improved cloning strategy for high throughput expression of higher eukaryotic proteins

    PubMed Central

    Chen, Yunjia; Qiu, Shihong; Luan, Chi-Hao; Luo, Ming

    2007-01-01

    Background Expression of higher eukaryotic genes as soluble, stable recombinant proteins is still a bottleneck step in biochemical and structural studies of novel proteins today. Correct identification of stable domains/fragments within the open reading frame (ORF), combined with proper cloning strategies, can greatly enhance the success rate when higher eukaryotic proteins are expressed as these domains/fragments. Furthermore, a HTP cloning pipeline incorporated with bioinformatics domain/fragment selection methods will be beneficial to studies of structure and function genomics/proteomics. Results With bioinformatics tools, we developed a domain/domain boundary prediction (DDBP) method, which was trained by available experimental data. Combined with an improved cloning strategy, DDBP had been applied to 57 proteins from C. elegans. Expression and purification results showed there was a 10-fold increase in terms of obtaining purified proteins. Based on the DDBP method, the improved GATEWAY cloning strategy and a robotic platform, we constructed a high throughput (HTP) cloning pipeline, including PCR primer design, PCR, BP reaction, transformation, plating, colony picking and entry clones extraction, which have been successfully applied to 90 C. elegans genes, 88 Brucella genes, and 188 human genes. More than 97% of the targeted genes were obtained as entry clones. This pipeline has a modular design and can adopt different operations for a variety of cloning/expression strategies. Conclusion The DDBP method and improved cloning strategy were satisfactory. The cloning pipeline, combined with our recombinant protein HTP expression pipeline and the crystal screening robots, constitutes a complete platform for structure genomics/proteomics. This platform will increase the success rate of purification and crystallization dramatically and promote the further advancement of structure genomics/proteomics. PMID:17663785

  16. Evolutionary-inspired probabilistic search for enhancing sampling of local minima in the protein energy surface

    PubMed Central

    2012-01-01

    Background Despite computational challenges, elucidating conformations that a protein system assumes under physiologic conditions for the purpose of biological activity is a central problem in computational structural biology. While these conformations are associated with low energies in the energy surface that underlies the protein conformational space, few existing conformational search algorithms focus on explicitly sampling low-energy local minima in the protein energy surface. Methods This work proposes a novel probabilistic search framework, PLOW, that explicitly samples low-energy local minima in the protein energy surface. The framework combines algorithmic ingredients from evolutionary computation and computational structural biology to effectively explore the subspace of local minima. A greedy local search maps a conformation sampled in conformational space to a nearby local minimum. A perturbation move jumps out of a local minimum to obtain a new starting conformation for the greedy local search. The process repeats in an iterative fashion, resulting in a trajectory-based exploration of the subspace of local minima. Results and conclusions The analysis of PLOW's performance shows that, by navigating only the subspace of local minima, PLOW is able to sample conformations near a protein's native structure, either more effectively or as well as state-of-the-art methods that focus on reproducing the native structure for a protein system. Analysis of the actual subspace of local minima shows that PLOW samples this subspace more effectively that a naive sampling approach. Additional theoretical analysis reveals that the perturbation function employed by PLOW is key to its ability to sample a diverse set of low-energy conformations. This analysis also suggests directions for further research and novel applications for the proposed framework. PMID:22759582

  17. Exploiting CELLULOSE SYNTHASE (CESA) Class Specificity to Probe Cellulose Microfibril Biosynthesis.

    PubMed

    Kumar, Manoj; Mishra, Laxmi; Carr, Paul; Pilling, Michael; Gardner, Peter; Mansfield, Shawn D; Turner, Simon

    2018-05-01

    Cellulose microfibrils are the basic units of cellulose in plants. The structure of these microfibrils is at least partly determined by the structure of the cellulose synthase complex. In higher plants, this complex is composed of 18 to 24 catalytic subunits known as CELLULOSE SYNTHASE A (CESA) proteins. Three different classes of CESA proteins are required for cellulose synthesis and for secondary cell wall cellulose biosynthesis these classes are represented by CESA4, CESA7, and CESA8. To probe the relationship between CESA proteins and microfibril structure, we created mutant cesa proteins that lack catalytic activity but retain sufficient structural integrity to allow assembly of the cellulose synthase complex. Using a series of Arabidopsis ( Arabidopsis thaliana ) mutants and genetic backgrounds, we found consistent differences in the ability of these mutant cesa proteins to complement the cellulose-deficient phenotype of the cesa null mutants. The best complementation was observed with catalytically inactive cesa4, while the equivalent mutation in cesa8 exhibited significantly lower levels of complementation. Using a variety of biophysical techniques, including solid-state nuclear magnetic resonance and Fourier transform infrared microscopy, to study these mutant plants, we found evidence for changes in cellulose microfibril structure, but these changes largely correlated with cellulose content and reflected differences in the relative proportions of primary and secondary cell walls. Our results suggest that individual CESA classes have similar roles in determining cellulose microfibril structure, and it is likely that the different effects of mutating members of different CESA classes are the consequence of their different catalytic activity and their influence on the overall rate of cellulose synthesis. © 2018 American Society of Plant Biologists. All Rights Reserved.

  18. Exploiting CELLULOSE SYNTHASE (CESA) Class Specificity to Probe Cellulose Microfibril Biosynthesis1[OPEN

    PubMed Central

    Mishra, Laxmi; Carr, Paul; Gardner, Peter

    2018-01-01

    Cellulose microfibrils are the basic units of cellulose in plants. The structure of these microfibrils is at least partly determined by the structure of the cellulose synthase complex. In higher plants, this complex is composed of 18 to 24 catalytic subunits known as CELLULOSE SYNTHASE A (CESA) proteins. Three different classes of CESA proteins are required for cellulose synthesis and for secondary cell wall cellulose biosynthesis these classes are represented by CESA4, CESA7, and CESA8. To probe the relationship between CESA proteins and microfibril structure, we created mutant cesa proteins that lack catalytic activity but retain sufficient structural integrity to allow assembly of the cellulose synthase complex. Using a series of Arabidopsis (Arabidopsis thaliana) mutants and genetic backgrounds, we found consistent differences in the ability of these mutant cesa proteins to complement the cellulose-deficient phenotype of the cesa null mutants. The best complementation was observed with catalytically inactive cesa4, while the equivalent mutation in cesa8 exhibited significantly lower levels of complementation. Using a variety of biophysical techniques, including solid-state nuclear magnetic resonance and Fourier transform infrared microscopy, to study these mutant plants, we found evidence for changes in cellulose microfibril structure, but these changes largely correlated with cellulose content and reflected differences in the relative proportions of primary and secondary cell walls. Our results suggest that individual CESA classes have similar roles in determining cellulose microfibril structure, and it is likely that the different effects of mutating members of different CESA classes are the consequence of their different catalytic activity and their influence on the overall rate of cellulose synthesis. PMID:29523715

  19. Mirrors in the PDB: left-handed α-turns guide design with D-amino acids

    PubMed Central

    Annavarapu, Srinivas; Nanda, Vikas

    2009-01-01

    Background Incorporating variable amino acid stereochemistry in molecular design has the potential to improve existing protein stability and create new topologies inaccessible to homochiral molecules. The Protein Data Bank has been a reliable, rich source of information on molecular interactions and their role in protein stability and structure. D-amino acids rarely occur naturally, making it difficult to infer general rules for how they would be tolerated in proteins through an analysis of existing protein structures. However, protein elements containing short left-handed turns and helices turn out to contain useful information. Molecular mechanisms used in proteins to stabilize left-handed elements by L-amino acids are structurally enantiomeric to potential synthetic strategies for stabilizing right-handed elements with D-amino acids. Results Propensities for amino acids to occur in contiguous αL helices correlate with published thermodynamic scales for incorporation of D-amino acids into αR helices. Two backbone rules for terminating a left-handed helix are found: an αR conformation is disfavored at the amino terminus, and a βR conformation is disfavored at the carboxy terminus. Helix capping sidechain-backbone interactions are found which are unique to αL helices including an elevated propensity for L-Asn, and L-Thr at the amino terminus and L-Gln, L-Thr and L-Ser at the carboxy terminus. Conclusion By examining left-handed α-turns containing L-amino acids, new interaction motifs for incorporating D-amino acids into right-handed α-helices are identified. These will provide a basis for de novo design of novel heterochiral protein folds. PMID:19772623

  20. SVM-Fold: a tool for discriminative multi-class protein fold and superfamily recognition

    PubMed Central

    Melvin, Iain; Ie, Eugene; Kuang, Rui; Weston, Jason; Stafford, William Noble; Leslie, Christina

    2007-01-01

    Background Predicting a protein's structural class from its amino acid sequence is a fundamental problem in computational biology. Much recent work has focused on developing new representations for protein sequences, called string kernels, for use with support vector machine (SVM) classifiers. However, while some of these approaches exhibit state-of-the-art performance at the binary protein classification problem, i.e. discriminating between a particular protein class and all other classes, few of these studies have addressed the real problem of multi-class superfamily or fold recognition. Moreover, there are only limited software tools and systems for SVM-based protein classification available to the bioinformatics community. Results We present a new multi-class SVM-based protein fold and superfamily recognition system and web server called SVM-Fold, which can be found at . Our system uses an efficient implementation of a state-of-the-art string kernel for sequence profiles, called the profile kernel, where the underlying feature representation is a histogram of inexact matching k-mer frequencies. We also employ a novel machine learning approach to solve the difficult multi-class problem of classifying a sequence of amino acids into one of many known protein structural classes. Binary one-vs-the-rest SVM classifiers that are trained to recognize individual structural classes yield prediction scores that are not comparable, so that standard "one-vs-all" classification fails to perform well. Moreover, SVMs for classes at different levels of the protein structural hierarchy may make useful predictions, but one-vs-all does not try to combine these multiple predictions. To deal with these problems, our method learns relative weights between one-vs-the-rest classifiers and encodes information about the protein structural hierarchy for multi-class prediction. In large-scale benchmark results based on the SCOP database, our code weighting approach significantly improves on the standard one-vs-all method for both the superfamily and fold prediction in the remote homology setting and on the fold recognition problem. Moreover, our code weight learning algorithm strongly outperforms nearest-neighbor methods based on PSI-BLAST in terms of prediction accuracy on every structure classification problem we consider. Conclusion By combining state-of-the-art SVM kernel methods with a novel multi-class algorithm, the SVM-Fold system delivers efficient and accurate protein fold and superfamily recognition. PMID:17570145

  1. Predicting beta-turns in proteins using support vector machines with fractional polynomials

    PubMed Central

    2013-01-01

    Background β-turns are secondary structure type that have essential role in molecular recognition, protein folding, and stability. They are found to be the most common type of non-repetitive structures since 25% of amino acids in protein structures are situated on them. Their prediction is considered to be one of the crucial problems in bioinformatics and molecular biology, which can provide valuable insights and inputs for the fold recognition and drug design. Results We propose an approach that combines support vector machines (SVMs) and logistic regression (LR) in a hybrid prediction method, which we call (H-SVM-LR) to predict β-turns in proteins. Fractional polynomials are used for LR modeling. We utilize position specific scoring matrices (PSSMs) and predicted secondary structure (PSS) as features. Our simulation studies show that H-SVM-LR achieves Qtotal of 82.87%, 82.84%, and 82.32% on the BT426, BT547, and BT823 datasets respectively. These values are the highest among other β-turns prediction methods that are based on PSSMs and secondary structure information. H-SVM-LR also achieves favorable performance in predicting β-turns as measured by the Matthew's correlation coefficient (MCC) on these datasets. Furthermore, H-SVM-LR shows good performance when considering shape strings as additional features. Conclusions In this paper, we present a comprehensive approach for β-turns prediction. Experiments show that our proposed approach achieves better performance compared to other competing prediction methods. PMID:24565438

  2. Sweetening the pot: adding glycosylation to the biomarker discovery equation

    PubMed Central

    Drake, Penelope M.; Cho, Wonryeon; Li, Bensheng; Prakobphol, Akraporn; Johansen, Eric; Anderson, N. Leigh; Regnier, Fred E.; Gibson, Bradford W.; Fisher, Susan J.

    2010-01-01

    Background Cancer has profound effects on gene expression, including a cell’s glycosylation machinery. Thus, tumors produce glycoproteins that carry oligosaccharides with structures that are markedly different from the same protein produced by a normal cell. A single protein can have many glycosylation sites that greatly amplify the signals they generate as compared to their protein backbones. Content We survey clinical tests that target carbohydrate modifications. for diagnosing and treating cancer. Next, we present the biological relevance of glycosylation to disease progression by highlighting the role these structures play in adhesion, signaling and metastasis, and then address current methodological approaches to biomarker discovery that capitalize on selectively capturing tumor-associated glycoforms to enrich and identify disease-related candidate analytes. Finally, we discuss emerging technologies—multiple reaction monitoring and lectin-antibody arrays—as potential tools for biomarker validation studies in pursuit of clinically useful tests. Summary The future of carbohydrate-based biomarker studies has arrived. At all stages, from discovery through verification and deployment into clinics, glycosylation should be considered a primary readout or a way of increasing the sensitivity and specificity of protein-based analyses. PMID:19959616

  3. Rapid functional diversification in the structurally conserved ELAV family of neuronal RNA binding proteins

    PubMed Central

    Samson, Marie-Laure

    2008-01-01

    Background The Drosophila gene embryonic lethal abnormal visual system (elav) is the prototype of a gene family present in all metazoans. Its members encode structurally conserved neuronal proteins with three RNA Recognition Motifs (RRM) but they paradoxically act at diverse levels of post-transcriptional regulation. In an attempt to understand the history of this family, we searched for orthologs in eleven completely sequenced genomes, including those of humans, D. melanogaster and C. elegans, for which cDNAs are available. Results We analyzed 23 orthologs/paralogs of elav, and found evidence of gain/loss of gene copy number. For one set of genes, including elav itself, the coding sequences are free of introns and their products most resemble ELAV. The remaining genes show remarkable conservation of their exon organization, and their products most resemble FNE and RBP9, proteins encoded by the two elav paralogs of Drosophila. Remarkably, three of the conserved exon junctions are both close to structural elements, involved respectively in protein-RNA interactions and in the regulation of sub-cellular localization, and in the vicinity of diverse sequence variations. Conclusion The data indicate that the essential elav gene of Drosophila is newly emerged, restricted to dipterans and of retrotransposed origin. We propose that the conserved exon junctions constitute potential sites for sequence/function modifications, and that RRM binding proteins, whose function relies upon plastic RNA-protein interactions, may have played an important role in brain evolution. PMID:18715504

  4. Structural and evolutionary adaptation of rhoptry kinases and pseudokinases, a family of coccidian virulence factors

    PubMed Central

    2013-01-01

    Background The widespread protozoan parasite Toxoplasma gondii interferes with host cell functions by exporting the contents of a unique apical organelle, the rhoptry. Among the mix of secreted proteins are an expanded, lineage-specific family of protein kinases termed rhoptry kinases (ROPKs), several of which have been shown to be key virulence factors, including the pseudokinase ROP5. The extent and details of the diversification of this protein family are poorly understood. Results In this study, we comprehensively catalogued the ROPK family in the genomes of Toxoplasma gondii, Neospora caninum and Eimeria tenella, as well as portions of the unfinished genome of Sarcocystis neurona, and classified the identified genes into 42 distinct subfamilies. We systematically compared the rhoptry kinase protein sequences and structures to each other and to the broader superfamily of eukaryotic protein kinases to study the patterns of diversification and neofunctionalization in the ROPK family and its subfamilies. We identified three ROPK sub-clades of particular interest: those bearing a structurally conserved N-terminal extension to the kinase domain (NTE), an E. tenella-specific expansion, and a basal cluster including ROP35 and BPK1 that we term ROPKL. Structural analysis in light of the solved structures ROP2, ROP5, ROP8 and in comparison to typical eukaryotic protein kinases revealed ROPK-specific conservation patterns in two key regions of the kinase domain, surrounding a ROPK-conserved insert in the kinase hinge region and a disulfide bridge in the kinase substrate-binding lobe. We also examined conservation patterns specific to the NTE-bearing clade. We discuss the possible functional consequences of each. Conclusions Our work sheds light on several important but previously unrecognized features shared among rhoptry kinases, as well as the essential differences between active and degenerate protein kinases. We identify the most distinctive ROPK-specific features conserved across both active kinases and pseudokinases, and discuss these in terms of sequence motifs, evolutionary context, structural impact and potential functional relevance. By characterizing the proteins that enable these parasites to invade the host cell and co-opt its signaling mechanisms, we provide guidance on potential therapeutic targets for the diseases caused by coccidian parasites. PMID:23742205

  5. MolTalk – a programming library for protein structures and structure analysis

    PubMed Central

    Diemand, Alexander V; Scheib, Holger

    2004-01-01

    Background Two of the mostly unsolved but increasingly urgent problems for modern biologists are a) to quickly and easily analyse protein structures and b) to comprehensively mine the wealth of information, which is distributed along with the 3D co-ordinates by the Protein Data Bank (PDB). Tools which address this issue need to be highly flexible and powerful but at the same time must be freely available and easy to learn. Results We present MolTalk, an elaborate programming language, which consists of the programming library libmoltalk implemented in Objective-C and the Smalltalk-based interpreter MolTalk. MolTalk combines the advantages of an easy to learn and programmable procedural scripting with the flexibility and power of a full programming language. An overview of currently available applications of MolTalk is given and with PDBChainSaw one such application is described in more detail. PDBChainSaw is a MolTalk-based parser and information extraction utility of PDB files. Weekly updates of the PDB are synchronised with PDBChainSaw and are available for free download from the MolTalk project page following the link to PDBChainSaw. For each chain in a protein structure, PDBChainSaw extracts the sequence from its co-ordinates and provides additional information from the PDB-file header section, such as scientific organism, compound name, and EC code. Conclusion MolTalk provides a rich set of methods to analyse and even modify experimentally determined or modelled protein structures. These methods vary in complexity and are thus suitable for beginners and advanced programmers alike. We envision MolTalk to be most valuable in the following applications: 1) To analyse protein structures repetitively in large-scale, i.e. to benchmark protein structure prediction methods or to evaluate structural models. The quality of the resulting 3D-models can be assessed by e.g. calculating a Ramachandran-Sasisekharan plot. 2) To quickly retrieve information for (a limited number of) macro-molecular structures, i.e. H-bonds, salt bridges, contacts between amino acids and ligands or at the interface between two chains. 3) To programme more complex structural bioinformatics software and to implement demanding algorithms through its portability to Objective-C, e.g. iMolTalk. 4) To be used as a front end to databases, e.g. PDBChainSaw. PMID:15096277

  6. Phylogenetic and Complementation Analysis of a Single-Stranded DNA Binding Protein Family from Lactococcal Phages Indicates a Non-Bacterial Origin

    PubMed Central

    Mariadassou, Mahendra; Bardowski, Jacek K.; Bidnenko, Elena

    2011-01-01

    Background The single-stranded-nucleic acid binding (SSB) protein superfamily includes proteins encoded by different organisms from Bacteria and their phages to Eukaryotes. SSB proteins share common structural characteristics and have been suggested to descend from an ancestor polypeptide. However, as other proteins involved in DNA replication, bacterial SSB proteins are clearly different from those found in Archaea and Eukaryotes. It was proposed that the corresponding genes in the phage genomes were transferred from the bacterial hosts. Recently new SSB proteins encoded by the virulent lactococcal bacteriophages (Orf14bIL67-like proteins) have been identified and characterized structurally and biochemically. Methodology/Principal Findings This study focused on the determination of phylogenetic relationships between Orf14bIL67-like proteins and other SSBs. We have performed a large scale phylogenetic analysis and pairwise sequence comparisons of SSB proteins from different phyla. The results show that, in remarkable contrast to other phage SSBs, the Orf14bIL67–like proteins form a distinct, self-contained and well supported phylogenetic group connected to the archaeal SSBs. Functional studies demonstrated that, despite the structural and amino acid sequence differences from bacterial SSBs, Orf14bIL67 protein complements the conditional lethal ssb-1 mutation of Escherichia coli. Conclusions/Significance Here we identified for the first time a group of phages encoded SSBs which are clearly distinct from their bacterial counterparts. All methods supported the recognition of these phage proteins as a new family within the SSB superfamily. Our findings suggest that unlike other phages, the virulent lactococcal phages carry ssb genes that were not acquired from their hosts, but transferred from an archaeal genome. This represents a unique example of a horizontal gene transfer between Archaea and bacterial phages. PMID:22073223

  7. Plasma proteins predict conversion to dementia from prodromal disease

    PubMed Central

    Hye, Abdul; Riddoch-Contreras, Joanna; Baird, Alison L.; Ashton, Nicholas J.; Bazenet, Chantal; Leung, Rufina; Westman, Eric; Simmons, Andrew; Dobson, Richard; Sattlecker, Martina; Lupton, Michelle; Lunnon, Katie; Keohane, Aoife; Ward, Malcolm; Pike, Ian; Zucht, Hans Dieter; Pepin, Danielle; Zheng, Wei; Tunnicliffe, Alan; Richardson, Jill; Gauthier, Serge; Soininen, Hilkka; Kłoszewska, Iwona; Mecocci, Patrizia; Tsolaki, Magda; Vellas, Bruno; Lovestone, Simon

    2014-01-01

    Background The study aimed to validate previously discovered plasma biomarkers associated with AD, using a design based on imaging measures as surrogate for disease severity and assess their prognostic value in predicting conversion to dementia. Methods Three multicenter cohorts of cognitively healthy elderly, mild cognitive impairment (MCI), and AD participants with standardized clinical assessments and structural neuroimaging measures were used. Twenty-six candidate proteins were quantified in 1148 subjects using multiplex (xMAP) assays. Results Sixteen proteins correlated with disease severity and cognitive decline. Strongest associations were in the MCI group with a panel of 10 proteins predicting progression to AD (accuracy 87%, sensitivity 85%, and specificity 88%). Conclusions We have identified 10 plasma proteins strongly associated with disease severity and disease progression. Such markers may be useful for patient selection for clinical trials and assessment of patients with predisease subjective memory complaints. PMID:25012867

  8. Effect of T- and C-loop mutations on the Herbaspirillum seropedicae GlnB protein in nitrogen signalling.

    PubMed

    Bonatto, Ana C; Souza, Emanuel M; Pedrosa, Fábio O; Yates, M Geoffrey; Benelli, Elaine M

    2005-01-01

    Proteins of the PII family are found in species of all kingdoms. Although these proteins usually share high identity, their functions are specific to the different organisms. Comparison of structural data from Escherichia coli GlnB and GlnK and Herbaspirillum seropedicae GlnB showed that the T-loop and C-terminus were variable regions. To evaluate the role of these regions in signal transduction by the H. seropedicae GlnB protein, four mutants were constructed: Y51F, G108A/P109a, G108W and Q3R/T5A. The activities of the native and mutated proteins were assayed in an E. coli background constitutively expressing the Klebsiella pneumoniae nifLA operon. The results suggested that the T-loop and C-terminus regions of H. seropedicae GlnB are involved in nitrogen signal transduction.

  9. A phylogenetic analysis of normal modes evolution in enzymes and its relationship to enzyme function

    PubMed Central

    Lai, Jason; Jin, Jing; Kubelka, Jan; Liberles, David A.

    2012-01-01

    Since the dynamic nature of protein structures is essential for enzymatic function, it is expected that the functional evolution can be inferred from the changes in the protein dynamics. However, dynamics can also diverge neutrally with sequence substitution between enzymes without changes of function. In this study, a phylogenetic approach is implemented to explore the relationship between enzyme dynamics and function through evolutionary history. Protein dynamics are described by normal mode analysis based on a simplified harmonic potential force field applied to the reduced Cα representation of the protein structure while enzymatic function is described by Enzyme Commission (EC) numbers. Similarity of the binding pocket dynamics at each branch of the protein family’s phylogeny was analyzed in two ways: 1) explicitly by quantifying the normal mode overlap calculated for the reconstructed ancestral proteins at each end and 2) implicitly using a diffusion model to obtain the reconstructed lineage-specific changes in the normal modes. Both explicit and implicit ancestral reconstruction identified generally faster rates of change in dynamics compared with the expected change from neutral evolution at the branches of potential functional divergences for the alpha-amylase, D-isomer specific 2-hydroxyacid dehydrogenase, and copper-containing amine oxidase protein families. Normal modes analysis added additional information over just comparing the RMSD of static structures. However, the branch-specific changes were not statistically significant compared to background function-independent neutral rates of change of dynamic properties and blind application of the analysis would not enable prediction of changes in enzyme specificity. PMID:22651983

  10. A phylogenetic analysis of normal modes evolution in enzymes and its relationship to enzyme function.

    PubMed

    Lai, Jason; Jin, Jing; Kubelka, Jan; Liberles, David A

    2012-09-21

    Since the dynamic nature of protein structures is essential for enzymatic function, it is expected that functional evolution can be inferred from the changes in protein dynamics. However, dynamics can also diverge neutrally with sequence substitution between enzymes without changes of function. In this study, a phylogenetic approach is implemented to explore the relationship between enzyme dynamics and function through evolutionary history. Protein dynamics are described by normal mode analysis based on a simplified harmonic potential force field applied to the reduced C(α) representation of the protein structure while enzymatic function is described by Enzyme Commission numbers. Similarity of the binding pocket dynamics at each branch of the protein family's phylogeny was analyzed in two ways: (1) explicitly by quantifying the normal mode overlap calculated for the reconstructed ancestral proteins at each end and (2) implicitly using a diffusion model to obtain the reconstructed lineage-specific changes in the normal modes. Both explicit and implicit ancestral reconstruction identified generally faster rates of change in dynamics compared with the expected change from neutral evolution at the branches of potential functional divergences for the α-amylase, D-isomer-specific 2-hydroxyacid dehydrogenase, and copper-containing amine oxidase protein families. Normal mode analysis added additional information over just comparing the RMSD of static structures. However, the branch-specific changes were not statistically significant compared to background function-independent neutral rates of change of dynamic properties and blind application of the analysis would not enable prediction of changes in enzyme specificity. Copyright © 2012 Elsevier Ltd. All rights reserved.

  11. OST-HTH: a novel predicted RNA-binding domain

    PubMed Central

    2010-01-01

    Background The mechanism by which the arthropod Oskar and vertebrate TDRD5/TDRD7 proteins nucleate or organize structurally related ribonucleoprotein (RNP) complexes, the polar granule and nuage, is poorly understood. Using sequence profile searches we identify a novel domain in these proteins that is widely conserved across eukaryotes and bacteria. Results Using contextual information from domain architectures, sequence-structure superpositions and available functional information we predict that this domain is likely to adopt the winged helix-turn-helix fold and bind RNA with a potential specificity for dsRNA. We show that in eukaryotes this domain is often combined in the same polypeptide with protein-protein- or lipid- interaction domains that might play a role in anchoring these proteins to specific cytoskeletal structures. Conclusions Thus, proteins with this domain might have a key role in the recognition and localization of dsRNA, including miRNAs, rasiRNAs and piRNAs hybridized to their targets. In other cases, this domain is fused to ubiquitin-binding, E3 ligase and ubiquitin-like domains indicating a previously under-appreciated role for ubiquitination in regulating the assembly and stability of nuage-like RNP complexes. Both bacteria and eukaryotes encode a conserved family of proteins that combines this predicted RNA-binding domain with a previously uncharacterized domain (DUF88). We present evidence that it is an RNAse belonging to the superfamily that includes the 5'->3' nucleases, PIN and NYN domains and might be recruited to degrade certain RNAs. Reviewers This article was reviewed by Sandor Pongor and Arcady Mushegian. PMID:20302647

  12. Comparison of molecular dynamics and superfamily spaces of protein domain deformation

    PubMed Central

    Velázquez-Muriel, Javier A; Rueda, Manuel; Cuesta, Isabel; Pascual-Montano, Alberto; Orozco, Modesto; Carazo, José-María

    2009-01-01

    Background It is well known the strong relationship between protein structure and flexibility, on one hand, and biological protein function, on the other hand. Technically, protein flexibility exploration is an essential task in many applications, such as protein structure prediction and modeling. In this contribution we have compared two different approaches to explore the flexibility space of protein domains: i) molecular dynamics (MD-space), and ii) the study of the structural changes within superfamily (SF-space). Results Our analysis indicates that the MD-space and the SF-space display a significant overlap, but are still different enough to be considered as complementary. The SF-space space is wider but less complex than the MD-space, irrespective of the number of members in the superfamily. Also, the SF-space does not sample all possibilities offered by the MD-space, but often introduces very large changes along just a few deformation modes, whose number tend to a plateau as the number of related folds in the superfamily increases. Conclusion Theoretically, we obtained two conclusions. First, that function restricts the access to some flexibility patterns to evolution, as we observe that when a superfamily member changes to become another, the path does not completely overlap with the physical deformability. Second, that conformational changes from variation in a superfamily are larger and much simpler than those allowed by physical deformability. Methodologically, the conclusion is that both spaces studied are complementary, and have different size and complexity. We expect this fact to have application in fields as 3D-EM/X-ray hybrid models or ab initio protein folding. PMID:19220918

  13. Trichomonas vaginalis vast BspA-like gene family: evidence for functional diversity from structural organisation and transcriptomics

    PubMed Central

    2010-01-01

    Background Trichomonas vaginalis is the most common non-viral human sexually transmitted pathogen and importantly, contributes to facilitating the spread of HIV. Yet very little is known about its surface and secreted proteins mediating interactions with, and permitting the invasion and colonisation of, the host mucosa. Initial annotations of T. vaginalis genome identified a plethora of candidate extracellular proteins. Results Data mining of the T. vaginalis genome identified 911 BspA-like entries (TvBspA) sharing TpLRR-like leucine-rich repeats, which represent the largest gene family encoding potential extracellular proteins for the pathogen. A broad range of microorganisms encoding BspA-like proteins was identified and these are mainly known to live on mucosal surfaces, among these T. vaginalis is endowed with the largest gene family. Over 190 TvBspA proteins with inferred transmembrane domains were characterised by a considerable structural diversity between their TpLRR and other types of repetitive sequences and two subfamilies possessed distinct classic sorting signal motifs for endocytosis. One TvBspA subfamily also shared a glycine-rich protein domain with proteins from Clostridium difficile pathogenic strains and C. difficile phages. Consistent with the hypothesis that TvBspA protein structural diversity implies diverse roles, we demonstrated for several TvBspA genes differential expression at the transcript level in different growth conditions. Identified variants of repetitive segments between several TvBspA paralogues and orthologues from two clinical isolates were also consistent with TpLRR and other repetitive sequences to be functionally important. For one TvBspA protein cell surface expression and antibody responses by both female and male T. vaginalis infected patients were also demonstrated. Conclusions The biased mucosal habitat for microbial species encoding BspA-like proteins, the characterisation of a vast structural diversity for the TvBspA proteins, differential expression of a subset of TvBspA genes and the cellular localisation and immunological data for one TvBspA; all point to the importance of the TvBspA proteins to various aspects of T. vaginalis pathobiology at the host-pathogen interface. PMID:20144183

  14. ST proteins, a new family of plant tandem repeat proteins with a DUF2775 domain mainly found in Fabaceae and Asteraceae

    PubMed Central

    2012-01-01

    Background Many proteins with tandem repeats in their sequence have been described and classified according to the length of the repeats: I) Repeats of short oligopeptides (from 2 to 20 amino acids), including structural cell wall proteins and arabinogalactan proteins. II) Repeats that range in length from 20 to 40 residues, including proteins with a well-established three-dimensional structure often involved in mediating protein-protein interactions. (III) Longer repeats in the order of 100 amino acids that constitute structurally and functionally independent units. Here we analyse ShooT specific (ST) proteins, a family of proteins with tandem repeats of unknown function that were first found in Leguminosae, and their possible similarities to other proteins with tandem repeats. Results ST protein sequences were only found in dicotyledonous plants, limited to several plant families, mainly the Fabaceae and the Asteraceae. ST mRNAs accumulate mainly in the roots and under biotic interactions. Most ST proteins have one or several Domain(s) of Unknown Function 2775 (DUF2775). All deduced ST proteins have a signal peptide, indicating that these proteins enter the secretory pathway, and the mature proteins have tandem repeat oligopeptides that share a hexapeptide (E/D)FEPRP followed by 4 partially conserved amino acids, which could determine a putative N-glycosylation signal, and a fully conserved tyrosine. In a phylogenetic tree, the sequences clade according to taxonomic group. A possible involvement in symbiosis and abiotic stress as well as in plant cell elongation is suggested, although different STs could play different roles in plant development. Conclusions We describe a new family of proteins called ST whose presence is limited to the plant kingdom, specifically to a few families of dicotyledonous plants. They present 20 to 40 amino acid tandem repeat sequences with different characteristics (signal peptide, DUF2775 domain, conservative repeat regions) from the described group of 20 to 40 amino acid tandem repeat proteins and also from known cell wall proteins with repeat sequences. Several putative roles in plant physiology can be inferred from the characteristics found. PMID:23134664

  15. Insulator function and topological domain border strength scale with architectural protein occupancy

    PubMed Central

    2014-01-01

    Background Chromosome conformation capture studies suggest that eukaryotic genomes are organized into structures called topologically associating domains. The borders of these domains are highly enriched for architectural proteins with characterized roles in insulator function. However, a majority of architectural protein binding sites localize within topological domains, suggesting sites associated with domain borders represent a functionally different subclass of these regulatory elements. How topologically associating domains are established and what differentiates border-associated from non-border architectural protein binding sites remain unanswered questions. Results By mapping the genome-wide target sites for several Drosophila architectural proteins, including previously uncharacterized profiles for TFIIIC and SMC-containing condensin complexes, we uncover an extensive pattern of colocalization in which architectural proteins establish dense clusters at the borders of topological domains. Reporter-based enhancer-blocking insulator activity as well as endogenous domain border strength scale with the occupancy level of architectural protein binding sites, suggesting co-binding by architectural proteins underlies the functional potential of these loci. Analyses in mouse and human stem cells suggest that clustering of architectural proteins is a general feature of genome organization, and conserved architectural protein binding sites may underlie the tissue-invariant nature of topologically associating domains observed in mammals. Conclusions We identify a spectrum of architectural protein occupancy that scales with the topological structure of chromosomes and the regulatory potential of these elements. Whereas high occupancy architectural protein binding sites associate with robust partitioning of topologically associating domains and robust insulator function, low occupancy sites appear reserved for gene-specific regulation within topological domains. PMID:24981874

  16. Pattern similarity study of functional sites in protein sequences: lysozymes and cystatins

    PubMed Central

    Nakai, Shuryo; Li-Chan, Eunice CY; Dou, Jinglie

    2005-01-01

    Background Although it is generally agreed that topography is more conserved than sequences, proteins sharing the same fold can have different functions, while there are protein families with low sequence similarity. An alternative method for profile analysis of characteristic conserved positions of the motifs within the 3D structures may be needed for functional annotation of protein sequences. Using the approach of quantitative structure-activity relationships (QSAR), we have proposed a new algorithm for postulating functional mechanisms on the basis of pattern similarity and average of property values of side-chains in segments within sequences. This approach was used to search for functional sites of proteins belonging to the lysozyme and cystatin families. Results Hydrophobicity and β-turn propensity of reference segments with 3–7 residues were used for the homology similarity search (HSS) for active sites. Hydrogen bonding was used as the side-chain property for searching the binding sites of lysozymes. The profiles of similarity constants and average values of these parameters as functions of their positions in the sequences could identify both active and substrate binding sites of the lysozyme of Streptomyces coelicolor, which has been reported as a new fold enzyme (Cellosyl). The same approach was successfully applied to cystatins, especially for postulating the mechanisms of amyloidosis of human cystatin C as well as human lysozyme. Conclusion Pattern similarity and average index values of structure-related properties of side chains in short segments of three residues or longer were, for the first time, successfully applied for predicting functional sites in sequences. This new approach may be applicable to studying functional sites in un-annotated proteins, for which complete 3D structures are not yet available. PMID:15904486

  17. Color transitions in coral's fluorescent proteins by site-directed mutagenesis

    PubMed Central

    Gurskaya, Nadya G; Savitsky, Alexander P; Yanushevich, Yurii G; Lukyanov, Sergey A; Lukyanov, Konstantin A

    2001-01-01

    Background Green Fluorescent Protein (GFP) cloned from jellyfish Aequorea victoria and its homologs from corals Anthozoa have a great practical significance as in vivo markers of gene expression. Also, they are an interesting puzzle of protein science due to an unusual mechanism of chromophore formation and diversity of fluorescent colors. Fluorescent proteins can be subdivided into cyan (~ 485 nm), green (~ 505 nm), yellow (~ 540 nm), and red (>580 nm) emitters. Results Here we applied site-directed mutagenesis in order to investigate the structural background of color variety and possibility of shifting between different types of fluorescence. First, a blue-shifted mutant of cyan amFP486 was generated. Second, it was established that cyan and green emitters can be modified so as to produce an intermediate spectrum of fluorescence. Third, the relationship between green and yellow fluorescence was inspected on closely homologous green zFP506 and yellow zFP538 proteins. The following transitions of colors were performed: yellow to green; yellow to dual color (green and yellow); and green to yellow. Fourth, we generated a mutant of cyan emitter dsFP483 that demonstrated dual color (cyan and red) fluorescence. Conclusions Several amino acid substitutions were found to strongly affect fluorescence maxima. Some positions primarily found by sequence comparison were proved to be crucial for fluorescence of particular color. These results are the first step towards predicting the color of natural GFP-like proteins corresponding to newly identified cDNAs from corals. PMID:11459517

  18. Quantitative evaluation of protein conformation in pharmaceuticals using cross-linking reactions coupled with LC-MS/MS analysis.

    PubMed

    Yamaguchi, Hideto; Hirakura, Yutaka; Shirai, Hiroki; Mimura, Hisashi; Toyo'oka, Toshimasa

    2011-06-01

    The need for a simple and high-throughput method for identifying the tertiary structure of protein pharmaceuticals has increased. In this study, a simple method for mapping the protein fold is proposed for use as a complementary quality test. This method is based on cross-linking a protein using a [bis(sulfosuccinimidyl)suberate (BS(3))], followed by peptide mapping by LC-MS. Consensus interferon (CIFN) was used as the model protein. The tryptic map obtained via liquid chromatography tandem mass spectroscopy (LC-MS/MS) and the mass mapping obtained via matrix-assisted laser desorption/ionization time-of-flight mass spectroscopy were used to identify cross-linked peptides. While LC-MS/MS analyses found that BS(3) formed cross-links in the loop region of the protein, which was regarded as the biologically active site, sodium dodecyl-sulfate polyacrylamide gel electrophoresis demonstrated that cross-linking occurred within a protein molecule, but not between protein molecules. The occurrence of cross-links at the active site depends greatly on the conformation of the protein, which is determined by the denaturing conditions. Quantitative evaluation of the tertiary structure of CIFN was thus possible by monitoring the amounts of cross-linked peptides generated. Assuming that background information is available at the development stage, this method may be applicable to process development as a complementary test for quality control. Copyright © 2011 Elsevier B.V. All rights reserved.

  19. Investigating the Structural Impacts of I64T and P311S Mutations in APE1-DNA Complex: A Molecular Dynamics Approach

    PubMed Central

    Doss, C. George Priya; NagaSundaram, N.

    2012-01-01

    Background Elucidating the molecular dynamic behavior of Protein-DNA complex upon mutation is crucial in current genomics. Molecular dynamics approach reveals the changes on incorporation of variants that dictate the structure and function of Protein-DNA complexes. Deleterious mutations in APE1 protein modify the physicochemical property of amino acids that affect the protein stability and dynamic behavior. Further, these mutations disrupt the binding sites and prohibit the protein to form complexes with its interacting DNA. Principal Findings In this study, we developed a rapid and cost-effective method to analyze variants in APE1 gene that are associated with disease susceptibility and evaluated their impacts on APE1-DNA complex dynamic behavior. Initially, two different in silico approaches were used to identify deleterious variants in APE1 gene. Deleterious scores that overlap in these approaches were taken in concern and based on it, two nsSNPs with IDs rs61730854 (I64T) and rs1803120 (P311S) were taken further for structural analysis. Significance Different parameters such as RMSD, RMSF, salt bridge, H-bonds and SASA applied in Molecular dynamic study reveals that predicted deleterious variants I64T and P311S alters the structure as well as affect the stability of APE1-DNA interacting functions. This study addresses such new methods for validating functional polymorphisms of human APE1 which is critically involved in causing deficit in repair capacity, which in turn leads to genetic instability and carcinogenesis. PMID:22384055

  20. Cancer Associated E17K Mutation Causes Rapid Conformational Drift in AKT1 Pleckstrin Homology (PH) Domain

    PubMed Central

    Kumar, Ambuj; Purohit, Rituraj

    2013-01-01

    Background AKT1 (v-akt murine thymoma viral oncogene homologue 1) kinase is one of the most frequently activated proliferated and survival pathway of cancer. Recently it has been shown that E17K mutation in the Pleckstrin Homology (PH) domain of AKT1 protein leads to cancer by amplifying the phosphorylation and membrane localization of protein. The mutant has shown resistance to AKT1/2 inhibitor VIII drug molecule. In this study we have demonstrated the detailed structural and molecular consequences associated with the activity regulation of mutant protein. Methods The docking score exhibited significant loss in the interaction affinity to AKT1/2 inhibitor VIII drug molecule. Furthermore, the molecular dynamics simulation studies presented an evidence of rapid conformational drift observed in mutant structure. Results There was no stability loss in mutant as compared to native structure and the major cation–π interactions were also shown to be retained. Moreover, the active residues involved in membrane localization of protein exhibited significant rise in NHbonds formation in mutant. The rise in NHbond formation in active residues accounts for the 4-fold increase in the membrane localization potential of protein. Conclusion The overall result suggested that, although the mutation did not induce any stability loss in structure, the associated pathological consequences might have occurred due to the rapid conformational drifts observed in the mutant AKT1 PH domain. General Significance The methodology implemented and the results obtained in this work will facilitate in determining the core molecular mechanisms of cancer-associated mutations and in designing their potential drug inhibitors. PMID:23741320

  1. Predicting PDZ domain mediated protein interactions from structure

    PubMed Central

    2013-01-01

    Background PDZ domains are structural protein domains that recognize simple linear amino acid motifs, often at protein C-termini, and mediate protein-protein interactions (PPIs) in important biological processes, such as ion channel regulation, cell polarity and neural development. PDZ domain-peptide interaction predictors have been developed based on domain and peptide sequence information. Since domain structure is known to influence binding specificity, we hypothesized that structural information could be used to predict new interactions compared to sequence-based predictors. Results We developed a novel computational predictor of PDZ domain and C-terminal peptide interactions using a support vector machine trained with PDZ domain structure and peptide sequence information. Performance was estimated using extensive cross validation testing. We used the structure-based predictor to scan the human proteome for ligands of 218 PDZ domains and show that the predictions correspond to known PDZ domain-peptide interactions and PPIs in curated databases. The structure-based predictor is complementary to the sequence-based predictor, finding unique known and novel PPIs, and is less dependent on training–testing domain sequence similarity. We used a functional enrichment analysis of our hits to create a predicted map of PDZ domain biology. This map highlights PDZ domain involvement in diverse biological processes, some only found by the structure-based predictor. Based on this analysis, we predict novel PDZ domain involvement in xenobiotic metabolism and suggest new interactions for other processes including wound healing and Wnt signalling. Conclusions We built a structure-based predictor of PDZ domain-peptide interactions, which can be used to scan C-terminal proteomes for PDZ interactions. We also show that the structure-based predictor finds many known PDZ mediated PPIs in human that were not found by our previous sequence-based predictor and is less dependent on training–testing domain sequence similarity. Using both predictors, we defined a functional map of human PDZ domain biology and predict novel PDZ domain function. Users may access our structure-based and previous sequence-based predictors at http://webservice.baderlab.org/domains/POW. PMID:23336252

  2. Single-Domain Parvulins Constitute a Specific Marker for Recently Proposed Deep-Branching Archaeal Subgroups

    PubMed Central

    Lederer, Christoph; Heider, Dominik; van den Boom, Johannes; Hoffmann, Daniel; Mueller, Jonathan W.; Bayer, Peter

    2011-01-01

    Peptidyl-prolyl cis/trans isomerases (PPIases) are enzymes assisting protein folding and protein quality control in organisms of all kingdoms of life. In contrast to the other sub-classes of PPIases, the cyclophilins and the FK-506 binding proteins, little was formerly known about the parvulin type of PPIase in Archaea. Recently, the first solution structure of an archaeal parvulin, the PinA protein from Cenarchaeum symbiosum, was reported. Investigation of occurrence and frequency of PPIase sequences in numerous archaeal genomes now revealed a strong tendency for thermophilic microorganisms to reduce the number of PPIases. Single-domain parvulins were mostly found in the genomes of recently proposed deep-branching archaeal subgroups, the Thaumarchaeota and the ARMANs (archaeal Richmond Mine acidophilic nanoorganisms). Hence, we used the parvulin sequence to reclassify available archaeal metagenomic contigs, thereby, adding new members to these subgroups. A combination of genomic background analysis and phylogenetic approaches of parvulin sequences suggested that the assigned sequences belong to at least two distinct groups of Thaumarchaeota. Finally, machine learning approaches were applied to identify amino acid residues that separate archaeal and bacterial parvulin proteins from each other. When mapped onto the recent PinA solution structure, most of these positions form a cluster at one site of the protein possibly indicating a different functionality of the two groups of parvulin proteins. PMID:22065628

  3. Protein structure based prediction of catalytic residues

    PubMed Central

    2013-01-01

    Background Worldwide structural genomics projects continue to release new protein structures at an unprecedented pace, so far nearly 6000, but only about 60% of these proteins have any sort of functional annotation. Results We explored a range of features that can be used for the prediction of functional residues given a known three-dimensional structure. These features include various centrality measures of nodes in graphs of interacting residues: closeness, betweenness and page-rank centrality. We also analyzed the distance of functional amino acids to the general center of mass (GCM) of the structure, relative solvent accessibility (RSA), and the use of relative entropy as a measure of sequence conservation. From the selected features, neural networks were trained to identify catalytic residues. We found that using distance to the GCM together with amino acid type provide a good discriminant function, when combined independently with sequence conservation. Using an independent test set of 29 annotated protein structures, the method returned 411 of the initial 9262 residues as the most likely to be involved in function. The output 411 residues contain 70 of the annotated 111 catalytic residues. This represents an approximately 14-fold enrichment of catalytic residues on the entire input set (corresponding to a sensitivity of 63% and a precision of 17%), a performance competitive with that of other state-of-the-art methods. Conclusions We found that several of the graph based measures utilize the same underlying feature of protein structures, which can be simply and more effectively captured with the distance to GCM definition. This also has the added the advantage of simplicity and easy implementation. Meanwhile sequence conservation remains by far the most influential feature in identifying functional residues. We also found that due the rapid changes in size and composition of sequence databases, conservation calculations must be recalibrated for specific reference databases. PMID:23433045

  4. Quantification of the impact of PSI:Biology according to the annotations of the determined structures

    PubMed Central

    2013-01-01

    Background Protein Structure Initiative:Biology (PSI:Biology) is the third phase of PSI where protein structures are determined in high-throughput to characterize their biological functions. The transition to the third phase entailed the formation of PSI:Biology Partnerships which are composed of structural genomics centers and biomedical science laboratories. We present a method to examine the impact of protein structures determined under the auspices of PSI:Biology by measuring their rates of annotations. The mean numbers of annotations per structure and per residue are examined. These are designed to provide measures of the amount of structure to function connections that can be leveraged from each structure. Results One result is that PSI:Biology structures are found to have a higher rate of annotations than structures determined during the first two phases of PSI. A second result is that the subset of PSI:Biology structures determined through PSI:Biology Partnerships have a higher rate of annotations than those determined exclusive of those partnerships. Both results hold when the annotation rates are examined either at the level of the entire protein or for annotations that are known to fall at specific residues within the portion of the protein that has a determined structure. Conclusions We conclude that PSI:Biology determines structures that are estimated to have a higher degree of biomedical interest than those determined during the first two phases of PSI based on a broad array of biomedical annotations. For the PSI:Biology Partnerships, we see that there is an associated added value that represents part of the progress toward the goals of PSI:Biology. We interpret the added value to mean that team-based structural biology projects that utilize the expertise and technologies of structural genomics centers together with biological laboratories in the community are conducted in a synergistic manner. We show that the annotation rates can be used in conjunction with established metrics, i.e. the numbers of structures and impact of publication records, to monitor the progress of PSI:Biology towards its goals of examining structure to function connections of high biomedical relevance. The metric provides an objective means to quantify the overall impact of PSI:Biology as it uses biomedical annotations from external sources. PMID:24139526

  5. Maximizing the quantitative accuracy and reproducibility of Förster resonance energy transfer measurement for screening by high throughput widefield microscopy

    PubMed Central

    Schaufele, Fred

    2013-01-01

    Förster resonance energy transfer (FRET) between fluorescent proteins (FPs) provides insights into the proximities and orientations of FPs as surrogates of the biochemical interactions and structures of the factors to which the FPs are genetically fused. As powerful as FRET methods are, technical issues have impeded their broad adoption in the biologic sciences. One hurdle to accurate and reproducible FRET microscopy measurement stems from variable fluorescence backgrounds both within a field and between different fields. Those variations introduce errors into the precise quantification of fluorescence levels on which the quantitative accuracy of FRET measurement is highly dependent. This measurement error is particularly problematic for screening campaigns since minimal well-to-well variation is necessary to faithfully identify wells with altered values. High content screening depends also upon maximizing the numbers of cells imaged, which is best achieved by low magnification high throughput microscopy. But, low magnification introduces flat-field correction issues that degrade the accuracy of background correction to cause poor reproducibility in FRET measurement. For live cell imaging, fluorescence of cell culture media in the fluorescence collection channels for the FPs commonly used for FRET analysis is a high source of background error. These signal-to-noise problems are compounded by the desire to express proteins at biologically meaningful levels that may only be marginally above the strong fluorescence background. Here, techniques are presented that correct for background fluctuations. Accurate calculation of FRET is realized even from images in which a non-flat background is 10-fold higher than the signal. PMID:23927839

  6. Recombinant probes for visualizing endogenous synaptic proteins in living neurons

    PubMed Central

    Gross, Garrett G.; Junge, Jason A.; Mora, Rudy J.; Kwon, Hyung-Bae; Olson, C. Anders; Takahashi, Terry T.; Liman, Emily R.; Ellis-Davies, Graham C.R.; McGee, Aaron W.; Sabatini, Bernardo L.; Roberts, Richard W.; Arnold, Don B.

    2013-01-01

    Summary The ability to visualize endogenous proteins in living neurons provides a powerful means to interrogate neuronal structure and function. Here we generate recombinant antibody-like proteins, termed FingRs (Fibronectin intrabodies generated with mRNA display), that bind endogenous neuronal proteins PSD-95 and Gephyrin with high affinity and which, when fused to GFP, allow excitatory and inhibitory synapses to be visualized in living neurons. Design of the FingR incorporates a novel transcriptional regulation system that ties FingR expression to the level of the target and reduces background fluorescence. In dissociated neurons and brain slices FingRs generated against PSD-95 and Gephyrin did not affect the expression patterns of their endogenous target proteins or the number or strength of synapses. Together, our data indicate that PSD-95 and Gephyrin FingRs can report the localization and amount of endogenous synaptic proteins in living neurons and thus may be used to study changes in synaptic strength in vivo. PMID:23791193

  7. Unfoldomics of human diseases: linking protein intrinsic disorder with diseases

    PubMed Central

    Uversky, Vladimir N; Oldfield, Christopher J; Midic, Uros; Xie, Hongbo; Xue, Bin; Vucetic, Slobodan; Iakoucheva, Lilia M; Obradovic, Zoran; Dunker, A Keith

    2009-01-01

    Background Intrinsically disordered proteins (IDPs) and intrinsically disordered regions (IDRs) lack stable tertiary and/or secondary structure yet fulfills key biological functions. The recent recognition of IDPs and IDRs is leading to an entire field aimed at their systematic structural characterization and at determination of their mechanisms of action. Bioinformatics studies showed that IDPs and IDRs are highly abundant in different proteomes and carry out mostly regulatory functions related to molecular recognition and signal transduction. These activities complement the functions of structured proteins. IDPs and IDRs were shown to participate in both one-to-many and many-to-one signaling. Alternative splicing and posttranslational modifications are frequently used to tune the IDP functionality. Several individual IDPs were shown to be associated with human diseases, such as cancer, cardiovascular disease, amyloidoses, diabetes, neurodegenerative diseases, and others. This raises questions regarding the involvement of IDPs and IDRs in various diseases. Results IDPs and IDRs were shown to be highly abundant in proteins associated with various human maladies. As the number of IDPs related to various diseases was found to be very large, the concepts of the disease-related unfoldome and unfoldomics were introduced. Novel bioinformatics tools were proposed to populate and characterize the disease-associated unfoldome. Structural characterization of the members of the disease-related unfoldome requires specialized experimental approaches. IDPs possess a number of unique structural and functional features that determine their broad involvement into the pathogenesis of various diseases. Conclusion Proteins associated with various human diseases are enriched in intrinsic disorder. These disease-associated IDPs and IDRs are real, abundant, diversified, vital, and dynamic. These proteins and regions comprise the disease-related unfoldome, which covers a significant part of the human proteome. Profound association between intrinsic disorder and various human diseases is determined by a set of unique structural and functional characteristics of IDPs and IDRs. Unfoldomics of human diseases utilizes unrivaled bioinformatics and experimental techniques, paves the road for better understanding of human diseases, their pathogenesis and molecular mechanisms, and helps develop new strategies for the analysis of disease-related proteins. PMID:19594884

  8. Crystal Structure of a Four-Layer Aggregate of Engineered TMV CP Implies the Importance of Terminal Residues for Oligomer Assembly

    PubMed Central

    Li, Xiangyang; Song, Baoan; Chen, Xi; Wang, Zhenchao; Zeng, Mengjiao; Yu, Dandan; Hu, Deyu; Chen, Zhuo; Jin, Linhong; Yang, Song; Yang, Caiguang; Chen, Baoen

    2013-01-01

    Background Crystal structures of the tobacco mosaic virus (TMV) coat protein (CP) in its helical and disk conformations have previously been determined at the atomic level. For the helical structure, interactions of proteins and nucleic acids in the main chains were clearly observed; however, the conformation of residues at the C-terminus was flexible and disordered. For the four-layer aggregate disk structure, interactions of the main chain residues could only be observed through water–mediated hydrogen bonding with protein residues. In this study, the effects of the C-terminal peptides on the interactions of TMV CP were investigated by crystal structure determination. Methodology/Principal Findings The crystal structure of a genetically engineered TMV CP was resolved at 3.06 Å. For the genetically engineered TMV CP, a six-histidine (His) tag was introduced at the N-terminus, and the C-terminal residues 155 to 158 were truncated (N-His-TMV CP19). Overall, N-His-TMV CP19 protein self-assembled into the four-layer aggregate form. The conformations of residues Gln36, Thr59, Asp115 and Arg134 were carefully analyzed in the high radius and low radius regions of N-His-TMV CP19, which were found to be significantly different from those observed previously for the helical and four-layer aggregate forms. In addition, the aggregation of the N-His-TMV CP19 layers was found to primarily be mediated through direct hydrogen-bonding. Notably, this engineered protein also can package RNA effectively and assemble into an infectious virus particle. Conclusion The terminal sequence of amino acids influences the conformation and interactions of the four-layer aggregate. Direct protein–protein interactions are observed in the major overlap region when residues Gly155 to Thr158 at the C-terminus are truncated. This engineered TMV CP is reassembled by direct protein–protein interaction and maintains the normal function of the four-layer aggregate of TMV CP in the presence of RNA. PMID:24223721

  9. Solution structure of the Legionella pneumophila Mip-rapamycin complex

    PubMed Central

    Ceymann, Andreas; Horstmann, Martin; Ehses, Philipp; Schweimer, Kristian; Paschke, Anne-Katrin; Steinert, Michael; Faber, Cornelius

    2008-01-01

    Background Legionella pneumphila is the causative agent of Legionnaires' disease. A major virulence factor of the pathogen is the homodimeric surface protein Mip. It shows peptidyl-prolyl cis/trans isomerase activty and is a receptor of FK506 and rapamycin, which both inhibit its enzymatic function. Insight into the binding process may be used for the design of novel Mip inhibitors as potential drugs against Legionnaires' disease. Results We have solved the solution structure of free Mip77–213 and the Mip77–213-rapamycin complex by NMR spectroscopy. Mip77–213 showed the typical FKBP-fold and only minor rearrangements upon binding of rapamycin. Apart from the configuration of a flexible hairpin loop, which is partly stabilized upon binding, the solution structure confirms the crystal structure. Comparisons to the structures of free FKBP12 and the FKBP12-rapamycin complex suggested an identical binding mode for both proteins. Conclusion The structural similarity of the Mip-rapamycin and FKBP12-rapamycin complexes suggests that FKBP12 ligands may be promising starting points for the design of novel Mip inhibitors. The search for a novel drug against Legionnaires' disease may therefore benefit from the large variety of known FKBP12 inhibitors. PMID:18366641

  10. Mechanical dynamics in live cells and fluorescence-based force/tension sensors

    PubMed Central

    Yang, Chao; Zhang, Xiaohan; Guo, Yichen; Meng, Fanjie; Sachs, Frederick; Guo, Jun

    2016-01-01

    Three signaling systems play the fundamental roles in modulating cell activities: chemical, electrical, and mechanical. While the former two are well studied, the mechanical signaling system is still elusive because of the lack of methods to measure structural forces in real time at cellular and subcellular levels. Indeed, almost all biological processes are responsive to modulation by mechanical forces that trigger dispersive downstream electrical and biochemical pathways. Communication among the three systems is essential to make cells and tissues receptive to environmental changes. Cells have evolved many sophisticated mechanisms for the generation, perception and transduction of mechanical forces, including motor proteins and mechanosensors. In this review, we introduce some background information about mechanical dynamics in live cells, including the ubiquitous mechanical activity, various types of mechanical stimuli exerted on cells and the different mechanosensors. We also summarize recent results obtained using genetically encoded FRET (fluorescence resonance energy transfer)-based force/tension sensors; a new technique used to measure mechanical forces in structural proteins. The sensors have been incorporated into many specific structural proteins and have measured the force gradients in real time within live cells, tissues, and animals. PMID:25958335

  11. The Regulatory Subunit of PKA-I Remains Partially Structured and Undergoes β-Aggregation upon Thermal Denaturation

    PubMed Central

    Dao, Khanh K.; Pey, Angel L.; Gjerde, Anja Underhaug; Teigen, Knut; Byeon, In-Ja L.; Døskeland, Stein O.; Gronenborn, Angela M.; Martinez, Aurora

    2011-01-01

    Background The regulatory subunit (R) of cAMP-dependent protein kinase (PKA) is a modular flexible protein that responds with large conformational changes to the binding of the effector cAMP. Considering its highly dynamic nature, the protein is rather stable. We studied the thermal denaturation of full-length RIα and a truncated RIα(92-381) that contains the tandem cyclic nucleotide binding (CNB) domains A and B. Methodology/Principal Findings As revealed by circular dichroism (CD) and differential scanning calorimetry, both RIα proteins contain significant residual structure in the heat-denatured state. As evidenced by CD, the predominantly α-helical spectrum at 25°C with double negative peaks at 209 and 222 nm changes to a spectrum with a single negative peak at 212–216 nm, characteristic of β-structure. A similar α→β transition occurs at higher temperature in the presence of cAMP. Thioflavin T fluorescence and atomic force microscopy studies support the notion that the structural transition is associated with cross-β-intermolecular aggregation and formation of non-fibrillar oligomers. Conclusions/Significance Thermal denaturation of RIα leads to partial loss of native packing with exposure of aggregation-prone motifs, such as the B' helices in the phosphate-binding cassettes of both CNB domains. The topology of the β-sandwiches in these domains favors inter-molecular β-aggregation, which is suppressed in the ligand-bound states of RIα under physiological conditions. Moreover, our results reveal that the CNB domains persist as structural cores through heat-denaturation. PMID:21394209

  12. An improved method to detect correct protein folds using partial clustering

    PubMed Central

    2013-01-01

    Background Structure-based clustering is commonly used to identify correct protein folds among candidate folds (also called decoys) generated by protein structure prediction programs. However, traditional clustering methods exhibit a poor runtime performance on large decoy sets. We hypothesized that a more efficient “partial“ clustering approach in combination with an improved scoring scheme could significantly improve both the speed and performance of existing candidate selection methods. Results We propose a new scheme that performs rapid but incomplete clustering on protein decoys. Our method detects structurally similar decoys (measured using either Cα RMSD or GDT-TS score) and extracts representatives from them without assigning every decoy to a cluster. We integrated our new clustering strategy with several different scoring functions to assess both the performance and speed in identifying correct or near-correct folds. Experimental results on 35 Rosetta decoy sets and 40 I-TASSER decoy sets show that our method can improve the correct fold detection rate as assessed by two different quality criteria. This improvement is significantly better than two recently published clustering methods, Durandal and Calibur-lite. Speed and efficiency testing shows that our method can handle much larger decoy sets and is up to 22 times faster than Durandal and Calibur-lite. Conclusions The new method, named HS-Forest, avoids the computationally expensive task of clustering every decoy, yet still allows superior correct-fold selection. Its improved speed, efficiency and decoy-selection performance should enable structure prediction researchers to work with larger decoy sets and significantly improve their ab initio structure prediction performance. PMID:23323835

  13. Molecular properties of muscarinic acetylcholine receptors

    PubMed Central

    HAGA, Tatsuya

    2013-01-01

    Muscarinic acetylcholine receptors, which comprise five subtypes (M1-M5 receptors), are expressed in both the CNS and PNS (particularly the target organs of parasympathetic neurons). M1-M5 receptors are integral membrane proteins with seven transmembrane segments, bind with acetylcholine (ACh) in the extracellular phase, and thereafter interact with and activate GTP-binding regulatory proteins (G proteins) in the intracellular phase: M1, M3, and M5 receptors interact with Gq-type G proteins, and M2 and M4 receptors with Gi/Go-type G proteins. Activated G proteins initiate a number of intracellular signal transduction systems. Agonist-bound muscarinic receptors are phosphorylated by G protein-coupled receptor kinases, which initiate their desensitization through uncoupling from G proteins, receptor internalization, and receptor breakdown (down regulation). Recently the crystal structures of M2 and M3 receptors were determined and are expected to contribute to the development of drugs targeted to muscarinic receptors. This paper summarizes the molecular properties of muscarinic receptors with reference to the historical background and bias to studies performed in our laboratories. PMID:23759942

  14. Immunopurification of adenomatous polyposis coli (APC) proteins

    PubMed Central

    2013-01-01

    Background The adenomatous polyposis coli (APC) tumour suppressor gene encodes a 2843 residue (310 kDa) protein. APC is a multifunctional protein involved in the regulation of β-catenin/Wnt signalling, cytoskeletal dynamics and cell adhesion. APC mutations occur in most colorectal cancers and typically result in truncation of the C-terminal half of the protein. Results In order to investigate the biophysical properties of APC, we have generated a set of monoclonal antibodies which enable purification of recombinant forms of APC. Here we describe the characterisation of these anti-APC monoclonal antibodies (APC-NT) that specifically recognise endogenous APC both in solution and in fixed cells. Full-length APC(1–2843) and cancer-associated, truncated APC proteins, APC(1–1638) and APC(1–1311) were produced in Sf9 insect cells. Conclusions Recombinant APC proteins were purified using a two-step affinity approach using our APC-NT antibodies. The purification of APC proteins provides the basis for detailed structure/function analyses of full-length, cancer-truncated and endogenous forms of the protein. PMID:24156781

  15. A simple and fast heuristic for protein structure comparison

    PubMed Central

    Pelta, David A; González, Juan R; Moreno Vega, Marcos

    2008-01-01

    Background Protein structure comparison is a key problem in bioinformatics. There exist several methods for doing protein comparison, being the solution of the Maximum Contact Map Overlap problem (MAX-CMO) one of the alternatives available. Although this problem may be solved using exact algorithms, researchers require approximate algorithms that obtain good quality solutions using less computational resources than the formers. Results We propose a variable neighborhood search metaheuristic for solving MAX-CMO. We analyze this strategy in two aspects: 1) from an optimization point of view the strategy is tested on two different datasets, obtaining an error of 3.5%(over 2702 pairs) and 1.7% (over 161 pairs) with respect to optimal values; thus leading to high accurate solutions in a simpler and less expensive way than exact algorithms; 2) in terms of protein structure classification, we conduct experiments on three datasets and show that is feasible to detect structural similarities at SCOP's family and CATH's architecture levels using normalized overlap values. Some limitations and the role of normalization are outlined for doing classification at SCOP's fold level. Conclusion We designed, implemented and tested.a new tool for solving MAX-CMO, based on a well-known metaheuristic technique. The good balance between solution's quality and computational effort makes it a valuable tool. Moreover, to the best of our knowledge, this is the first time the MAX-CMO measure is tested at SCOP's fold and CATH's architecture levels with encouraging results. Software is available for download at . PMID:18366735

  16. Crimean-Congo Hemorrhagic Fever Virus Gn Bioinformatic Analysis and Construction of a Recombinant Bacmid in Order to Express Gn by Baculovirus Expression System

    PubMed Central

    Rahpeyma, Mehdi; Fotouhi, Fatemeh; Makvandi, Manouchehr; Ghadiri, Ata; Samarbaf-Zadeh, Alireza

    2015-01-01

    Background Crimean-Congo hemorrhagic fever virus (CCHFV) is a member of the nairovirus, a genus in the Bunyaviridae family, which causes a life threatening disease in human. Currently, there is no vaccine against CCHFV and detailed structural analysis of CCHFV proteins remains undefined. The CCHFV M RNA segment encodes two viral surface glycoproteins known as Gn and Gc. Viral glycoproteins can be considered as key targets for vaccine development. Objectives The current study aimed to investigate structural bioinformatics of CCHFV Gn protein and design a construct to make a recombinant bacmid to express by baculovirus system. Materials and Methods To express the Gn protein in insect cells that can be used as antigen in animal model vaccine studies. Bioinformatic analysis of CCHFV Gn protein was performed and designed a construct and cloned into pFastBacHTb vector and a recombinant Gn-bacmid was generated by Bac to Bac system. Results Primary, secondary, and 3D structure of CCHFV Gn were obtained and PCR reaction with M13 forward and reverse primers confirmed the generation of recombinant bacmid DNA harboring Gn coding region under polyhedron promoter. Conclusions Characterization of the detailed structure of CCHFV Gn by bioinformatics software provides the basis for development of new experiments and construction of a recombinant bacmid harboring CCHFV Gn, which is valuable for designing a recombinant vaccine against deadly pathogens like CCHFV. PMID:26862379

  17. Elucidating the ensemble of functionally-relevant transitions in protein systems with a robotics-inspired method

    PubMed Central

    2013-01-01

    Background Many proteins tune their biological function by transitioning between different functional states, effectively acting as dynamic molecular machines. Detailed structural characterization of transition trajectories is central to understanding the relationship between protein dynamics and function. Computational approaches that build on the Molecular Dynamics framework are in principle able to model transition trajectories at great detail but also at considerable computational cost. Methods that delay consideration of dynamics and focus instead on elucidating energetically-credible conformational paths connecting two functionally-relevant structures provide a complementary approach. Effective sampling-based path planning methods originating in robotics have been recently proposed to produce conformational paths. These methods largely model short peptides or address large proteins by simplifying conformational space. Methods We propose a robotics-inspired method that connects two given structures of a protein by sampling conformational paths. The method focuses on small- to medium-size proteins, efficiently modeling structural deformations through the use of the molecular fragment replacement technique. In particular, the method grows a tree in conformational space rooted at the start structure, steering the tree to a goal region defined around the goal structure. We investigate various bias schemes over a progress coordinate for balance between coverage of conformational space and progress towards the goal. A geometric projection layer promotes path diversity. A reactive temperature scheme allows sampling of rare paths that cross energy barriers. Results and conclusions Experiments are conducted on small- to medium-size proteins of length up to 214 amino acids and with multiple known functionally-relevant states, some of which are more than 13Å apart of each-other. Analysis reveals that the method effectively obtains conformational paths connecting structural states that are significantly different. A detailed analysis on the depth and breadth of the tree suggests that a soft global bias over the progress coordinate enhances sampling and results in higher path diversity. The explicit geometric projection layer that biases the exploration away from over-sampled regions further increases coverage, often improving proximity to the goal by forcing the exploration to find new paths. The reactive temperature scheme is shown effective in increasing path diversity, particularly in difficult structural transitions with known high-energy barriers. PMID:24565158

  18. The C-Terminal Sequence of RhoB Directs Protein Degradation through an Endo-Lysosomal Pathway

    PubMed Central

    Ramos, Irene; Herrera, Mónica; Stamatakis, Konstantinos

    2009-01-01

    Background Protein degradation is essential for cell homeostasis. Targeting of proteins for degradation is often achieved by specific protein sequences or posttranslational modifications such as ubiquitination. Methodology/Principal Findings By using biochemical and genetic tools we have monitored the localization and degradation of endogenous and chimeric proteins in live primary cells by confocal microscopy and ultra-structural analysis. Here we identify an eight amino acid sequence from the C-terminus of the short-lived GTPase RhoB that directs the rapid degradation of both RhoB and chimeric proteins bearing this sequence through a lysosomal pathway. Elucidation of the RhoB degradation pathway unveils a mechanism dependent on protein isoprenylation and palmitoylation that involves sorting of the protein into multivesicular bodies, mediated by the ESCRT machinery. Moreover, RhoB sorting is regulated by late endosome specific lipid dynamics and is altered in human genetic lipid traffic disease. Conclusions/Significance Our findings characterize a short-lived cytosolic protein that is degraded through a lysosomal pathway. In addition, we define a novel motif for protein sorting and rapid degradation, which allows controlling protein levels by means of clinically used drugs. PMID:19956591

  19. Visualisation and graph-theoretic analysis of a large-scale protein structural interactome

    PubMed Central

    Bolser, Dan; Dafas, Panos; Harrington, Richard; Park, Jong; Schroeder, Michael

    2003-01-01

    Background Large-scale protein interaction maps provide a new, global perspective with which to analyse protein function. PSIMAP, the Protein Structural Interactome Map, is a database of all the structurally observed interactions between superfamilies of protein domains with known three-dimensional structure in the PDB. PSIMAP incorporates both functional and evolutionary information into a single network. Results We present a global analysis of PSIMAP using several distinct network measures relating to centrality, interactivity, fault-tolerance, and taxonomic diversity. We found the following results: Centrality: we show that the center and barycenter of PSIMAP do not coincide, and that the superfamilies forming the barycenter relate to very general functions, while those constituting the center relate to enzymatic activity. Interactivity: we identify the P-loop and immunoglobulin superfamilies as the most highly interactive. We successfully use connectivity and cluster index, which characterise the connectivity of a superfamily's neighbourhood, to discover superfamilies of complex I and II. This is particularly significant as the structure of complex I is not yet solved. Taxonomic diversity: we found that highly interactive superfamilies are in general taxonomically very diverse and are thus amongst the oldest. Fault-tolerance: we found that the network is very robust as for the majority of superfamilies removal from the network will not break up the network. Conclusions Overall, we can single out the P-loop containing nucleotide triphosphate hydrolases superfamily as it is the most highly connected and has the highest taxonomic diversity. In addition, this superfamily has the highest interaction rank, is the barycenter of the network (it has the shortest average path to every other superfamily in the network), and is an articulation vertex, whose removal will disconnect the network. More generally, we conclude that the graph-theoretic and taxonomic analysis of PSIMAP is an important step towards the understanding of protein function and could be an important tool for tracing the evolution of life at the molecular level. PMID:14531933

  20. Evolutionary divergence of chloroplast FAD synthetase proteins

    PubMed Central

    2010-01-01

    Background Flavin adenine dinucleotide synthetases (FADSs) - a group of bifunctional enzymes that carry out the dual functions of riboflavin phosphorylation to produce flavin mononucleotide (FMN) and its subsequent adenylation to generate FAD in most prokaryotes - were studied in plants in terms of sequence, structure and evolutionary history. Results Using a variety of bioinformatics methods we have found that FADS enzymes localized to the chloroplasts, which we term as plant-like FADS proteins, are distributed across a variety of green plant lineages and constitute a divergent protein family clearly of cyanobacterial origin. The C-terminal module of these enzymes does not contain the typical riboflavin kinase active site sequence, while the N-terminal module is broadly conserved. These results agree with a previous work reported by Sandoval et al. in 2008. Furthermore, our observations and preliminary experimental results indicate that the C-terminus of plant-like FADS proteins may contain a catalytic activity, but different to that of their prokaryotic counterparts. In fact, homology models predict that plant-specific conserved residues constitute a distinct active site in the C-terminus. Conclusions A structure-based sequence alignment and an in-depth evolutionary survey of FADS proteins, thought to be crucial in plant metabolism, are reported, which will be essential for the correct annotation of plant genomes and further structural and functional studies. This work is a contribution to our understanding of the evolutionary history of plant-like FADS enzymes, which constitute a new family of FADS proteins whose C-terminal module might be involved in a distinct catalytic activity. PMID:20955574

  1. Cytoskeletal and cellular adhesion proteins in zebrafish (Danio rerio) myogenesis.

    PubMed

    Costa, M L; Escaleira, R; Manasfi, M; de Souza, L F; Mermelstein, C S

    2003-08-01

    The current myogenesis and myofibrillogenesis model has been based mostly on in vitro cell culture studies, and, to a lesser extent, on in situ studies in avian and mammalian embryos. While the more isolated artificial conditions of cells in culture permitted careful structural analysis, the actual in situ cellular structures have not been described in detail because the embryos are more difficult to section and manipulate. To overcome these difficulties, we used the optically clear and easy to handle embryos of the zebrafish Danio rerio. We monitored the expression of cytoskeletal and cell-adhesion proteins (actin, myosin, desmin, alpha-actinin, troponin, titin, vimentin and vinculin) using immunofluorescence microscopy and video-enhanced, background-subtracted, differential interference contrast of 24- to 48-h zebrafish embryos. In the mature myotome, the mononucleated myoblasts displayed periodic striations for all sarcomeric proteins tested. The changes in desmin distribution from aggregates to perinuclear and striated forms, although following the same sequence, occurred much faster than in other models. All desmin-positive cells were also positive for myofibrillar proteins and striated, in contrast to that which occurs in cell cultures. Vimentin appeared to be striated in mature cells, while it is developmentally down-regulated in vitro. The whole connective tissue septum between the somites was positive for adhesion proteins such as vinculin, instead of the isolated adhesion plaques observed in cell cultures. The differences in the myogenesis of zebrafish in situ and in cell culture in vitro suggest that some of the previously observed structures and protein distributions in cultures could be methodological artifacts.

  2. Evidence for alternative quaternary structure in a bacterial Type III secretion system chaperone

    PubMed Central

    2010-01-01

    Background Type III secretion systems are a common virulence mechanism in many Gram-negative bacterial pathogens. These systems use a nanomachine resembling a molecular needle and syringe to provide an energized conduit for the translocation of effector proteins from the bacterial cytoplasm to the host cell cytoplasm for the benefit of the pathogen. Prior to translocation specialized chaperones maintain proper effector protein conformation. The class II chaperone, Invasion plasmid gene (Ipg) C, stabilizes two pore forming translocator proteins. IpgC exists as a functional dimer to facilitate the mutually exclusive binding of both translocators. Results In this study, we present the 3.3 Å crystal structure of an amino-terminally truncated form (residues 10-155, denoted IpgC10-155) of the class II chaperone IpgC from Shigella flexneri. Our structure demonstrates an alternative quaternary arrangement to that previously described for a carboxy-terminally truncated variant of IpgC (IpgC1-151). Specifically, we observe a rotationally-symmetric "head-to- head" dimerization interface that is far more similar to that previously described for SycD from Yersinia enterocolitica than to IpgC1-151. The IpgC structure presented here displays major differences in the amino terminal region, where extended coil-like structures are seen, as opposed to the short, ordered alpha helices and asymmetric dimerization interface seen within IpgC1-151. Despite these differences, however, both modes of dimerization support chaperone activity, as judged by a copurification assay with a recombinant form of the translocator protein, IpaB. Conclusions From primary to quaternary structure, these results presented here suggest that a symmetric dimerization interface is conserved across bacterial class II chaperones. In light of previous data which have described the structure and function of asymmetric dimerization, our results raise the possibility that class II chaperones may transition between asymmetric and symmetric dimers in response to changes in either biochemical modifications (e.g. proteolytic cleavage) or other biological cues. Such transitions may contribute to the broad range of protein-protein interactions and functions attributed to class II chaperones. PMID:20633281

  3. Synthetic Biology of Proteins: Tuning GFPs Folding and Stability with Fluoroproline

    PubMed Central

    Steiner, Thomas; Hess, Petra; Bae, Jae Hyun; Wiltschi, Birgit; Moroder, Luis; Budisa, Nediljko

    2008-01-01

    Background Proline residues affect protein folding and stability via cis/trans isomerization of peptide bonds and by the Cγ-exo or -endo puckering of their pyrrolidine rings. Peptide bond conformation as well as puckering propensity can be manipulated by proper choice of ring substituents, e.g. Cγ-fluorination. Synthetic chemistry has routinely exploited ring-substituted proline analogs in order to change, modulate or control folding and stability of peptides. Methodology/Principal Findings In order to transmit this synthetic strategy to complex proteins, the ten proline residues of enhanced green fluorescent protein (EGFP) were globally replaced by (4R)- and (4S)-fluoroprolines (FPro). By this approach, we expected to affect the cis/trans peptidyl-proline bond isomerization and pyrrolidine ring puckering, which are responsible for the slow folding of this protein. Expression of both protein variants occurred at levels comparable to the parent protein, but the (4R)-FPro-EGFP resulted in irreversibly unfolded inclusion bodies, whereas the (4S)-FPro-EGFP led to a soluble fluorescent protein. Upon thermal denaturation, refolding of this variant occurs at significantly higher rates than the parent EGFP. Comparative inspection of the X-ray structures of EGFP and (4S)-FPro-EGFP allowed to correlate the significantly improved refolding with the Cγ-endo puckering of the pyrrolidine rings, which is favored by 4S-fluorination, and to lesser extents with the cis/trans isomerization of the prolines. Conclusions/Significance We discovered that the folding rates and stability of GFP are affected to a lesser extent by cis/trans isomerization of the proline bonds than by the puckering of pyrrolidine rings. In the Cγ-endo conformation the fluorine atoms are positioned in the structural context of the GFP such that a network of favorable local interactions is established. From these results the combined use of synthetic amino acids along with detailed structural knowledge and existing protein engineering methods can be envisioned as a promising strategy for the design of complex tailor-made proteins and even cellular structures of superior properties compared to the native forms. PMID:18301757

  4. Challenging Residual Contamination of Instruments for Robotic Surgery in Japan.

    PubMed

    Saito, Yuhei; Yasuhara, Hiroshi; Murakoshi, Satoshi; Komatsu, Takami; Fukatsu, Kazuhiko; Uetera, Yushi

    2017-02-01

    BACKGROUND Recently, robotic surgery has been introduced in many hospitals. The structure of robotic instruments is so complex that updating their cleaning methods is a challenge for healthcare professionals. However, there is limited information on the effectiveness of cleaning for instruments for robotic surgery. OBJECTIVE To determine the level of residual contamination of instruments for robotic surgery and to develop a method to evaluate the cleaning efficacy for complex surgical devices. METHODS Surgical instruments were collected immediately after operations and/or after in-house cleaning, and the level of residual protein was measured. Three serial measurements were performed on instruments after cleaning to determine the changes in the level of contamination and the total amount of residual protein. The study took place from September 1, 2013, through June 30, 2015, in Japan. RESULTS The amount of protein released from robotic instruments declined exponentially. The amount after in-house cleaning was 650, 550, and 530 µg/instrument in the 3 serial measurements. The overall level of residual protein in each measurement was much higher for robotic instruments than for ordinary instruments (P<.0001). CONCLUSIONS Our data demonstrated that complete removal of residual protein from surgical instruments is virtually impossible. The pattern of decline differed depending on the instrument type, which reflected the complex structure of the instruments. It might be necessary to establish a new standard for cleaning using a novel classification according to the structural complexity of instruments, especially for those for robotic surgery. Infect Control Hosp Epidemiol 2017;38:143-146.

  5. [Quantitative changes of main components of erythrocyte membranes which define architectonics of cells under pttg gene knockout].

    PubMed

    Kaniuka, O P; Filiak, Ie Z; Kulachkovs'kyĭ, O R; Osyp, Iu L; Sybirna, N O

    2014-01-01

    A pttg gene knockout affects the functional state of erythron in mice which could be associated with structural changes in the structure of erythrocyte membranes. The pttg gene knockout causes a significant modification of fatty acids composition of erythrocyte membrane lipids by reducing the content of palmitic acid and increasing of polyunsaturated fatty acids amount by 18%. Analyzing the erythrocyte surface architectonics of mice under pttg gene knockout, it was found that on the background of reduction of the functionally complete biconcave discs population one could observe an increase of the number of transformed cells at different degeneration stages. Researches have shown that in mice with a pttg gene knockout compared with a control group of animals cytoskeletal protein--beta-spectrin was reduced by 17.03%. However, there is a reduction of membrane protein band 3 by 33.04%, simultaneously the content of anion transport protein band 4.5 increases by 35.2% and protein band 4.2 by 32.1%. The lectin blot analysis has helped to reveal changes in the structure of the carbohydrate determinants of erythrocyte membrane glycoproteins under conditions of directed pttg gene inactivation, accompanied by changes in the type of communication, which joins the terminal residue in carbohydrate determinant of glycoproteins. Thus, a significant redistribution of protein and fatty acids contents in erythrocyte membranes that manifested in the increase of the deformed shape of red blood cells is observed underpttg gene knockout.

  6. The Drosophila muscle LIM protein, Mlp84B, cooperates with D-titin to maintain muscle structural integrity.

    PubMed

    Clark, Kathleen A; Bland, Jennifer M; Beckerle, Mary C

    2007-06-15

    Muscle LIM protein (MLP) is a cytoskeletal LIM-only protein expressed in striated muscle. Mutations in human MLP are associated with cardiomyopathy; however, the molecular mechanism by which MLP functions is not established. A Drosophila MLP homolog, mlp84B, displays many of the same features as the vertebrate protein, illustrating the utility of the fly for the study of MLP function. Animals lacking Mlp84B develop into larvae with a morphologically intact musculature, but the mutants arrest during pupation with impaired muscle function. Mlp84B displays muscle-specific expression and is a component of the Z-disc and nucleus. Preventing nuclear retention of Mlp84B does not affect its function, indicating that Mlp84B site of action is likely to be at the Z-disc. Within the Z-disc, Mlp84B is colocalized with the N-terminus of D-titin, a protein crucial for sarcomere organization and stretch mechanics. The mlp84B mutants phenotypically resemble weak D-titin mutants. Furthermore, reducing D-titin activity in the mlp84B background leads to pronounced enhancement of the mlp84B muscle defects and loss of muscle structural integrity. The genetic interactions between mlp84B and D-titin reveal a role for Mlp84B in maintaining muscle structural integrity that was not obvious from analysis of the mlp84B mutants themselves, and suggest Mlp84B and D-titin cooperate to stabilize muscle sarcomeres.

  7. A variant of green fluorescent protein exclusively deposited to active intracellular inclusion bodies

    PubMed Central

    2014-01-01

    Background Inclusion bodies (IBs) were generally considered to be inactive protein deposits and did not hold any attractive values in biotechnological applications. Recently, some IBs of recombinant proteins were confirmed to show their functional properties such as enzyme activities, fluorescence, etc. Such biologically active IBs are not commonly formed, but they have great potentials in the fields of biocatalysis, material science and nanotechnology. Results In this study, we characterized the IBs of DL4, a deletion variant of green fluorescent protein which forms active intracellular aggregates. The DL4 proteins expressed in Escherichia coli were exclusively deposited to IBs, and the IBs were estimated to be mostly composed of active proteins. The spectral properties and quantum yield of the DL4 variant in the active IBs were almost same with those of its native protein. Refolding and stability studies revealed that the deletion mutation in DL4 didn’t affect the folding efficiency of the protein, but destabilized its structure. Analyses specific for amyloid-like structures informed that the inner architecture of DL4 IBs might be amorphous rather than well-organized. The diameter of fluorescent DL4 IBs could be decreased up to 100–200 nm by reducing the expression time of the protein in vivo. Conclusions To our knowledge, DL4 is the first GFP variant that folds correctly but aggregates exclusively in vivo without any self-aggregating/assembling tags. The fluorescent DL4 IBs have potentials to be used as fluorescent biomaterials. This study also suggests that biologically active IBs can be achieved through engineering a target protein itself. PMID:24885571

  8. Predictors of natively unfolded proteins: unanimous consensus score to detect a twilight zone between order and disorder in generic datasets

    PubMed Central

    2010-01-01

    Background Natively unfolded proteins lack a well defined three dimensional structure but have important biological functions, suggesting a re-assignment of the structure-function paradigm. To assess that a given protein is natively unfolded requires laborious experimental investigations, then reliable sequence-only methods for predicting whether a sequence corresponds to a folded or to an unfolded protein are of interest in fundamental and applicative studies. Many proteins have amino acidic compositions compatible both with the folded and unfolded status, and belong to a twilight zone between order and disorder. This makes difficult a dichotomic classification of protein sequences into folded and natively unfolded ones. In this work we propose an operational method to identify proteins belonging to the twilight zone by combining into a consensus score good performing single predictors of folding. Results In this methodological paper dichotomic folding indexes are considered: hydrophobicity-charge, mean packing, mean pairwise energy, Poodle-W and a new global index, that is called here gVSL2, based on the local disorder predictor VSL2. The performance of these indexes is evaluated on different datasets, in particular on a new dataset composed by 2369 folded and 81 natively unfolded proteins. Poodle-W, gVSL2 and mean pairwise energy have good performance and stability in all the datasets considered and are combined into a strictly unanimous combination score SSU, that leaves proteins unclassified when the consensus of all combined indexes is not reached. The unclassified proteins: i) belong to an overlap region in the vector space of amino acidic compositions occupied by both folded and unfolded proteins; ii) are composed by approximately the same number of order-promoting and disorder-promoting amino acids; iii) have a mean flexibility intermediate between that of folded and that of unfolded proteins. Conclusions Our results show that proteins unclassified by SSU belong to a twilight zone. Proteins left unclassified by the consensus score SSU have physical properties intermediate between those of folded and those of natively unfolded proteins and their structural properties and evolutionary history are worth to be investigated. PMID:20409339

  9. HomPPI: a class of sequence homology based protein-protein interface prediction methods

    PubMed Central

    2011-01-01

    Background Although homology-based methods are among the most widely used methods for predicting the structure and function of proteins, the question as to whether interface sequence conservation can be effectively exploited in predicting protein-protein interfaces has been a subject of debate. Results We studied more than 300,000 pair-wise alignments of protein sequences from structurally characterized protein complexes, including both obligate and transient complexes. We identified sequence similarity criteria required for accurate homology-based inference of interface residues in a query protein sequence. Based on these analyses, we developed HomPPI, a class of sequence homology-based methods for predicting protein-protein interface residues. We present two variants of HomPPI: (i) NPS-HomPPI (Non partner-specific HomPPI), which can be used to predict interface residues of a query protein in the absence of knowledge of the interaction partner; and (ii) PS-HomPPI (Partner-specific HomPPI), which can be used to predict the interface residues of a query protein with a specific target protein. Our experiments on a benchmark dataset of obligate homodimeric complexes show that NPS-HomPPI can reliably predict protein-protein interface residues in a given protein, with an average correlation coefficient (CC) of 0.76, sensitivity of 0.83, and specificity of 0.78, when sequence homologs of the query protein can be reliably identified. NPS-HomPPI also reliably predicts the interface residues of intrinsically disordered proteins. Our experiments suggest that NPS-HomPPI is competitive with several state-of-the-art interface prediction servers including those that exploit the structure of the query proteins. The partner-specific classifier, PS-HomPPI can, on a large dataset of transient complexes, predict the interface residues of a query protein with a specific target, with a CC of 0.65, sensitivity of 0.69, and specificity of 0.70, when homologs of both the query and the target can be reliably identified. The HomPPI web server is available at http://homppi.cs.iastate.edu/. Conclusions Sequence homology-based methods offer a class of computationally efficient and reliable approaches for predicting the protein-protein interface residues that participate in either obligate or transient interactions. For query proteins involved in transient interactions, the reliability of interface residue prediction can be improved by exploiting knowledge of putative interaction partners. PMID:21682895

  10. SCOWLP classification: Structural comparison and analysis of protein binding regions

    PubMed Central

    Teyra, Joan; Paszkowski-Rogacz, Maciej; Anders, Gerd; Pisabarro, M Teresa

    2008-01-01

    Background Detailed information about protein interactions is critical for our understanding of the principles governing protein recognition mechanisms. The structures of many proteins have been experimentally determined in complex with different ligands bound either in the same or different binding regions. Thus, the structural interactome requires the development of tools to classify protein binding regions. A proper classification may provide a general view of the regions that a protein uses to bind others and also facilitate a detailed comparative analysis of the interacting information for specific protein binding regions at atomic level. Such classification might be of potential use for deciphering protein interaction networks, understanding protein function, rational engineering and design. Description Protein binding regions (PBRs) might be ideally described as well-defined separated regions that share no interacting residues one another. However, PBRs are often irregular, discontinuous and can share a wide range of interacting residues among them. The criteria to define an individual binding region can be often arbitrary and may differ from other binding regions within a protein family. Therefore, the rational behind protein interface classification should aim to fulfil the requirements of the analysis to be performed. We extract detailed interaction information of protein domains, peptides and interfacial solvent from the SCOWLP database and we classify the PBRs of each domain family. For this purpose, we define a similarity index based on the overlapping of interacting residues mapped in pair-wise structural alignments. We perform our classification with agglomerative hierarchical clustering using the complete-linkage method. Our classification is calculated at different similarity cut-offs to allow flexibility in the analysis of PBRs, feature especially interesting for those protein families with conflictive binding regions. The hierarchical classification of PBRs is implemented into the SCOWLP database and extends the SCOP classification with three additional family sub-levels: Binding Region, Interface and Contacting Domains. SCOWLP contains 9,334 binding regions distributed within 2,561 families. In 65% of the cases we observe families containing more than one binding region. Besides, 22% of the regions are forming complex with more than one different protein family. Conclusion The current SCOWLP classification and its web application represent a framework for the study of protein interfaces and comparative analysis of protein family binding regions. This comparison can be performed at atomic level and allows the user to study interactome conservation and variability. The new SCOWLP classification may be of great utility for reconstruction of protein complexes, understanding protein networks and ligand design. SCOWLP will be updated with every SCOP release. The web application is available at . PMID:18182098

  11. High performance transcription factor-DNA docking with GPU computing

    PubMed Central

    2012-01-01

    Background Protein-DNA docking is a very challenging problem in structural bioinformatics and has important implications in a number of applications, such as structure-based prediction of transcription factor binding sites and rational drug design. Protein-DNA docking is very computational demanding due to the high cost of energy calculation and the statistical nature of conformational sampling algorithms. More importantly, experiments show that the docking quality depends on the coverage of the conformational sampling space. It is therefore desirable to accelerate the computation of the docking algorithm, not only to reduce computing time, but also to improve docking quality. Methods In an attempt to accelerate the sampling process and to improve the docking performance, we developed a graphics processing unit (GPU)-based protein-DNA docking algorithm. The algorithm employs a potential-based energy function to describe the binding affinity of a protein-DNA pair, and integrates Monte-Carlo simulation and a simulated annealing method to search through the conformational space. Algorithmic techniques were developed to improve the computation efficiency and scalability on GPU-based high performance computing systems. Results The effectiveness of our approach is tested on a non-redundant set of 75 TF-DNA complexes and a newly developed TF-DNA docking benchmark. We demonstrated that the GPU-based docking algorithm can significantly accelerate the simulation process and thereby improving the chance of finding near-native TF-DNA complex structures. This study also suggests that further improvement in protein-DNA docking research would require efforts from two integral aspects: improvement in computation efficiency and energy function design. Conclusions We present a high performance computing approach for improving the prediction accuracy of protein-DNA docking. The GPU-based docking algorithm accelerates the search of the conformational space and thus increases the chance of finding more near-native structures. To the best of our knowledge, this is the first ad hoc effort of applying GPU or GPU clusters to the protein-DNA docking problem. PMID:22759575

  12. Quantification of DNA-associated proteins inside eukaryotic cells using single-molecule localization microscopy

    PubMed Central

    Etheridge, Thomas J.; Boulineau, Rémi L.; Herbert, Alex; Watson, Adam T.; Daigaku, Yasukazu; Tucker, Jem; George, Sophie; Jönsson, Peter; Palayret, Matthieu; Lando, David; Laue, Ernest; Osborne, Mark A.; Klenerman, David; Lee, Steven F.; Carr, Antony M.

    2014-01-01

    Development of single-molecule localization microscopy techniques has allowed nanometre scale localization accuracy inside cells, permitting the resolution of ultra-fine cell structure and the elucidation of crucial molecular mechanisms. Application of these methodologies to understanding processes underlying DNA replication and repair has been limited to defined in vitro biochemical analysis and prokaryotic cells. In order to expand these techniques to eukaryotic systems, we have further developed a photo-activated localization microscopy-based method to directly visualize DNA-associated proteins in unfixed eukaryotic cells. We demonstrate that motion blurring of fluorescence due to protein diffusivity can be used to selectively image the DNA-bound population of proteins. We designed and tested a simple methodology and show that it can be used to detect changes in DNA binding of a replicative helicase subunit, Mcm4, and the replication sliding clamp, PCNA, between different stages of the cell cycle and between distinct genetic backgrounds. PMID:25106872

  13. Biological role of bacterial inclusion bodies: a model for amyloid aggregation.

    PubMed

    García-Fruitós, Elena; Sabate, Raimon; de Groot, Natalia S; Villaverde, Antonio; Ventura, Salvador

    2011-07-01

    Inclusion bodies are insoluble protein aggregates usually found in recombinant bacteria when they are forced to produce heterologous protein species. These particles are formed by polypeptides that cross-interact through sterospecific contacts and that are steadily deposited in either the cell's cytoplasm or the periplasm. An important fraction of eukaryotic proteins form inclusion bodies in bacteria, which has posed major problems in the development of the biotechnology industry. Over the last decade, the fine dissection of the quality control system in bacteria and the recognition of the amyloid-like architecture of inclusion bodies have provided dramatic insights on the dynamic biology of these aggregates. We discuss here the relevant aspects, in the interface between cell physiology and structural biology, which make inclusion bodies unique models for the study of protein aggregation, amyloid formation and prion biology in a physiologically relevant background. © 2011 The Authors Journal compilation © 2011 FEBS.

  14. Asymmetric interactions in the adenosine-binding pockets of the MS2 coat protein dimer

    PubMed Central

    Powell, Amy J; Peabody, David S

    2001-01-01

    Background The X-ray structure of the MS2 coat protein-operator RNA complex reveals the existence of quasi-synmetric interactions of adenosines -4 and -10 in pockets formed on different subunits of the coat protein dimer. Both pockets utilize the same five amino acid residues, namely Val29, Thr45, Ser47, Thr59, and Lys61. We call these sites the adenosine-binding pockets. Results We present here a heterodimer complementation analysis of the contributions of individual A-pocket amino acids to the binding of A-4 and A-10 in different halves of the dimer. Various substitutions of A-pocket residues were introduced into one half of single-chain coat protein heterodimers where they were tested for their abilities to complement Y85H or T91I substitutions (defects in the A-4 and A-10 half-sites, respectively) present in the other dimer half. Conclusions These experiments provide functional tests of interactions predicted from structural analyses, demonstrating the importance of certain amino acid-nucleotide contacts observed in the crystal structure, and showing that others make little or no contribution to the stability of the complex. In summary, Val29 and Lys61 form important stabilizing interactions with both A-4 and A-10. Meanwhile, Ser47 and Thr59 interact primarily with A-10. The important interactions with Thr45 are restricted to A-4. PMID:11504563

  15. Automated identification of protein-ligand interaction features using Inductive Logic Programming: a hexose binding case study

    PubMed Central

    2012-01-01

    Background There is a need for automated methods to learn general features of the interactions of a ligand class with its diverse set of protein receptors. An appropriate machine learning approach is Inductive Logic Programming (ILP), which automatically generates comprehensible rules in addition to prediction. The development of ILP systems which can learn rules of the complexity required for studies on protein structure remains a challenge. In this work we use a new ILP system, ProGolem, and demonstrate its performance on learning features of hexose-protein interactions. Results The rules induced by ProGolem detect interactions mediated by aromatics and by planar-polar residues, in addition to less common features such as the aromatic sandwich. The rules also reveal a previously unreported dependency for residues cys and leu. They also specify interactions involving aromatic and hydrogen bonding residues. This paper shows that Inductive Logic Programming implemented in ProGolem can derive rules giving structural features of protein/ligand interactions. Several of these rules are consistent with descriptions in the literature. Conclusions In addition to confirming literature results, ProGolem’s model has a 10-fold cross-validated predictive accuracy that is superior, at the 95% confidence level, to another ILP system previously used to study protein/hexose interactions and is comparable with state-of-the-art statistical learners. PMID:22783946

  16. A comprehensive biophysical description of pairwise epistasis throughout an entire protein domain

    PubMed Central

    Olson, C. Anders; Wu, Nicholas C.; Sun, Ren

    2014-01-01

    SUMMARY Background Non-additivity in fitness effects from two or more mutations, termed epistasis, can result in compensation of deleterious mutations or negation of beneficial mutations. Recent evidence shows the importance of epistasis in individual evolutionary pathways. However, an unresolved question in molecular evolution is how often and how significantly fitness effects change in alternative genetic backgrounds. Results To answer this question we quantified the effects of all single mutations and double mutations between all positions in the IgG-binding domain of protein G (GB1). By observing the first two steps of all possible evolutionary pathways, this fitness profile enabled the characterization of the extent and magnitude of pairwise epistasis throughout an entire protein molecule. Furthermore, we developed a novel approach to quantitatively determine the effects of single mutations on structural stability (ΔΔGU). This enabled determination of the importance of stability effects in functional epistasis. Conclusions Our results illustrate common biophysical mechanisms for occurrences of positive and negative epistasis. Our results show pervasive positive epistasis within a conformationally dynamic network of residues. The stability analysis shows that significant negative epistasis, which is more common than positive epistasis, mostly occurs between combinations of destabilizing mutations. Furthermore, we show that although significant positive epistasis is rare, many deleterious mutations are beneficial in at least one alternative mutational background. The distribution of conditionally beneficial mutations throughout the domain demonstrates that the functional portion of sequence space can be significantly expanded by epistasis. PMID:25455030

  17. Scanning light-sheet microscopy in the whole mouse brain with HiLo background rejection.

    PubMed

    Mertz, Jerome; Kim, Jinhyun

    2010-01-01

    It is well known that light-sheet illumination can enable optically sectioned wide-field imaging of macroscopic samples. However, the optical sectioning capacity of a light-sheet macroscope is undermined by sample-induced scattering or aberrations that broaden the thickness of the sheet illumination. We present a technique to enhance the optical sectioning capacity of a scanning light-sheet microscope by out-of-focus background rejection. The technique, called HiLo microscopy, makes use of two images sequentially acquired with uniform and structured sheet illumination. An optically sectioned image is then synthesized by fusing high and low spatial frequency information from both images. The benefits of combining light-sheet macroscopy and HiLo background rejection are demonstrated in optically cleared whole mouse brain samples, using both green fluorescent protein (GFP)-fluorescence and dark-field scattered light contrast.

  18. Scanning light-sheet microscopy in the whole mouse brain with HiLo background rejection

    NASA Astrophysics Data System (ADS)

    Mertz, Jerome; Kim, Jinhyun

    2010-01-01

    It is well known that light-sheet illumination can enable optically sectioned wide-field imaging of macroscopic samples. However, the optical sectioning capacity of a light-sheet macroscope is undermined by sample-induced scattering or aberrations that broaden the thickness of the sheet illumination. We present a technique to enhance the optical sectioning capacity of a scanning light-sheet microscope by out-of-focus background rejection. The technique, called HiLo microscopy, makes use of two images sequentially acquired with uniform and structured sheet illumination. An optically sectioned image is then synthesized by fusing high and low spatial frequency information from both images. The benefits of combining light-sheet macroscopy and HiLo background rejection are demonstrated in optically cleared whole mouse brain samples, using both green fluorescent protein (GFP)-fluorescence and dark-field scattered light contrast.

  19. Defining the pathogenesis of the human Atp12p W94R mutation using a Saccharomyces cerevisiae yeast model.

    PubMed

    Meulemans, Ann; Seneca, Sara; Pribyl, Thomas; Smet, Joel; Alderweirldt, Valerie; Waeytens, Anouk; Lissens, Willy; Van Coster, Rudy; De Meirleir, Linda; di Rago, Jean-Paul; Gatti, Domenico L; Ackerman, Sharon H

    2010-02-05

    Studies in yeast have shown that a deficiency in Atp12p prevents assembly of the extrinsic domain (F(1)) of complex V and renders cells unable to make ATP through oxidative phosphorylation. De Meirleir et al. (De Meirleir, L., Seneca, S., Lissens, W., De Clercq, I., Eyskens, F., Gerlo, E., Smet, J., and Van Coster, R. (2004) J. Med. Genet. 41, 120-124) have reported that a homozygous missense mutation in the gene for human Atp12p (HuAtp12p), which replaces Trp-94 with Arg, was linked to the death of a 14-month-old patient. We have investigated the impact of the pathogenic W94R mutation on Atp12p structure/function. Plasmid-borne wild type human Atp12p rescues the respiratory defect of a yeast ATP12 deletion mutant (Deltaatp12). The W94R mutation alters the protein at the most highly conserved position in the Pfam sequence and renders HuAtp12p insoluble in the background of Deltaatp12. In contrast, the yeast protein harboring the corresponding mutation, ScAtp12p(W103R), is soluble in the background of Deltaatp12 but not in the background of Deltaatp12Deltafmc1, a strain that also lacks Fmc1p. Fmc1p is a yeast mitochondrial protein not found in higher eukaryotes. Tryptophan 94 (human) or 103 (yeast) is located in a positively charged region of Atp12p, and hence its mutation to arginine does not alter significantly the electrostatic properties of the protein. Instead, we provide evidence that the primary effect of the substitution is on the dynamic properties of Atp12p.

  20. Heterologous Expression of Membrane Proteins: Choosing the Appropriate Host

    PubMed Central

    Pochon, Nathalie; Dementin, Sébastien; Hivin, Patrick; Boutigny, Sylvain; Rioux, Jean-Baptiste; Salvi, Daniel; Seigneurin-Berny, Daphné; Richaud, Pierre; Joyard, Jacques; Pignol, David; Sabaty, Monique; Desnos, Thierry; Pebay-Peyroula, Eva; Darrouzet, Elisabeth; Vernet, Thierry; Rolland, Norbert

    2011-01-01

    Background Membrane proteins are the targets of 50% of drugs, although they only represent 1% of total cellular proteins. The first major bottleneck on the route to their functional and structural characterisation is their overexpression; and simply choosing the right system can involve many months of trial and error. This work is intended as a guide to where to start when faced with heterologous expression of a membrane protein. Methodology/Principal Findings The expression of 20 membrane proteins, both peripheral and integral, in three prokaryotic (E. coli, L. lactis, R. sphaeroides) and three eukaryotic (A. thaliana, N. benthamiana, Sf9 insect cells) hosts was tested. The proteins tested were of various origins (bacteria, plants and mammals), functions (transporters, receptors, enzymes) and topologies (between 0 and 13 transmembrane segments). The Gateway system was used to clone all 20 genes into appropriate vectors for the hosts to be tested. Culture conditions were optimised for each host, and specific strategies were tested, such as the use of Mistic fusions in E. coli. 17 of the 20 proteins were produced at adequate yields for functional and, in some cases, structural studies. We have formulated general recommendations to assist with choosing an appropriate system based on our observations of protein behaviour in the different hosts. Conclusions/Significance Most of the methods presented here can be quite easily implemented in other laboratories. The results highlight certain factors that should be considered when selecting an expression host. The decision aide provided should help both newcomers and old-hands to select the best system for their favourite membrane protein. PMID:22216205

  1. The effect of amino acid deletions and substitutions in the longest loop of GFP

    PubMed Central

    Flores-Ramírez, Gabriela; Rivera, Manuel; Morales-Pablos, Alfredo; Osuna, Joel; Soberón, Xavier; Gaytán, Paul

    2007-01-01

    Background The effect of single and multiple amino acid substitutions in the green fluorescent protein (GFP) from Aequorea victoria has been extensively explored, yielding several proteins of diverse spectral properties. However, the role of amino acid deletions in this protein -as with most proteins- is still unknown, due to the technical difficulties involved in generating combinatorial in-phase amino acid deletions on a target region. Results In this study, the region I129-L142 of superglo GFP (sgGFP), corresponding to the longest loop of the protein and located far away from the central chromophore, was subjected to a random amino acid deletion approach, employing an in-house recently developed mutagenesis method termed Codon-Based Random Deletion (COBARDE). Only two mutants out of 16384 possible variant proteins retained fluorescence: sgGFP-Δ I129 and sgGFP-Δ D130. Interestingly, both mutants were thermosensitive and at 30°C sgGFP-Δ D130 was more fluorescent than the parent protein. In contrast with deletions, substitutions of single amino acids from residues F131 to L142 were well tolerated. The substitution analysis revealed a particular importance of residues F131, G135, I137, L138, H140 and L142 for the stability of the protein. Conclusion The behavior of GFP variants with both amino acid deletions and substitutions demonstrate that this loop is playing an important structural role in GFP folding. Some of the amino acids which tolerated any substitution but no deletion are simply acting as "spacers" to localize important residues in the protein structure. PMID:17594481

  2. Biological Small Angle Scattering: Techniques, Strategies and Tips

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chaudhuri, Barnali; Muñoz, Inés G.; Urban, Volker S.

    This book provides a clear, comprehensible and up-to-date description of how Small Angle Scattering (SAS) can help structural biology researchers. SAS is an efficient technique that offers structural information on how biological macromolecules behave in solution. SAS provides distinct and complementary data for integrative structural biology approaches in combination with other widely used probes, such as X-ray crystallography, Nuclear magnetic resonance, Mass spectrometry and Cryo-electron Microscopy. The development of brilliant synchrotron small-angle X-ray scattering (SAXS) beam lines has increased the number of researchers interested in solution scattering. SAS is especially useful for studying conformational changes in proteins, highly flexible proteins,more » and intrinsically disordered proteins. Small-angle neutron scattering (SANS) with neutron contrast variation is ideally suited for studying multi-component assemblies as well as membrane proteins that are stabilized in surfactant micelles or vesicles. SAS is also used for studying dynamic processes of protein fibrillation in amyloid diseases, and pharmaceutical drug delivery. The combination with size-exclusion chromatography further increases the range of SAS applications.The book is written by leading experts in solution SAS methodologies. The principles and theoretical background of various SAS techniques are included, along with practical aspects that range from sample preparation to data presentation for publication. Topics covered include techniques for improving data quality and analysis, as well as different scientific applications of SAS. With abundant illustrations and practical tips, we hope the clear explanations of the principles and the reviews on the latest progresses will serve as a guide through all aspects of biological solution SAS.The scope of this book is particularly relevant for structural biology researchers who are new to SAS. Advanced users of the technique will find it helpful for exploring the diversity of solution SAS methods and applications.« less

  3. Soluble expression, purification and characterization of the full length IS2 Transposase

    PubMed Central

    2011-01-01

    Background The two-step transposition pathway of insertion sequences of the IS3 family, and several other families, involves first the formation of a branched figure-of-eight (F-8) structure by an asymmetric single strand cleavage at one optional donor end and joining to the flanking host DNA near the target end. Its conversion to a double stranded minicircle precedes the second insertional step, where both ends function as donors. In IS2, the left end which lacks donor function in Step I acquires it in Step II. The assembly of two intrinsically different protein-DNA complexes in these F-8 generating elements has been intuitively proposed, but a barrier to testing this hypothesis has been the difficulty of isolating a full length, soluble and active transposase that creates fully formed synaptic complexes in vitro with protein bound to both binding and catalytic domains of the ends. We address here a solution to expressing, purifying and structurally analyzing such a protein. Results A soluble and active IS2 transposase derivative with GFP fused to its C-terminus functions as efficiently as the native protein in in vivo transposition assays. In vitro electrophoretic mobility shift assay data show that the partially purified protein prepared under native conditions binds very efficiently to cognate DNA, utilizing both N- and C-terminal residues. As a precursor to biophysical analyses of these complexes, a fluorescence-based random mutagenesis protocol was developed that enabled a structure-function analysis of the protein with good resolution at the secondary structure level. The results extend previous structure-function work on IS3 family transposases, identifying the binding domain as a three helix H + HTH bundle and explaining the function of an atypical leucine zipper-like motif in IS2. In addition gain- and loss-of-function mutations in the catalytic active site define its role in regional and global binding and identify functional signatures that are common to the three dimensional catalytic core motif of the retroviral integrase superfamily. Conclusions Intractably insoluble transposases, such as the IS2 transposase, prepared by solubilization protocols are often refractory to whole protein structure-function studies. The results described here have validated the use of GFP-tagging and fluorescence-based random mutagenesis in overcoming this limitation at the secondary structure level. PMID:22032517

  4. Simulations of single-particle imaging of hydrated proteins with x-ray free-electron lasers

    NASA Astrophysics Data System (ADS)

    Fortmann-Grote, C.; Bielecki, J.; Jurek, Z.; Santra, R.; Ziaja-Motyka, B.; Mancuso, A. P.

    2017-08-01

    We employ start-to-end simulations to model coherent diffractive imaging of single biomolecules using x-ray free electron lasers. This technique is expected to yield new structural information about biologically relevant macromolecules thanks to the ability to study the isolated sample in its natural environment as opposed to crystallized or cryogenic samples. The effect of the solvent on the diffraction pattern and interpretability of the data is an open question. We present first results of calculations where the solvent is taken into account explicitly. They were performed with a molecular dynamics scheme for a sample consisting of a protein and a hydration layer of varying thickness. Through R-factor analysis of the simulated diffraction patterns from hydrated samples, we show that the scattering background from realistic hydration layers of up to 3 Å thickness presents no obstacle for the resolution of molecular structures at the sub-nm level.

  5. Active Fragments from Pro- and Antiapoptotic BCL-2 Proteins Have Distinct Membrane Behavior Reflecting Their Functional Divergence

    PubMed Central

    Guillemin, Yannis; Lopez, Jonathan; Gimenez, Diana; Fuertes, Gustavo; Valero, Juan Garcia; Blum, Loïc; Gonzalo, Philippe; Salgado, Jesùs; Girard-Egrot, Agnès; Aouacheria, Abdel

    2010-01-01

    Background The BCL-2 family of proteins includes pro- and antiapoptotic members acting by controlling the permeabilization of mitochondria. Although the association of these proteins with the outer mitochondrial membrane is crucial for their function, little is known about the characteristics of this interaction. Methodology/Principal Findings Here, we followed a reductionist approach to clarify to what extent membrane-active regions of homologous BCL-2 family proteins contribute to their functional divergence. Using isolated mitochondria as well as model lipid Langmuir monolayers coupled with Brewster Angle Microscopy, we explored systematically and comparatively the membrane activity and membrane-peptide interactions of fragments derived from the central helical hairpin of BAX, BCL-xL and BID. The results show a connection between the differing abilities of the assayed peptide fragments to contact, insert, destabilize and porate membranes and the activity of their cognate proteins in programmed cell death. Conclusion/Significance BCL-2 family-derived pore-forming helices thus represent structurally analogous, but functionally dissimilar membrane domains. PMID:20140092

  6. A Common Suite of Coagulation Proteins Function in Drosophila Muscle Attachment.

    PubMed

    Green, Nicole; Odell, Nadia; Zych, Molly; Clark, Cheryl; Wang, Zong-Heng; Biersmith, Bridget; Bajzek, Clara; Cook, Kevin R; Dushay, Mitchell S; Geisbrecht, Erika R

    2016-11-01

    The organization and stability of higher order structures that form in the extracellular matrix (ECM) to mediate the attachment of muscles are poorly understood. We have made the surprising discovery that a subset of clotting factor proteins are also essential for muscle attachment in the model organism Drosophila melanogaster One such coagulation protein, Fondue (Fon), was identified as a novel muscle mutant in a pupal lethal genetic screen. Fon accumulates at muscle attachment sites and removal of this protein results in decreased locomotor behavior and detached larval muscles. A sensitized genetic background assay reveals that fon functions with the known muscle attachment genes Thrombospondin (Tsp) and Tiggrin (Tig). Interestingly, Tig is also a component of the hemolymph clot. We further demonstrate that an additional clotting protein, Larval serum protein 1γ (Lsp1γ), is also required for muscle attachment stability and accumulates where muscles attach to tendons. While the local biomechanical and organizational properties of the ECM vary greatly depending on the tissue microenvironment, we propose that shared extracellular protein-protein interactions influence the strength and elasticity of ECM proteins in both coagulation and muscle attachment. Copyright © 2016 by the Genetics Society of America.

  7. The Puf family of RNA-binding proteins in plants: phylogeny, structural modeling, activity and subcellular localization

    PubMed Central

    2010-01-01

    Background Puf proteins have important roles in controlling gene expression at the post-transcriptional level by promoting RNA decay and repressing translation. The Pumilio homology domain (PUM-HD) is a conserved region within Puf proteins that binds to RNA with sequence specificity. Although Puf proteins have been well characterized in animal and fungal systems, little is known about the structural and functional characteristics of Puf-like proteins in plants. Results The Arabidopsis and rice genomes code for 26 and 19 Puf-like proteins, respectively, each possessing eight or fewer Puf repeats in their PUM-HD. Key amino acids in the PUM-HD of several of these proteins are conserved with those of animal and fungal homologs, whereas other plant Puf proteins demonstrate extensive variability in these amino acids. Three-dimensional modeling revealed that the predicted structure of this domain in plant Puf proteins provides a suitable surface for binding RNA. Electrophoretic gel mobility shift experiments showed that the Arabidopsis AtPum2 PUM-HD binds with high affinity to BoxB of the Drosophila Nanos Response Element I (NRE1) RNA, whereas a point mutation in the core of the NRE1 resulted in a significant reduction in binding affinity. Transient expression of several of the Arabidopsis Puf proteins as fluorescent protein fusions revealed a dynamic, punctate cytoplasmic pattern of localization for most of these proteins. The presence of predicted nuclear export signals and accumulation of AtPuf proteins in the nucleus after treatment of cells with leptomycin B demonstrated that shuttling of these proteins between the cytosol and nucleus is common among these proteins. In addition to the cytoplasmically enriched AtPum proteins, two AtPum proteins showed nuclear targeting with enrichment in the nucleolus. Conclusions The Puf family of RNA-binding proteins in plants consists of a greater number of members than any other model species studied to date. This, along with the amino acid variability observed within their PUM-HDs, suggests that these proteins may be involved in a wide range of post-transcriptional regulatory events that are important in providing plants with the ability to respond rapidly to changes in environmental conditions and throughout development. PMID:20214804

  8. Proteus: a random forest classifier to predict disorder-to-order transitioning binding regions in intrinsically disordered proteins

    NASA Astrophysics Data System (ADS)

    Basu, Sankar; Söderquist, Fredrik; Wallner, Björn

    2017-05-01

    The focus of the computational structural biology community has taken a dramatic shift over the past one-and-a-half decades from the classical protein structure prediction problem to the possible understanding of intrinsically disordered proteins (IDP) or proteins containing regions of disorder (IDPR). The current interest lies in the unraveling of a disorder-to-order transitioning code embedded in the amino acid sequences of IDPs/IDPRs. Disordered proteins are characterized by an enormous amount of structural plasticity which makes them promiscuous in binding to different partners, multi-functional in cellular activity and atypical in folding energy landscapes resembling partially folded molten globules. Also, their involvement in several deadly human diseases (e.g. cancer, cardiovascular and neurodegenerative diseases) makes them attractive drug targets, and important for a biochemical understanding of the disease(s). The study of the structural ensemble of IDPs is rather difficult, in particular for transient interactions. When bound to a structured partner, an IDPR adapts an ordered conformation in the complex. The residues that undergo this disorder-to-order transition are called protean residues, generally found in short contiguous stretches and the first step in understanding the modus operandi of an IDP/IDPR would be to predict these residues. There are a few available methods which predict these protean segments from their amino acid sequences; however, their performance reported in the literature leaves clear room for improvement. With this background, the current study presents `Proteus', a random forest classifier that predicts the likelihood of a residue undergoing a disorder-to-order transition upon binding to a potential partner protein. The prediction is based on features that can be calculated using the amino acid sequence alone. Proteus compares favorably with existing methods predicting twice as many true positives as the second best method (55 vs. 27%) with a much higher precision on an independent data set. The current study also sheds some light on a possible `disorder-to-order' transitioning consensus, untangled, yet embedded in the amino acid sequence of IDPs. Some guidelines have also been suggested for proceeding with a real-life structural modeling involving an IDPR using Proteus.

  9. A clathrin coat assembly role for the muniscin protein central linker revealed by TALEN-mediated gene editing

    PubMed Central

    Umasankar, Perunthottathu K; Ma, Li; Thieman, James R; Jha, Anupma; Doray, Balraj; Watkins, Simon C; Traub, Linton M

    2014-01-01

    Clathrin-mediated endocytosis is an evolutionarily ancient membrane transport system regulating cellular receptivity and responsiveness. Plasmalemma clathrin-coated structures range from unitary domed assemblies to expansive planar constructions with internal or flanking invaginated buds. Precisely how these morphologically-distinct coats are formed, and whether all are functionally equivalent for selective cargo internalization is still disputed. We have disrupted the genes encoding a set of early arriving clathrin-coat constituents, FCHO1 and FCHO2, in HeLa cells. Endocytic coats do not disappear in this genetic background; rather clustered planar lattices predominate and endocytosis slows, but does not cease. The central linker of FCHO proteins acts as an allosteric regulator of the prime endocytic adaptor, AP-2. By loading AP-2 onto the plasma membrane, FCHO proteins provide a parallel pathway for AP-2 activation and clathrin-coat fabrication. Further, the steady-state morphology of clathrin-coated structures appears to be a manifestation of the availability of the muniscin linker during lattice polymerization. DOI: http://dx.doi.org/10.7554/eLife.04137.001 PMID:25303365

  10. Comparative NMR Analysis of an 80-Residue G Protein-Coupled Receptor Fragment in Two Membrane Mimetic Environments

    PubMed Central

    LS, Cohen; B, Arshava; A, Neumoin; JM, Becker; P, Güntert; O, Zerbe; Naider, F

    2011-01-01

    Fragments of integral membrane proteins have been used to study the physical chemical properties of regions of transporters and receptors. Ste2p(G31-T110) is an 80-residue polypeptide which contains a portion of the N-terminal domain, transmembrane domain 1 (TM1), intracellular loop 1, TM2 and part of extracellular loop 2 of the α-factor receptor (Ste2p) from Saccharomyces cerevisiae. The structure of this peptide was previously determined to form a helical hairpin in lyso-palmitoylphosphatidyl-glycerol micelles (LPPG)[1]. Herein, we perform a systematic comparison of the structure of this protein fragment in micelles and trifluoroethanol(TFE):water in order to understand whether spectra recorded in organic:aqueous medium can facilitate the structure determination in a micellar environment. Using uniformly labeled peptide and peptide selectively protonated on Ile, Val and Leu methyl groups in a perdeuterated background and a broad set of 3D NMR experiments we assigned 89% of the observable atoms. NOEs and chemical shift analysis were used to define the helical regions of the fragment. Together with constraints from paramagnetic spin labeling, NOEs were used to calculate a transiently folded helical hairpin structure for this peptide in TFE:water. Correlation of chemical shifts was insufficient to transfer assignments from TFE:water to LPPG spectra in the absence of further information. PMID:21791199

  11. External reflection FTIR of peptide monolayer films in situ at the air/water interface: experimental design, spectra-structure correlations, and effects of hydrogen-deuterium exchange.

    PubMed Central

    Flach, C R; Brauner, J W; Taylor, J W; Baldwin, R C; Mendelsohn, R

    1994-01-01

    A Fourier transform infrared spectrometer has been interfaced with a surface balance and a new external reflection infrared sampling accessory, which permits the acquisition of spectra from protein monolayers in situ at the air/water interface. The accessory, a sample shuttle that permits the collection of spectra in alternating fashion from sample and background troughs, reduces interference from water vapor rotation-vibration bands in the amide I and amide II regions of protein spectra (1520-1690 cm-1) by nearly an order of magnitude. Residual interference from water vapor absorbance ranges from 50 to 200 microabsorbance units. The performance of the device is demonstrated through spectra of synthetic peptides designed to adopt alpha-helical, antiparallel beta-sheet, mixed beta-sheet/beta-turn, and unordered conformations at the air/water interface. The extent of exchange on the surface can be monitored from the relative intensities of the amide II and amide I modes. Hydrogen-deuterium exchange may lower the amide I frequency by as much as 11-12 cm-1 for helical secondary structures. This shifts the vibrational mode into a region normally associated with unordered structures and leads to uncertainties in the application of algorithms commonly used for determination of secondary structure from amide I contours of proteins in D2O solution. PMID:7919013

  12. Prediction of TF target sites based on atomistic models of protein-DNA complexes

    PubMed Central

    Angarica, Vladimir Espinosa; Pérez, Abel González; Vasconcelos, Ana T; Collado-Vides, Julio; Contreras-Moreira, Bruno

    2008-01-01

    Background The specific recognition of genomic cis-regulatory elements by transcription factors (TFs) plays an essential role in the regulation of coordinated gene expression. Studying the mechanisms determining binding specificity in protein-DNA interactions is thus an important goal. Most current approaches for modeling TF specific recognition rely on the knowledge of large sets of cognate target sites and consider only the information contained in their primary sequence. Results Here we describe a structure-based methodology for predicting sequence motifs starting from the coordinates of a TF-DNA complex. Our algorithm combines information regarding the direct and indirect readout of DNA into an atomistic statistical model, which is used to estimate the interaction potential. We first measure the ability of our method to correctly estimate the binding specificities of eight prokaryotic and eukaryotic TFs that belong to different structural superfamilies. Secondly, the method is applied to two homology models, finding that sampling of interface side-chain rotamers remarkably improves the results. Thirdly, the algorithm is compared with a reference structural method based on contact counts, obtaining comparable predictions for the experimental complexes and more accurate sequence motifs for the homology models. Conclusion Our results demonstrate that atomic-detail structural information can be feasibly used to predict TF binding sites. The computational method presented here is universal and might be applied to other systems involving protein-DNA recognition. PMID:18922190

  13. Exploration of structural stability in deleterious nsSNPs of the XPA gene: A molecular dynamics approach

    PubMed Central

    NagaSundaram, N; Priya Doss, C George

    2011-01-01

    Background: Distinguishing the deleterious from the massive number of non-functional nsSNPs that occur within a single genome is a considerable challenge in mutation research. In this approach, we have used the existing in silico methods to explore the mutation-structure-function relationship in the XPAgene. Materials and Methods: We used the Sorting Intolerant From Tolerant (SIFT), Polymorphism Phenotyping (PolyPhen), I-Mutant 2.0, and the Protein Analysis THrough Evolutionary Relationships methods to predict the effects of deleterious nsSNPs on protein function and evaluated the impact of mutation on protein stability by Molecular Dynamics simulations. Results: By comparing the scores of all the four in silico methods, nsSNP with an ID rs104894131 at position C108F was predicted to be highly deleterious. We extended our Molecular dynamics approach to gain insight into the impact of this non-synonymous polymorphism on structural changes that may affect the activity of the XPAgene. Conclusion: Based on the in silico methods score, potential energy, root-mean-square deviation, and root-mean-square fluctuation, we predict that deleterious nsSNP at position C108F would play a significant role in causing disease by the XPA gene. Our approach would present the application of in silicotools in understanding the functional variation from the perspective of structure, evolution, and phenotype. PMID:22190868

  14. In vitro modelling of familial amyloidotic polyneuropathy allows quantitative detection of transthyretin amyloid fibril-like structures in hepatic derivatives of patient-specific induced pluripotent stem cells.

    PubMed

    Hoepfner, Jeannine; Kleinsorge, Mandy; Papp, Oliver; Alfken, Susanne; Heiringhoff, Robin; Pich, Andreas; Sauer, Vanessa; Zibert, Andree; Göhring, Gudrun; Schmidt, Hartmut; Sgodda, Malte; Cantz, Tobias

    2017-07-26

    The transthyretin protein is thermodynamically destabilised by mutations in the transthyretin gene, promoting the formation of amyloid fibrils in various tissues. Consequently, impaired autonomic organ function is observed in patients suffering from transthyretin-related familial amyloidotic polyneuropathy (FAP). The influence of individual genetic backgrounds on fibril formation as a potential cause of genotype-phenotype variations needs to be investigated in order to ensure efficient patient-specific therapies. We reprogrammed FAP patient fibroblasts to induced pluripotent stem (iPS) cells and differentiated these cells into transthyretin-expressing hepatocyte-like cells (HLCs). HLCs differentiated from FAP iPS cells and healthy control iPS cells secreted the transthyretin protein in similar concentrations. Mass spectrometry revealed the presence of mutant transthyretin protein in FAP HLC supernatants. In comparison to healthy control iPS cells, we demonstrated the formation of transthyretin amyloid fibril-like structures in FAP HLC supernatants using the amyloid-specific dyes Congo red and thioflavin T. These dyes were also applicable for the quantitative determination of in vitro formed transthyretin fibril-like structures. Moreover, we confirmed the inhibition of fibril formation by the TTR kinetic stabiliser diclofenac. Thioflavin T fluorescence intensity measurements even allowed the quantification of amyloid fibril-like structures in 96-well plate formats as a prerequisite for patient-specific drug screening approaches.

  15. Modular prediction of protein structural classes from sequences of twilight-zone identity with predicting sequences

    PubMed Central

    2009-01-01

    Background Knowledge of structural class is used by numerous methods for identification of structural/functional characteristics of proteins and could be used for the detection of remote homologues, particularly for chains that share twilight-zone similarity. In contrast to existing sequence-based structural class predictors, which target four major classes and which are designed for high identity sequences, we predict seven classes from sequences that share twilight-zone identity with the training sequences. Results The proposed MODular Approach to Structural class prediction (MODAS) method is unique as it allows for selection of any subset of the classes. MODAS is also the first to utilize a novel, custom-built feature-based sequence representation that combines evolutionary profiles and predicted secondary structure. The features quantify information relevant to the definition of the classes including conservation of residues and arrangement and number of helix/strand segments. Our comprehensive design considers 8 feature selection methods and 4 classifiers to develop Support Vector Machine-based classifiers that are tailored for each of the seven classes. Tests on 5 twilight-zone and 1 high-similarity benchmark datasets and comparison with over two dozens of modern competing predictors show that MODAS provides the best overall accuracy that ranges between 80% and 96.7% (83.5% for the twilight-zone datasets), depending on the dataset. This translates into 19% and 8% error rate reduction when compared against the best performing competing method on two largest datasets. The proposed predictor provides accurate predictions at 58% accuracy for membrane proteins class, which is not considered by majority of existing methods, in spite that this class accounts for only 2% of the data. Our predictive model is analyzed to demonstrate how and why the input features are associated with the corresponding classes. Conclusions The improved predictions stem from the novel features that express collocation of the secondary structure segments in the protein sequence and that combine evolutionary and secondary structure information. Our work demonstrates that conservation and arrangement of the secondary structure segments predicted along the protein chain can successfully predict structural classes which are defined based on the spatial arrangement of the secondary structures. A web server is available at http://biomine.ece.ualberta.ca/MODAS/. PMID:20003388

  16. FRASS: the web-server for RNA structural comparison

    PubMed Central

    2010-01-01

    Background The impressive increase of novel RNA structures, during the past few years, demands automated methods for structure comparison. While many algorithms handle only small motifs, few techniques, developed in recent years, (ARTS, DIAL, SARA, SARSA, and LaJolla) are available for the structural comparison of large and intact RNA molecules. Results The FRASS web-server represents a RNA chain with its Gauss integrals and allows one to compare structures of RNA chains and to find similar entries in a database derived from the Protein Data Bank. We observed that FRASS scores correlate well with the ARTS and LaJolla similarity scores. Moreover, the-web server can also reproduce satisfactorily the DARTS classification of RNA 3D structures and the classification of the SCOR functions that was obtained by the SARA method. Conclusions The FRASS web-server can be easily used to detect relationships among RNA molecules and to scan efficiently the rapidly enlarging structural databases. PMID:20553602

  17. Why do proteins aggregate? “Intrinsically insoluble proteins” and “dark mediators” revealed by studies on “insoluble proteins” solubilized in pure water

    PubMed Central

    Song, Jianxing

    2013-01-01

    In 2008, I reviewed and proposed a model for our discovery in 2005 that unrefoldable and insoluble proteins could in fact be solubilized in unsalted water. Since then, this discovery has offered us and other groups a powerful tool to characterize insoluble proteins, and we have further addressed several fundamental and disease-relevant issues associated with this discovery. Here I review these results, which are conceptualized into several novel scenarios. 1) Unlike 'misfolded proteins', which still retain the capacity to fold into well-defined structures but are misled to 'off-pathway' aggregation, unrefoldable and insoluble proteins completely lack this ability and will unavoidably aggregate in vivo with ~150 mM ions, thus designated as 'intrinsically insoluble proteins (IIPs)' here. IIPs may largely account for the 'wastefully synthesized' DRiPs identified in human cells. 2) The fact that IIPs including membrane proteins are all soluble in unsalted water, but get aggregated upon being exposed to ions, logically suggests that ions existing in the background play a central role in mediating protein aggregation, thus acting as 'dark mediators'. Our study with 14 salts confirms that IIPs lack the capacity to fold into any well-defined structures. We uncover that salts modulate protein dynamics and anions bind proteins with high selectivity and affinity, which is surprisingly masked by pre-existing ions. Accordingly, I modified my previous model. 3) Insoluble proteins interact with lipids to different degrees. Remarkably, an ALS-causing P56S mutation transforms the β-sandwich MSP domain into a helical integral membrane protein. Consequently, the number of membrane-interacting proteins might be much larger than currently recognized. To attack biological membranes may represent a common mechanism by which aggregated proteins initiate human diseases. 4) Our discovery also implies a solution to the 'chicken-and-egg paradox' for the origin of primitive membranes embedded with integral membrane proteins, if proteins originally emerged in unsalted prebiotic media. PMID:24555050

  18. Mapping the distribution of packing topologies within protein interiors shows predominant preference for specific packing motifs

    PubMed Central

    2011-01-01

    Background Mapping protein primary sequences to their three dimensional folds referred to as the 'second genetic code' remains an unsolved scientific problem. A crucial part of the problem concerns the geometrical specificity in side chain association leading to densely packed protein cores, a hallmark of correctly folded native structures. Thus, any model of packing within proteins should constitute an indispensable component of protein folding and design. Results In this study an attempt has been made to find, characterize and classify recurring patterns in the packing of side chain atoms within a protein which sustains its native fold. The interaction of side chain atoms within the protein core has been represented as a contact network based on the surface complementarity and overlap between associating side chain surfaces. Some network topologies definitely appear to be preferred and they have been termed 'packing motifs', analogous to super secondary structures in proteins. Study of the distribution of these motifs reveals the ubiquitous presence of typical smaller graphs, which appear to get linked or coalesce to give larger graphs, reminiscent of the nucleation-condensation model in protein folding. One such frequently occurring motif, also envisaged as the unit of clustering, the three residue clique was invariably found in regions of dense packing. Finally, topological measures based on surface contact networks appeared to be effective in discriminating sequences native to a specific fold amongst a set of decoys. Conclusions Out of innumerable topological possibilities, only a finite number of specific packing motifs are actually realized in proteins. This small number of motifs could serve as a basis set in the construction of larger networks. Of these, the triplet clique exhibits distinct preference both in terms of composition and geometry. PMID:21605466

  19. The ancient history of the structure of ribonuclease P and the early origins of Archaea

    PubMed Central

    2010-01-01

    Background Ribonuclease P is an ancient endonuclease that cleaves precursor tRNA and generally consists of a catalytic RNA subunit (RPR) and one or more proteins (RPPs). It represents an important macromolecular complex and model system that is universally distributed in life. Its putative origins have inspired fundamental hypotheses, including the proposal of an ancient RNA world. Results To study the evolution of this complex, we constructed rooted phylogenetic trees of RPR molecules and substructures and estimated RPP age using a cladistic method that embeds structure directly into phylogenetic analysis. The general approach was used previously to study the evolution of tRNA, SINE RNA and 5S rRNA, the origins of metabolism, and the evolution and complexity of the protein world, and revealed here remarkable evolutionary patterns. Trees of molecules uncovered the tripartite nature of life and the early origin of archaeal RPRs. Trees of substructures showed molecules originated in stem P12 and were accessorized with a catalytic P1-P4 core structure before the first substructure was lost in Archaea. This core currently interacts with RPPs and ancient segments of the tRNA molecule. Finally, a census of protein domain structure in hundreds of genomes established RPPs appeared after the rise of metabolic enzymes at the onset of the protein world. Conclusions The study provides a detailed account of the history and early diversification of a fundamental ribonucleoprotein and offers further evidence in support of the existence of a tripartite organismal world that originated by the segregation of archaeal lineages from an ancient community of primordial organisms. PMID:20334683

  20. Characterization of the SAM domain of the PKD-related protein ANKS6 and its interaction with ANKS3

    PubMed Central

    2014-01-01

    Background Autosomal dominant polycystic kidney disease (ADPKD) is the most common genetic disorder leading to end-stage renal failure in humans. In the PKD/Mhm(cy/+) rat model of ADPKD, the point mutation R823W in the sterile alpha motif (SAM) domain of the protein ANKS6 is responsible for disease. SAM domains are known protein-protein interaction domains, capable of binding each other to form polymers and heterodimers. Despite its physiological importance, little is known about the function of ANKS6 and how the R823W point mutation leads to PKD. Recent work has revealed that ANKS6 interacts with a related protein called ANKS3. Both ANKS6 and ANKS3 have a similar domain structure, with ankyrin repeats at the N-terminus and a SAM domain at the C-terminus. Results The SAM domain of ANKS3 is identified as a direct binding partner of the ANKS6 SAM domain. We find that ANKS3-SAM polymerizes and ANKS6-SAM can bind to one end of the polymer. We present crystal structures of both the ANKS3-SAM polymer and the ANKS3-SAM/ANKS6-SAM complex, revealing the molecular details of their association. We also learn how the R823W mutation disrupts ANKS6 function by dramatically destabilizing the SAM domain such that the interaction with ANKS3-SAM is lost. Conclusions ANKS3 is a direct interacting partner of ANKS6. By structurally and biochemically characterizing the interaction between the ANKS3 and ANKS6 SAM domains, our work provides a basis for future investigation of how the interaction between these proteins mediates kidney function. PMID:24998259

  1. In silico characterization of a novel pathogenic deletion mutation identified in XPA gene in a Pakistani family with severe xeroderma pigmentosum

    PubMed Central

    2013-01-01

    Background Xeroderma Pigmentosum (XP) is a rare skin disorder characterized by skin hypersensitivity to sunlight and abnormal pigmentation. The aim of this study was to investigate the genetic cause of a severe XP phenotype in a consanguineous Pakistani family and in silico characterization of any identified disease-associated mutation. Results The XP complementation group was assigned by genotyping of family for known XP loci. Genotyping data mapped the family to complementation group A locus, involving XPA gene. Mutation analysis of the candidate XP gene by DNA sequencing revealed a novel deletion mutation (c.654del A) in exon 5 of XPA gene. The c.654del A, causes frameshift, which pre-maturely terminates protein and result into a truncated product of 222 amino acid (aa) residues instead of 273 (p.Lys218AsnfsX5). In silico tools were applied to study the likelihood of changes in structural motifs and thus interaction of mutated protein with binding partners. In silico analysis of mutant protein sequence, predicted to affect the aa residue which attains coiled coil structure. The coiled coil structure has an important role in key cellular interactions, especially with DNA damage-binding protein 2 (DDB2), which has important role in DDB-mediated nucleotide excision repair (NER) system. Conclusions Our findings support the fact of genetic and clinical heterogeneity in XP. The study also predicts the critical role of DDB2 binding region of XPA protein in NER pathway and opens an avenue for further research to study the functional role of the mutated protein domain. PMID:24063568

  2. Characterization of a Gene Family Encoding SEA (Sea-urchin Sperm Protein, Enterokinase and Agrin)-Domain Proteins with Lectin-Like and Heme-Binding Properties from Schistosoma japonicum

    PubMed Central

    Mbanefo, Evaristus Chibunna; Kikuchi, Mihoko; Huy, Nguyen Tien; Shuaibu, Mohammed Nasir; Cherif, Mahamoud Sama; Yu, Chuanxin; Wakao, Masahiro; Suda, Yasuo; Hirayama, Kenji

    2014-01-01

    Background We previously identified a novel gene family dispersed in the genome of Schistosoma japonicum by retrotransposon-mediated gene duplication mechanism. Although many transcripts were identified, no homolog was readily identifiable from sequence information. Methodology/Principal Findings Here, we utilized structural homology modeling and biochemical methods to identify remote homologs, and characterized the gene products as SEA (sea-urchin sperm protein, enterokinase and agrin)-domain containing proteins. A common extracellular domain in this family was structurally similar to SEA-domain. SEA-domain is primarily a structural domain, known to assist or regulate binding to glycans. Recombinant proteins from three members of this gene family specifically interacted with glycosaminoglycans with high affinity, with potential implication in ligand acquisition and immune evasion. Similar approach was used to identify a heme-binding site on the SEA-domain. The heme-binding mode showed heme molecule inserted into a hydrophobic pocket, with heme iron putatively coordinated to two histidine axial ligands. Heme-binding properties were confirmed using biochemical assays and UV-visible absorption spectroscopy, which showed high affinity heme-binding (K D = 1.605×10−6 M) and cognate spectroscopic attributes of hexa-coordinated heme iron. The native proteins were oligomers, antigenic, and are localized on adult worm teguments and gastrodermis; major host-parasite interfaces and site for heme detoxification and acquisition. Conclusions The results suggest potential role, at least in the nucleation step of heme crystallization (hemozoin formation), and as receptors for heme uptake. Survival strategies exploited by parasites, including heme homeostasis mechanism in hemoparasites, are paramount for successful parasitism. Thus, assessing prospects for application in disease intervention is warranted. PMID:24416467

  3. Biophysical and structural considerations for protein sequence evolution

    PubMed Central

    2011-01-01

    Background Protein sequence evolution is constrained by the biophysics of folding and function, causing interdependence between interacting sites in the sequence. However, current site-independent models of sequence evolutions do not take this into account. Recent attempts to integrate the influence of structure and biophysics into phylogenetic models via statistical/informational approaches have not resulted in expected improvements in model performance. This suggests that further innovations are needed for progress in this field. Results Here we develop a coarse-grained physics-based model of protein folding and binding function, and compare it to a popular informational model. We find that both models violate the assumption of the native sequence being close to a thermodynamic optimum, causing directional selection away from the native state. Sampling and simulation show that the physics-based model is more specific for fold-defining interactions that vary less among residue type. The informational model diffuses further in sequence space with fewer barriers and tends to provide less support for an invariant sites model, although amino acid substitutions are generally conservative. Both approaches produce sequences with natural features like dN/dS < 1 and gamma-distributed rates across sites. Conclusions Simple coarse-grained models of protein folding can describe some natural features of evolving proteins but are currently not accurate enough to use in evolutionary inference. This is partly due to improper packing of the hydrophobic core. We suggest possible improvements on the representation of structure, folding energy, and binding function, as regards both native and non-native conformations, and describe a large number of possible applications for such a model. PMID:22171550

  4. A Versatile System for High-Throughput In Situ X-ray Screening and Data Collection of Soluble and Membrane-Protein Crystals

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Broecker, Jana; Klingel, Viviane; Ou, Wei-Lin

    In recent years, in situ data collection has been a major focus of progress in protein crystallography. Here, we introduce the Mylar in situ method using Mylar-based sandwich plates that are inexpensive, easy to make and handle, and show significantly less background scattering than other setups. A variety of cognate holders for patches of Mylar in situ sandwich films corresponding to one or more wells makes the method robust and versatile, allows for storage and shipping of entire wells, and enables automated crystal imaging, screening, and goniometerbased X-ray diffraction data-collection at room temperature and under cryogenic conditions for soluble andmore » membrane-protein crystals grown in or transferred to these plates. We validated the Mylar in situ method using crystals of the water-soluble proteins hen egg-white lysozyme and sperm whale myoglobin as well as the 7-transmembrane protein bacteriorhodopsin from Haloquadratum walsbyi. In conjunction with current developments at synchrotrons, this approach promises high-resolution structural studies of membrane proteins to become faster and more routine.« less

  5. A Common Suite of Coagulation Proteins Function in Drosophila Muscle Attachment

    PubMed Central

    Green, Nicole; Odell, Nadia; Zych, Molly; Clark, Cheryl; Wang, Zong-Heng; Biersmith, Bridget; Bajzek, Clara; Cook, Kevin R.; Dushay, Mitchell S.; Geisbrecht, Erika R.

    2016-01-01

    The organization and stability of higher order structures that form in the extracellular matrix (ECM) to mediate the attachment of muscles are poorly understood. We have made the surprising discovery that a subset of clotting factor proteins are also essential for muscle attachment in the model organism Drosophila melanogaster. One such coagulation protein, Fondue (Fon), was identified as a novel muscle mutant in a pupal lethal genetic screen. Fon accumulates at muscle attachment sites and removal of this protein results in decreased locomotor behavior and detached larval muscles. A sensitized genetic background assay reveals that fon functions with the known muscle attachment genes Thrombospondin (Tsp) and Tiggrin (Tig). Interestingly, Tig is also a component of the hemolymph clot. We further demonstrate that an additional clotting protein, Larval serum protein 1γ (Lsp1γ), is also required for muscle attachment stability and accumulates where muscles attach to tendons. While the local biomechanical and organizational properties of the ECM vary greatly depending on the tissue microenvironment, we propose that shared extracellular protein–protein interactions influence the strength and elasticity of ECM proteins in both coagulation and muscle attachment. PMID:27585844

  6. New statistical potential for quality assessment of protein models and a survey of energy functions

    PubMed Central

    2010-01-01

    Background Scoring functions, such as molecular mechanic forcefields and statistical potentials are fundamentally important tools in protein structure modeling and quality assessment. Results The performances of a number of publicly available scoring functions are compared with a statistical rigor, with an emphasis on knowledge-based potentials. We explored the effect on accuracy of alternative choices for representing interaction center types and other features of scoring functions, such as using information on solvent accessibility, on torsion angles, accounting for secondary structure preferences and side chain orientation. Partially based on the observations made, we present a novel residue based statistical potential, which employs a shuffled reference state definition and takes into account the mutual orientation of residue side chains. Atom- and residue-level statistical potentials and Linux executables to calculate the energy of a given protein proposed in this work can be downloaded from http://www.fiserlab.org/potentials. Conclusions Among the most influential terms we observed a critical role of a proper reference state definition and the benefits of including information about the microenvironment of interaction centers. Molecular mechanical potentials were also tested and found to be over-sensitive to small local imperfections in a structure, requiring unfeasible long energy relaxation before energy scores started to correlate with model quality. PMID:20226048

  7. Optical switch probes and optical lock-in detection (OLID) imaging microscopy: high-contrast fluorescence imaging within living systems.

    PubMed

    Yan, Yuling; Marriott, M Emma; Petchprayoon, Chutima; Marriott, Gerard

    2011-02-01

    Few to single molecule imaging of fluorescent probe molecules can provide information on the distribution, dynamics, interactions and activity of specific fluorescently tagged proteins during cellular processes. Unfortunately, these imaging studies are made challenging in living cells because of fluorescence signals from endogenous cofactors. Moreover, related background signals within multi-cell systems and intact tissue are even higher and reduce signal contrast even for ensemble populations of probe molecules. High-contrast optical imaging within high-background environments will therefore require new ideas on the design of fluorescence probes, and the way their fluorescence signals are generated and analysed to form an image. To this end, in the present review we describe recent studies on a new family of fluorescent probe called optical switches, with descriptions of the mechanisms that underlie their ability to undergo rapid and reversible transitions between two distinct states. Optical manipulation of the fluorescent and non-fluorescent states of an optical switch probe generates a modulated fluorescence signal that can be isolated from a larger unmodulated background by using OLID (optical lock-in detection) techniques. The present review concludes with a discussion on select applications of synthetic and genetically encoded optical switch probes and OLID microscopy for high-contrast imaging of specific proteins and membrane structures within living systems.

  8. Mapping small molecule binding data to structural domains

    PubMed Central

    2012-01-01

    Background Large-scale bioactivity/SAR Open Data has recently become available, and this has allowed new analyses and approaches to be developed to help address the productivity and translational gaps of current drug discovery. One of the current limitations of these data is the relative sparsity of reported interactions per protein target, and complexities in establishing clear relationships between bioactivity and targets using bioinformatics tools. We detail in this paper the indexing of targets by the structural domains that bind (or are likely to bind) the ligand within a full-length protein. Specifically, we present a simple heuristic to map small molecule binding to Pfam domains. This profiling can be applied to all proteins within a genome to give some indications of the potential pharmacological modulation and regulation of all proteins. Results In this implementation of our heuristic, ligand binding to protein targets from the ChEMBL database was mapped to structural domains as defined by profiles contained within the Pfam-A database. Our mapping suggests that the majority of assay targets within the current version of the ChEMBL database bind ligands through a small number of highly prevalent domains, and conversely the majority of Pfam domains sampled by our data play no currently established role in ligand binding. Validation studies, carried out firstly against Uniprot entries with expert binding-site annotation and secondly against entries in the wwPDB repository of crystallographic protein structures, demonstrate that our simple heuristic maps ligand binding to the correct domain in about 90 percent of all assessed cases. Using the mappings obtained with our heuristic, we have assembled ligand sets associated with each Pfam domain. Conclusions Small molecule binding has been mapped to Pfam-A domains of protein targets in the ChEMBL bioactivity database. The result of this mapping is an enriched annotation of small molecule bioactivity data and a grouping of activity classes following the Pfam-A specifications of protein domains. This is valuable for data-focused approaches in drug discovery, for example when extrapolating potential targets of a small molecule with known activity against one or few targets, or in the assessment of a potential target for drug discovery or screening studies. PMID:23282026

  9. KEGG orthology-based annotation of the predicted proteome of Acropora digitifera: ZoophyteBase - an open access and searchable database of a coral genome

    PubMed Central

    2013-01-01

    Background Contemporary coral reef research has firmly established that a genomic approach is urgently needed to better understand the effects of anthropogenic environmental stress and global climate change on coral holobiont interactions. Here we present KEGG orthology-based annotation of the complete genome sequence of the scleractinian coral Acropora digitifera and provide the first comprehensive view of the genome of a reef-building coral by applying advanced bioinformatics. Description Sequences from the KEGG database of protein function were used to construct hidden Markov models. These models were used to search the predicted proteome of A. digitifera to establish complete genomic annotation. The annotated dataset is published in ZoophyteBase, an open access format with different options for searching the data. A particularly useful feature is the ability to use a Google-like search engine that links query words to protein attributes. We present features of the annotation that underpin the molecular structure of key processes of coral physiology that include (1) regulatory proteins of symbiosis, (2) planula and early developmental proteins, (3) neural messengers, receptors and sensory proteins, (4) calcification and Ca2+-signalling proteins, (5) plant-derived proteins, (6) proteins of nitrogen metabolism, (7) DNA repair proteins, (8) stress response proteins, (9) antioxidant and redox-protective proteins, (10) proteins of cellular apoptosis, (11) microbial symbioses and pathogenicity proteins, (12) proteins of viral pathogenicity, (13) toxins and venom, (14) proteins of the chemical defensome and (15) coral epigenetics. Conclusions We advocate that providing annotation in an open-access searchable database available to the public domain will give an unprecedented foundation to interrogate the fundamental molecular structure and interactions of coral symbiosis and allow critical questions to be addressed at the genomic level based on combined aspects of evolutionary, developmental, metabolic, and environmental perspectives. PMID:23889801

  10. Deep Illumina-Based Shotgun Sequencing Reveals Dietary Effects on the Structure and Function of the Fecal Microbiome of Growing Kittens

    PubMed Central

    Deusch, Oliver; O’Flynn, Ciaran; Colyer, Alison; Morris, Penelope; Allaway, David; Jones, Paul G.; Swanson, Kelly S.

    2014-01-01

    Background Previously, we demonstrated that dietary protein:carbohydrate ratio dramatically affects the fecal microbial taxonomic structure of kittens using targeted 16S gene sequencing. The present study, using the same fecal samples, applied deep Illumina shotgun sequencing to identify the diet-associated functional potential and analyze taxonomic changes of the feline fecal microbiome. Methodology & Principal Findings Fecal samples from kittens fed one of two diets differing in protein and carbohydrate content (high–protein, low–carbohydrate, HPLC; and moderate-protein, moderate-carbohydrate, MPMC) were collected at 8, 12 and 16 weeks of age (n = 6 per group). A total of 345.3 gigabases of sequence were generated from 36 samples, with 99.75% of annotated sequences identified as bacterial. At the genus level, 26% and 39% of reads were annotated for HPLC- and MPMC-fed kittens, with HPLC-fed cats showing greater species richness and microbial diversity. Two phyla, ten families and fifteen genera were responsible for more than 80% of the sequences at each taxonomic level for both diet groups, consistent with the previous taxonomic study. Significantly different abundances between diet groups were observed for 324 genera (56% of all genera identified) demonstrating widespread diet-induced changes in microbial taxonomic structure. Diversity was not affected over time. Functional analysis identified 2,013 putative enzyme function groups were different (p<0.000007) between the two dietary groups and were associated to 194 pathways, which formed five discrete clusters based on average relative abundance. Of those, ten contained more (p<0.022) enzyme functions with significant diet effects than expected by chance. Six pathways were related to amino acid biosynthesis and metabolism linking changes in dietary protein with functional differences of the gut microbiome. Conclusions These data indicate that feline feces-derived microbiomes have large structural and functional differences relating to the dietary protein:carbohydrate ratio and highlight the impact of diet early in life. PMID:25010839

  11. Scanning light-sheet microscopy in the whole mouse brain with HiLo background rejection

    PubMed Central

    Mertz, Jerome; Kim, Jinhyun

    2010-01-01

    It is well known that light-sheet illumination can enable optically sectioned wide-field imaging of macroscopic samples. However, the optical sectioning capacity of a light-sheet macroscope is undermined by sample-induced scattering or aberrations that broaden the thickness of the sheet illumination. We present a technique to enhance the optical sectioning capacity of a scanning light-sheet microscope by out-of-focus background rejection. The technique, called HiLo microscopy, makes use of two images sequentially acquired with uniform and structured sheet illumination. An optically sectioned image is then synthesized by fusing high and low spatial frequency information from both images. The benefits of combining light-sheet macroscopy and HiLo background rejection are demonstrated in optically cleared whole mouse brain samples, using both green fluorescent protein (GFP)-fluorescence and dark-field scattered light contrast. PMID:20210471

  12. The minimizing of fluorescence background in Raman optical activity and Raman spectra of human blood plasma.

    PubMed

    Tatarkovič, Michal; Synytsya, Alla; Šťovíčková, Lucie; Bunganič, Bohuš; Miškovičová, Michaela; Petruželka, Luboš; Setnička, Vladimír

    2015-02-01

    Raman optical activity (ROA) is inherently sensitive to the secondary structure of biomolecules, which makes it a method of interest for finding new approaches to clinical applications based on blood plasma analysis, for instance the diagnostics of several protein-misfolding diseases. Unfortunately, real blood plasma exhibits strong background fluorescence when excited at 532 nm; hence, measuring the ROA spectra appears to be impossible. Therefore, we established a suitable method using a combination of kinetic quenchers, filtering, photobleaching, and a mathematical correction of residual fluorescence. Our method reduced the background fluorescence approximately by 90%, which allowed speedup for each measurement by an average of 50%. In addition, the signal-to-noise ratio was significantly increased, while the baseline distortion remained low. We assume that our method is suitable for the investigation of human blood plasma by ROA and may lead to the development of a new tool for clinical diagnostics.

  13. Automated hierarchical classification of protein domain subfamilies based on functionally-divergent residue signatures

    PubMed Central

    2012-01-01

    Background The NCBI Conserved Domain Database (CDD) consists of a collection of multiple sequence alignments of protein domains that are at various stages of being manually curated into evolutionary hierarchies based on conserved and divergent sequence and structural features. These domain models are annotated to provide insights into the relationships between sequence, structure and function via web-based BLAST searches. Results Here we automate the generation of conserved domain (CD) hierarchies using a combination of heuristic and Markov chain Monte Carlo (MCMC) sampling procedures and starting from a (typically very large) multiple sequence alignment. This procedure relies on statistical criteria to define each hierarchy based on the conserved and divergent sequence patterns associated with protein functional-specialization. At the same time this facilitates the sequence and structural annotation of residues that are functionally important. These statistical criteria also provide a means to objectively assess the quality of CD hierarchies, a non-trivial task considering that the protein subgroups are often very distantly related—a situation in which standard phylogenetic methods can be unreliable. Our aim here is to automatically generate (typically sub-optimal) hierarchies that, based on statistical criteria and visual comparisons, are comparable to manually curated hierarchies; this serves as the first step toward the ultimate goal of obtaining optimal hierarchical classifications. A plot of runtimes for the most time-intensive (non-parallelizable) part of the algorithm indicates a nearly linear time complexity so that, even for the extremely large Rossmann fold protein class, results were obtained in about a day. Conclusions This approach automates the rapid creation of protein domain hierarchies and thus will eliminate one of the most time consuming aspects of conserved domain database curation. At the same time, it also facilitates protein domain annotation by identifying those pattern residues that most distinguish each protein domain subgroup from other related subgroups. PMID:22726767

  14. The crystal structure of Haloferax volcanii proliferating cell nuclear antigen reveals unique surface charge characteristics due to halophilic adaptation

    PubMed Central

    Winter, Jody A; Christofi, Panayiotis; Morroll, Shaun; Bunting, Karen A

    2009-01-01

    Background The high intracellular salt concentration required to maintain a halophilic lifestyle poses challenges to haloarchaeal proteins that must stay soluble, stable and functional in this extreme environment. Proliferating cell nuclear antigen (PCNA) is a fundamental protein involved in maintaining genome integrity, with roles in both DNA replication and repair. To investigate the halophilic adaptation of such a key protein we have crystallised and solved the structure of Haloferax volcanii PCNA (HvPCNA) to a resolution of 2.0 Å. Results The overall architecture of HvPCNA is very similar to other known PCNAs, which are highly structurally conserved. Three commonly observed adaptations in halophilic proteins are higher surface acidity, bound ions and increased numbers of intermolecular ion pairs (in oligomeric proteins). HvPCNA possesses the former two adaptations but not the latter, despite functioning as a homotrimer. Strikingly, the positive surface charge considered key to PCNA's role as a sliding clamp is dramatically reduced in the halophilic protein. Instead, bound cations within the solvation shell of HvPCNA may permit sliding along negatively charged DNA by reducing electrostatic repulsion effects. Conclusion The extent to which individual proteins adapt to halophilic conditions varies, presumably due to their diverse characteristics and roles within the cell. The number of ion pairs observed in the HvPCNA monomer-monomer interface was unexpectedly low. This may reflect the fact that the trimer is intrinsically stable over a wide range of salt concentrations and therefore additional modifications for trimer maintenance in high salt conditions are not required. Halophilic proteins frequently bind anions and cations and in HvPCNA cation binding may compensate for the remarkable reduction in positive charge in the pore region, to facilitate functional interactions with DNA. In this way, HvPCNA may harness its environment as opposed to simply surviving in extreme halophilic conditions. PMID:19698123

  15. RNase-assisted RNA chromatography

    PubMed Central

    Michlewski, Gracjan; Cáceres, Javier F.

    2010-01-01

    RNA chromatography combined with mass spectrometry represents a widely used experimental approach to identify RNA-binding proteins that recognize specific RNA targets. An important drawback of most of these protocols is the high background due to direct or indirect nonspecific binding of cellular proteins to the beads. In many cases this can hamper the detection of individual proteins due to their low levels and/or comigration with contaminating proteins. Increasing the salt concentration during washing steps can reduce background, but at the cost of using less physiological salt concentrations and the likely loss of important RNA-binding proteins that are less stringently bound to a given RNA, as well as the disassembly of protein or ribonucleoprotein complexes. Here, we describe an improved RNA chromatography method that relies on the use of a cocktail of RNases in the elution step. This results in the release of proteins specifically associated with the RNA ligand and almost complete elimination of background noise, allowing a more sensitive and thorough detection of RNA-binding proteins recognizing a specific RNA transcript. PMID:20571124

  16. Molecular evolution of cyclin proteins in animals and fungi

    PubMed Central

    2011-01-01

    Background The passage through the cell cycle is controlled by complexes of cyclins, the regulatory units, with cyclin-dependent kinases, the catalytic units. It is also known that cyclins form several families, which differ considerably in primary structure from one eukaryotic organism to another. Despite these lines of evidence, the relationship between the evolution of cyclins and their function is an open issue. Here we present the results of our study on the molecular evolution of A-, B-, D-, E-type cyclin proteins in animals and fungi. Results We constructed phylogenetic trees for these proteins, their ancestral sequences and analyzed patterns of amino acid replacements. The analysis of infrequently fixed atypical amino acid replacements in cyclins evidenced that accelerated evolution proceeded predominantly during paralog duplication or after it in animals and fungi and that it was related to aromorphic changes in animals. It was shown also that evolutionary flexibility of cyclin function may be provided by consequential reorganization of regions on protein surface remote from CDK binding sites in animal and fungal cyclins and by functional differentiation of paralogous cyclins formed in animal evolution. Conclusions The results suggested that changes in the number and/or nature of cyclin-binding proteins may underlie the evolutionary role of the alterations in the molecular structure of cyclins and their involvement in diverse molecular-genetic events. PMID:21798004

  17. Characterization of rubella-specific humoral immunity following two doses of MMR vaccine using proteome microarray technology

    PubMed Central

    Haralambieva, Iana H.; Gibson, Michael J.; Kennedy, Richard B.; Ovsyannikova, Inna G.; Warner, Nathaniel D.; Grill, Diane E.

    2017-01-01

    Introduction//Background The lack of standardization of the currently used commercial anti-rubella IgG antibody assays leads to frequent misinterpretation of results for samples with low/equivocal antibody concentration. The use of alternative approaches in rubella serology could add new information leading to a fuller understanding of rubella protective immunity and neutralizing antibody response after vaccination. Methods We applied microarray technology to measure antibodies to all rubella virus proteins in 75 high and 75 low rubella virus-specific antibody responders after two MMR vaccine doses. These data were used in multivariate penalized logistic regression modeling of rubella-specific neutralizing antibody response after vaccination. Results We measured antibodies to all rubella virus structural proteins (i.e., the glycoproteins E1 and E2 and the capsid C protein) and to the non-structural protein P150. Antibody levels to each of these proteins were: correlated with the neutralizing antibody titer (p<0.006); demonstrated differences between the high and the low antibody responder groups (p<0.008); and were components of the model associated with/predictive of vaccine-induced rubella virus-specific neutralizing antibody titers (misclassification error = 0.2). Conclusion Our study supports the use of this new technology, as well as the use of antibody profiles/patterns (rather than single antibody measures) as biomarkers of neutralizing antibody response and correlates of protective immunity in rubella virus serology. PMID:29145521

  18. Characterizing informative sequence descriptors and predicting binding affinities of heterodimeric protein complexes

    PubMed Central

    2015-01-01

    Background Protein-protein interactions (PPIs) are involved in various biological processes, and underlying mechanism of the interactions plays a crucial role in therapeutics and protein engineering. Most machine learning approaches have been developed for predicting the binding affinity of protein-protein complexes based on structure and functional information. This work aims to predict the binding affinity of heterodimeric protein complexes from sequences only. Results This work proposes a support vector machine (SVM) based binding affinity classifier, called SVM-BAC, to classify heterodimeric protein complexes based on the prediction of their binding affinity. SVM-BAC identified 14 of 580 sequence descriptors (physicochemical, energetic and conformational properties of the 20 amino acids) to classify 216 heterodimeric protein complexes into low and high binding affinity. SVM-BAC yielded the training accuracy, sensitivity, specificity, AUC and test accuracy of 85.80%, 0.89, 0.83, 0.86 and 83.33%, respectively, better than existing machine learning algorithms. The 14 features and support vector regression were further used to estimate the binding affinities (Pkd) of 200 heterodimeric protein complexes. Prediction performance of a Jackknife test was the correlation coefficient of 0.34 and mean absolute error of 1.4. We further analyze three informative physicochemical properties according to their contribution to prediction performance. Results reveal that the following properties are effective in predicting the binding affinity of heterodimeric protein complexes: apparent partition energy based on buried molar fractions, relations between chemical structure and biological activity in principal component analysis IV, and normalized frequency of beta turn. Conclusions The proposed sequence-based prediction method SVM-BAC uses an optimal feature selection method to identify 14 informative features to classify and predict binding affinity of heterodimeric protein complexes. The characterization analysis revealed that the average numbers of beta turns and hydrogen bonds at protein-protein interfaces in high binding affinity complexes are more than those in low binding affinity complexes. PMID:26681483

  19. Structure to function prediction of hypothetical protein KPN_00953 (Ycbk) from Klebsiella pneumoniae MGH 78578 highlights possible role in cell wall metabolism

    PubMed Central

    2014-01-01

    Background Klebsiella pneumoniae plays a major role in causing nosocomial infection in immunocompromised patients. Medical inflictions by the pathogen can range from respiratory and urinary tract infections, septicemia and primarily, pneumonia. As more K. pneumoniae strains are becoming highly resistant to various antibiotics, treatment of this bacterium has been rendered more difficult. This situation, as a consequence, poses a threat to public health. Hence, identification of possible novel drug targets against this opportunistic pathogen need to be undertaken. In the complete genome sequence of K. pneumoniae MGH 78578, approximately one-fourth of the genome encodes for hypothetical proteins (HPs). Due to their low homology and relatedness to other known proteins, HPs may serve as potential, new drug targets. Results Sequence analysis on the HPs of K. pneumoniae MGH 78578 revealed that a particular HP termed KPN_00953 (YcbK) contains a M15_3 peptidases superfamily conserved domain. Some members of this superfamily are metalloproteases which are involved in cell wall metabolism. BLASTP similarity search on KPN_00953 (YcbK) revealed that majority of the hits were hypothetical proteins although two of the hits suggested that it may be a lipoprotein or related to twin-arginine translocation (Tat) pathway important for transport of proteins to the cell membrane and periplasmic space. As lipoproteins and other components of the cell wall are important pathogenic factors, homology modeling of KPN_00953 was attempted to predict the structure and function of this protein. Three-dimensional model of the protein showed that its secondary structure topology and active site are similar with those found among metalloproteases where two His residues, namely His169 and His209 and an Asp residue, Asp176 in KPN_00953 were found to be Zn-chelating residues. Interestingly, induced expression of the cloned KPN_00953 gene in lipoprotein-deficient E. coli JE5505 resulted in smoother cells with flattened edges. Some cells showed deposits of film-like material under scanning electron microscope. Conclusions We postulate that KPN_00953 is a Zn metalloprotease and may play a role in bacterial cell wall metabolism. Structural biology studies to understand its structure, function and mechanism of action pose the possibility of utilizing this protein as a new drug target against K. pneumoniae in the future. PMID:24499172

  20. Genetic analyses of bone morphogenetic protein 2, 4 and 7 in congenital combined pituitary hormone deficiency

    PubMed Central

    2013-01-01

    Background The complex process of development of the pituitary gland is regulated by a number of signalling molecules and transcription factors. Mutations in these factors have been identified in rare cases of congenital hypopituitarism but for most subjects with combined pituitary hormone deficiency (CPHD) genetic causes are unknown. Bone morphogenetic proteins (BMPs) affect induction and growth of the pituitary primordium and thus represent plausible candidates for mutational screening of patients with CPHD. Methods We sequenced BMP2, 4 and 7 in 19 subjects with CPHD. For validation purposes, novel genetic variants were genotyped in 1046 healthy subjects. Additionally, potential functional relevance for most promising variants has been assessed by phylogenetic analyses and prediction of effects on protein structure. Results Sequencing revealed two novel variants and confirmed 30 previously known polymorphisms and mutations in BMP2, 4 and 7. Although phylogenetic analyses indicated that these variants map within strongly conserved gene regions, there was no direct support for their impact on protein structure when applying predictive bioinformatics tools. Conclusions A mutation in the BMP4 coding region resulting in an amino acid exchange (p.Arg300Pro) appeared most interesting among the identified variants. Further functional analyses are required to ultimately map the relevance of these novel variants in CPHD. PMID:24289245

  1. Functional and Structural Characterization of FAU Gene/Protein from Marine Sponge Suberites domuncula

    PubMed Central

    Perina, Dragutin; Korolija, Marina; Popović Hadžija, Marijana; Grbeša, Ivana; Belužić, Robert; Imešek, Mirna; Morrow, Christine; Marjanović, Melanija Posavec; Bakran-Petricioli, Tatjana; Mikoč, Andreja; Ćetković, Helena

    2015-01-01

    Finkel-Biskis-Reilly murine sarcoma virus (FBR-MuSV) ubiquitously expressed (FAU) gene is down-regulated in human prostate, breast and ovarian cancers. Moreover, its dysregulation is associated with poor prognosis in breast cancer. Sponges (Porifera) are animals without tissues which branched off first from the common ancestor of all metazoans. A large majority of genes implicated in human cancers have their homologues in the sponge genome. Our study suggests that FAU gene from the sponge Suberites domuncula reflects characteristics of the FAU gene from the metazoan ancestor, which have changed only slightly during the course of animal evolution. We found pro-apoptotic activity of sponge FAU protein. The same as its human homologue, sponge FAU increases apoptosis in human HEK293T cells. This indicates that the biological functions of FAU, usually associated with “higher” metazoans, particularly in cancer etiology, possess a biochemical background established early in metazoan evolution. The ancestor of all animals possibly possessed FAU protein with the structure and function similar to evolutionarily more recent versions of the protein, even before the appearance of true tissues and the origin of tumors and metastasis. It provides an opportunity to use pre-bilaterian animals as a simpler model for studying complex interactions in human cancerogenesis. PMID:26198235

  2. Full genome comparison and characterization of avian H10 viruses with different pathogenicity in Mink (Mustela vison) reveals genetic and functional differences in the non-structural gene

    PubMed Central

    2010-01-01

    Background The unique property of some avian H10 viruses, particularly the ability to cause severe disease in mink without prior adaptation, enabled our study. Coupled with previous experimental data and genetic characterization here we tried to investigate the possible influence of different genes on the virulence of these H10 avian influenza viruses in mink. Results Phylogenetic analysis revealed a close relationship between the viruses studied. Our study also showed that there are no genetic differences in receptor specificity or the cleavability of the haemagglutinin proteins of these viruses regardless of whether they are of low or high pathogenicity in mink. In poly I:C stimulated mink lung cells the NS1 protein of influenza A virus showing high pathogenicity in mink down regulated the type I interferon promoter activity to a greater extent than the NS1 protein of the virus showing low pathogenicity in mink. Conclusions Differences in pathogenicity and virulence in mink between these strains could be related to clear amino acid differences in the non structural 1 (NS1) protein. The NS gene of mink/84 appears to have contributed to the virulence of the virus in mink by helping the virus evade the innate immune responses. PMID:20591155

  3. Floral ontogeny and gene protein localization rules out euanthial interpretation of reproductive units in Lepironia (Cyperaceae, Mapanioideae, Chrysitricheae)

    PubMed Central

    Prychid, C. J.; Bruhl, J. J.

    2013-01-01

    Background and Aims In the sedge subfamily Mapanioideae there are considerable discrepancies between the standard trimerous monocot floral architecture expected and the complex floral and inflorescence morphologies seen. Decades of debate about whether the basic reproductive units are single flowers or pseudanthia have not resolved the question. This paper evaluates current knowledge about Mapaniid reproductive structures and presents an ontogenetic study of the Mapaniid genus Lepironia with the first floral protein expression maps for the family, localizing the products of the APETALA1/FRUITFULL-like (AP1/FUL) MADS-box genes with the aim of shedding light on this conundrum. Methods A range of reproductive developmental stages, from spikelet primordia through to infructescence material, were processed for anatomical and immunohistochemical analyses. Key Results The basic reproductive unit is subtended by a bract and possesses two prophyll-like structures, the first organs to be initiated on the primordium, which grow rapidly, enclosing two whorls of initiating leaf-like structures with intervening stamens and a central gynoecium, formed from an annular primordium. The subtending bract and prophyll-like structures possess very different morphologies from that of the internal leaf-like structures and do not show AP1/FUL-like protein localization, which is otherwise strongly localized in the internal leaf-like structures, stamens and gynoecia. Conclusions Results support the synanthial hypothesis as the evolutionary origin of the reproductive unit. Thus, the basic reproductive unit in Lepironia is an extremely condensed pseudanthium, of staminate flowers surrounding a central terminal pistillate female flower. Early in development the reproductive unit becomes enclosed by a split-prophyll, with the whole structure subtended by a bract. PMID:23723258

  4. Microfluidic Chips for In Situ Crystal X-ray Diffraction and In Situ Dynamic Light Scattering for Serial Crystallography.

    PubMed

    Gicquel, Yannig; Schubert, Robin; Kapis, Svetlana; Bourenkov, Gleb; Schneider, Thomas; Perbandt, Markus; Betzel, Christian; Chapman, Henry N; Heymann, Michael

    2018-04-24

    This protocol describes fabricating microfluidic devices with low X-ray background optimized for goniometer based fixed target serial crystallography. The devices are patterned from epoxy glue using soft lithography and are suitable for in situ X-ray diffraction experiments at room temperature. The sample wells are lidded on both sides with polymeric polyimide foil windows that allow diffraction data collection with low X-ray background. This fabrication method is undemanding and inexpensive. After the sourcing of a SU-8 master wafer, all fabrication can be completed outside of a cleanroom in a typical research lab environment. The chip design and fabrication protocol utilize capillary valving to microfluidically split an aqueous reaction into defined nanoliter sized droplets. This loading mechanism avoids the sample loss from channel dead-volume and can easily be performed manually without using pumps or other equipment for fluid actuation. We describe how isolated nanoliter sized drops of protein solution can be monitored in situ by dynamic light scattering to control protein crystal nucleation and growth. After suitable crystals are grown, complete X-ray diffraction datasets can be collected using goniometer based in situ fixed target serial X-ray crystallography at room temperature. The protocol provides custom scripts to process diffraction datasets using a suite of software tools to solve and refine the protein crystal structure. This approach avoids the artefacts possibly induced during cryo-preservation or manual crystal handling in conventional crystallography experiments. We present and compare three protein structures that were solved using small crystals with dimensions of approximately 10-20 µm grown in chip. By crystallizing and diffracting in situ, handling and hence mechanical disturbances of fragile crystals is minimized. The protocol details how to fabricate a custom X-ray transparent microfluidic chip suitable for in situ serial crystallography. As almost every crystal can be used for diffraction data collection, these microfluidic chips are a very efficient crystal delivery method.

  5. Plasma-assisted quadruple-channel optosensing of proteins and cells with Mn-doped ZnS quantum dots.

    PubMed

    Li, Chenghui; Wu, Peng; Hou, Xiandeng

    2016-02-21

    Information extraction from nano-bio-systems is crucial for understanding their inner molecular level interactions and can help in the development of multidimensional/multimodal sensing devices to realize novel or expanded functionalities. The intrinsic fluorescence (IF) of proteins has long been considered as an effective tool for studying protein structures and dynamics, but not for protein recognition analysis partially because it generally contributes to the fluorescence background in bioanalysis. Here we explored the use of IF as the fourth channel optical input for a multidimensional optosensing device, together with the triple-channel optical output of Mn-doped ZnS QDs (fluorescence from ZnS host, phosphorescence from Mn(2+) dopant, and Rayleigh light scattering from the QDs), to dramatically improve the protein recognition and discrimination resolution. To further increase the cross-reactivity of the multidimensional optosensing device, plasma modification of proteins was explored to enhance the IF difference as well as their interactions with Mn-doped ZnS QDs. Such a sensor device was demonstrated for highly discriminative and precise identification of proteins in human serum and urine samples, and for cancer and normal cells as well.

  6. Dual Function of Novel Pollen Coat (Surface) Proteins: IgE-binding Capacity and Proteolytic Activity Disrupting the Airway Epithelial Barrier

    PubMed Central

    Bashir, Mohamed Elfatih H.; Ward, Jason M.; Cummings, Matthew; Karrar, Eltayeb E.; Root, Michael; Mohamed, Abu Bekr A.; Naclerio, Robert M.; Preuss, Daphne

    2013-01-01

    Background The pollen coat is the first structure of the pollen to encounter the mucosal immune system upon inhalation. Prior characterizations of pollen allergens have focused on water-soluble, cytoplasmic proteins, but have overlooked much of the extracellular pollen coat. Due to washing with organic solvents when prepared, these pollen coat proteins are typically absent from commercial standardized allergenic extracts (i.e., “de-fatted”), and, as a result, their involvement in allergy has not been explored. Methodology/Principal Findings Using a unique approach to search for pollen allergenic proteins residing in the pollen coat, we employed transmission electron microscopy (TEM) to assess the impact of organic solvents on the structural integrity of the pollen coat. TEM results indicated that de-fatting of Cynodon dactylon (Bermuda grass) pollen (BGP) by use of organic solvents altered the structural integrity of the pollen coat. The novel IgE-binding proteins of the BGP coat include a cysteine protease (CP) and endoxylanase (EXY). The full-length cDNA that encodes the novel IgE-reactive CP was cloned from floral RNA. The EXY and CP were purified to homogeneity and tested for IgE reactivity. The CP from the BGP coat increased the permeability of human airway epithelial cells, caused a clear concentration-dependent detachment of cells, and damaged their barrier integrity. Conclusions/Significance Using an immunoproteomics approach, novel allergenic proteins of the BGP coat were identified. These proteins represent a class of novel dual-function proteins residing on the coat of the pollen grain that have IgE-binding capacity and proteolytic activity, which disrupts the integrity of the airway epithelial barrier. The identification of pollen coat allergens might explain the IgE-negative response to available skin-prick-testing proteins in patients who have positive symptoms. Further study of the role of these pollen coat proteins in allergic responses is warranted and could potentially lead to the development of improved diagnostic and therapeutic tools. PMID:23308195

  7. Modern bioanalysis of proteins by electrophoretic techniques.

    PubMed

    Krizkova, Sona; Ryvolova, Marketa; Masarik, Michal; Zitka, Ondrej; Adam, Vojtech; Hubalek, Jaromir; Eckschlager, Tomas; Kizek, Rene

    2014-01-01

    In 1957, protein rich in cysteine able to bind cadmium was isolated from horse kidney and named as metallothionein according to its structural properties. Further, this protein and metallothionein-like proteins have been found in tissues of other animal species, yeasts, fungi and plants. MT is as a potential cancer marker in the focus of interest, and its properties, functions, and behavior under various conditions are intensively studied. Our protocol describes separation of two major mammalian isoforms of MT (MT-1 and MT-2) using capillary electrophoresis (CE) coupled with UV detector. This protocol enables separation of MT isoforms and studying of their basic behavior as well as their quantification with detection limit in units of ng per μL. Sodium borate buffer (20 mM, pH 9.5) was optimized as a background electrolyte, and the separation was carried out in fused silica capillary with internal diameter of 75 μm and electric field intensity of 350 V/cm. Optimal detection wavelength was 254 nm.

  8. Genetic Background is a Key Determinant of Glomerular Extracellular Matrix Composition and Organization.

    PubMed

    Randles, Michael J; Woolf, Adrian S; Huang, Jennifer L; Byron, Adam; Humphries, Jonathan D; Price, Karen L; Kolatsi-Joannou, Maria; Collinson, Sophie; Denny, Thomas; Knight, David; Mironov, Aleksandr; Starborg, Toby; Korstanje, Ron; Humphries, Martin J; Long, David A; Lennon, Rachel

    2015-12-01

    Glomerular disease often features altered histologic patterns of extracellular matrix (ECM). Despite this, the potential complexities of the glomerular ECM in both health and disease are poorly understood. To explore whether genetic background and sex determine glomerular ECM composition, we investigated two mouse strains, FVB and B6, using RNA microarrays of isolated glomeruli combined with proteomic glomerular ECM analyses. These studies, undertaken in healthy young adult animals, revealed unique strain- and sex-dependent glomerular ECM signatures, which correlated with variations in levels of albuminuria and known predisposition to progressive nephropathy. Among the variation, we observed changes in netrin 4, fibroblast growth factor 2, tenascin C, collagen 1, meprin 1-α, and meprin 1-β. Differences in protein abundance were validated by quantitative immunohistochemistry and Western blot analysis, and the collective differences were not explained by mutations in known ECM or glomerular disease genes. Within the distinct signatures, we discovered a core set of structural ECM proteins that form multiple protein-protein interactions and are conserved from mouse to man. Furthermore, we found striking ultrastructural changes in glomerular basement membranes in FVB mice. Pathway analysis of merged transcriptomic and proteomic datasets identified potential ECM regulatory pathways involving inhibition of matrix metalloproteases, liver X receptor/retinoid X receptor, nuclear factor erythroid 2-related factor 2, notch, and cyclin-dependent kinase 5. These pathways may therefore alter ECM and confer susceptibility to disease. Copyright © 2015 by the American Society of Nephrology.

  9. Highly Sensitive Detection of Individual HEAT and ARM Repeats with HHpred and COACH

    PubMed Central

    Kippert, Fred; Gerloff, Dietlind L.

    2009-01-01

    Background HEAT and ARM repeats occur in a large number of eukaryotic proteins. As these repeats are often highly diverged, the prediction of HEAT or ARM domains can be challenging. Except for the most clear-cut cases, identification at the individual repeat level is indispensable, in particular for determining domain boundaries. However, methods using single sequence queries do not have the sensitivity required to deal with more divergent repeats and, when applied to proteins with known structures, in some cases failed to detect a single repeat. Methodology and Principal Findings Testing algorithms which use multiple sequence alignments as queries, we found two of them, HHpred and COACH, to detect HEAT and ARM repeats with greatly enhanced sensitivity. Calibration against experimentally determined structures suggests the use of three score classes with increasing confidence in the prediction, and prediction thresholds for each method. When we applied a new protocol using both HHpred and COACH to these structures, it detected 82% of HEAT repeats and 90% of ARM repeats, with the minimum for a given protein of 57% for HEAT repeats and 60% for ARM repeats. Application to bona fide HEAT and ARM proteins or domains indicated that similar numbers can be expected for the full complement of HEAT/ARM proteins. A systematic screen of the Protein Data Bank for false positive hits revealed their number to be low, in particular for ARM repeats. Double false positive hits for a given protein were rare for HEAT and not at all observed for ARM repeats. In combination with fold prediction and consistency checking (multiple sequence alignments, secondary structure prediction, and position analysis), repeat prediction with the new HHpred/COACH protocol dramatically improves prediction in the twilight zone of fold prediction methods, as well as the delineation of HEAT/ARM domain boundaries. Significance A protocol is presented for the identification of individual HEAT or ARM repeats which is straightforward to implement. It provides high sensitivity at a low false positive rate and will therefore greatly enhance the accuracy of predictions of HEAT and ARM domains. PMID:19777061

  10. Distantly related lipocalins share two conserved clusters of hydrophobic residues: use in homology modeling

    PubMed Central

    Adam, Benoit; Charloteaux, Benoit; Beaufays, Jerome; Vanhamme, Luc; Godfroid, Edmond; Brasseur, Robert; Lins, Laurence

    2008-01-01

    Background Lipocalins are widely distributed in nature and are found in bacteria, plants, arthropoda and vertebra. In hematophagous arthropods, they are implicated in the successful accomplishment of the blood meal, interfering with platelet aggregation, blood coagulation and inflammation and in the transmission of disease parasites such as Trypanosoma cruzi and Borrelia burgdorferi. The pairwise sequence identity is low among this family, often below 30%, despite a well conserved tertiary structure. Under the 30% identity threshold, alignment methods do not correctly assign and align proteins. The only safe way to assign a sequence to that family is by experimental determination. However, these procedures are long and costly and cannot always be applied. A way to circumvent the experimental approach is sequence and structure analyze. To further help in that task, the residues implicated in the stabilisation of the lipocalin fold were determined. This was done by analyzing the conserved interactions for ten lipocalins having a maximum pairwise identity of 28% and various functions. Results It was determined that two hydrophobic clusters of residues are conserved by analysing the ten lipocalin structures and sequences. One cluster is internal to the barrel, involving all strands and the 310 helix. The other is external, involving four strands and the helix lying parallel to the barrel surface. These clusters are also present in RaHBP2, a unusual "outlier" lipocalin from tick Rhipicephalus appendiculatus. This information was used to assess assignment of LIR2 a protein from Ixodes ricinus and to build a 3D model that helps to predict function. FTIR data support the lipocalin fold for this protein. Conclusion By sequence and structural analyzes, two conserved clusters of hydrophobic residues in interactions have been identified in lipocalins. Since the residues implicated are not conserved for function, they should provide the minimal subset necessary to confer the lipocalin fold. This information has been used to assign LIR2 to lipocalins and to investigate its structure/function relationship. This study could be applied to other protein families with low pairwise similarity, such as the structurally related fatty acid binding proteins or avidins. PMID:18190694

  11. MultiSeq: unifying sequence and structure data for evolutionary analysis

    PubMed Central

    Roberts, Elijah; Eargle, John; Wright, Dan; Luthey-Schulten, Zaida

    2006-01-01

    Background Since the publication of the first draft of the human genome in 2000, bioinformatic data have been accumulating at an overwhelming pace. Currently, more than 3 million sequences and 35 thousand structures of proteins and nucleic acids are available in public databases. Finding correlations in and between these data to answer critical research questions is extremely challenging. This problem needs to be approached from several directions: information science to organize and search the data; information visualization to assist in recognizing correlations; mathematics to formulate statistical inferences; and biology to analyze chemical and physical properties in terms of sequence and structure changes. Results Here we present MultiSeq, a unified bioinformatics analysis environment that allows one to organize, display, align and analyze both sequence and structure data for proteins and nucleic acids. While special emphasis is placed on analyzing the data within the framework of evolutionary biology, the environment is also flexible enough to accommodate other usage patterns. The evolutionary approach is supported by the use of predefined metadata, adherence to standard ontological mappings, and the ability for the user to adjust these classifications using an electronic notebook. MultiSeq contains a new algorithm to generate complete evolutionary profiles that represent the topology of the molecular phylogenetic tree of a homologous group of distantly related proteins. The method, based on the multidimensional QR factorization of multiple sequence and structure alignments, removes redundancy from the alignments and orders the protein sequences by increasing linear dependence, resulting in the identification of a minimal basis set of sequences that spans the evolutionary space of the homologous group of proteins. Conclusion MultiSeq is a major extension of the Multiple Alignment tool that is provided as part of VMD, a structural visualization program for analyzing molecular dynamics simulations. Both are freely distributed by the NIH Resource for Macromolecular Modeling and Bioinformatics and MultiSeq is included with VMD starting with version 1.8.5. The MultiSeq website has details on how to download and use the software: PMID:16914055

  12. A Systems Biology-Based Approach to Uncovering Molecular Mechanisms Underlying Effects of Traditional Chinese Medicine Qingdai in Chronic Myelogenous Leukemia, Involving Integration of Network Pharmacology and Molecular Docking Technology.

    PubMed

    Zhou, Chao; Liu, LiJuan; Zhuang, Jing; Wei, JunYu; Zhang, TingTing; Gao, ChunDi; Liu, Cun; Li, HuaYao; Si, HongZong; Sun, ChangGang

    2018-06-23

    BACKGROUND The method of multiple targets overall control is increasingly used to predict the main active ingredient and potential target group of Chinese traditional medicines and to determine the mechanisms involved in their curative effects. Qingdai is the main traditional Chinese medicine used in the treatment of chronic myelogenous leukemia (CML), but the complex active ingredients and antitumor targets in treatment of CML have not been clearly defined in previous studies. MATERIAL AND METHODS We constructed a protein-protein interaction network diagram of CML with 638 nodes (proteins) and 1830 edges, based on the biological function of chronic myelocytic leukemia by use of Cytoscape, and we determined 19 key gene nodes in the CML molecule by network topological properties analysis in a data bank. Then, we used the Surflex-dock plugin in SYBYL7.3 docking and acquired the protein crystal structures of key genes involved in CML from the chemical composition of the traditional Chinese medicine Qingdai with key proteins in CML networks. RESULTS According to the score and the spatial structure, the pharmacodynamically active ingredients of Qingdai are Isdirubin, Isoindigo, N-phenyl-2-naphthylamine, and Isatin, among which Isdirubin is the most important. We further screened the most effective activity key protein structures of CML to find the best pharmacodynamically active ingredients of Qingdai, according to the binding interactions of the inhibitors at the catalytic site performed in best docking combinations. CONCLUSIONS The results suggest that Isdirubin plays a role in resistance to CML by altering the expressions of PIK3CA, MYC, JAK2, and TP53 target proteins. Network pharmacology and molecular docking technology can be used to search for possible reactive molecules in traditional chinese medicines (TCM) and to elucidate their molecular mechanisms.

  13. Effects of storage temperature on airway exosome integrity for diagnostic and functional analyses

    PubMed Central

    Maroto, Rosario; Zhao, Yingxin; Jamaluddin, Mohammad; Popov, Vsevolod L.; Wang, Hongwang; Kalubowilage, Madumali; Zhang, Yueqing; Luisi, Jonathan; Sun, Hong; Culbertson, Christopher T.; Bossmann, Stefan H.; Motamedi, Massoud; Brasier, Allan R.

    2017-01-01

    ABSTRACT Background: Extracellular vesicles contain biological molecules specified by cell-type of origin and modified by microenvironmental changes. To conduct reproducible studies on exosome content and function, storage conditions need to have minimal impact on airway exosome integrity. Aim: We compared surface properties and protein content of airway exosomes that had been freshly isolated vs. those that had been treated with cold storage or freezing. Methods: Mouse bronchoalveolar lavage fluid (BALF) exosomes purified by differential ultracentrifugation were analysed immediately or stored at +4°C or −80°C. Exosomal structure was assessed by dynamic light scattering (DLS), transmission electron microscopy (TEM) and charge density (zeta potential, ζ). Exosomal protein content, including leaking/dissociating proteins, were identified by label-free LC-MS/MS. Results: Freshly isolated BALF exosomes exhibited a mean diameter of 95 nm and characteristic morphology. Storage had significant impact on BALF exosome size and content. Compared to fresh, exosomes stored at +4°C had a 10% increase in diameter, redistribution to polydisperse aggregates and reduced ζ. Storage at −80°C produced an even greater effect, resulting in a 25% increase in diameter, significantly reducing the ζ, resulting in multilamellar structure formation. In fresh exosomes, we identified 1140 high-confidence proteins enriched in 19 genome ontology biological processes. After storage at room temperature, 848 proteins were identified. In preparations stored at +4°C, 224 proteins appeared in the supernatant fraction compared to the wash fractions from freshly prepared exosomes; these proteins represent exosome leakage or dissociation of loosely bound “peri-exosomal” proteins. In preparations stored at −80°C, 194 proteins appeared in the supernatant fraction, suggesting that distinct protein groups leak from exosomes at different storage temperatures. Conclusions: Storage destabilizes the surface characteristics, morphological features and protein content of BALF exosomes. For preservation of the exosome protein content and representative functional analysis, airway exosomes should be analysed immediately after isolation. PMID:28819550

  14. Effects of storage temperature on airway exosome integrity for diagnostic and functional analyses.

    PubMed

    Maroto, Rosario; Zhao, Yingxin; Jamaluddin, Mohammad; Popov, Vsevolod L; Wang, Hongwang; Kalubowilage, Madumali; Zhang, Yueqing; Luisi, Jonathan; Sun, Hong; Culbertson, Christopher T; Bossmann, Stefan H; Motamedi, Massoud; Brasier, Allan R

    2017-01-01

    Background : Extracellular vesicles contain biological molecules specified by cell-type of origin and modified by microenvironmental changes. To conduct reproducible studies on exosome content and function, storage conditions need to have minimal impact on airway exosome integrity. Aim : We compared surface properties and protein content of airway exosomes that had been freshly isolated vs. those that had been treated with cold storage or freezing. Methods : Mouse bronchoalveolar lavage fluid (BALF) exosomes purified by differential ultracentrifugation were analysed immediately or stored at +4°C or -80°C. Exosomal structure was assessed by dynamic light scattering (DLS), transmission electron microscopy (TEM) and charge density (zeta potential, ζ). Exosomal protein content, including leaking/dissociating proteins, were identified by label-free LC-MS/MS. Results : Freshly isolated BALF exosomes exhibited a mean diameter of 95 nm and characteristic morphology. Storage had significant impact on BALF exosome size and content. Compared to fresh, exosomes stored at +4°C had a 10% increase in diameter, redistribution to polydisperse aggregates and reduced ζ. Storage at -80°C produced an even greater effect, resulting in a 25% increase in diameter, significantly reducing the ζ, resulting in multilamellar structure formation. In fresh exosomes, we identified 1140 high-confidence proteins enriched in 19 genome ontology biological processes. After storage at room temperature, 848 proteins were identified. In preparations stored at +4°C, 224 proteins appeared in the supernatant fraction compared to the wash fractions from freshly prepared exosomes; these proteins represent exosome leakage or dissociation of loosely bound "peri-exosomal" proteins. In preparations stored at -80°C, 194 proteins appeared in the supernatant fraction, suggesting that distinct protein groups leak from exosomes at different storage temperatures. Conclusions : Storage destabilizes the surface characteristics, morphological features and protein content of BALF exosomes. For preservation of the exosome protein content and representative functional analysis, airway exosomes should be analysed immediately after isolation.

  15. Protein-protein docking using region-based 3D Zernike descriptors

    PubMed Central

    2009-01-01

    Background Protein-protein interactions are a pivotal component of many biological processes and mediate a variety of functions. Knowing the tertiary structure of a protein complex is therefore essential for understanding the interaction mechanism. However, experimental techniques to solve the structure of the complex are often found to be difficult. To this end, computational protein-protein docking approaches can provide a useful alternative to address this issue. Prediction of docking conformations relies on methods that effectively capture shape features of the participating proteins while giving due consideration to conformational changes that may occur. Results We present a novel protein docking algorithm based on the use of 3D Zernike descriptors as regional features of molecular shape. The key motivation of using these descriptors is their invariance to transformation, in addition to a compact representation of local surface shape characteristics. Docking decoys are generated using geometric hashing, which are then ranked by a scoring function that incorporates a buried surface area and a novel geometric complementarity term based on normals associated with the 3D Zernike shape description. Our docking algorithm was tested on both bound and unbound cases in the ZDOCK benchmark 2.0 dataset. In 74% of the bound docking predictions, our method was able to find a near-native solution (interface C-αRMSD ≤ 2.5 Å) within the top 1000 ranks. For unbound docking, among the 60 complexes for which our algorithm returned at least one hit, 60% of the cases were ranked within the top 2000. Comparison with existing shape-based docking algorithms shows that our method has a better performance than the others in unbound docking while remaining competitive for bound docking cases. Conclusion We show for the first time that the 3D Zernike descriptors are adept in capturing shape complementarity at the protein-protein interface and useful for protein docking prediction. Rigorous benchmark studies show that our docking approach has a superior performance compared to existing methods. PMID:20003235

  16. Identification of ER Proteins Involved in the Functional Organisation of the Early Secretory Pathway in Drosophila Cells by a Targeted RNAi Screen

    PubMed Central

    Kondylis, Vangelis; Tang, Yang; Fuchs, Florian; Boutros, Michael; Rabouille, Catherine

    2011-01-01

    Background In Drosophila, the early secretory apparatus comprises discrete paired Golgi stacks in close proximity to exit sites from the endoplasmic reticulum (tER sites), thus forming tER-Golgi units. Although many components involved in secretion have been identified, the structural components sustaining its organisation are less known. Here we set out to identify novel ER resident proteins involved in the of tER-Golgi unit organisation. Results To do so, we designed a novel screening strategy combining a bioinformatics pre-selection with an RNAi screen. We first selected 156 proteins exhibiting known or related ER retention/retrieval signals from a list of proteins predicted to have a signal sequence. We then performed a microscopy-based primary and confirmation RNAi screen in Drosophila S2 cells directly scoring the organisation of the tER-Golgi units. We identified 49 hits, most of which leading to an increased number of smaller tER-Golgi units (MG for “more and smaller Golgi”) upon depletion. 16 of them were validated and characterised, showing that this phenotype was not due to an inhibition in secretion, a block in G2, or ER stress. Interestingly, the MG phenotype was often accompanied by an increase in the cell volume. Out of 6 proteins, 4 were localised to the ER. Conclusions This work has identified novel proteins involved in the organisation of the Drosophila early secretory pathway. It contributes to the effort of assigning protein functions to gene annotation in the secretory pathway, and analysis of the MG hits revealed an enrichment of ER proteins. These results suggest a link between ER localisation, aspects of cell metabolism and tER-Golgi structural organisation. PMID:21383842

  17. Sweetness determinant sites of brazzein, a small, heat-stable, sweet-tasting protein.

    PubMed

    Assadi-Porter, F M; Aceti, D J; Markley, J L

    2000-04-15

    Brazzein, originally isolated from the fruit of the African plant Pentadiplandra brazzeana Baillon, is the smallest, most heat-stable and pH-stable member of the set of proteins known to have intrinsic sweetness. These properties make brazzein an ideal system for investigating the chemical and structural requirements of a sweet-tasting protein. We have used the three-dimensional structure of the protein (J. E. Caldwell et al. (1998) Nat. Struct. Biol. 5, 427-431) as a guide in designing 15 synthetic genes in expression constructs aimed at delineating the sweetness determinants of brazzein. Protein was produced heterologously in Escherichia coli, isolated, and purified as described in the companion paper (Assadi-Porter, F. M., Aceti, D., Cheng, H., and Markley, J. L., this issue). Analysis by one-dimensional (1)H NMR spectroscopy indicated that all but one of these variants had folded properly under the conditions used. A taste panel compared the gustatory properties of solutions of these proteins to those of sucrose and brazzein isolated from fruit. Of the 14 mutations in the des-pGlu1-brazzein background, four exhibited almost no sweetness, six had significantly reduced sweetness, two had taste properties equivalent to des-pGlu1-brazzein (two times as sweet as the major form of brazzein isolated from fruit which contains pGlu1), and two were about twice as sweet as des-pGlu1-brazzein. Overall, the results suggest that two regions of the protein are critical for the sweetness of brazzein: a region that includes the N- and C-termini of the protein, which are located close to one another, and a region that includes the flexible loop around Arg43. Copyright 2000 Academic Press.

  18. Conditional protein splicing: a new tool to control protein structure and function in vitro and in vivo.

    PubMed

    Mootz, Henning D; Blum, Elyse S; Tyszkiewicz, Amy B; Muir, Tom W

    2003-09-03

    Protein splicing is a naturally occurring process in which an intervening intein domain excises itself out of a precursor polypeptide in an autocatalytic fashion with concomitant linkage of the two flanking extein sequences by a native peptide bond. We have recently reported an engineered split VMA intein whose splicing activity in trans between two polypeptides can be triggered by the small molecule rapamycin. In this report, we show that this conditional protein splicing (CPS) system can be used in mammalian cells. Two model constructs harboring maltose-binding protein (MBP) and a His-tag as exteins were expressed from a constitutive promoter after transient transfection. The splicing product MBP-His was detected by Western blotting and immunoprecipitation in cells treated with rapamycin or a nontoxic analogue thereof. No background splicing in the absence of the small-molecule inducer was observed over a 24-h time course. Product formation could be detected within 10 min of addition of rapamycin, indicating the advantage of the posttranslational nature of CPS for quick responses. The level of protein splicing was dose dependent and could be competitively attenuated with the small molecule ascomycin. In related studies, the geometric flexibility of the CPS components was investigated with a series of purified proteins. The FKBP and FRB domains, which are dimerized by rapamycin and thereby induce the reconstitution of the split intein, were fused to the extein sequences of the split intein halves. CPS was still triggered by rapamycin when FKBP and FRB occupied one or both of the extein positions. This finding suggests yet further applications of CPS in the area of proteomics. In summary, CPS holds great promise to become a powerful new tool to control protein structure and function in vitro and in living cells.

  19. Structural interactions between retroviral Gag proteins examined by cysteine cross-linking.

    PubMed Central

    Hansen, M S; Barklis, E

    1995-01-01

    We have examined structural interactions between Gag proteins within Moloney murine leukemia virus (M-MuLV) particles by making use of the cysteine-specific cross-linking agents iodine and bis-maleimido hexane. Virion-associated wild-type M-MuLV Pr65Gag proteins in immature particles were intermolecularly cross-linked at cysteines to form Pr65Gag oligomers, from dimers to pentamers or hexamers. Following a systematic approach of cysteine-to-serine mutagenesis, we have shown that cross-linking of Pr65Gag occurred at cysteines of the nucleocapsid (NC) Cys-His motif, suggesting that the Cys-His motifs within virus particles are packed in close proximity. The M-MuLV Pr65Gag protein did not cross-link to the human immunodeficiency virus Pr55Gag protein when the two molecules were coexpressed, indicating either that they did not coassemble or that heterologous Gag proteins were not in close enough proximity to be cross-linked. Using an assembly-competent, protease-minus, cysteine-minus Pr65Gag protein as a template, novel cysteine residues were generated in the M-MuLV capsid domain major homology region (MHR). Cross-linking of proteins containing MHR cysteines showed above-background levels of Gag-Gag dimers but also identified a novel cellular factor, present in virions, that cross-linked to MHR residues. Although the NC cysteine mutation was compatible with M-MuLV particle assembly, deletions of the NC domain were not tolerated. These results suggest that the Cys-His motif is held in close proximity within immature M-MuLV particles by interactions between CA domains and/or non-Cys-His motif domains of the NC. PMID:7815493

  20. Structure of human POFUT1, its requirement in ligand-independent oncogenic Notch signaling, and functional effects of Dowling-Degos mutations

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    McMillan, Brian J.; Zimmerman, Brandon; Egan, Emily D.

    Protein O-fucosyltransferase-1 (POFUT1), which transfers fucose residues to acceptor sites on serine and threonine residues of epidermal growth factor-like repeats of recipient proteins, is essential for Notch signal transduction in mammals. Here, we examine the consequences of POFUT1 loss on the oncogenic signaling associated with certain leukemia-associated mutations of human Notch1, report the structures of human POFUT1 in free and GDP-fucose bound states, and assess the effects of Dowling-Degos mutations on human POFUT1 function. CRISPR-mediated knockout of POFUT1 in U2OS cells suppresses both normal Notch1 signaling, and the ligand-independent signaling associated with leukemogenic mutations of Notch1. Normal and oncogenic signalingmore » are rescued by wild-type POFUT1 but rescue is impaired by an active-site R240A mutation. The overall structure of the human enzyme closely resembles that of the Caenorhabditis elegans protein, with an overall backbone RMSD of 0.93 Å, despite primary sequence identity of only 39% in the mature protein. GDP-fucose binding to the human enzyme induces limited backbone conformational movement, though the side chains of R43 and D244 reorient to make direct contact with the fucose moiety in the complex. The reported Dowling-Degos mutations of POFUT1, except for M262T, fail to rescue Notch1 signaling efficiently in the CRISPR-engineered POFUT1 -/- background. Together, these studies identify POFUT1 as a potential target for cancers driven by Notch1 mutations and provide a structural roadmap for its inhibition.« less

  1. Adaptive expansion of the maize maternally expressed gene (Meg) family involves changes in expression patterns and protein secondary structures of its members

    PubMed Central

    2014-01-01

    Background The Maternally expressed gene (Meg) family is a locally-duplicated gene family of maize which encodes cysteine-rich proteins (CRPs). The founding member of the family, Meg1, is required for normal development of the basal endosperm transfer cell layer (BETL) and is involved in the allocation of maternal nutrients to growing seeds. Despite the important roles of Meg1 in maize seed development, the evolutionary history of the Meg cluster and the activities of the duplicate genes are not understood. Results In maize, the Meg gene cluster resides in a 2.3 Mb-long genomic region that exhibits many features of non-centromeric heterochromatin. Using phylogenetic reconstruction and syntenic alignments, we identified the pedigree of the Meg family, in which 11 of its 13 members arose in maize after allotetraploidization ~4.8 mya. Phylogenetic and population-genetic analyses identified possible signatures suggesting recent positive selection in Meg homologs. Structural analyses of the Meg proteins indicated potentially adaptive changes in secondary structure from α-helix to β-strand during the expansion. Transcriptomic analysis of the maize endosperm indicated that 6 Meg genes are selectively activated in the BETL, and younger Meg genes are more active than older ones. In endosperms from B73 by Mo17 reciprocal crosses, most Meg genes did not display parent-specific expression patterns. Conclusions Recently-duplicated Meg genes have different protein secondary structures, and their expressions in the BETL dominate over those of older members. Together with the signs of positive selections in the young Meg genes, these results suggest that the expansion of the Meg family involves potentially adaptive transitions in which new members with novel functions prevailed over older members. PMID:25084677

  2. Computational Modeling-Based Discovery of Novel Classes of Anti-Inflammatory Drugs That Target Lanthionine Synthetase C-Like Protein 2

    PubMed Central

    Lu, Pinyi; Hontecillas, Raquel; Horne, William T.; Carbo, Adria; Viladomiu, Monica; Pedragosa, Mireia; Bevan, David R.; Lewis, Stephanie N.; Bassaganya-Riera, Josep

    2012-01-01

    Background Lanthionine synthetase component C-like protein 2 (LANCL2) is a member of the eukaryotic lanthionine synthetase component C-Like protein family involved in signal transduction and insulin sensitization. Recently, LANCL2 is a target for the binding and signaling of abscisic acid (ABA), a plant hormone with anti-diabetic and anti-inflammatory effects. Methodology/Principal Findings The goal of this study was to determine the role of LANCL2 as a potential therapeutic target for developing novel drugs and nutraceuticals against inflammatory diseases. Previously, we performed homology modeling to construct a three-dimensional structure of LANCL2 using the crystal structure of lanthionine synthetase component C-like protein 1 (LANCL1) as a template. Using this model, structure-based virtual screening was performed using compounds from NCI (National Cancer Institute) Diversity Set II, ChemBridge, ZINC natural products, and FDA-approved drugs databases. Several potential ligands were identified using molecular docking. In order to validate the anti-inflammatory efficacy of the top ranked compound (NSC61610) in the NCI Diversity Set II, a series of in vitro and pre-clinical efficacy studies were performed using a mouse model of dextran sodium sulfate (DSS)-induced colitis. Our findings showed that the lead compound, NSC61610, activated peroxisome proliferator-activated receptor gamma in a LANCL2- and adenylate cyclase/cAMP dependent manner in vitro and ameliorated experimental colitis by down-modulating colonic inflammatory gene expression and favoring regulatory T cell responses. Conclusions/Significance LANCL2 is a novel therapeutic target for inflammatory diseases. High-throughput, structure-based virtual screening is an effective computational-based drug design method for discovering anti-inflammatory LANCL2-based drug candidates. PMID:22509338

  3. Molecular dynamics simulation studies and in vitro site directed mutagenesis of avian beta-defensin Apl_AvBD2

    PubMed Central

    2010-01-01

    Background Defensins comprise a group of antimicrobial peptides, widely recognized as important elements of the innate immune system in both animals and plants. Cationicity, rather than the secondary structure, is believed to be the major factor defining the antimicrobial activity of defensins. To test this hypothesis and to improve the activity of the newly identified avian β-defensin Apl_AvBD2 by enhancing the cationicity, we performed in silico site directed mutagenesis, keeping the predicted secondary structure intact. Molecular dynamics (MD) simulation studies were done to predict the activity. Mutant proteins were made by in vitro site directed mutagenesis and recombinant protein expression, and tested for antimicrobial activity to confirm the results obtained in MD simulation analysis. Results MD simulation revealed subtle, but critical, structural variations between the wild type Apl_AvBD2 and the more cationic in silico mutants, which were not detected in the initial structural prediction by homology modelling. The C-terminal cationic 'claw' region, important in antimicrobial activity, which was intact in the wild type, showed changes in shape and orientation in all the mutant peptides. Mutant peptides also showed increased solvent accessible surface area and more number of hydrogen bonds with the surrounding water molecules. In functional studies, the Escherichia coli expressed, purified recombinant mutant proteins showed total loss of antimicrobial activity compared to the wild type protein. Conclusion The study revealed that cationicity alone is not the determining factor in the microbicidal activity of antimicrobial peptides. Factors affecting the molecular dynamics such as hydrophobicity, electrostatic interactions and the potential for oligomerization may also play fundamental roles. It points to the usefulness of MD simulation studies in successful engineering of antimicrobial peptides for improved activity and other desirable functions. PMID:20122244

  4. Validating clustering of molecular dynamics simulations using polymer models

    PubMed Central

    2011-01-01

    Background Molecular dynamics (MD) simulation is a powerful technique for sampling the meta-stable and transitional conformations of proteins and other biomolecules. Computational data clustering has emerged as a useful, automated technique for extracting conformational states from MD simulation data. Despite extensive application, relatively little work has been done to determine if the clustering algorithms are actually extracting useful information. A primary goal of this paper therefore is to provide such an understanding through a detailed analysis of data clustering applied to a series of increasingly complex biopolymer models. Results We develop a novel series of models using basic polymer theory that have intuitive, clearly-defined dynamics and exhibit the essential properties that we are seeking to identify in MD simulations of real biomolecules. We then apply spectral clustering, an algorithm particularly well-suited for clustering polymer structures, to our models and MD simulations of several intrinsically disordered proteins. Clustering results for the polymer models provide clear evidence that the meta-stable and transitional conformations are detected by the algorithm. The results for the polymer models also help guide the analysis of the disordered protein simulations by comparing and contrasting the statistical properties of the extracted clusters. Conclusions We have developed a framework for validating the performance and utility of clustering algorithms for studying molecular biopolymer simulations that utilizes several analytic and dynamic polymer models which exhibit well-behaved dynamics including: meta-stable states, transition states, helical structures, and stochastic dynamics. We show that spectral clustering is robust to anomalies introduced by structural alignment and that different structural classes of intrinsically disordered proteins can be reliably discriminated from the clustering results. To our knowledge, our framework is the first to utilize model polymers to rigorously test the utility of clustering algorithms for studying biopolymers. PMID:22082218

  5. Reconstruction of the experimentally supported human protein interactome: what can we learn?

    PubMed Central

    2013-01-01

    Background Understanding the topology and dynamics of the human protein-protein interaction (PPI) network will significantly contribute to biomedical research, therefore its systematic reconstruction is required. Several meta-databases integrate source PPI datasets, but the protein node sets of their networks vary depending on the PPI data combined. Due to this inherent heterogeneity, the way in which the human PPI network expands via multiple dataset integration has not been comprehensively analyzed. We aim at assembling the human interactome in a global structured way and exploring it to gain insights of biological relevance. Results First, we defined the UniProtKB manually reviewed human “complete” proteome as the reference protein-node set and then we mined five major source PPI datasets for direct PPIs exclusively between the reference proteins. We updated the protein and publication identifiers and normalized all PPIs to the UniProt identifier level. The reconstructed interactome covers approximately 60% of the human proteome and has a scale-free structure. No apparent differentiating gene functional classification characteristics were identified for the unrepresented proteins. The source dataset integration augments the network mainly in PPIs. Polyubiquitin emerged as the highest-degree node, but the inclusion of most of its identified PPIs may be reconsidered. The high number (>300) of connections of the subsequent fifteen proteins correlates well with their essential biological role. According to the power-law network structure, the unrepresented proteins should mainly have up to four connections with equally poorly-connected interactors. Conclusions Reconstructing the human interactome based on the a priori definition of the protein nodes enabled us to identify the currently included part of the human “complete” proteome, and discuss the role of the proteins within the network topology with respect to their function. As the network expansion has to comply with the scale-free theory, we suggest that the core of the human interactome has essentially emerged. Thus, it could be employed in systems biology and biomedical research, despite the considerable number of currently unrepresented proteins. The latter are probably involved in specialized physiological conditions, justifying the scarcity of related PPI information, and their identification can assist in designing relevant functional experiments and targeted text mining algorithms. PMID:24088582

  6. Exploiting protein flexibility to predict the location of allosteric sites

    PubMed Central

    2012-01-01

    Background Allostery is one of the most powerful and common ways of regulation of protein activity. However, for most allosteric proteins identified to date the mechanistic details of allosteric modulation are not yet well understood. Uncovering common mechanistic patterns underlying allostery would allow not only a better academic understanding of the phenomena, but it would also streamline the design of novel therapeutic solutions. This relatively unexplored therapeutic potential and the putative advantages of allosteric drugs over classical active-site inhibitors fuel the attention allosteric-drug research is receiving at present. A first step to harness the regulatory potential and versatility of allosteric sites, in the context of drug-discovery and design, would be to detect or predict their presence and location. In this article, we describe a simple computational approach, based on the effect allosteric ligands exert on protein flexibility upon binding, to predict the existence and position of allosteric sites on a given protein structure. Results By querying the literature and a recently available database of allosteric sites, we gathered 213 allosteric proteins with structural information that we further filtered into a non-redundant set of 91 proteins. We performed normal-mode analysis and observed significant changes in protein flexibility upon allosteric-ligand binding in 70% of the cases. These results agree with the current view that allosteric mechanisms are in many cases governed by changes in protein dynamics caused by ligand binding. Furthermore, we implemented an approach that achieves 65% positive predictive value in identifying allosteric sites within the set of predicted cavities of a protein (stricter parameters set, 0.22 sensitivity), by combining the current analysis on dynamics with previous results on structural conservation of allosteric sites. We also analyzed four biological examples in detail, revealing that this simple coarse-grained methodology is able to capture the effects triggered by allosteric ligands already described in the literature. Conclusions We introduce a simple computational approach to predict the presence and position of allosteric sites in a protein based on the analysis of changes in protein normal modes upon the binding of a coarse-grained ligand at predicted cavities. Its performance has been demonstrated using a newly curated non-redundant set of 91 proteins with reported allosteric properties. The software developed in this work is available upon request from the authors. PMID:23095452

  7. LXtoo: an integrated live Linux distribution for the bioinformatics community

    PubMed Central

    2012-01-01

    Background Recent advances in high-throughput technologies dramatically increase biological data generation. However, many research groups lack computing facilities and specialists. This is an obstacle that remains to be addressed. Here, we present a Linux distribution, LXtoo, to provide a flexible computing platform for bioinformatics analysis. Findings Unlike most of the existing live Linux distributions for bioinformatics limiting their usage to sequence analysis and protein structure prediction, LXtoo incorporates a comprehensive collection of bioinformatics software, including data mining tools for microarray and proteomics, protein-protein interaction analysis, and computationally complex tasks like molecular dynamics. Moreover, most of the programs have been configured and optimized for high performance computing. Conclusions LXtoo aims to provide well-supported computing environment tailored for bioinformatics research, reducing duplication of efforts in building computing infrastructure. LXtoo is distributed as a Live DVD and freely available at http://bioinformatics.jnu.edu.cn/LXtoo. PMID:22813356

  8. EasyModeller: A graphical interface to MODELLER

    PubMed Central

    2010-01-01

    Background MODELLER is a program for automated protein Homology Modeling. It is one of the most widely used tool for homology or comparative modeling of protein three-dimensional structures, but most users find it a bit difficult to start with MODELLER as it is command line based and requires knowledge of basic Python scripting to use it efficiently. Findings The study was designed with an aim to develop of "EasyModeller" tool as a frontend graphical interface to MODELLER using Perl/Tk, which can be used as a standalone tool in windows platform with MODELLER and Python preinstalled. It helps inexperienced users to perform modeling, assessment, visualization, and optimization of protein models in a simple and straightforward way. Conclusion EasyModeller provides a graphical straight forward interface and functions as a stand-alone tool which can be used in a standard personal computer with Microsoft Windows as the operating system. PMID:20712861

  9. Re-visiting protein-centric two-tier classification of existing DNA-protein complexes

    PubMed Central

    2012-01-01

    Background Precise DNA-protein interactions play most important and vital role in maintaining the normal physiological functioning of the cell, as it controls many high fidelity cellular processes. Detailed study of the nature of these interactions has paved the way for understanding the mechanisms behind the biological processes in which they are involved. Earlier in 2000, a systematic classification of DNA-protein complexes based on the structural analysis of the proteins was proposed at two tiers, namely groups and families. With the advancement in the number and resolution of structures of DNA-protein complexes deposited in the Protein Data Bank, it is important to revisit the existing classification. Results On the basis of the sequence analysis of DNA binding proteins, we have built upon the protein centric, two-tier classification of DNA-protein complexes by adding new members to existing families and making new families and groups. While classifying the new complexes, we also realised the emergence of new groups and families. The new group observed was where β-propeller was seen to interact with DNA. There were 34 SCOP folds which were observed to be present in the complexes of both old and new classifications, whereas 28 folds are present exclusively in the new complexes. Some new families noticed were NarL transcription factor, Z-α DNA binding proteins, Forkhead transcription factor, AP2 protein, Methyl CpG binding protein etc. Conclusions Our results suggest that with the increasing number of availability of DNA-protein complexes in Protein Data Bank, the number of families in the classification increased by approximately three fold. The folds present exclusively in newly classified complexes is suggestive of inclusion of proteins with new function in new classification, the most populated of which are the folds responsible for DNA damage repair. The proposed re-visited classification can be used to perform genome-wide surveys in the genomes of interest for the presence of DNA-binding proteins. Further analysis of these complexes can aid in developing algorithms for identifying DNA-binding proteins and their family members from mere sequence information. PMID:22800292

  10. Transcriptomic analysis of Arabidopsis developing stems: a close-up on cell wall genes

    PubMed Central

    Minic, Zoran; Jamet, Elisabeth; San-Clemente, Hélène; Pelletier, Sandra; Renou, Jean-Pierre; Rihouey, Christophe; Okinyo, Denis PO; Proux, Caroline; Lerouge, Patrice; Jouanin, Lise

    2009-01-01

    Background Different strategies (genetics, biochemistry, and proteomics) can be used to study proteins involved in cell biogenesis. The availability of the complete sequences of several plant genomes allowed the development of transcriptomic studies. Although the expression patterns of some Arabidopsis thaliana genes involved in cell wall biogenesis were identified at different physiological stages, detailed microarray analysis of plant cell wall genes has not been performed on any plant tissues. Using transcriptomic and bioinformatic tools, we studied the regulation of cell wall genes in Arabidopsis stems, i.e. genes encoding proteins involved in cell wall biogenesis and genes encoding secreted proteins. Results Transcriptomic analyses of stems were performed at three different developmental stages, i.e., young stems, intermediate stage, and mature stems. Many genes involved in the synthesis of cell wall components such as polysaccharides and monolignols were identified. A total of 345 genes encoding predicted secreted proteins with moderate or high level of transcripts were analyzed in details. The encoded proteins were distributed into 8 classes, based on the presence of predicted functional domains. Proteins acting on carbohydrates and proteins of unknown function constituted the two most abundant classes. Other proteins were proteases, oxido-reductases, proteins with interacting domains, proteins involved in signalling, and structural proteins. Particularly high levels of expression were established for genes encoding pectin methylesterases, germin-like proteins, arabinogalactan proteins, fasciclin-like arabinogalactan proteins, and structural proteins. Finally, the results of this transcriptomic analyses were compared with those obtained through a cell wall proteomic analysis from the same material. Only a small proportion of genes identified by previous proteomic analyses were identified by transcriptomics. Conversely, only a few proteins encoded by genes having moderate or high level of transcripts were identified by proteomics. Conclusion Analysis of the genes predicted to encode cell wall proteins revealed that about 345 genes had moderate or high levels of transcripts. Among them, we identified many new genes possibly involved in cell wall biogenesis. The discrepancies observed between results of this transcriptomic study and a previous proteomic study on the same material revealed post-transcriptional mechanisms of regulation of expression of genes encoding cell wall proteins. PMID:19149885

  11. Structural and Physiological Analyses of the Alkanesulphonate-Binding Protein (SsuA) of the Citrus Pathogen Xanthomonas citri

    PubMed Central

    Tófoli de Araújo, Fabiano; Bolanos-Garcia, Victor M.; Pereira, Cristiane T.; Sanches, Mario; Oshiro, Elisa E.; Ferreira, Rita C. C.; Chigardze, Dimitri Y.; Barbosa, João Alexandre Gonçalves; de Souza Ferreira, Luís Carlos; Benedetti, Celso E.; Blundell, Tom L.; Balan, Andrea

    2013-01-01

    Background The uptake of sulphur-containing compounds plays a pivotal role in the physiology of bacteria that live in aerobic soils where organosulfur compounds such as sulphonates and sulphate esters represent more than 95% of the available sulphur. Until now, no information has been available on the uptake of sulphonates by bacterial plant pathogens, particularly those of the Xanthomonas genus, which encompasses several pathogenic species. In the present study, we characterised the alkanesulphonate uptake system (Ssu) of Xanthomonas axonopodis pv. citri 306 strain (X. citri), the etiological agent of citrus canker. Methodology/Principal Findings A single operon-like gene cluster (ssuEDACB) that encodes both the sulphur uptake system and enzymes involved in desulphurisation was detected in the genomes of X. citri and of the closely related species. We characterised X. citri SsuA protein, a periplasmic alkanesulphonate-binding protein that, together with SsuC and SsuB, defines the alkanesulphonate uptake system. The crystal structure of SsuA bound to MOPS, MES and HEPES, which is herein described for the first time, provides evidence for the importance of a conserved dipole in sulphate group coordination, identifies specific amino acids interacting with the sulphate group and shows the presence of a rather large binding pocket that explains the rather wide range of molecules recognised by the protein. Isolation of an isogenic ssuA-knockout derivative of the X. citri 306 strain showed that disruption of alkanesulphonate uptake affects both xanthan gum production and generation of canker lesions in sweet orange leaves. Conclusions/Significance The present study unravels unique structural and functional features of the X. citri SsuA protein and provides the first experimental evidence that an ABC uptake system affects the virulence of this phytopathogen. PMID:24282519

  12. External-Cavity Quantum Cascade Laser Spectroscopy for Mid-IR Transmission Measurements of Proteins in Aqueous Solution.

    PubMed

    Alcaráz, Mirta R; Schwaighofer, Andreas; Kristament, Christian; Ramer, Georg; Brandstetter, Markus; Goicoechea, Héctor; Lendl, Bernhard

    2015-07-07

    In this work, we report mid-IR transmission measurements of the protein amide I band in aqueous solution at large optical paths. A tunable external-cavity quantum cascade laser (EC-QCL) operated in pulsed mode at room temperature allowed one to apply a path length of up to 38 μm, which is four times larger than that applicable with conventional FT-IR spectrometers. To minimize temperature-induced variations caused by background absorption of the ν2-vibration of water (HOH-bending) overlapping with the amide I region, a highly stable temperature control unit with relative temperature stability within 0.005 °C was developed. An advanced data processing protocol was established to overcome fluctuations in the fine structure of the emission curve that are inherent to the employed EC-QCL due to its mechanical instabilities. To allow for wavenumber accuracy, a spectral calibration method has been elaborated to reference the acquired IR spectra to the absolute positions of the water vapor absorption bands. Employing this setup, characteristic spectral features of five well-studied proteins exhibiting different secondary structures could be measured at concentrations as low as 2.5 mg mL(-1). This concentration range could previously only be accessed by IR measurements in D2O. Mathematical evaluation of the spectral overlap and comparison of second derivative spectra confirm excellent agreement of the QCL transmission measurements with protein spectra acquired by FT-IR spectroscopy. This proves the potential of the applied setup to monitor secondary structure changes of proteins in aqueous solution at extended optical path lengths, which allow experiments in flow through configuration.

  13. Structural and Functional Analyses of a Sterol Carrier Protein in Spodoptera litura

    PubMed Central

    Xu, Rui; Zheng, Sichun; He, Hongwu; Wan, Jian; Feng, Qili

    2014-01-01

    Backgrounds In insects, cholesterol is one of the membrane components in cells and a precursor of ecdysteroid biosynthesis. Because insects lack two key enzymes, squalene synthase and lanosterol synthase, in the cholesterol biosynthesis pathway, they cannot autonomously synthesize cholesterol de novo from simple compounds and therefore have to obtain sterols from their diet. Sterol carrier protein (SCP) is a cholesterol-binding protein responsible for cholesterol absorption and transport. Results In this study, a model of the three-dimensional structure of SlSCPx-2 in Spodoptera litura, a destructive polyphagous agricultural pest insect in tropical and subtropical areas, was constructed. Docking of sterol and fatty acid ligands to SlSCPx-2 and ANS fluorescent replacement assay showed that SlSCPx-2 was able to bind with relatively high affinities to cholesterol, stearic acid, linoleic acid, stigmasterol, oleic acid, palmitic acid and arachidonate, implying that SlSCPx may play an important role in absorption and transport of these cholesterol and fatty acids from host plants. Site-directed mutation assay of SlSCPx-2 suggests that amino acid residues F53, W66, F89, F110, I115, T128 and Q131 are critical for the ligand-binding activity of the SlSCPx-2 protein. Virtual ligand screening resulted in identification of several lead compounds which are potential inhibitors of SlSCPx-2. Bioassay for inhibitory effect of five selected compounds showed that AH-487/41731687, AG-664/14117324, AG-205/36813059 and AG-205/07775053 inhibited the growth of S. litura larvae. Conclusions Compounds AH-487/41731687, AG-664/14117324, AG-205/36813059 and AG-205/07775053 selected based on structural modeling showed binding affinity to SlSCPx-2 protein and inhibitory effect on the growth of S. litura larvae. PMID:24454688

  14. Comparative proteomics analysis of oral cancer cell lines: identification of cancer associated proteins

    PubMed Central

    2014-01-01

    Background A limiting factor in performing proteomics analysis on cancerous cells is the difficulty in obtaining sufficient amounts of starting material. Cell lines can be used as a simplified model system for studying changes that accompany tumorigenesis. This study used two-dimensional gel electrophoresis (2DE) to compare the whole cell proteome of oral cancer cell lines vs normal cells in an attempt to identify cancer associated proteins. Results Three primary cell cultures of normal cells with a limited lifespan without hTERT immortalization have been successfully established. 2DE was used to compare the whole cell proteome of these cells with that of three oral cancer cell lines. Twenty four protein spots were found to have changed in abundance. MALDI TOF/TOF was then used to determine the identity of these proteins. Identified proteins were classified into seven functional categories – structural proteins, enzymes, regulatory proteins, chaperones and others. IPA core analysis predicted that 18 proteins were related to cancer with involvements in hyperplasia, metastasis, invasion, growth and tumorigenesis. The mRNA expressions of two proteins – 14-3-3 protein sigma and Stress-induced-phosphoprotein 1 – were found to correlate with the corresponding proteins’ abundance. Conclusions The outcome of this analysis demonstrated that a comparative study of whole cell proteome of cancer versus normal cell lines can be used to identify cancer associated proteins. PMID:24422745

  15. Stereophysicochemical variability plots highlight conserved antigenic areas in Flaviviruses

    PubMed Central

    Schein, Catherine H; Zhou, Bin; Braun, Werner

    2005-01-01

    Background Flaviviruses, which include Dengue (DV) and West Nile (WN), mutate in response to immune system pressure. Identifying escape mutants, variant progeny that replicate in the presence of neutralizing antibodies, is a common way to identify functionally important residues of viral proteins. However, the mutations typically occur at variable positions on the viral surface that are not essential for viral replication. Methods are needed to determine the true targets of the neutralizing antibodies. Results Stereophysicochemical variability plots (SVPs), 3-D images of protein structures colored according to variability, as determined by our PCPMer program, were used to visualize residues conserved in their physical chemical properties (PCPs) near escape mutant positions. The analysis showed 1) that escape mutations in the flavivirus envelope protein are variable residues by our criteria and 2) two escape mutants found at the same position in many flaviviruses sit above clusters of conserved residues from different regions of the linear sequence. Conservation patterns in T-cell epitopes in the NS3- protease suggest a similar mechanism of immune system evasion. Conclusion The SVPs add another dimension to structurally defining the binding sites of neutralizing antibodies. They provide a useful aid for determining antigenically important regions and designing vaccines. PMID:15845145

  16. J D Bernal and the genesis of structural biology

    NASA Astrophysics Data System (ADS)

    Caffrey, Martin

    2007-02-01

    I was invited to participate in this Symposium a month or so before the event. At that time however, I knew little about J D Bernal. I vaguely remembered a brief conversation on the topic over a decade ago with Professor Vittorio Luzzati as we ambled around the gardens at the Palace of Varsailles. Vittorio likely knew Bernal through his friend Rosalind Franklin who worked with Bernal at Birbeck College. But beyond that I knew nothing about the man or his science. And so it was most fortunate that Andrew Brown's book J D Bernal: The Sage of Science appeared in 2005 and I was able to call on it. Indeed, much of the material included in this chapter is based on that source and on Dorothy Hodgkin's biographic memoir of J D Bernal, her postgraduate supervisor. Given that this chapter is to be published in a Physics journal I thought it appropriate to provide some background to the theme of my presentation, structural biology. Accordingly, I will begin with an introduction to proteins, one of structural biology's central characters, and to which Bernal devoted much energy and attention. How the molecular structure of a protein determines its activity and function will then be described. Bernal's major contribution in this area was to X-ray crystallography, the primary method by which a protein's structure is determined. The method, and aspects of its development, will be described. I will also make reference to some of Bernal's additional contributions in related fields. Finally, Vincent Casey, the symposium organizer, asked that I comment on how structural biology might impact on society. I will attempt to address that at the close of my presentation.

  17. Structural Integrity of the Greek Key Motif in βγ-Crystallins Is Vital for Central Eye Lens Transparency

    PubMed Central

    Vendra, Venkata Pulla Rao; Agarwal, Garima; Chandani, Sushil; Talla, Venu; Srinivasan, Narayanaswamy; Balasubramanian, Dorairajan

    2013-01-01

    Background We highlight an unrecognized physiological role for the Greek key motif, an evolutionarily conserved super-secondary structural topology of the βγ-crystallins. These proteins constitute the bulk of the human eye lens, packed at very high concentrations in a compact, globular, short-range order, generating transparency. Congenital cataract (affecting 400,000 newborns yearly worldwide), associated with 54 mutations in βγ-crystallins, occurs in two major phenotypes nuclear cataract, which blocks the central visual axis, hampering the development of the growing eye and demanding earliest intervention, and the milder peripheral progressive cataract where surgery can wait. In order to understand this phenotypic dichotomy at the molecular level, we have studied the structural and aggregation features of representative mutations. Methods Wild type and several representative mutant proteins were cloned, expressed and purified and their secondary and tertiary structural details, as well as structural stability, were compared in solution, using spectroscopy. Their tendencies to aggregate in vitro and in cellulo were also compared. In addition, we analyzed their structural differences by molecular modeling in silico. Results Based on their properties, mutants are seen to fall into two classes. Mutants A36P, L45PL54P, R140X, and G165fs display lowered solubility and structural stability, expose several buried residues to the surface, aggregate in vitro and in cellulo, and disturb/distort the Greek key motif. And they are associated with nuclear cataract. In contrast, mutants P24T and R77S, associated with peripheral cataract, behave quite similar to the wild type molecule, and do not affect the Greek key topology. Conclusion When a mutation distorts even one of the four Greek key motifs, the protein readily self-aggregates and precipitates, consistent with the phenotype of nuclear cataract, while mutations not affecting the motif display ‘native state aggregation’, leading to peripheral cataract, thus offering a protein structural rationale for the cataract phenotypic dichotomy “distort motif, lose central vision”. PMID:23936409

  18. An exploration of alternative visualisations of the basic helix-loop-helix protein interaction network

    PubMed Central

    Holden, Brian J; Pinney, John W; Lovell, Simon C; Amoutzias, Grigoris D; Robertson, David L

    2007-01-01

    Background Alternative representations of biochemical networks emphasise different aspects of the data and contribute to the understanding of complex biological systems. In this study we present a variety of automated methods for visualisation of a protein-protein interaction network, using the basic helix-loop-helix (bHLH) family of transcription factors as an example. Results Network representations that arrange nodes (proteins) according to either continuous or discrete information are investigated, revealing the existence of protein sub-families and the retention of interactions following gene duplication events. Methods of network visualisation in conjunction with a phylogenetic tree are presented, highlighting the evolutionary relationships between proteins, and clarifying the context of network hubs and interaction clusters. Finally, an optimisation technique is used to create a three-dimensional layout of the phylogenetic tree upon which the protein-protein interactions may be projected. Conclusion We show that by incorporating secondary genomic, functional or phylogenetic information into network visualisation, it is possible to move beyond simple layout algorithms based on network topology towards more biologically meaningful representations. These new visualisations can give structure to complex networks and will greatly help in interpreting their evolutionary origins and functional implications. Three open source software packages (InterView, TVi and OptiMage) implementing our methods are available. PMID:17683601

  19. How far in-silico computing meets real experiments. A study on the structure and dynamics of spin labeled vinculin tail protein by molecular dynamics simulations and EPR spectroscopy

    PubMed Central

    2013-01-01

    Background Investigation of conformational changes in a protein is a prerequisite to understand its biological function. To explore these conformational changes in proteins we developed a strategy with the combination of molecular dynamics (MD) simulations and electron paramagnetic resonance (EPR) spectroscopy. The major goal of this work is to investigate how far computer simulations can meet the experiments. Methods Vinculin tail protein is chosen as a model system as conformational changes within the vinculin protein are believed to be important for its biological function at the sites of cell adhesion. MD simulations were performed on vinculin tail protein both in water and in vacuo environments. EPR experimental data is compared with those of the simulated data for corresponding spin label positions. Results The calculated EPR spectra from MD simulations trajectories of selected spin labelled positions are comparable to experimental EPR spectra. The results show that the information contained in the spin label mobility provides a powerful means of mapping protein folds and their conformational changes. Conclusions The results suggest the localization of dynamic and flexible regions of the vinculin tail protein. This study shows MD simulations can be used as a complementary tool to interpret experimental EPR data. PMID:23445506

  20. Discovering protein complexes in protein interaction networks via exploring the weak ties effect

    PubMed Central

    2012-01-01

    Background Studying protein complexes is very important in biological processes since it helps reveal the structure-functionality relationships in biological networks and much attention has been paid to accurately predict protein complexes from the increasing amount of protein-protein interaction (PPI) data. Most of the available algorithms are based on the assumption that dense subgraphs correspond to complexes, failing to take into account the inherence organization within protein complex and the roles of edges. Thus, there is a critical need to investigate the possibility of discovering protein complexes using the topological information hidden in edges. Results To provide an investigation of the roles of edges in PPI networks, we show that the edges connecting less similar vertices in topology are more significant in maintaining the global connectivity, indicating the weak ties phenomenon in PPI networks. We further demonstrate that there is a negative relation between the weak tie strength and the topological similarity. By using the bridges, a reliable virtual network is constructed, in which each maximal clique corresponds to the core of a complex. By this notion, the detection of the protein complexes is transformed into a classic all-clique problem. A novel core-attachment based method is developed, which detects the cores and attachments, respectively. A comprehensive comparison among the existing algorithms and our algorithm has been made by comparing the predicted complexes against benchmark complexes. Conclusions We proved that the weak tie effect exists in the PPI network and demonstrated that the density is insufficient to characterize the topological structure of protein complexes. Furthermore, the experimental results on the yeast PPI network show that the proposed method outperforms the state-of-the-art algorithms. The analysis of detected modules by the present algorithm suggests that most of these modules have well biological significance in context of complexes, suggesting that the roles of edges are critical in discovering protein complexes. PMID:23046740

  1. Predicting domain-domain interaction based on domain profiles with feature selection and support vector machines

    PubMed Central

    2010-01-01

    Background Protein-protein interaction (PPI) plays essential roles in cellular functions. The cost, time and other limitations associated with the current experimental methods have motivated the development of computational methods for predicting PPIs. As protein interactions generally occur via domains instead of the whole molecules, predicting domain-domain interaction (DDI) is an important step toward PPI prediction. Computational methods developed so far have utilized information from various sources at different levels, from primary sequences, to molecular structures, to evolutionary profiles. Results In this paper, we propose a computational method to predict DDI using support vector machines (SVMs), based on domains represented as interaction profile hidden Markov models (ipHMM) where interacting residues in domains are explicitly modeled according to the three dimensional structural information available at the Protein Data Bank (PDB). Features about the domains are extracted first as the Fisher scores derived from the ipHMM and then selected using singular value decomposition (SVD). Domain pairs are represented by concatenating their selected feature vectors, and classified by a support vector machine trained on these feature vectors. The method is tested by leave-one-out cross validation experiments with a set of interacting protein pairs adopted from the 3DID database. The prediction accuracy has shown significant improvement as compared to InterPreTS (Interaction Prediction through Tertiary Structure), an existing method for PPI prediction that also uses the sequences and complexes of known 3D structure. Conclusions We show that domain-domain interaction prediction can be significantly enhanced by exploiting information inherent in the domain profiles via feature selection based on Fisher scores, singular value decomposition and supervised learning based on support vector machines. Datasets and source code are freely available on the web at http://liao.cis.udel.edu/pub/svdsvm. Implemented in Matlab and supported on Linux and MS Windows. PMID:21034480

  2. The protein interaction map of bacteriophage lambda

    PubMed Central

    2011-01-01

    Background Bacteriophage lambda is a model phage for most other dsDNA phages and has been studied for over 60 years. Although it is probably the best-characterized phage there are still about 20 poorly understood open reading frames in its 48-kb genome. For a complete understanding we need to know all interactions among its proteins. We have manually curated the lambda literature and compiled a total of 33 interactions that have been found among lambda proteins. We set out to find out how many protein-protein interactions remain to be found in this phage. Results In order to map lambda's interactions, we have cloned 68 out of 73 lambda open reading frames (the "ORFeome") into Gateway vectors and systematically tested all proteins for interactions using exhaustive array-based yeast two-hybrid screens. These screens identified 97 interactions. We found 16 out of 30 previously published interactions (53%). We have also found at least 18 new plausible interactions among functionally related proteins. All previously found and new interactions are combined into structural and network models of phage lambda. Conclusions Phage lambda serves as a benchmark for future studies of protein interactions among phage, viruses in general, or large protein assemblies. We conclude that we could not find all the known interactions because they require chaperones, post-translational modifications, or multiple proteins for their interactions. The lambda protein network connects 12 proteins of unknown function with well characterized proteins, which should shed light on the functional associations of these uncharacterized proteins. PMID:21943085

  3. Giant viruses coexisted with the cellular ancestors and represent a distinct supergroup along with superkingdoms Archaea, Bacteria and Eukarya

    PubMed Central

    2012-01-01

    Background The discovery of giant viruses with genome and physical size comparable to cellular organisms, remnants of protein translation machinery and virus-specific parasites (virophages) have raised intriguing questions about their origin. Evidence advocates for their inclusion into global phylogenomic studies and their consideration as a distinct and ancient form of life. Results Here we reconstruct phylogenies describing the evolution of proteomes and protein domain structures of cellular organisms and double-stranded DNA viruses with medium-to-very-large proteomes (giant viruses). Trees of proteomes define viruses as a ‘fourth supergroup’ along with superkingdoms Archaea, Bacteria, and Eukarya. Trees of domains indicate they have evolved via massive and primordial reductive evolutionary processes. The distribution of domain structures suggests giant viruses harbor a significant number of protein domains including those with no cellular representation. The genomic and structural diversity embedded in the viral proteomes is comparable to the cellular proteomes of organisms with parasitic lifestyles. Since viral domains are widespread among cellular species, we propose that viruses mediate gene transfer between cells and crucially enhance biodiversity. Conclusions Results call for a change in the way viruses are perceived. They likely represent a distinct form of life that either predated or coexisted with the last universal common ancestor (LUCA) and constitute a very crucial part of our planet’s biosphere. PMID:22920653

  4. Domain fusion analysis by applying relational algebra to protein sequence and domain databases

    PubMed Central

    Truong, Kevin; Ikura, Mitsuhiko

    2003-01-01

    Background Domain fusion analysis is a useful method to predict functionally linked proteins that may be involved in direct protein-protein interactions or in the same metabolic or signaling pathway. As separate domain databases like BLOCKS, PROSITE, Pfam, SMART, PRINTS-S, ProDom, TIGRFAMs, and amalgamated domain databases like InterPro continue to grow in size and quality, a computational method to perform domain fusion analysis that leverages on these efforts will become increasingly powerful. Results This paper proposes a computational method employing relational algebra to find domain fusions in protein sequence databases. The feasibility of this method was illustrated on the SWISS-PROT+TrEMBL sequence database using domain predictions from the Pfam HMM (hidden Markov model) database. We identified 235 and 189 putative functionally linked protein partners in H. sapiens and S. cerevisiae, respectively. From scientific literature, we were able to confirm many of these functional linkages, while the remainder offer testable experimental hypothesis. Results can be viewed at . Conclusion As the analysis can be computed quickly on any relational database that supports standard SQL (structured query language), it can be dynamically updated along with the sequence and domain databases, thereby improving the quality of predictions over time. PMID:12734020

  5. Allergic sensitization: screening methods

    PubMed Central

    2014-01-01

    Experimental in silico, in vitro, and rodent models for screening and predicting protein sensitizing potential are discussed, including whether there is evidence of new sensitizations and allergies since the introduction of genetically modified crops in 1996, the importance of linear versus conformational epitopes, and protein families that become allergens. Some common challenges for predicting protein sensitization are addressed: (a) exposure routes; (b) frequency and dose of exposure; (c) dose-response relationships; (d) role of digestion, food processing, and the food matrix; (e) role of infection; (f) role of the gut microbiota; (g) influence of the structure and physicochemical properties of the protein; and (h) the genetic background and physiology of consumers. The consensus view is that sensitization screening models are not yet validated to definitively predict the de novo sensitizing potential of a novel protein. However, they would be extremely useful in the discovery and research phases of understanding the mechanisms of food allergy development, and may prove fruitful to provide information regarding potential allergenicity risk assessment of future products on a case by case basis. These data and findings were presented at a 2012 international symposium in Prague organized by the Protein Allergenicity Technical Committee of the International Life Sciences Institute’s Health and Environmental Sciences Institute. PMID:24739743

  6. Protein expression, characterization and activity comparisons of wild type and mutant DUSP5 proteins

    DOE PAGES

    Nayak, Jaladhi; Gastonguay, Adam J.; Talipov, Marat R.; ...

    2014-12-18

    Background: The mitogen-activated protein kinases (MAPKs) pathway is critical for cellular signaling, and proteins such as phosphatases that regulate this pathway are important for normal tissue development. Based on our previous work on dual specificity phosphatase-5 (DUSP5), and its role in embryonic vascular development and disease, we hypothesized that mutations in DUSP5 will affect its function. Results: In this study, we tested this hypothesis by generating full-length glutathione-S-transferase-tagged DUSP5 and serine 147 proline mutant (S147P) proteins from bacteria. Light scattering analysis, circular dichroism, enzymatic assays and molecular modeling approaches have been performed to extensively characterize the protein form and function.more » We demonstrate that both proteins are active and, interestingly, the S147P protein is hypoactive as compared to the DUSP5 WT protein in two distinct biochemical substrate assays. Furthermore, due to the novel positioning of the S147P mutation, we utilize computational modeling to reconstruct full-length DUSP5 and S147P to predict a possible mechanism for the reduced activity of S147P. Conclusion: Taken together, this is the first evidence of the generation and characterization of an active, full-length, mutant DUSP5 protein which will facilitate future structure-function and drug development-based studies.« less

  7. A strategy to measure electrophysiological changes with photoacoustic imaging (Conference Presentation)

    NASA Astrophysics Data System (ADS)

    Sepela, Rebecka J.; Sherlock, Benjamin E.; Tian, Lin; Marcu, Laura; Sack, Jon

    2017-03-01

    Photoacoustic imaging is an emerging technology capable of both functional and structural biological imaging. Absorption and scattering in tissue limit the penetration depth of conventional microscopy techniques to <1mm. Photoacoustic imaging however, can offer high-resolution and contrast at depths of several centimeters. Though functional imaging of endogenous contrast agents, such as hemoglobin, is widely implemented, currently photoacoustic imaging is unable to functionally report electrophysiological changes within cells. We aim to develop photoacoustic contrast agents to fulfill this need. Cells throughout the brain and body create electrical signals using ion channel proteins. These proteins undergo structural changes to regulate the flux of salt ions into the cell. We have recently developed ion channel activity tracers that dissociate from ion channels after the protein changes structure. By conjugating the tracer to dyes that are sensitive to changes in their chemical environment, we can detect tracer dissociation and therefore ion channel activity. We are exploring whether a similar mechanism can create photoacoustic signal intensity changes. To test if the environmental sensitivity of the dye is photoacoustically distinguishable, we imaged the dye in different solvent backgrounds. We report that manipulation of the chemical environment of the contrast dye results in robust changes in photoacoustic properties. We are working to capture photoacoustic signal changes that occur when ion channel proteins activate using live cell imaging. This technology could permit photoacoustic imaging of electrophysiological dynamics in deep tissue, such as the brain. Further optimization of this technology could lead to concurrent imaging of neural activity and hemodynamic responses, a crucial step towards understanding neurovascular coupling in the brain.

  8. High-level expression and deuteration of sperm whale myoglobin: A study of its solvent structure by X-ray and neutron diffraction methods

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Shu, F.; Ramakrishnan, V.; Schoenborn, B.P.

    1994-12-31

    Neutron diffraction has become one of the best ways to study light atoms, such as hydrogens. Hydrogen however has a negative coherent scattering factor, and a large incoherent scattering factor, while deuterium has virtually no incoherent scattering, but a large positive coherent scattering factor. Beside causing high background due to its incoherent scattering, the negative coherent scattering of hydrogen tends to cancel out the positive contribution from other atoms in a neutron density map. Therefore a fully deuterated sample will yield better diffraction data with stronger density in the hydrogen position. On this basis, a sperm whale myoglobin gene modifiedmore » to include part of the A cII protein gene has been cloned into the T7 expression system. Milligram amounts of fully deuterated holo-myoglobin have been obtained and used for crystallization. The synthetic sperm whale myoglobin crystallized in P2{sub 1} space group isomorphous with the native protein crystal. A complete X-ray diffraction dataset at 1.5{Angstrom} has been collected. This X-ray dataset, and a neutron data set collected previously on a protonated carbon-monoxymyoglobin crystal have been used for solvent structure studies. Both X-ray and neutron data have shown that there are ordered hydration layers around the protein surface. Solvent shell analysis on the neutron data further has shown that the first hydration layer behaves differently around polar and apolar regions of the protein surface. Finally, the structure of per-deuterated myoglobin has been refined using all reflections to a R factor of 17%.« less

  9. Structures formed by a cell membrane-associated arabinogalactan-protein on graphite or mica alone and with Yariv phenylglycosides

    PubMed Central

    Zhou, Li Hong; Weizbauer, Renate A.; Singamaneni, Srikanth; Xu, Feng; Genin, Guy M.; Pickard, Barbara G.

    2014-01-01

    Background Certain membrane-associated arabinogalactan-proteins (AGPs) with lysine-rich sub-domains participate in plant growth, development and resistance to stress. To complement fluorescence imaging of such molecules when tagged and introduced transgenically to the cell periphery and to extend the groundwork for assessing molecular structure, some behaviours of surface-spread AGPs were visualized at the nanometre scale in a simplified electrostatic environment. Methods Enhanced green fluorescent protein (EGFP)-labelled LeAGP1 was isolated from Arabidopsis thaliana leaves using antibody-coated magnetic beads, deposited on graphite or mica, and examined with atomic force microscopy (AFM). Key Results When deposited at low concentration on graphite, LeAGP can form independent clusters and rings a few nanometres in diameter, often defining deep pits; the aperture of the rings depends on plating parameters. On mica, intermediate and high concentrations, respectively, yielded lacy meshes and solid sheets that could dynamically evolve arcs, rings, ‘pores’ and ‘co-pores’, and pits. Glucosyl Yariv reagent combined with the AGP to make very large and distinctive rings. Conclusions Diverse cell-specific nano-patterns of native lysine-rich AGPs are expected at the wall–membrane interface and, while there will not be an identical patterning in different environmental settings, AFM imaging suggests protein tendencies for surficial organization and thus opens new avenues for experimentation. Nanopore formation with Yariv reagents suggests how the reagent might bind with AGP to admit Ca2+ to cells and hints at ways in which AGP might be structured at some cell surfaces. PMID:25164699

  10. Applying Recovery Biomarkers to Calibrate Self-Report Measures of Energy and Protein in the Hispanic Community Health Study/Study of Latinos

    PubMed Central

    Mossavar-Rahmani, Yasmin; Shaw, Pamela A.; Wong, William W.; Sotres-Alvarez, Daniela; Gellman, Marc D.; Van Horn, Linda; Stoutenberg, Mark; Daviglus, Martha L.; Wylie-Rosett, Judith; Siega-Riz, Anna Maria; Ou, Fang-Shu; Prentice, Ross L.

    2015-01-01

    We investigated measurement error in the self-reported diets of US Hispanics/Latinos, who are prone to obesity and related comorbidities, by background (Central American, Cuban, Dominican, Mexican, Puerto Rican, and South American) in 2010–2012. In 477 participants aged 18–74 years, doubly labeled water and urinary nitrogen were used as objective recovery biomarkers of energy and protein intakes. Self-report was captured from two 24-hour dietary recalls. All measures were repeated in a subsample of 98 individuals. We examined the bias of dietary recalls and their associations with participant characteristics using generalized estimating equations. Energy intake was underestimated by 25.3% (men, 21.8%; women, 27.3%), and protein intake was underestimated by 18.5% (men, 14.7%; women, 20.7%). Protein density was overestimated by 10.7% (men, 11.3%; women, 10.1%). Higher body mass index and Hispanic/Latino background were associated with underestimation of energy (P < 0.05). For protein intake, higher body mass index, older age, nonsmoking, Spanish speaking, and Hispanic/Latino background were associated with underestimation (P < 0.05). Systematic underreporting of energy and protein intakes and overreporting of protein density were found to vary significantly by Hispanic/Latino background. We developed calibration equations that correct for subject-specific error in reporting that can be used to reduce bias in diet-disease association studies. PMID:25995289

  11. BiFCROS: A Low-Background Fluorescence Repressor Operator System for Labeling of Genomic Loci.

    PubMed

    Milbredt, Sarah; Waldminghaus, Torsten

    2017-06-07

    Fluorescence-based methods are widely used to analyze elementary cell processes such as DNA replication or chromosomal folding and segregation. Labeling DNA with a fluorescent protein allows the visualization of its temporal and spatial organization. One popular approach is FROS (fluorescence repressor operator system). This method specifically labels DNA in vivo through binding of a fusion of a fluorescent protein and a repressor protein to an operator array, which contains numerous copies of the repressor binding site integrated into the genomic site of interest. Bound fluorescent proteins are then visible as foci in microscopic analyses and can be distinguished from the background fluorescence caused by unbound fusion proteins. Even though this method is widely used, no attempt has been made so far to decrease the background fluorescence to facilitate analysis of the actual signal of interest. Here, we present a new method that greatly reduces the background signal of FROS. BiFCROS (Bimolecular Fluorescence Complementation and Repressor Operator System) is based on fusions of repressor proteins to halves of a split fluorescent protein. Binding to a hybrid FROS array results in fluorescence signals due to bimolecular fluorescence complementation. Only proteins bound to the hybrid FROS array fluoresce, greatly improving the signal to noise ratio compared to conventional FROS. We present the development of BiFCROS and discuss its potential to be used as a fast and single-cell readout for copy numbers of genetic loci. Copyright © 2017 Milbredt and Waldminghaus.

  12. BiFCROS: A Low-Background Fluorescence Repressor Operator System for Labeling of Genomic Loci

    PubMed Central

    Milbredt, Sarah; Waldminghaus, Torsten

    2017-01-01

    Fluorescence-based methods are widely used to analyze elementary cell processes such as DNA replication or chromosomal folding and segregation. Labeling DNA with a fluorescent protein allows the visualization of its temporal and spatial organization. One popular approach is FROS (fluorescence repressor operator system). This method specifically labels DNA in vivo through binding of a fusion of a fluorescent protein and a repressor protein to an operator array, which contains numerous copies of the repressor binding site integrated into the genomic site of interest. Bound fluorescent proteins are then visible as foci in microscopic analyses and can be distinguished from the background fluorescence caused by unbound fusion proteins. Even though this method is widely used, no attempt has been made so far to decrease the background fluorescence to facilitate analysis of the actual signal of interest. Here, we present a new method that greatly reduces the background signal of FROS. BiFCROS (Bimolecular Fluorescence Complementation and Repressor Operator System) is based on fusions of repressor proteins to halves of a split fluorescent protein. Binding to a hybrid FROS array results in fluorescence signals due to bimolecular fluorescence complementation. Only proteins bound to the hybrid FROS array fluoresce, greatly improving the signal to noise ratio compared to conventional FROS. We present the development of BiFCROS and discuss its potential to be used as a fast and single-cell readout for copy numbers of genetic loci. PMID:28450375

  13. Protein C-Terminal Labeling and Biotinylation Using Synthetic Peptide and Split-Intein

    PubMed Central

    Volkmann, Gerrit; Liu, Xiang-Qin

    2009-01-01

    Background Site-specific protein labeling or modification can facilitate the characterization of proteins with respect to their structure, folding, and interaction with other proteins. However, current methods of site-specific protein labeling are few and with limitations, therefore new methods are needed to satisfy the increasing need and sophistications of protein labeling. Methodology A method of protein C-terminal labeling was developed using a non-canonical split-intein, through an intein-catalyzed trans-splicing reaction between a protein and a small synthetic peptide carrying the desired labeling groups. As demonstrations of this method, three different proteins were efficiently labeled at their C-termini with two different labels (fluorescein and biotin) either in solution or on a solid surface, and a transferrin receptor protein was labeled on the membrane surface of live mammalian cells. Protein biotinylation and immobilization on a streptavidin-coated surface were also achieved in a cell lysate without prior purification of the target protein. Conclusions We have produced a method of site-specific labeling or modification at the C-termini of recombinant proteins. This method compares favorably with previous protein labeling methods and has several unique advantages. It is expected to have many potential applications in protein engineering and research, which include fluorescent labeling for monitoring protein folding, location, and trafficking in cells, and biotinylation for protein immobilization on streptavidin-coated surfaces including protein microchips. The types of chemical labeling may be limited only by the ability of chemical synthesis to produce the small C-intein peptide containing the desired chemical groups. PMID:20027230

  14. Multifunctionality and diversity of GDSL esterase/lipase gene family in rice (Oryza sativa L. japonica) genome: new insights from bioinformatics analysis

    PubMed Central

    2012-01-01

    Background GDSL esterases/lipases are a newly discovered subclass of lipolytic enzymes that are very important and attractive research subjects because of their multifunctional properties, such as broad substrate specificity and regiospecificity. Compared with the current knowledge regarding these enzymes in bacteria, our understanding of the plant GDSL enzymes is very limited, although the GDSL gene family in plant species include numerous members in many fully sequenced plant genomes. Only two genes from a large rice GDSL esterase/lipase gene family were previously characterised, and the majority of the members remain unknown. In the present study, we describe the rice OsGELP (Oryza sativa GDSL esterase/lipase protein) gene family at the genomic and proteomic levels, and use this knowledge to provide insights into the multifunctionality of the rice OsGELP enzymes. Results In this study, an extensive bioinformatics analysis identified 114 genes in the rice OsGELP gene family. A complete overview of this family in rice is presented, including the chromosome locations, gene structures, phylogeny, and protein motifs. Among the OsGELPs and the plant GDSL esterase/lipase proteins of known functions, 41 motifs were found that represent the core secondary structure elements or appear specifically in different phylogenetic subclades. The specification and distribution of identified putative conserved clade-common and -specific peptide motifs, and their location on the predicted protein three dimensional structure may possibly signify their functional roles. Potentially important regions for substrate specificity are highlighted, in accordance with protein three-dimensional model and location of the phylogenetic specific conserved motifs. The differential expression of some representative genes were confirmed by quantitative real-time PCR. The phylogenetic analysis, together with protein motif architectures, and the expression profiling were analysed to predict the possible biological functions of the rice OsGELP genes. Conclusions Our current genomic analysis, for the first time, presents fundamental information on the organization of the rice OsGELP gene family. With combination of the genomic, phylogenetic, microarray expression, protein motif distribution, and protein structure analyses, we were able to create supported basis for the functional prediction of many members in the rice GDSL esterase/lipase family. The present study provides a platform for the selection of candidate genes for further detailed functional study. PMID:22793791

  15. Cellular Retinoic Acid Binding Proteins: Genomic and Non-genomic Functions and their Regulation.

    PubMed

    Wei, Li-Na

    Cellular retinoic acid binding proteins (CRABPs) are high-affinity retinoic acid (RA) binding proteins that mainly reside in the cytoplasm. In mammals, this family has two members, CRABPI and II, both highly conserved during evolution. The two proteins share a very similar structure that is characteristic of a "β-clam" motif built up from10-strands. The proteins are encoded by two different genes that share a very similar genomic structure. CRABPI is widely distributed and CRABPII has restricted expression in only certain tissues. The CrabpI gene is driven by a housekeeping promoter, but can be regulated by numerous factors, including thyroid hormones and RA, which engage a specific chromatin-remodeling complex containing either TRAP220 or RIP140 as coactivator and corepressor, respectively. The chromatin-remodeling complex binds the DR4 element in the CrabpI gene promoter to activate or repress this gene in different cellular backgrounds. The CrabpII gene promoter contains a TATA-box and is rapidly activated by RA through an RA response element. Biochemical and cell culture studies carried out in vitro show the two proteins have distinct biological functions. CRABPII mainly functions to deliver RA to the nuclear RA receptors for gene regulation, although recent studies suggest that CRABPII may also be involved in other cellular events, such as RNA stability. In contrast, biochemical and cell culture studies suggest that CRABPI functions mainly in the cytoplasm to modulate intracellular RA availability/concentration and to engage other signaling components such as ERK activity. However, these functional studies remain inconclusive because knocking out one or both genes in mice does not produce definitive phenotypes. Further studies are needed to unambiguously decipher the exact physiological activities of these two proteins.

  16. Evolution and characterization of a new reversibly photoswitching chromogenic protein, Dathail

    DOE PAGES

    Langan, Patricia S.; Close, Devin W.; Coates, Leighton; ...

    2016-03-18

    In this paper, we report the engineering of a new reversibly switching chromogenic protein, Dathail. Dathail was evolved from the extremely thermostable fluorescent proteins thermal green protein (TGP) and eCGP123 using directed evolution and ratiometric sorting. Dathail has two spectrally distinct chromogenic states with low quantum yields, corresponding to absorbance in a ground state with a maximum at 389 nm, and a photo-induced metastable state with a maximum at 497 nm. In contrast to all previously described photoswitchable proteins, both spectral states of Dathail are non-fluorescent. The photo-induced chromogenic state of Dathail has a lifetime of ~ 50 min atmore » 293 K and pH 7.5 as measured by UV–Vis spectrophotometry, returning to the ground state through thermal relaxation. X-ray crystallography provided structural insights supporting a change in conformation and coordination in the chromophore pocket as being responsible for Dathail's photoswitching. Neutron crystallography, carried out for the first time on a protein from the green fluorescent protein family, showed a distribution of hydrogen atoms revealing protonation of the chromophore 4-hydroxybenzyl group in the ground state. Additionally, the neutron structure also supports the hypothesis that the photo-induced proton transfer from the chromophore occurs through water-mediated proton relay into the bulk solvent. Beyond its spectroscopic curiosity, Dathail has several characteristics that are improvements for applications, including low background fluorescence, large spectral separation, rapid switching time, and the ability to switch many times. Therefore, Dathail is likely to be extremely useful in the quickly developing fields of imaging and biosensors, including photochromic Förster resonance energy transfer, high-resolution microscopy, and live tracking within the cell.« less

  17. Fluorescent Applications to Crystallization

    NASA Technical Reports Server (NTRS)

    Pusey, Marc L.; Forsythe, Elizabeth; Achari, Aniruddha

    2006-01-01

    By covalently modifying a subpopulation, less than or equal to 1%, of a macromolecule with a fluorescent probe, the labeled material will add to a growing crystal as a microheterogeneous growth unit. Labeling procedures can be readily incorporated into the final stages of purification, and tests with model proteins have shown that labeling u to 5 percent of the protein molecules does not affect the X-ray data quality obtained . The presence of the trace fluorescent label gives a number of advantages. Since the label is covalently attached to the protein molecules, it "tracks" the protein s response to the crystallization conditions. The covalently attached probe will concentrate in the crystal relative to the solution, and under fluorescent illumination crystals show up as bright objects against a darker background. Non-protein structures, such as salt crystals, do not show up under fluorescent illumination. Crystals have the highest protein concentration and are readily observed against less bright precipitated phases, which under white light illumination may obscure the crystals. Automated image analysis to find crystals should be greatly facilitated, without having to first define crystallization drop boundaries as the protein or protein structures is all that shows up. Fluorescence intensity is a faster search parameter, whether visually or by automated methods, than looking for crystalline features. Preliminary tests, using model proteins, indicates that we can use high fluorescence intensity regions, in the absence of clear crystalline features or "hits", as a means for determining potential lead conditions. A working hypothesis is that more rapid amorphous precipitation kinetics may overwhelm and trap more slowly formed ordered assemblies, which subsequently show up as regions of brighter fluorescence intensity. Experiments are now being carried out to test this approach using a wider range, of proteins. The trace fluorescently labeled crystals will also emit with sufficient intensity to aid in the automation of crystal alignment using relatively low cost optics, further increasing throughput at synchrotrons.

  18. Taxonomic distribution and origins of the extended LHC (light-harvesting complex) antenna protein superfamily

    PubMed Central

    2010-01-01

    Background The extended light-harvesting complex (LHC) protein superfamily is a centerpiece of eukaryotic photosynthesis, comprising the LHC family and several families involved in photoprotection, like the LHC-like and the photosystem II subunit S (PSBS). The evolution of this complex superfamily has long remained elusive, partially due to previously missing families. Results In this study we present a meticulous search for LHC-like sequences in public genome and expressed sequence tag databases covering twelve representative photosynthetic eukaryotes from the three primary lineages of plants (Plantae): glaucophytes, red algae and green plants (Viridiplantae). By introducing a coherent classification of the different protein families based on both, hidden Markov model analyses and structural predictions, numerous new LHC-like sequences were identified and several new families were described, including the red lineage chlorophyll a/b-binding-like protein (RedCAP) family from red algae and diatoms. The test of alternative topologies of sequences of the highly conserved chlorophyll-binding core structure of LHC and PSBS proteins significantly supports the independent origins of LHC and PSBS families via two unrelated internal gene duplication events. This result was confirmed by the application of cluster likelihood mapping. Conclusions The independent evolution of LHC and PSBS families is supported by strong phylogenetic evidence. In addition, a possible origin of LHC and PSBS families from different homologous members of the stress-enhanced protein subfamily, a diverse and anciently paralogous group of two-helix proteins, seems likely. The new hypothesis for the evolution of the extended LHC protein superfamily proposed here is in agreement with the character evolution analysis that incorporates the distribution of families and subfamilies across taxonomic lineages. Intriguingly, stress-enhanced proteins, which are universally found in the genomes of green plants, red algae, glaucophytes and in diatoms with complex plastids, could represent an important and previously missing link in the evolution of the extended LHC protein superfamily. PMID:20673336

  19. Genetic Background is a Key Determinant of Glomerular Extracellular Matrix Composition and Organization

    PubMed Central

    Randles, Michael J.; Woolf, Adrian S.; Huang, Jennifer L.; Byron, Adam; Humphries, Jonathan D.; Price, Karen L.; Kolatsi-Joannou, Maria; Collinson, Sophie; Denny, Thomas; Knight, David; Mironov, Aleksandr; Starborg, Toby; Korstanje, Ron; Humphries, Martin J.; Long, David A.

    2015-01-01

    Glomerular disease often features altered histologic patterns of extracellular matrix (ECM). Despite this, the potential complexities of the glomerular ECM in both health and disease are poorly understood. To explore whether genetic background and sex determine glomerular ECM composition, we investigated two mouse strains, FVB and B6, using RNA microarrays of isolated glomeruli combined with proteomic glomerular ECM analyses. These studies, undertaken in healthy young adult animals, revealed unique strain- and sex-dependent glomerular ECM signatures, which correlated with variations in levels of albuminuria and known predisposition to progressive nephropathy. Among the variation, we observed changes in netrin 4, fibroblast growth factor 2, tenascin C, collagen 1, meprin 1-α, and meprin 1-β. Differences in protein abundance were validated by quantitative immunohistochemistry and Western blot analysis, and the collective differences were not explained by mutations in known ECM or glomerular disease genes. Within the distinct signatures, we discovered a core set of structural ECM proteins that form multiple protein–protein interactions and are conserved from mouse to man. Furthermore, we found striking ultrastructural changes in glomerular basement membranes in FVB mice. Pathway analysis of merged transcriptomic and proteomic datasets identified potential ECM regulatory pathways involving inhibition of matrix metalloproteases, liver X receptor/retinoid X receptor, nuclear factor erythroid 2-related factor 2, notch, and cyclin-dependent kinase 5. These pathways may therefore alter ECM and confer susceptibility to disease. PMID:25896609

  20. Genome-wide analysis of putative peroxiredoxin in unicellular and filamentous cyanobacteria

    PubMed Central

    2012-01-01

    Background Cyanobacteria are photoautotrophic prokaryotes with wide variations in genome sizes and ecological habitats. Peroxiredoxin (PRX) is an important protein that plays essential roles in protecting own cells against reactive oxygen species (ROS). PRXs have been identified from mammals, fungi and higher plants. However, knowledge on cyanobacterial PRXs still remains obscure. With the availability of 37 sequenced cyanobacterial genomes, we performed a comprehensive comparative analysis of PRXs and explored their diversity, distribution, domain structure and evolution. Results Overall 244 putative prx genes were identified, which were abundant in filamentous diazotrophic cyanobacteria, Acaryochloris marina MBIC 11017, and unicellular cyanobacteria inhabiting freshwater and hot-springs, while poor in all Prochlorococcus and marine Synechococcus strains. Among these putative genes, 25 open reading frames (ORFs) encoding hypothetical proteins were identified as prx gene family members and the others were already annotated as prx genes. All 244 putative PRXs were classified into five major subfamilies (1-Cys, 2-Cys, BCP, PRX5_like, and PRX-like) according to their domain structures. The catalytic motifs of the cyanobacterial PRXs were similar to those of eukaryotic PRXs and highly conserved in all but the PRX-like subfamily. Classical motif (CXXC) of thioredoxin was detected in protein sequences from the PRX-like subfamily. Phylogenetic tree constructed of catalytic domains coincided well with the domain structures of PRXs and the phylogenies based on 16s rRNA. Conclusions The distribution of genes encoding PRXs in different unicellular and filamentous cyanobacteria especially those sub-families like PRX-like or 1-Cys PRX correlate with the genome size, eco-physiology, and physiological properties of the organisms. Cyanobacterial and eukaryotic PRXs share similar conserved motifs, indicating that cyanobacteria adopt similar catalytic mechanisms as eukaryotes. All cyanobacterial PRX proteins share highly similar structures, implying that these genes may originate from a common ancestor. In this study, a general framework of the sequence-structure-function connections of the PRXs was revealed, which may facilitate functional investigations of PRXs in various organisms. PMID:23157370

  1. Examining Myddosome Formation by Luminescence-Based Mammalian Interactome Mapping (LUMIER).

    PubMed

    Wolz, Olaf-Oliver; Koegl, Manfred; Weber, Alexander N R

    2018-01-01

    Recent structural, biochemical, and functional studies have led to the notion that many of the post-receptor signaling complexes in innate immunity have a multimeric, multi-protein architecture whose hierarchical assembly is vital for function. The Myddosome is a post-receptor complex in the cytoplasmic signaling of Toll-like receptors (TLR) and the Interleukin-1 receptor (IL-1R), involving the proteins MyD88, IL-1R-associated kinase 4 (IRAK4), and IRAK2. Its importance is strikingly illustrated by the fact that rare germline mutations in MYD88 causing high susceptibility to infections are characterized by failure to assemble Myddosomes; conversely, gain-of-function MYD88 mutations leading to oncogenic hyperactivation of NF-κB show increased Myddosome formation. Reliable methods to probe Myddosome formation experimentally are therefore vital to further study the properties of this important post-receptor complex and its role in innate immunity, such as its regulation by posttranslational modification. Compared to structural and biochemical analyses, luminescence-based mammalian interactome mapping (LUMIER) is a straightforward, automatable, quantifiable, and versatile technique to study protein-protein interactions in a physiologically relevant context. We adapted LUMIER for Myddosome analysis and provide here a basic background of this technique, suitable experimental protocols, and its potential for medium-throughput screening. The principles presented herein can be adapted to other signaling pathways.

  2. Cleavable DNA-protein hybrid molecular beacon: A novel efficient signal translator for sensitive fluorescence anisotropy bioassay.

    PubMed

    Hu, Pan; Yang, Bin

    2016-01-15

    Due to its unique features such as high sensitivity, homogeneous format, and independence on fluorescent intensity, fluorescence anisotropy (FA) assay has become a hotspot of study in oligonucleotide-based bioassays. However, until now most FA probes require carefully customized structure designs, and thus are neither generalizable for different sensing systems nor effective to obtain sufficient signal response. To address this issue, a cleavable DNA-protein hybrid molecular beacon was successfully engineered for signal amplified FA bioassay, via combining the unique stable structure of molecular beacon and the large molecular mass of streptavidin. Compared with single DNA strand probe or conventional molecular beacon, the DNA-protein hybrid molecular beacon exhibited a much higher FA value, which was potential to obtain high signal-background ratio in sensing process. As proof-of-principle, this novel DNA-protein hybrid molecular beacon was further applied for FA bioassay using DNAzyme-Pb(2+) as a model sensing system. This FA assay approach could selectively detect as low as 0.5nM Pb(2+) in buffer solution, and also be successful for real samples analysis with good recovery values. Compatible with most of oligonucleotide probes' designs and enzyme-based signal amplification strategies, the molecular beacon can serve as a novel signal translator to expand the application prospect of FA technology in various bioassays. Copyright © 2015 Elsevier B.V. All rights reserved.

  3. Mapping of the minimal inorganic phosphate transporting unit of human PiT2 suggests a structure universal to PiT-related proteins from all kingdoms of life

    PubMed Central

    2011-01-01

    Background The inorganic (Pi) phosphate transporter (PiT) family comprises known and putative Na+- or H+-dependent Pi-transporting proteins with representatives from all kingdoms. The mammalian members are placed in the outer cell membranes and suggested to supply cells with Pi to maintain house-keeping functions. Alignment of protein sequences representing PiT family members from all kingdoms reveals the presence of conserved amino acids and that bacterial phosphate permeases and putative phosphate permeases from archaea lack substantial parts of the protein sequence when compared to the mammalian PiT family members. Besides being Na+-dependent Pi (NaPi) transporters, the mammalian PiT paralogs, PiT1 and PiT2, also are receptors for gamma-retroviruses. We have here exploited the dual-function of PiT1 and PiT2 to study the structure-function relationship of PiT proteins. Results We show that the human PiT2 histidine, H502, and the human PiT1 glutamate, E70, - both conserved in eukaryotic PiT family members - are critical for Pi transport function. Noticeably, human PiT2 H502 is located in the C-terminal PiT family signature sequence, and human PiT1 E70 is located in ProDom domains characteristic for all PiT family members. A human PiT2 truncation mutant, which consists of the predicted 10 transmembrane (TM) domain backbone without a large intracellular domain (human PiT2ΔR254-V483), was found to be a fully functional Pi transporter. Further truncation of the human PiT2 protein by additional removal of two predicted TM domains together with the large intracellular domain created a mutant that resembles a bacterial phosphate permease and an archaeal putative phosphate permease. This human PiT2 truncation mutant (human PiT2ΔL183-V483) did also support Pi transport albeit at very low levels. Conclusions The results suggest that the overall structure of the Pi-transporting unit of the PiT family proteins has remained unchanged during evolution. Moreover, in combination, our studies of the gene structure of the human PiT1 and PiT2 genes (SLC20A1 and SLC20A2, respectively) and alignment of protein sequences of PiT family members from all kingdoms, along with the studies of the dual functions of the human PiT paralogs show that these proteins are excellent as models for studying the evolution of a protein's structure-function relationship. PMID:21586110

  4. Determinants of BH3 Binding Specificity for Mcl-1 versus Bcl-x[subscript L

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Dutta, Sanjib; Gullá, Stefano; Chen, T. Scott

    2010-06-25

    Interactions among Bcl-2 family proteins are important for regulating apoptosis. Prosurvival members of the family interact with proapoptotic BH3 (Bcl-2-homology-3)-only members, inhibiting execution of cell death through the mitochondrial pathway. Structurally, this interaction is mediated by binding of the {alpha}-helical BH3 region of the proapoptotic proteins to a conserved hydrophobic groove on the prosurvival proteins. Native BH3-only proteins exhibit selectivity in binding prosurvival members, as do small molecules that block these interactions. Understanding the sequence and structural basis of interaction specificity in this family is important, as it may allow the prediction of new Bcl-2 family associations and/or the designmore » of new classes of selective inhibitors to serve as reagents or therapeutics. In this work, we used two complementary techniques - yeast surface display screening from combinatorial peptide libraries and SPOT peptide array analysis - to elucidate specificity determinants for binding to Bcl-x{sub L} versus Mcl-1, two prominent prosurvival proteins. We screened a randomized library and identified BH3 peptides that bound to either Mcl-1 or Bcl-x{sub L} selectively or to both with high affinity. The peptides competed with native ligands for binding into the conserved hydrophobic groove, as illustrated in detail by a crystal structure of a specific peptide bound to Mcl-1. Mcl-1-selective peptides from the screen were highly specific for binding Mcl-1 in preference to Bcl-x{sub L}, Bcl-2, Bcl-w, and Bfl-1, whereas Bcl-x{sub L}-selective peptides showed some cross-interaction with related proteins Bcl-2 and Bcl-w. Mutational analyses using SPOT arrays revealed the effects of 170 point mutations made in the background of a peptide derived from the BH3 region of Bim, and a simple predictive model constructed using these data explained much of the specificity observed in our Mcl-1 versus Bcl-x{sub L} binders.« less

  5. Combining automated peak tracking in SAR by NMR with structure-based backbone assignment from 15N-NOESY

    PubMed Central

    2012-01-01

    Background Chemical shift mapping is an important technique in NMR-based drug screening for identifying the atoms of a target protein that potentially bind to a drug molecule upon the molecule's introduction in increasing concentrations. The goal is to obtain a mapping of peaks with known residue assignment from the reference spectrum of the unbound protein to peaks with unknown assignment in the target spectrum of the bound protein. Although a series of perturbed spectra help to trace a path from reference peaks to target peaks, a one-to-one mapping generally is not possible, especially for large proteins, due to errors, such as noise peaks, missing peaks, missing but then reappearing, overlapped, and new peaks not associated with any peaks in the reference. Due to these difficulties, the mapping is typically done manually or semi-automatically, which is not efficient for high-throughput drug screening. Results We present PeakWalker, a novel peak walking algorithm for fast-exchange systems that models the errors explicitly and performs many-to-one mapping. On the proteins: hBclXL, UbcH5B, and histone H1, it achieves an average accuracy of over 95% with less than 1.5 residues predicted per target peak. Given these mappings as input, we present PeakAssigner, a novel combined structure-based backbone resonance and NOE assignment algorithm that uses just 15N-NOESY, while avoiding TOCSY experiments and 13C-labeling, to resolve the ambiguities for a one-to-one mapping. On the three proteins, it achieves an average accuracy of 94% or better. Conclusions Our mathematical programming approach for modeling chemical shift mapping as a graph problem, while modeling the errors directly, is potentially a time- and cost-effective first step for high-throughput drug screening based on limited NMR data and homologous 3D structures. PMID:22536902

  6. Determinants of BH3 binding specificity for Mcl-1 vs. Bcl-xL

    PubMed Central

    Dutta, Sanjib; Gullá, Stefano; Chen, T. Scott; Fire, Emiko; Grant, Robert A.; Keating, Amy E.

    2010-01-01

    Interactions among Bcl-2 family proteins are important for regulating apoptosis. Pro-survival members of the family interact with pro-apoptotic BH3-only members, inhibiting execution of cell death through the mitochondrial pathway. Structurally, this interaction is mediated by binding of the alpha-helical BH3 region of the pro-apoptotic proteins to a conserved hydrophobic groove on the pro-survival proteins. Native BH3-only proteins exhibit selectivity in binding pro-survival members, as do small molecules that block these interactions. Understanding the sequence and structural basis of interaction specificity in this family is important, as it may allow the prediction of new Bcl-2 family associations and/or the design of new classes of selective inhibitors to serve as reagents or therapeutics. In this work we used two complementary techniques, yeast surface display screening from combinatorial peptide libraries and SPOT peptide array analysis, to elucidate specificity determinants for binding to Bcl-xL vs. Mcl-1, two prominent pro-survival proteins. We screened a randomized library and identified BH3 peptides that bound to either Mcl-1 or Bcl-xL selectively, or to both with high affinity. The peptides competed with native ligands for binding into the conserved hydrophobic groove, as illustrated in detail by a crystal structure of a specific peptide bound to Mcl-1. Mcl-1 selective peptides from the screen were highly specific for binding Mcl-1 in preference to Bcl-xL, Bcl-2, Bcl-w and Bfl-1, whereas Bcl-xL selective peptides showed some cross-interaction with related proteins Bcl-2 and Bcl-w. Mutational analyses using SPOT arrays revealed the effects of 170 point mutations made in the background of a peptide derived from the BH3 region of Bim, and a simple predictive model constructed using these data explained much of the specificity observed in our Mcl-1 vs. Bcl-xL binders. PMID:20363230

  7. A stochastic context free grammar based framework for analysis of protein sequences

    PubMed Central

    Dyrka, Witold; Nebel, Jean-Christophe

    2009-01-01

    Background In the last decade, there have been many applications of formal language theory in bioinformatics such as RNA structure prediction and detection of patterns in DNA. However, in the field of proteomics, the size of the protein alphabet and the complexity of relationship between amino acids have mainly limited the application of formal language theory to the production of grammars whose expressive power is not higher than stochastic regular grammars. However, these grammars, like other state of the art methods, cannot cover any higher-order dependencies such as nested and crossing relationships that are common in proteins. In order to overcome some of these limitations, we propose a Stochastic Context Free Grammar based framework for the analysis of protein sequences where grammars are induced using a genetic algorithm. Results This framework was implemented in a system aiming at the production of binding site descriptors. These descriptors not only allow detection of protein regions that are involved in these sites, but also provide insight in their structure. Grammars were induced using quantitative properties of amino acids to deal with the size of the protein alphabet. Moreover, we imposed some structural constraints on grammars to reduce the extent of the rule search space. Finally, grammars based on different properties were combined to convey as much information as possible. Evaluation was performed on sites of various sizes and complexity described either by PROSITE patterns, domain profiles or a set of patterns. Results show the produced binding site descriptors are human-readable and, hence, highlight biologically meaningful features. Moreover, they achieve good accuracy in both annotation and detection. In addition, findings suggest that, unlike current state-of-the-art methods, our system may be particularly suited to deal with patterns shared by non-homologous proteins. Conclusion A new Stochastic Context Free Grammar based framework has been introduced allowing the production of binding site descriptors for analysis of protein sequences. Experiments have shown that not only is this new approach valid, but produces human-readable descriptors for binding sites which have been beyond the capability of current machine learning techniques. PMID:19814800

  8. Role of long- and short-range hydrophobic, hydrophilic and charged residues contact network in protein’s structural organization

    PubMed Central

    2012-01-01

    Background The three-dimensional structure of a protein can be described as a graph where nodes represent residues and the strength of non-covalent interactions between them are edges. These protein contact networks can be separated into long and short-range interactions networks depending on the positions of amino acids in primary structure. Long-range interactions play a distinct role in determining the tertiary structure of a protein while short-range interactions could largely contribute to the secondary structure formations. In addition, physico chemical properties and the linear arrangement of amino acids of the primary structure of a protein determines its three dimensional structure. Here, we present an extensive analysis of protein contact subnetworks based on the London van der Waals interactions of amino acids at different length scales. We further subdivided those networks in hydrophobic, hydrophilic and charged residues networks and have tried to correlate their influence in the overall topology and organization of a protein. Results The largest connected component (LCC) of long (LRN)-, short (SRN)- and all-range (ARN) networks within proteins exhibit a transition behaviour when plotted against different interaction strengths of edges among amino acid nodes. While short-range networks having chain like structures exhibit highly cooperative transition; long- and all-range networks, which are more similar to each other, have non-chain like structures and show less cooperativity. Further, the hydrophobic residues subnetworks in long- and all-range networks have similar transition behaviours with all residues all-range networks, but the hydrophilic and charged residues networks don’t. While the nature of transitions of LCC’s sizes is same in SRNs for thermophiles and mesophiles, there exists a clear difference in LRNs. The presence of larger size of interconnected long-range interactions in thermophiles than mesophiles, even at higher interaction strength between amino acids, give extra stability to the tertiary structure of the thermophiles. All the subnetworks at different length scales (ARNs, LRNs and SRNs) show assortativity mixing property of their participating amino acids. While there exists a significant higher percentage of hydrophobic subclusters over others in ARNs and LRNs; we do not find the assortative mixing behaviour of any the subclusters in SRNs. The clustering coefficient of hydrophobic subclusters in long-range network is the highest among types of subnetworks. There exist highly cliquish hydrophobic nodes followed by charged nodes in LRNs and ARNs; on the other hand, we observe the highest dominance of charged residues cliques in short-range networks. Studies on the perimeter of the cliques also show higher occurrences of hydrophobic and charged residues’ cliques. Conclusions The simple framework of protein contact networks and their subnetworks based on London van der Waals force is able to capture several known properties of protein structure as well as can unravel several new features. The thermophiles do not only have the higher number of long-range interactions; they also have larger cluster of connected residues at higher interaction strengths among amino acids, than their mesophilic counterparts. It can reestablish the significant role of long-range hydrophobic clusters in protein folding and stabilization; at the same time, it shed light on the higher communication ability of hydrophobic subnetworks over the others. The results give an indication of the controlling role of hydrophobic subclusters in determining protein’s folding rate. The occurrences of higher perimeters of hydrophobic and charged cliques imply the role of charged residues as well as hydrophobic residues in stabilizing the distant part of primary structure of a protein through London van der Waals interaction. PMID:22720789

  9. ProteinShader: illustrative rendering of macromolecules

    PubMed Central

    Weber, Joseph R

    2009-01-01

    Background Cartoon-style illustrative renderings of proteins can help clarify structural features that are obscured by space filling or balls and sticks style models, and recent advances in programmable graphics cards offer many new opportunities for improving illustrative renderings. Results The ProteinShader program, a new tool for macromolecular visualization, uses information from Protein Data Bank files to produce illustrative renderings of proteins that approximate what an artist might create by hand using pen and ink. A combination of Hermite and spherical linear interpolation is used to draw smooth, gradually rotating three-dimensional tubes and ribbons with a repeating pattern of texture coordinates, which allows the application of texture mapping, real-time halftoning, and smooth edge lines. This free platform-independent open-source program is written primarily in Java, but also makes extensive use of the OpenGL Shading Language to modify the graphics pipeline. Conclusion By programming to the graphics processor unit, ProteinShader is able to produce high quality images and illustrative rendering effects in real-time. The main feature that distinguishes ProteinShader from other free molecular visualization tools is its use of texture mapping techniques that allow two-dimensional images to be mapped onto the curved three-dimensional surfaces of ribbons and tubes with minimum distortion of the images. PMID:19331660

  10. DeepGO: predicting protein functions from sequence and interactions using a deep ontology-aware classifier.

    PubMed

    Kulmanov, Maxat; Khan, Mohammed Asif; Hoehndorf, Robert; Wren, Jonathan

    2018-02-15

    A large number of protein sequences are becoming available through the application of novel high-throughput sequencing technologies. Experimental functional characterization of these proteins is time-consuming and expensive, and is often only done rigorously for few selected model organisms. Computational function prediction approaches have been suggested to fill this gap. The functions of proteins are classified using the Gene Ontology (GO), which contains over 40 000 classes. Additionally, proteins have multiple functions, making function prediction a large-scale, multi-class, multi-label problem. We have developed a novel method to predict protein function from sequence. We use deep learning to learn features from protein sequences as well as a cross-species protein-protein interaction network. Our approach specifically outputs information in the structure of the GO and utilizes the dependencies between GO classes as background information to construct a deep learning model. We evaluate our method using the standards established by the Computational Assessment of Function Annotation (CAFA) and demonstrate a significant improvement over baseline methods such as BLAST, in particular for predicting cellular locations. Web server: http://deepgo.bio2vec.net, Source code: https://github.com/bio-ontology-research-group/deepgo. robert.hoehndorf@kaust.edu.sa. Supplementary data are available at Bioinformatics online. © The Author(s) 2017. Published by Oxford University Press.

  11. Discovery: an interactive resource for the rational selection and comparison of putative drug target proteins in malaria

    PubMed Central

    Joubert, Fourie; Harrison, Claudia M; Koegelenberg, Riaan J; Odendaal, Christiaan J; de Beer, Tjaart AP

    2009-01-01

    Background Up to half a billion human clinical cases of malaria are reported each year, resulting in about 2.7 million deaths, most of which occur in sub-Saharan Africa. Due to the over-and misuse of anti-malarials, widespread resistance to all the known drugs is increasing at an alarming rate. Rational methods to select new drug target proteins and lead compounds are urgently needed. The Discovery system provides data mining functionality on extensive annotations of five malaria species together with the human and mosquito hosts, enabling the selection of new targets based on multiple protein and ligand properties. Methods A web-based system was developed where researchers are able to mine information on malaria proteins and predicted ligands, as well as perform comparisons to the human and mosquito host characteristics. Protein features used include: domains, motifs, EC numbers, GO terms, orthologs, protein-protein interactions, protein-ligand interactions and host-pathogen interactions among others. Searching by chemical structure is also available. Results An in silico system for the selection of putative drug targets and lead compounds is presented, together with an example study on the bifunctional DHFR-TS from Plasmodium falciparum. Conclusion The Discovery system allows for the identification of putative drug targets and lead compounds in Plasmodium species based on the filtering of protein and chemical properties. PMID:19642978

  12. The Salivary Secretome of the Tsetse Fly Glossina pallidipes (Diptera: Glossinidae) Infected by Salivary Gland Hypertrophy Virus

    PubMed Central

    Kariithi, Henry M.; Ince, Ikbal A.; Boeren, Sjef; Abd-Alla, Adly M. M.; Parker, Andrew G.; Aksoy, Serap; Vlak, Just M.; van Oers, Monique M.

    2011-01-01

    Background The competence of the tsetse fly Glossina pallidipes (Diptera; Glossinidae) to acquire salivary gland hypertrophy virus (SGHV), to support virus replication and successfully transmit the virus depends on complex interactions between Glossina and SGHV macromolecules. Critical requisites to SGHV transmission are its replication and secretion of mature virions into the fly's salivary gland (SG) lumen. However, secretion of host proteins is of equal importance for successful transmission and requires cataloging of G. pallidipes secretome proteins from hypertrophied and non-hypertrophied SGs. Methodology/Principal Findings After electrophoretic profiling and in-gel trypsin digestion, saliva proteins were analyzed by nano-LC-MS/MS. MaxQuant/Andromeda search of the MS data against the non-redundant (nr) GenBank database and a G. morsitans morsitans SG EST database, yielded a total of 521 hits, 31 of which were SGHV-encoded. On a false discovery rate limit of 1% and detection threshold of least 2 unique peptides per protein, the analysis resulted in 292 Glossina and 25 SGHV MS-supported proteins. When annotated by the Blast2GO suite, at least one gene ontology (GO) term could be assigned to 89.9% (285/317) of the detected proteins. Five (∼1.8%) Glossina and three (∼12%) SGHV proteins remained without a predicted function after blast searches against the nr database. Sixty-five of the 292 detected Glossina proteins contained an N-terminal signal/secretion peptide sequence. Eight of the SGHV proteins were predicted to be non-structural (NS), and fourteen are known structural (VP) proteins. Conclusions/Significance SGHV alters the protein expression pattern in Glossina. The G. pallidipes SG secretome encompasses a spectrum of proteins that may be required during the SGHV infection cycle. These detected proteins have putative interactions with at least 21 of the 25 SGHV-encoded proteins. Our findings opens venues for developing novel SGHV mitigation strategies to block SGHV infections in tsetse production facilities such as using SGHV-specific antibodies and phage display-selected gut epithelia-binding peptides. PMID:22132244

  13. Crystallographic and Molecular Dynamics Analysis of Loop Motions Unmasking the Peptidoglycan-Binding Site in Stator Protein MotB of Flagellar Motor

    PubMed Central

    Nahar, Musammat F.; Buckle, Ashley M.; Roujeinikova, Anna

    2011-01-01

    Background The C-terminal domain of MotB (MotB-C) shows high sequence similarity to outer membrane protein A and related peptidoglycan (PG)-binding proteins. It is believed to anchor the power-generating MotA/MotB stator unit of the bacterial flagellar motor to the peptidoglycan layer of the cell wall. We previously reported the first crystal structure of this domain and made a puzzling observation that all conserved residues that are thought to be essential for PG recognition are buried and inaccessible in the crystal structure. In this study, we tested a hypothesis that peptidoglycan binding is preceded by, or accompanied by, some structural reorganization that exposes the key conserved residues. Methodology/Principal Findings We determined the structure of a new crystalline form (Form B) of Helicobacter pylori MotB-C. Comparisons with the existing Form A revealed conformational variations in the petal-like loops around the carbohydrate binding site near one end of the β-sheet. These variations are thought to reflect natural flexibility at this site required for insertion into the peptidoglycan mesh. In order to understand the nature of this flexibility we have performed molecular dynamics simulations of the MotB-C dimer. The results are consistent with the crystallographic data and provide evidence that the three loops move in a concerted fashion, exposing conserved MotB residues that have previously been implicated in binding of the peptide moiety of peptidoglycan. Conclusion/Significance Our structural analysis provides a new insight into the mechanism by which MotB inserts into the peptidoglycan mesh, thus anchoring the power-generating complex to the cell wall. PMID:21533052

  14. Structural molecular biology: Recent results from neutron diffraction

    NASA Astrophysics Data System (ADS)

    Timmins, Peter A.

    1995-02-01

    Neutron diffraction is of importance in structural biology at several different levels of resolution. In most cases the unique possibility arising from deuterium labelling or contrast variation is of fundamental importance in providing information complementary to that which can be obtained from X-ray diffraction. At high resolution, neutron crystallography of proteins allows the location of hydrogen atoms in the molecule or of the hydration water, both of which may be central to biological activity. A major difficulty in this field has been the poor signal-to-noise ratio of the data arising not only from relatively low beam intensities and small crystals but, most importantly from the incoherent background due to hydrogen atoms in the sample. Modern methods of molecular biology now offer ways of producing fully deuterated proteins by cloning in bacteria grown on fully deuterated media. At a slightly lower resolution, there are a number of systems which may be ordered in one or two dimensions. This is the case in the purple membrane where neutron diffraction with deuterium labelling has complemented high resolution electron diffraction. Finally there is a class of very large macromolecular systems which can be crystallised and have been studied by X-ray diffraction but in which part of the structure is locally disordered and usually has insufficient contrast to be seen with X-rays. In this case the use of H 2O/D 2O contrast variation allows these components to be located. Examples of this are the nucleic acid in virus structures and detergent bound to membrane proteins.

  15. Using molecular principal axes for structural comparison: determining the tertiary changes of a FAB antibody domain induced by antigenic binding

    PubMed Central

    Silverman, B David

    2007-01-01

    Background Comparison of different protein x-ray structures has previously been made in a number of different ways; for example, by visual examination, by differences in the locations of secondary structures, by explicit superposition of structural elements, e.g. α-carbon atom locations, or by procedures that utilize a common symmetry element or geometrical feature of the structures to be compared. Results A new approach is applied to determine the structural changes that an antibody protein domain experiences upon its interaction with an antigenic target. These changes are determined with the use of two different, however comparable, sets of principal axes that are obtained by diagonalizing the second-order tensors that yield the moments-of-geometry as well as an ellipsoidal characterization of domain shape, prior to and after interaction. Determination of these sets of axes for structural comparison requires no internal symmetry features of the domains, depending solely upon their representation in three-dimensional space. This representation may involve atomic, Cα, or residue centroid coordinates. The present analysis utilizes residue centroids. When the structural changes are minimal, the principal axes of the domains, prior to and after interaction, are essentially comparable and consequently may be used for structural comparison. When the differences of the axes cannot be neglected, but are nevertheless slight, a smaller relatively invariant substructure of the domains may be utilized for comparison. The procedure yields two distance metrics for structural comparison. First, the displacements of the residue centroids due to antigenic binding, referenced to the ellipsoidal principal axes, are noted. Second, changes in the ellipsoidal distances with respect to the non-interacting structure provide a direct measure of the spatial displacements of the residue centroids, towards either the interior or exterior of the domain. Conclusion With use of x-ray data from the protein data bank (PDB), these two metrics are shown to highlight, in a manner different from before, the structural changes that are induced in the overall domains as well as in the H3 loops of the complementarity-determining regions (CDR) upon FAB antibody binding to a truncated and to a synthetic hemagglutinin viral antigenic target. PMID:17996091

  16. Discrete structural features among interface residue-level classes

    PubMed Central

    2015-01-01

    Background Protein-protein interaction (PPI) is essential for molecular functions in biological cells. Investigation on protein interfaces of known complexes is an important step towards deciphering the driving forces of PPIs. Each PPI complex is specific, sensitive and selective to binding. Therefore, we have estimated the relative difference in percentage of polar residues between surface and the interface for each complex in a non-redundant heterodimer dataset of 278 complexes to understand the predominant forces driving binding. Results Our analysis showed ~60% of protein complexes with surface polarity greater than interface polarity (designated as class A). However, a considerable number of complexes (~40%) have interface polarity greater than surface polarity, (designated as class B), with a significantly different p-value of 1.66E-45 from class A. Comprehensive analyses of protein complexes show that interface features such as interface area, interface polarity abundance, solvation free energy gain upon interface formation, binding energy and the percentage of interface charged residue abundance distinguish among class A and class B complexes, while electrostatic visualization maps also help differentiate interface classes among complexes. Conclusions Class A complexes are classical with abundant non-polar interactions at the interface; however class B complexes have abundant polar interactions at the interface, similar to protein surface characteristics. Five physicochemical interface features analyzed from the protein heterodimer dataset are discriminatory among the interface residue-level classes. These novel observations find application in developing residue-level models for protein-protein binding prediction, protein-protein docking studies and interface inhibitor design as drugs. PMID:26679043

  17. Compounds that correct F508del-CFTR trafficking can also correct other protein trafficking diseases: an in vitro study using cell lines

    PubMed Central

    2013-01-01

    Background Many genetic diseases are due to defects in protein trafficking where the mutant protein is recognized by the quality control systems, retained in the endoplasmic reticulum (ER), and degraded by the proteasome. In many cases, the mutant protein retains function if it can be trafficked to its proper cellular location. We have identified structurally diverse correctors that restore the trafficking and function of the most common mutation causing cystic fibrosis, F508del-CFTR. Most of these correctors do not act directly as ligands of CFTR, but indirectly on other pathways to promote folding and correction. We hypothesize that these proteostasis regulators may also correct other protein trafficking diseases. Methods To test our hypothesis, we used stable cell lines or transient transfection to express 2 well-studied trafficking disease mutations in each of 3 different proteins: the arginine-vasopressin receptor 2 (AVPR2, also known as V2R), the human ether-a-go-go-related gene (KCNH2, also known as hERG), and finally the sulfonylurea receptor 1 (ABCC8, also known as SUR1). We treated cells expressing these mutant proteins with 9 structurally diverse F508del-CFTR correctors that function through different cellular mechanisms and assessed whether correction occurred via immunoblotting and functional assays. Results were deemed significantly different from controls by a one-way ANOVA (p < 0.05). Results Here we show that F508del-CFTR correctors RDR1, KM60 and KM57 also correct some mutant alleles of other protein trafficking diseases. We also show that one corrector, the cardiac glycoside ouabain, was found to alter the glycosylation of all mutant alleles tested. Conclusions Correctors of F508del-CFTR trafficking might have broader applications to other protein trafficking diseases. PMID:23316740

  18. Applying Recovery Biomarkers to Calibrate Self-Report Measures of Energy and Protein in the Hispanic Community Health Study/Study of Latinos.

    PubMed

    Mossavar-Rahmani, Yasmin; Shaw, Pamela A; Wong, William W; Sotres-Alvarez, Daniela; Gellman, Marc D; Van Horn, Linda; Stoutenberg, Mark; Daviglus, Martha L; Wylie-Rosett, Judith; Siega-Riz, Anna Maria; Ou, Fang-Shu; Prentice, Ross L

    2015-06-15

    We investigated measurement error in the self-reported diets of US Hispanics/Latinos, who are prone to obesity and related comorbidities, by background (Central American, Cuban, Dominican, Mexican, Puerto Rican, and South American) in 2010-2012. In 477 participants aged 18-74 years, doubly labeled water and urinary nitrogen were used as objective recovery biomarkers of energy and protein intakes. Self-report was captured from two 24-hour dietary recalls. All measures were repeated in a subsample of 98 individuals. We examined the bias of dietary recalls and their associations with participant characteristics using generalized estimating equations. Energy intake was underestimated by 25.3% (men, 21.8%; women, 27.3%), and protein intake was underestimated by 18.5% (men, 14.7%; women, 20.7%). Protein density was overestimated by 10.7% (men, 11.3%; women, 10.1%). Higher body mass index and Hispanic/Latino background were associated with underestimation of energy (P<0.05). For protein intake, higher body mass index, older age, nonsmoking, Spanish speaking, and Hispanic/Latino background were associated with underestimation (P<0.05). Systematic underreporting of energy and protein intakes and overreporting of protein density were found to vary significantly by Hispanic/Latino background. We developed calibration equations that correct for subject-specific error in reporting that can be used to reduce bias in diet-disease association studies. © The Author 2015. Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of Public Health.

  19. The E2 glycoprotein of classical swine fever virus is a virulence determinant in swine.

    PubMed

    Risatti, G R; Borca, M V; Kutish, G F; Lu, Z; Holinka, L G; French, R A; Tulman, E R; Rock, D L

    2005-03-01

    To identify genetic determinants of classical swine fever virus (CSFV) virulence and host range, chimeras of the highly pathogenic Brescia strain and the attenuated vaccine strain CS were constructed and evaluated for viral virulence in swine. Upon initial screening, only chimeras 138.8v and 337.14v, the only chimeras containing the E2 glycoprotein of CS, were attenuated in swine despite exhibiting unaltered growth characteristics in primary porcine macrophage cell cultures. Additional viral chimeras were constructed to confirm the role of E2 in virulence. Chimeric virus 319.1v, which contained only the CS E2 glycoprotein in the Brescia background, was markedly attenuated in pigs, exhibiting significantly decreased virus replication in tonsils, a transient viremia, limited generalization of infection, and decreased virus shedding. Chimeras encoding all Brescia structural proteins in a CS genetic background remained attenuated, indicating that additional mutations outside the structural region are important for CS vaccine virus attenuation. These results demonstrate that CS E2 alone is sufficient for attenuating Brescia, indicating a significant role for the CSFV E2 glycoprotein in swine virulence.

  20. A genome-wide survey on basic helix-loop-helix transcription factors in giant panda.

    PubMed

    Dang, Chunwang; Wang, Yong; Zhang, Debao; Yao, Qin; Chen, Keping

    2011-01-01

    The giant panda (Ailuropoda melanoleuca) is a critically endangered mammalian species. Studies on functions of regulatory proteins involved in developmental processes would facilitate understanding of specific behavior in giant panda. The basic helix-loop-helix (bHLH) proteins play essential roles in a wide range of developmental processes in higher organisms. bHLH family members have been identified in over 20 organisms, including fruit fly, zebrafish, mouse and human. Our present study identified 107 bHLH family members being encoded in giant panda genome. Phylogenetic analyses revealed that they belong to 44 bHLH families with 46, 25, 15, 4, 11 and 3 members in group A, B, C, D, E and F, respectively, while the remaining 3 members were assigned into "orphan". Compared to mouse, the giant panda does not encode seven bHLH proteins namely Beta3a, Mesp2, Sclerax, S-Myc, Hes5 (or Hes6), EBF4 and Orphan 1. These results provide useful background information for future studies on structure and function of bHLH proteins in the regulation of giant panda development.

  1. A population-based evolutionary search approach to the multiple minima problem in de novo protein structure prediction

    PubMed Central

    2013-01-01

    Background Elucidating the native structure of a protein molecule from its sequence of amino acids, a problem known as de novo structure prediction, is a long standing challenge in computational structural biology. Difficulties in silico arise due to the high dimensionality of the protein conformational space and the ruggedness of the associated energy surface. The issue of multiple minima is a particularly troublesome hallmark of energy surfaces probed with current energy functions. In contrast to the true energy surface, these surfaces are weakly-funneled and rich in comparably deep minima populated by non-native structures. For this reason, many algorithms seek to be inclusive and obtain a broad view of the low-energy regions through an ensemble of low-energy (decoy) conformations. Conformational diversity in this ensemble is key to increasing the likelihood that the native structure has been captured. Methods We propose an evolutionary search approach to address the multiple-minima problem in decoy sampling for de novo structure prediction. Two population-based evolutionary search algorithms are presented that follow the basic approach of treating conformations as individuals in an evolving population. Coarse graining and molecular fragment replacement are used to efficiently obtain protein-like child conformations from parents. Potential energy is used both to bias parent selection and determine which subset of parents and children will be retained in the evolving population. The effect on the decoy ensemble of sampling minima directly is measured by additionally mapping a conformation to its nearest local minimum before considering it for retainment. The resulting memetic algorithm thus evolves not just a population of conformations but a population of local minima. Results and conclusions Results show that both algorithms are effective in terms of sampling conformations in proximity of the known native structure. The additional minimization is shown to be key to enhancing sampling capability and obtaining a diverse ensemble of decoy conformations, circumventing premature convergence to sub-optimal regions in the conformational space, and approaching the native structure with proximity that is comparable to state-of-the-art decoy sampling methods. The results are shown to be robust and valid when using two representative state-of-the-art coarse-grained energy functions. PMID:24565020

  2. Transcripts with in silico predicted RNA structure are enriched everywhere in the mouse brain

    PubMed Central

    2012-01-01

    Background Post-transcriptional control of gene expression is mostly conducted by specific elements in untranslated regions (UTRs) of mRNAs, in collaboration with specific binding proteins and RNAs. In several well characterized cases, these RNA elements are known to form stable secondary structures. RNA secondary structures also may have major functional implications for long noncoding RNAs (lncRNAs). Recent transcriptional data has indicated the importance of lncRNAs in brain development and function. However, no methodical efforts to investigate this have been undertaken. Here, we aim to systematically analyze the potential for RNA structure in brain-expressed transcripts. Results By comprehensive spatial expression analysis of the adult mouse in situ hybridization data of the Allen Mouse Brain Atlas, we show that transcripts (coding as well as non-coding) associated with in silico predicted structured probes are highly and significantly enriched in almost all analyzed brain regions. Functional implications of these RNA structures and their role in the brain are discussed in detail along with specific examples. We observe that mRNAs with a structure prediction in their UTRs are enriched for binding, transport and localization gene ontology categories. In addition, after manual examination we observe agreement between RNA binding protein interaction sites near the 3’ UTR structures and correlated expression patterns. Conclusions Our results show a potential use for RNA structures in expressed coding as well as noncoding transcripts in the adult mouse brain, and describe the role of structured RNAs in the context of intracellular signaling pathways and regulatory networks. Based on this data we hypothesize that RNA structure is widely involved in transcriptional and translational regulatory mechanisms in the brain and ultimately plays a role in brain function. PMID:22651826

  3. A comparative analysis of the foamy and ortho virus capsid structures reveals an ancient domain duplication.

    PubMed

    Taylor, William R; Stoye, Jonathan P; Taylor, Ian A

    2017-04-04

    The Spumaretrovirinae (foamy viruses) and the Orthoretrovirinae (e.g. HIV) share many similarities both in genome structure and the sequences of the core viral encoded proteins, such as the aspartyl protease and reverse transcriptase. Similarity in the gag region of the genome is less obvious at the sequence level but has been illuminated by the recent solution of the foamy virus capsid (CA) structure. This revealed a clear structural similarity to the orthoretrovirus capsids but with marked differences that left uncertainty in the relationship between the two domains that comprise the structure. We have applied protein structure comparison methods in order to try and resolve this ambiguous relationship. These included both the DALI method and the SAP method, with rigorous statistical tests applied to the results of both methods. For this, we employed collections of artificial fold 'decoys' (generated from the pair of native structures being compared) to provide a customised background distribution for each comparison, thus allowing significance levels to be estimated. We have shown that the relationship of the two domains conforms to a simple linear correspondence rather than a domain transposition. These similarities suggest that the origin of both viral capsids was a common ancestor with a double domain structure. In addition, we show that there is also a significant structural similarity between the amino and carboxy domains in both the foamy and ortho viruses. These results indicate that, as well as the duplication of the double domain capsid, there may have been an even more ancient gene-duplication that preceded the double domain structure. In addition, our structure comparison methodology demonstrates a general approach to problems where the components have a high intrinsic level of similarity.

  4. High-Resolution Mapping of Chromatin Conformation in Cardiac Myocytes Reveals Structural Remodeling of the Epigenome in Heart Failure

    PubMed Central

    Rosa-Garrido, Manuel; Chapski, Douglas J.; Schmitt, Anthony D.; Kimball, Todd H.; Karbassi, Elaheh; Monte, Emma; Balderas, Enrique; Pellegrini, Matteo; Shih, Tsai-Ting; Soehalim, Elizabeth; Liem, David; Ping, Peipei; Galjart, Niels J.; Ren, Shuxun; Wang, Yibin; Ren, Bing

    2017-01-01

    Background: Cardiovascular disease is associated with epigenomic changes in the heart; however, the endogenous structure of cardiac myocyte chromatin has never been determined. Methods: To investigate the mechanisms of epigenomic function in the heart, genome-wide chromatin conformation capture (Hi-C) and DNA sequencing were performed in adult cardiac myocytes following development of pressure overload–induced hypertrophy. Mice with cardiac-specific deletion of CTCF (a ubiquitous chromatin structural protein) were generated to explore the role of this protein in chromatin structure and cardiac phenotype. Transcriptome analyses by RNA-seq were conducted as a functional readout of the epigenomic structural changes. Results: Depletion of CTCF was sufficient to induce heart failure in mice, and human patients with heart failure receiving mechanical unloading via left ventricular assist devices show increased CTCF abundance. Chromatin structural analyses revealed interactions within the cardiac myocyte genome at 5-kb resolution, enabling examination of intra- and interchromosomal events, and providing a resource for future cardiac epigenomic investigations. Pressure overload or CTCF depletion selectively altered boundary strength between topologically associating domains and A/B compartmentalization, measurements of genome accessibility. Heart failure involved decreased stability of chromatin interactions around disease-causing genes. In addition, pressure overload or CTCF depletion remodeled long-range interactions of cardiac enhancers, resulting in a significant decrease in local chromatin interactions around these functional elements. Conclusions: These findings provide a high-resolution chromatin architecture resource for cardiac epigenomic investigations and demonstrate that global structural remodeling of chromatin underpins heart failure. The newly identified principles of endogenous chromatin structure have key implications for epigenetic therapy. PMID:28802249

  5. GPU-Q-J, a fast method for calculating root mean square deviation (RMSD) after optimal superposition

    PubMed Central

    2011-01-01

    Background Calculation of the root mean square deviation (RMSD) between the atomic coordinates of two optimally superposed structures is a basic component of structural comparison techniques. We describe a quaternion based method, GPU-Q-J, that is stable with single precision calculations and suitable for graphics processor units (GPUs). The application was implemented on an ATI 4770 graphics card in C/C++ and Brook+ in Linux where it was 260 to 760 times faster than existing unoptimized CPU methods. Source code is available from the Compbio website http://software.compbio.washington.edu/misc/downloads/st_gpu_fit/ or from the author LHH. Findings The Nutritious Rice for the World Project (NRW) on World Community Grid predicted de novo, the structures of over 62,000 small proteins and protein domains returning a total of 10 billion candidate structures. Clustering ensembles of structures on this scale requires calculation of large similarity matrices consisting of RMSDs between each pair of structures in the set. As a real-world test, we calculated the matrices for 6 different ensembles from NRW. The GPU method was 260 times faster that the fastest existing CPU based method and over 500 times faster than the method that had been previously used. Conclusions GPU-Q-J is a significant advance over previous CPU methods. It relieves a major bottleneck in the clustering of large numbers of structures for NRW. It also has applications in structure comparison methods that involve multiple superposition and RMSD determination steps, particularly when such methods are applied on a proteome and genome wide scale. PMID:21453553

  6. Discovering Conformational Sub-States Relevant to Protein Function

    PubMed Central

    Ramanathan, Arvind; Savol, Andrej J.; Langmead, Christopher J.; Agarwal, Pratul K.; Chennubhotla, Chakra S.

    2011-01-01

    Background Internal motions enable proteins to explore a range of conformations, even in the vicinity of native state. The role of conformational fluctuations in the designated function of a protein is widely debated. Emerging evidence suggests that sub-groups within the range of conformations (or sub-states) contain properties that may be functionally relevant. However, low populations in these sub-states and the transient nature of conformational transitions between these sub-states present significant challenges for their identification and characterization. Methods and Findings To overcome these challenges we have developed a new computational technique, quasi-anharmonic analysis (QAA). QAA utilizes higher-order statistics of protein motions to identify sub-states in the conformational landscape. Further, the focus on anharmonicity allows identification of conformational fluctuations that enable transitions between sub-states. QAA applied to equilibrium simulations of human ubiquitin and T4 lysozyme reveals functionally relevant sub-states and protein motions involved in molecular recognition. In combination with a reaction pathway sampling method, QAA characterizes conformational sub-states associated with cis/trans peptidyl-prolyl isomerization catalyzed by the enzyme cyclophilin A. In these three proteins, QAA allows identification of conformational sub-states, with critical structural and dynamical features relevant to protein function. Conclusions Overall, QAA provides a novel framework to intuitively understand the biophysical basis of conformational diversity and its relevance to protein function. PMID:21297978

  7. Effects of Single Nucleotide Polymorphisms on Human N-Acetyltransferase 2 Structure and Dynamics by Molecular Dynamics Simulation

    PubMed Central

    Rajasekaran, M.; Abirami, Santhanam; Chen, Chinpan

    2011-01-01

    Background Arylamine N-acetyltransferase 2 (NAT2) is an important catalytic enzyme that metabolizes the carcinogenic arylamines, hydrazine drugs and chemicals. This enzyme is highly polymorphic in different human populations. Several polymorphisms of NAT2, including the single amino acid substitutions R64Q, I114T, D122N, L137F, Q145P, R197Q, and G286E, are classified as slow acetylators, whereas the wild-type NAT2 is classified as a fast acetylator. The slow acetylators are often associated with drug toxicity and efficacy as well as cancer susceptibility. The biological functions of these 7 mutations have previously been characterized, but the structural basis behind the reduced catalytic activity and reduced protein level is not clear. Methodology/Principal Findings We performed multiple molecular dynamics simulations of these mutants as well as NAT2 to investigate the structural and dynamical effects throughout the protein structure, specifically the catalytic triad, cofactor binding site, and the substrate binding pocket. None of these mutations induced unfolding; instead, their effects were confined to the inter-domain, domain 3 and 17-residue insert region, where the flexibility was significantly reduced relative to the wild-type. Structural effects of these mutations propagate through space and cause a change in catalytic triad conformation, cofactor binding site, substrate binding pocket size/shape and electrostatic potential. Conclusions/Significance Our results showed that the dynamical properties of all the mutant structures, especially in inter-domain, domain 3 and 17-residue insert region were affected in the same manner. Similarly, the electrostatic potential of all the mutants were altered and also the functionally important regions such as catalytic triad, cofactor binding site, and substrate binding pocket adopted different orientation and/or conformation relative to the wild-type that may affect the functions of the mutants. Overall, our study may provide the structural basis for reduced catalytic activity and protein level, as was experimentally observed for these polymorphisms. PMID:21980537

  8. Hybrid structures based on gold nanoparticles and semiconductor quantum dots for biosensor applications

    PubMed Central

    Kurochkina, Margarita; Konshina, Elena; Oseev, Aleksandr; Hirsch, Soeren

    2018-01-01

    Background The luminescence amplification of semiconductor quantum dots (QD) in the presence of self-assembled gold nanoparticles (Au NPs) is one of way for creating biosensors with highly efficient transduction. Aims The objective of this study was to fabricate the hybrid structures based on semiconductor CdSe/ZnS QDs and Au NP arrays and to use them as biosensors of protein. Methods In this paper, the hybrid structures based on CdSe/ZnS QDs and Au NP arrays were fabricated using spin coating processes. Au NP arrays deposited on a glass wafer were investigated by optical microscopy and absorption spectroscopy depending on numbers of spin coating layers and their baking temperature. Bovine serum albumin (BSA) was used as the target protein analyte in a phosphate buffer. A confocal laser scanning microscope was used to study the luminescent properties of Au NP/QD hybrid structures and to test BSA. Results The dimensions of Au NP aggregates increased and the space between them decreased with increasing processing temperature. At the same time, a blue shift of the plasmon resonance peak in the absorption spectra of Au NP arrays was observed. The deposition of CdSe/ZnS QDs with a core diameter of 5 nm on the surface of the Au NP arrays caused an increase in absorption and a red shift of the plasmon peak in the spectra. The exciton–plasmon enhancement of the QDs’ photoluminescence intensity has been obtained at room temperature for hybrid structures with Au NPs array pretreated at temperatures of 100°C and 150°C. It has been found that an increase in the weight content of BSA increases the photoluminescence intensity of such hybrid structures. Conclusion The ability of the qualitative and quantitative determination of protein content in solution using the Au NP/QD structures as an optical biosensor has been shown experimentally. PMID:29731613

  9. Systematic analysis of human kinase genes: a large number of genes and alternative splicing events result in functional and structural diversity

    PubMed Central

    Milanesi, Luciano; Petrillo, Mauro; Sepe, Leandra; Boccia, Angelo; D'Agostino, Nunzio; Passamano, Myriam; Di Nardo, Salvatore; Tasco, Gianluca; Casadio, Rita; Paolella, Giovanni

    2005-01-01

    Background Protein kinases are a well defined family of proteins, characterized by the presence of a common kinase catalytic domain and playing a significant role in many important cellular processes, such as proliferation, maintenance of cell shape, apoptosys. In many members of the family, additional non-kinase domains contribute further specialization, resulting in subcellular localization, protein binding and regulation of activity, among others. About 500 genes encode members of the kinase family in the human genome, and although many of them represent well known genes, a larger number of genes code for proteins of more recent identification, or for unknown proteins identified as kinase only after computational studies. Results A systematic in silico study performed on the human genome, led to the identification of 5 genes, on chromosome 1, 11, 13, 15 and 16 respectively, and 1 pseudogene on chromosome X; some of these genes are reported as kinases from NCBI but are absent in other databases, such as KinBase. Comparative analysis of 483 gene regions and subsequent computational analysis, aimed at identifying unannotated exons, indicates that a large number of kinase may code for alternately spliced forms or be incorrectly annotated. An InterProScan automated analysis was perfomed to study domain distribution and combination in the various families. At the same time, other structural features were also added to the annotation process, including the putative presence of transmembrane alpha helices, and the cystein propensity to participate into a disulfide bridge. Conclusion The predicted human kinome was extended by identifiying both additional genes and potential splice variants, resulting in a varied panorama where functionality may be searched at the gene and protein level. Structural analysis of kinase proteins domains as defined in multiple sources together with transmembrane alpha helices and signal peptide prediction provides hints to function assignment. The results of the human kinome analysis are collected in the KinWeb database, available for browsing and searching over the internet, where all results from the comparative analysis and the gene structure annotation are made available, alongside the domain information. Kinases may be searched by domain combinations and the relative genes may be viewed in a graphic browser at various level of magnification up to gene organization on the full chromosome set. PMID:16351747

  10. The evolutionary history of protein fold families and proteomes confirms that the archaeal ancestor is more ancient than the ancestors of other superkingdoms

    PubMed Central

    2012-01-01

    Background The entire evolutionary history of life can be studied using myriad sequences generated by genomic research. This includes the appearance of the first cells and of superkingdoms Archaea, Bacteria, and Eukarya. However, the use of molecular sequence information for deep phylogenetic analyses is limited by mutational saturation, differential evolutionary rates, lack of sequence site independence, and other biological and technical constraints. In contrast, protein structures are evolutionary modules that are highly conserved and diverse enough to enable deep historical exploration. Results Here we build phylogenies that describe the evolution of proteins and proteomes. These phylogenetic trees are derived from a genomic census of protein domains defined at the fold family (FF) level of structural classification. Phylogenomic trees of FF structures were reconstructed from genomic abundance levels of 2,397 FFs in 420 proteomes of free-living organisms. These trees defined timelines of domain appearance, with time spanning from the origin of proteins to the present. Timelines are divided into five different evolutionary phases according to patterns of sharing of FFs among superkingdoms: (1) a primordial protein world, (2) reductive evolution and the rise of Archaea, (3) the rise of Bacteria from the common ancestor of Bacteria and Eukarya and early development of the three superkingdoms, (4) the rise of Eukarya and widespread organismal diversification, and (5) eukaryal diversification. The relative ancestry of the FFs shows that reductive evolution by domain loss is dominant in the first three phases and is responsible for both the diversification of life from a universal cellular ancestor and the appearance of superkingdoms. On the other hand, domain gains are predominant in the last two phases and are responsible for organismal diversification, especially in Bacteria and Eukarya. Conclusions The evolution of functions that are associated with corresponding FFs along the timeline reveals that primordial metabolic domains evolved earlier than informational domains involved in translation and transcription, supporting the metabolism-first hypothesis rather than the RNA world scenario. In addition, phylogenomic trees of proteomes reconstructed from FFs appearing in each of the five phases of the protein world show that trees reconstructed from ancient domain structures were consistently rooted in archaeal lineages, supporting the proposal that the archaeal ancestor is more ancient than the ancestors of other superkingdoms. PMID:22284070

  11. Evaluation of the Pichia pastoris expression system for the production of GPCRs for structural analysis

    PubMed Central

    2011-01-01

    Background Various protein expression systems, such as Escherichia coli (E. coli), Saccharomyces cerevisiae (S. cerevisiae), Pichia pastoris (P. pastoris), insect cells and mammalian cell lines, have been developed for the synthesis of G protein-coupled receptors (GPCRs) for structural studies. Recently, the crystal structures of four recombinant human GPCRs, namely β2 adrenergic receptor, adenosine A2a receptor, CXCR4 and dopamine D3 receptor, were successfully determined using an insect cell expression system. GPCRs expressed in insect cells are believed to undergo mammalian-like posttranscriptional modifications and have similar functional properties than in mammals. Crystal structures of GPCRs have not yet been solved using yeast expression systems. In the present study, P. pastoris and insect cell expression systems for the human muscarinic acetylcholine receptor M2 subtype (CHRM2) were developed and the quantity and quality of CHRM2 synthesized by both expression systems were compared for the application in structural studies. Results The ideal conditions for the expression of CHRM2 in P. pastoris were 60 hr at 20°C in a buffer of pH 7.0. The specific activity of the expressed CHRM2 was 28.9 pmol/mg of membrane protein as determined by binding assays using [3H]-quinuclidinyl benzilate (QNB). Although the specific activity of the protein produced by P. pastoris was lower than that of Sf9 insect cells, CHRM2 yield in P. pastoris was 2-fold higher than in Sf9 insect cells because P. pastoris was cultured at high cell density. The dissociation constant (Kd) for QNB in P. pastoris was 101.14 ± 15.07 pM, which was similar to that in Sf9 insect cells (86.23 ± 8.57 pM). There were no differences in the binding affinity of CHRM2 for QNB between P. pastoris and Sf9 insect cells. Conclusion Compared to insect cells, P. pastoris is easier to handle, can be grown at lower cost, and can be expressed quicker at a large scale. Yeast, P. pastoris, and insect cells are all effective expression systems for GPCRs. The results of the present study strongly suggested that protein expression in P. pastoris can be applied to the structural and biochemical studies of GPCRs. PMID:21513509

  12. Rapid sampling of local minima in protein energy surface and effective reduction through a multi-objective filter

    PubMed Central

    2013-01-01

    Background Many problems in protein modeling require obtaining a discrete representation of the protein conformational space as an ensemble of conformations. In ab-initio structure prediction, in particular, where the goal is to predict the native structure of a protein chain given its amino-acid sequence, the ensemble needs to satisfy energetic constraints. Given the thermodynamic hypothesis, an effective ensemble contains low-energy conformations which are similar to the native structure. The high-dimensionality of the conformational space and the ruggedness of the underlying energy surface currently make it very difficult to obtain such an ensemble. Recent studies have proposed that Basin Hopping is a promising probabilistic search framework to obtain a discrete representation of the protein energy surface in terms of local minima. Basin Hopping performs a series of structural perturbations followed by energy minimizations with the goal of hopping between nearby energy minima. This approach has been shown to be effective in obtaining conformations near the native structure for small systems. Recent work by us has extended this framework to larger systems through employment of the molecular fragment replacement technique, resulting in rapid sampling of large ensembles. Methods This paper investigates the algorithmic components in Basin Hopping to both understand and control their effect on the sampling of near-native minima. Realizing that such an ensemble is reduced before further refinement in full ab-initio protocols, we take an additional step and analyze the quality of the ensemble retained by ensemble reduction techniques. We propose a novel multi-objective technique based on the Pareto front to filter the ensemble of sampled local minima. Results and conclusions We show that controlling the magnitude of the perturbation allows directly controlling the distance between consecutively-sampled local minima and, in turn, steering the exploration towards conformations near the native structure. For the minimization step, we show that the addition of Metropolis Monte Carlo-based minimization is no more effective than a simple greedy search. Finally, we show that the size of the ensemble of sampled local minima can be effectively and efficiently reduced by a multi-objective filter to obtain a simpler representation of the probed energy surface. PMID:24564970

  13. Graphene as a protein crystal mounting material to reduce background scatter.

    PubMed

    Wierman, Jennifer L; Alden, Jonathan S; Kim, Chae Un; McEuen, Paul L; Gruner, Sol M

    2013-10-01

    The overall signal-to-noise ratio per unit dose for X-ray diffraction data from protein crystals can be improved by reducing the mass and density of all material surrounding the crystals. This article demonstrates a path towards the practical ultimate in background reduction by use of atomically thin graphene sheets as a crystal mounting platform for protein crystals. The results show the potential for graphene in protein crystallography and other cases where X-ray scatter from the mounting material must be reduced and specimen dehydration prevented, such as in coherent X-ray diffraction imaging of microscopic objects.

  14. Graphene as a protein crystal mounting material to reduce background scatter

    PubMed Central

    Wierman, Jennifer L.; Alden, Jonathan S.; Kim, Chae Un; McEuen, Paul L.; Gruner, Sol M.

    2013-01-01

    The overall signal-to-noise ratio per unit dose for X-ray diffraction data from protein crystals can be improved by reducing the mass and density of all material surrounding the crystals. This article demonstrates a path towards the practical ultimate in background reduction by use of atomically thin graphene sheets as a crystal mounting platform for protein crystals. The results show the potential for graphene in protein crystallography and other cases where X-ray scatter from the mounting material must be reduced and specimen dehydration prevented, such as in coherent X-ray diffraction imaging of microscopic objects. PMID:24068843

  15. Twisted cyanines: a non-planar fluorogenic dye with superior photostability and its use in a protein-based fluoromodule.

    PubMed

    Shank, Nathaniel I; Pham, Ha H; Waggoner, Alan S; Armitage, Bruce A

    2013-01-09

    The cyanine dye thiazole orange (TO) is a well-known fluorogenic stain for DNA and RNA, but this property precludes its use as an intracellular fluorescent probe for non-nucleic acid biomolecules. Further, as is the case with many cyanines, the dye suffers from low photostability. Here, we report the synthesis of a bridge-substituted version of TO named α-CN-TO, where the central methine hydrogen of TO is replaced by an electron withdrawing cyano group, which was expected to decrease the susceptibility of the dye toward singlet oxygen-mediated degradation. An X-ray crystal structure shows that α-CN-TO is twisted drastically out of plane, in contrast to TO, which crystallizes in the planar conformation. α-CN-TO retains the fluorogenic behavior of the parent dye TO in viscous glycerol/water solvent, but direct irradiation and indirect bleaching studies showed that α-CN-TO is essentially inert to visible light and singlet oxygen. In addition, the twisted conformation of α-CN-TO mitigates nonspecific binding and fluorescence activation by DNA and a previously selected TO-binding protein and exhibits low background fluorescence in HeLa cell culture. α-CN-TO was then used to select a new protein that binds and activates fluorescence from the dye. The new α-CN-TO/protein fluoromodule exhibits superior photostability to an analogous TO/protein fluoromodule. These properties indicate that α-CN-TO will be a useful fluorogenic dye in combination with specific RNA and protein binding partners for both in vitro and cell-based applications. More broadly, structural features that promote nonplanar conformations can provide an effective method for reducing nonspecific binding of cationic dyes to nucleic acids and other biomolecules.

  16. Twisted Cyanines: A Non-Planar Fluorogenic Dye with Superior Photostability and its Use in a Protein-Based Fluoromodule

    PubMed Central

    Shank, Nathaniel I.; Pham, Ha; Waggoner, Alan S.; Armitage, Bruce A.

    2013-01-01

    The cyanine dye thiazole orange (TO) is a well-known fluorogenic stain for DNA and RNA, but this property precludes its use as an intracellular fluorescent probe for non-nucleic acid biomolecules. Further, as is the case with many cyanines, the dye suffers from low photostability. Here we report the synthesis of a bridge-substituted version of TO named α-CN-TO, where the central methine hydrogen of TO is replaced by an electron withdrawing cyano group, which was expected to decrease the susceptibility of the dye toward singlet oxygen-mediated degradation. An X-ray crystal structure shows that α-CN-TO is twisted drastically out of plane, in contrast to TO, which crystallizes in the planar conformation. α-CN-TO retains the fluorogenic behavior of the parent dye TO in viscous glycerol/water solvent, but direct irradiation and indirect bleaching studies showed that α-CN-TO is essentially inert to visible light and singlet oxygen. In addition, the twisted conformation of α-CN-TO mitigates non-specific binding and fluorescence activation by DNA and a previously selected TO-binding protein and exhibits low background fluorescence in HeLa cell culture. α-CN-TO was then used to select a new protein that binds and activates fluorescence from the dye. The new α-CN-TO/protein fluoromodule exhibits superior photostability to an analogous TO/protein fluoromodule. These properties indicate that α-CN-TO will be a useful fluorogenic dye in combination with specific RNA and protein binding partners for both in vitro and cell-based applications. More broadly, structural features that promote nonplanar conformations can provide an effective method for reducing nonspecific binding of cationic dyes to nucleic acids and other biomolecules. PMID:23252842

  17. Genome-wide analysis of WRKY gene family in Cucumis sativus

    PubMed Central

    2011-01-01

    Background WRKY proteins are a large family of transcriptional regulators in higher plant. They are involved in many biological processes, such as plant development, metabolism, and responses to biotic and abiotic stresses. Prior to the present study, only one full-length cucumber WRKY protein had been reported. The recent publication of the draft genome sequence of cucumber allowed us to conduct a genome-wide search for cucumber WRKY proteins, and to compare these positively identified proteins with their homologs in model plants, such as Arabidopsis. Results We identified a total of 55 WRKY genes in the cucumber genome. According to structural features of their encoded proteins, the cucumber WRKY (CsWRKY) genes were classified into three groups (group 1-3). Analysis of expression profiles of CsWRKY genes indicated that 48 WRKY genes display differential expression either in their transcript abundance or in their expression patterns under normal growth conditions, and 23 WRKY genes were differentially expressed in response to at least one abiotic stresses (cold, drought or salinity). The expression profile of stress-inducible CsWRKY genes were correlated with those of their putative Arabidopsis WRKY (AtWRKY) orthologs, except for the group 3 WRKY genes. Interestingly, duplicated group 3 AtWRKY genes appear to have been under positive selection pressure during evolution. In contrast, there was no evidence of recent gene duplication or positive selection pressure among CsWRKY group 3 genes, which may have led to the expressional divergence of group 3 orthologs. Conclusions Fifty-five WRKY genes were identified in cucumber and the structure of their encoded proteins, their expression, and their evolution were examined. Considering that there has been extensive expansion of group 3 WRKY genes in angiosperms, the occurrence of different evolutionary events could explain the functional divergence of these genes. PMID:21955985

  18. Formin homology 2 domains occur in multiple contexts in angiosperms

    PubMed Central

    Cvrčková, Fatima; Novotný, Marian; Pícková, Denisa; Žárský, Viktor

    2004-01-01

    Background Involvement of conservative molecular modules and cellular mechanisms in the widely diversified processes of eukaryotic cell morphogenesis leads to the intriguing question: how do similar proteins contribute to dissimilar morphogenetic outputs. Formins (FH2 proteins) play a central part in the control of actin organization and dynamics, providing a good example of evolutionarily versatile use of a conserved protein domain in the context of a variety of lineage-specific structural and signalling interactions. Results In order to identify possible plant-specific sequence features within the FH2 protein family, we performed a detailed analysis of angiosperm formin-related sequences available in public databases, with particular focus on the complete Arabidopsis genome and the nearly finished rice genome sequence. This has led to revision of the current annotation of half of the 22 Arabidopsis formin-related genes. Comparative analysis of the two plant genomes revealed a good conservation of the previously described two subfamilies of plant formins (Class I and Class II), as well as several subfamilies within them that appear to predate the separation of monocot and dicot plants. Moreover, a number of plant Class II formins share an additional conserved domain, related to the protein phosphatase/tensin/auxilin fold. However, considerable inter-species variability sets limits to generalization of any functional conclusions reached on a single species such as Arabidopsis. Conclusions The plant-specific domain context of the conserved FH2 domain, as well as plant-specific features of the domain itself, may reflect distinct functional requirements in plant cells. The variability of formin structures found in plants far exceeds that known from both fungi and metazoans, suggesting a possible contribution of FH2 proteins in the evolution of the plant type of multicellularity. PMID:15256004

  19. Experimental validation of FINDSITEcomb virtual ligand screening results for eight proteins yields novel nanomolar and micromolar binders

    PubMed Central

    2014-01-01

    Background Identification of ligand-protein binding interactions is a critical step in drug discovery. Experimental screening of large chemical libraries, in spite of their specific role and importance in drug discovery, suffer from the disadvantages of being random, time-consuming and expensive. To accelerate the process, traditional structure- or ligand-based VLS approaches are combined with experimental high-throughput screening, HTS. Often a single protein or, at most, a protein family is considered. Large scale VLS benchmarking across diverse protein families is rarely done, and the reported success rate is very low. Here, we demonstrate the experimental HTS validation of a novel VLS approach, FINDSITEcomb, across a diverse set of medically-relevant proteins. Results For eight different proteins belonging to different fold-classes and from diverse organisms, the top 1% of FINDSITEcomb’s VLS predictions were tested, and depending on the protein target, 4%-47% of the predicted ligands were shown to bind with μM or better affinities. In total, 47 small molecule binders were identified. Low nanomolar (nM) binders for dihydrofolate reductase and protein tyrosine phosphatases (PTPs) and micromolar binders for the other proteins were identified. Six novel molecules had cytotoxic activity (<10 μg/ml) against the HCT-116 colon carcinoma cell line and one novel molecule had potent antibacterial activity. Conclusions We show that FINDSITEcomb is a promising new VLS approach that can assist drug discovery. PMID:24936211

  20. TINAGL1 and B3GALNT1 are potential therapy target genes to suppress metastasis in non-small cell lung cancer

    PubMed Central

    2014-01-01

    Background Non-small cell lung cancer (NSCLC) remains lethal despite the development of numerous drug therapy technologies. About 85% to 90% of lung cancers are NSCLC and the 5-year survival rate is at best still below 50%. Thus, it is important to find drugable target genes for NSCLC to develop an effective therapy for NSCLC. Results Integrated analysis of publically available gene expression and promoter methylation patterns of two highly aggressive NSCLC cell lines generated by in vivo selection was performed. We selected eleven critical genes that may mediate metastasis using recently proposed principal component analysis based unsupervised feature extraction. The eleven selected genes were significantly related to cancer diagnosis. The tertiary protein structure of the selected genes was inferred by Full Automatic Modeling System, a profile-based protein structure inference software, to determine protein functions and to specify genes that could be potential drug targets. Conclusions We identified eleven potentially critical genes that may mediate NSCLC metastasis using bioinformatic analysis of publically available data sets. These genes are potential target genes for the therapy of NSCLC. Among the eleven genes, TINAGL1 and B3GALNT1 are possible candidates for drug compounds that inhibit their gene expression. PMID:25521548

  1. Hit detection in serial femtosecond crystallography using X-ray spectroscopy of plasma emission.

    PubMed

    Jönsson, H Olof; Caleman, Carl; Andreasson, Jakob; Tîmneanu, Nicuşor

    2017-11-01

    Serial femtosecond crystallography is an emerging and promising method for determining protein structures, making use of the ultrafast and bright X-ray pulses from X-ray free-electron lasers. The upcoming X-ray laser sources will produce well above 1000 pulses per second and will pose a new challenge: how to quickly determine successful crystal hits and avoid a high-rate data deluge. Proposed here is a hit-finding scheme based on detecting photons from plasma emission after the sample has been intercepted by the X-ray laser. Plasma emission spectra are simulated for systems exposed to high-intensity femtosecond pulses, for both protein crystals and the liquid carrier systems that are used for sample delivery. The thermal radiation from the glowing plasma gives a strong background in the XUV region that depends on the intensity of the pulse, around the emission lines from light elements (carbon, nitrogen, oxygen). Sample hits can be reliably distinguished from the carrier liquid based on the characteristic emission lines from heavier elements present only in the sample, such as sulfur. For buffer systems with sulfur present, selenomethionine substitution is suggested, where the selenium emission lines could be used both as an indication of a hit and as an aid in phasing and structural reconstruction of the protein.

  2. Crossword: A Fully Automated Algorithm for the Segmentation and Quality Control of Protein Microarray Images

    PubMed Central

    2015-01-01

    Biological assays formatted as microarrays have become a critical tool for the generation of the comprehensive data sets required for systems-level understanding of biological processes. Manual annotation of data extracted from images of microarrays, however, remains a significant bottleneck, particularly for protein microarrays due to the sensitivity of this technology to weak artifact signal. In order to automate the extraction and curation of data from protein microarrays, we describe an algorithm called Crossword that logically combines information from multiple approaches to fully automate microarray segmentation. Automated artifact removal is also accomplished by segregating structured pixels from the background noise using iterative clustering and pixel connectivity. Correlation of the location of structured pixels across image channels is used to identify and remove artifact pixels from the image prior to data extraction. This component improves the accuracy of data sets while reducing the requirement for time-consuming visual inspection of the data. Crossword enables a fully automated protocol that is robust to significant spatial and intensity aberrations. Overall, the average amount of user intervention is reduced by an order of magnitude and the data quality is increased through artifact removal and reduced user variability. The increase in throughput should aid the further implementation of microarray technologies in clinical studies. PMID:24417579

  3. When the Lowest Energy Does Not Induce Native Structures: Parallel Minimization of Multi-Energy Values by Hybridizing Searching Intelligences

    PubMed Central

    Lü, Qiang; Xia, Xiao-Yan; Chen, Rong; Miao, Da-Jun; Chen, Sha-Sha; Quan, Li-Jun; Li, Hai-Ou

    2012-01-01

    Background Protein structure prediction (PSP), which is usually modeled as a computational optimization problem, remains one of the biggest challenges in computational biology. PSP encounters two difficult obstacles: the inaccurate energy function problem and the searching problem. Even if the lowest energy has been luckily found by the searching procedure, the correct protein structures are not guaranteed to obtain. Results A general parallel metaheuristic approach is presented to tackle the above two problems. Multi-energy functions are employed to simultaneously guide the parallel searching threads. Searching trajectories are in fact controlled by the parameters of heuristic algorithms. The parallel approach allows the parameters to be perturbed during the searching threads are running in parallel, while each thread is searching the lowest energy value determined by an individual energy function. By hybridizing the intelligences of parallel ant colonies and Monte Carlo Metropolis search, this paper demonstrates an implementation of our parallel approach for PSP. 16 classical instances were tested to show that the parallel approach is competitive for solving PSP problem. Conclusions This parallel approach combines various sources of both searching intelligences and energy functions, and thus predicts protein conformations with good quality jointly determined by all the parallel searching threads and energy functions. It provides a framework to combine different searching intelligence embedded in heuristic algorithms. It also constructs a container to hybridize different not-so-accurate objective functions which are usually derived from the domain expertise. PMID:23028708

  4. IFACEwat: the interfacial water-implemented re-ranking algorithm to improve the discrimination of near native structures for protein rigid docking

    PubMed Central

    2014-01-01

    Background Protein-protein docking is an in silico method to predict the formation of protein complexes. Due to limited computational resources, the protein-protein docking approach has been developed under the assumption of rigid docking, in which one of the two protein partners remains rigid during the protein associations and water contribution is ignored or implicitly presented. Despite obtaining a number of acceptable complex predictions, it seems to-date that most initial rigid docking algorithms still find it difficult or even fail to discriminate successfully the correct predictions from the other incorrect or false positive ones. To improve the rigid docking results, re-ranking is one of the effective methods that help re-locate the correct predictions in top high ranks, discriminating them from the other incorrect ones. In this paper, we propose a new re-ranking technique using a new energy-based scoring function, namely IFACEwat - a combined Interface Atomic Contact Energy (IFACE) and water effect. The IFACEwat aims to further improve the discrimination of the near-native structures of the initial rigid docking algorithm ZDOCK3.0.2. Unlike other re-ranking techniques, the IFACEwat explicitly implements interfacial water into the protein interfaces to account for the water-mediated contacts during the protein interactions. Results Our results showed that the IFACEwat increased both the numbers of the near-native structures and improved their ranks as compared to the initial rigid docking ZDOCK3.0.2. In fact, the IFACEwat achieved a success rate of 83.8% for Antigen/Antibody complexes, which is 10% better than ZDOCK3.0.2. As compared to another re-ranking technique ZRANK, the IFACEwat obtains success rates of 92.3% (8% better) and 90% (5% better) respectively for medium and difficult cases. When comparing with the latest published re-ranking method F2Dock, the IFACEwat performed equivalently well or even better for several Antigen/Antibody complexes. Conclusions With the inclusion of interfacial water, the IFACEwat improves mostly results of the initial rigid docking, especially for Antigen/Antibody complexes. The improvement is achieved by explicitly taking into account the contribution of water during the protein interactions, which was ignored or not fully presented by the initial rigid docking and other re-ranking techniques. In addition, the IFACEwat maintains sufficient computational efficiency of the initial docking algorithm, yet improves the ranks as well as the number of the near native structures found. As our implementation so far targeted to improve the results of ZDOCK3.0.2, and particularly for the Antigen/Antibody complexes, it is expected in the near future that more implementations will be conducted to be applicable for other initial rigid docking algorithms. PMID:25521441

  5. Coiled-coil protein composition of 22 proteomes – differences and common themes in subcellular infrastructure and traffic control

    PubMed Central

    Rose, Annkatrin; Schraegle, Shannon J; Stahlberg, Eric A; Meier, Iris

    2005-01-01

    Background Long alpha-helical coiled-coil proteins are involved in diverse organizational and regulatory processes in eukaryotic cells. They provide cables and networks in the cyto- and nucleoskeleton, molecular scaffolds that organize membrane systems and tissues, motors, levers, rotating arms, and possibly springs. Mutations in long coiled-coil proteins have been implemented in a growing number of human diseases. Using the coiled-coil prediction program MultiCoil, we have previously identified all long coiled-coil proteins from the model plant Arabidopsis thaliana and have established a searchable Arabidopsis coiled-coil protein database. Results Here, we have identified all proteins with long coiled-coil domains from 21 additional fully sequenced genomes. Because regions predicted to form coiled-coils interfere with sequence homology determination, we have developed a sequence comparison and clustering strategy based on masking predicted coiled-coil domains. Comparing and grouping all long coiled-coil proteins from 22 genomes, the kingdom-specificity of coiled-coil protein families was determined. At the same time, a number of proteins with unknown function could be grouped with already characterized proteins from other organisms. Conclusion MultiCoil predicts proteins with extended coiled-coil domains (more than 250 amino acids) to be largely absent from bacterial genomes, but present in archaea and eukaryotes. The structural maintenance of chromosomes proteins and their relatives are the only long coiled-coil protein family clearly conserved throughout all kingdoms, indicating their ancient nature. Motor proteins, membrane tethering and vesicle transport proteins are the dominant eukaryote-specific long coiled-coil proteins, suggesting that coiled-coil proteins have gained functions in the increasingly complex processes of subcellular infrastructure maintenance and trafficking control of the eukaryotic cell. PMID:16288662

  6. Genetic variability and natural selection at the ligand domain of the Duffy binding protein in brazilian Plasmodium vivax populations

    PubMed Central

    2010-01-01

    Background Plasmodium vivax malaria is a major public health challenge in Latin America, Asia and Oceania, with 130-435 million clinical cases per year worldwide. Invasion of host blood cells by P. vivax mainly depends on a type I membrane protein called Duffy binding protein (PvDBP). The erythrocyte-binding motif of PvDBP is a 170 amino-acid stretch located in its cysteine-rich region II (PvDBPII), which is the most variable segment of the protein. Methods To test whether diversifying natural selection has shaped the nucleotide diversity of PvDBPII in Brazilian populations, this region was sequenced in 122 isolates from six different geographic areas. A Bayesian method was applied to test for the action of natural selection under a population genetic model that incorporates recombination. The analysis was integrated with a structural model of PvDBPII, and T- and B-cell epitopes were localized on the 3-D structure. Results The results suggest that: (i) recombination plays an important role in determining the haplotype structure of PvDBPII, and (ii) PvDBPII appears to contain neutrally evolving codons as well as codons evolving under natural selection. Diversifying selection preferentially acts on sites identified as epitopes, particularly on amino acid residues 417, 419, and 424, which show strong linkage disequilibrium. Conclusions This study shows that some polymorphisms of PvDBPII are present near the erythrocyte-binding domain and might serve to elude antibodies that inhibit cell invasion. Therefore, these polymorphisms should be taken into account when designing vaccines aimed at eliciting antibodies to inhibit erythrocyte invasion. PMID:21092207

  7. Membrane insertion and assembly of epitope-tagged gp9 at the tip of the M13 phage

    PubMed Central

    2011-01-01

    Background Filamentous M13 phage extrude from infected Escherichia coli with a tip structure composed of gp7 and gp9. This tip structure is extended by the assembly of the filament composed of the major coat protein gp8. Finally, gp3 and gp6 terminate the phage structure at the proximal end. Up to now, gp3 has been the primary tool for phage display technology. However, gp7, gp8 and gp9 could also be used for phage display and these phage particles should bind to two different or more surfaces when the modified coat proteins are combined. Therefore, we tested here if the amino-terminal end of gp9 can be modified and whether the modified portion is exposed and detectable on the M13 phage particles. Results The amino-terminal region of gp9 was modified by inserting short sequences that encode antigenic epitopes. We show here that the modified gp9 proteins correctly integrate into the membrane using the membrane insertase YidC exposing the modified epitope into the periplasm. The proteins are then efficiently assembled onto the phage particles. Also extensions up to 36 amino acid residues at the amino-terminal end of gp9 did not interfere with membrane integration and phage assembly. The exposure of the antigenic tags on the phage was visualised with immunogold labelling by electron microscopy and verified by dot blotting with antibodies to the tags. Conclusions Our results suggest that gp9 at the phage tip is suitable for the phage display technology. The modified gp9 can be supplied in trans from a plasmid and fully complements M13 phage with an amber mutation in gene 9. The modified phage tip is very well accessible to antibodies. PMID:21943062

  8. Fluorescent proteins function as a prey attractant: experimental evidence from the hydromedusa Olindias formosus and other marine organisms

    PubMed Central

    Haddock, Steven H. D.; Dunn, Casey W.

    2015-01-01

    ABSTRACT Although proteins in the green fluorescent protein family (GFPs) have been discovered in a wide array of taxa, their ecological functions in these organisms remain unclear. Many hypothesized roles are related to modifying bioluminescence spectra or modulating the light regime for algal symbionts, but these do not explain the presence of GFPs in animals that are non-luminous and non-symbiotic. Other hypothesized functions are unrelated to the visual signals themselves, including stress responses and antioxidant roles, but these cannot explain the localization of fluorescence in particular structures on the animals. Here we tested the hypothesis that fluorescence might serve to attract prey. In laboratory experiments, the predator was the hydromedusa Olindias formosus (previously known as O. formosa), which has fluorescent and pigmented patches on the tips of its tentacles. The prey, juvenile rockfishes in the genus Sebastes, were significantly more attracted (P<1×10−5) to the medusa's tentacles under lighting conditions where fluorescence was excited and tentacle tips were visible above the background. The fish did not respond significantly when treatments did not include fluorescent structures or took place under yellow or white lights, which did not generate fluorescence visible above the ambient light. Furthermore, underwater observations of the behavior of fishes when presented with a brightly illuminated point showed a strong attraction to this visual stimulus. In situ observations also provided evidence for fluorescent lures as supernormal stimuli in several other marine animals, including the siphonophore Rhizophysa eysenhardti. Our results support the idea that fluorescent structures can serve as prey attractants, thus providing a potential function for GFPs and other fluorescent proteins in a diverse range of organisms. PMID:26231627

  9. Paleo-Immunology: Evidence Consistent with Insertion of a Primordial Herpes Virus-Like Element in the Origins of Acquired Immunity

    PubMed Central

    Dreyfus, David H.

    2009-01-01

    Background The RAG encoded proteins, RAG-1 and RAG-2 regulate site-specific recombination events in somatic immune B- and T-lymphocytes to generate the acquired immune repertoire. Catalytic activities of the RAG proteins are related to the recombinase functions of a pre-existing mobile DNA element in the DDE recombinase/RNAse H family, sometimes termed the “RAG transposon”. Methodology/Principal Findings Novel to this work is the suggestion that the DDE recombinase responsible for the origins of acquired immunity was encoded by a primordial herpes virus, rather than a “RAG transposon.” A subsequent “arms race” between immunity to herpes infection and the immune system obscured primary amino acid similarities between herpes and immune system proteins but preserved regulatory, structural and functional similarities between the respective recombinase proteins. In support of this hypothesis, evidence is reviewed from previous published data that a modern herpes virus protein family with properties of a viral recombinase is co-regulated with both RAG-1 and RAG-2 by closely linked cis-acting co-regulatory sequences. Structural and functional similarity is also reviewed between the putative herpes recombinase and both DDE site of the RAG-1 protein and another DDE/RNAse H family nuclease, the Argonaute protein component of RISC (RNA induced silencing complex). Conclusions/Significance A “co-regulatory” model of the origins of V(D)J recombination and the acquired immune system can account for the observed linked genomic structure of RAG-1 and RAG-2 in non-vertebrate organisms such as the sea urchin that lack an acquired immune system and V(D)J recombination. Initially the regulated expression of a viral recombinase in immune cells may have been positively selected by its ability to stimulate innate immunity to herpes virus infection rather than V(D)J recombination Unlike the “RAG-transposon” hypothesis, the proposed model can be readily tested by comparative functional analysis of herpes virus replication and V(D)J recombination. PMID:19492059

  10. Disclosure of the differences of Mesorhizobium loti under the free-living and symbiotic conditions by comparative proteome analysis without bacteroid isolation

    PubMed Central

    2013-01-01

    Background Rhizobia are symbiotic nitrogen-fixing soil bacteria that show a symbiotic relationship with their host legume. Rhizobia have 2 different physiological conditions: a free-living condition in soil, and a symbiotic nitrogen-fixing condition in the nodule. The lifestyle of rhizobia remains largely unknown, although genome and transcriptome analyses have been carried out. To clarify the lifestyle of bacteria, proteome analysis is necessary because the protein profile directly reflects in vivo reactions of the organisms. In proteome analysis, high separation performance is required to analyze complex biological samples. Therefore, we used a liquid chromatography-tandem mass spectrometry system, equipped with a long monolithic silica capillary column, which is superior to conventional columns. In this study, we compared the protein profile of Mesorhizobium loti MAFF303099 under free-living condition to that of symbiotic conditions by using small amounts of crude extracts. Result We identified 1,533 and 847 proteins for M. loti under free-living and symbiotic conditions, respectively. Pathway analysis by Kyoto Encyclopedia of Genes and Genomes (KEGG) revealed that many of the enzymes involved in the central carbon metabolic pathway were commonly detected under both conditions. The proteins encoded in the symbiosis island, the transmissible chromosomal region that includes the genes that are highly upregulated under the symbiotic condition, were uniquely detected under the symbiotic condition. The features of the symbiotic condition that have been reported by transcriptome analysis were confirmed at the protein level by proteome analysis. In addition, the genes of the proteins involved in cell surface structure were repressed under the symbiotic nitrogen-fixing condition. Furthermore, farnesyl pyrophosphate (FPP) was found to be biosynthesized only in rhizobia under the symbiotic condition. Conclusion The obtained protein profile appeared to reflect the difference in phenotypes under the free-living and symbiotic conditions. In addition, KEGG pathway analysis revealed that the cell surface structure of rhizobia was largely different under each condition, and surprisingly, rhizobia might provided FPP to the host as a source of secondary metabolism. M. loti changed its metabolism and cell surface structure in accordance with the surrounding conditions. PMID:23898917

  11. Proteomic changes in the base of chrysanthemum cuttings during adventitious root formation

    PubMed Central

    2013-01-01

    Background A lack of competence to form adventitious roots by cuttings of Chrysanthemum (Chrysanthemum morifolium) is an obstacle for the rapid fixation of elite genotypes. We performed a proteomic analysis of cutting bases of chrysanthemum cultivar ‘Jinba’ during adventitious root formation (ARF) in order to identify rooting ability associated protein and/or to get further insight into the molecular mechanisms controlling adventitious rooting. Results The protein profiles during ARF were analyzed by comparing the 2-DE gels between 0-day-old (just severed from the stock plant) and 5-day-old cutting bases of chrysanthemum. A total of 69 differentially accumulated protein spots (two-fold change; t-test: 95% significance) were excised and analyzed using MALDI-TOF/TOF, among which 42 protein spots (assigned as 24 types of proteins and 7 unknown proteins) were confidently identified using the NCBI database. The results demonstrated that 19% proteins were related to carbohydrate and energy metabolism, 16% to photosynthesis, 10% to protein fate, 7% to plant defense, 6% to cell structure, 7% to hormone related, 3% to nitrate metabolism, 3% to lipid metabolism, 3% to ascorbate biosynthesis and 3% to RNA binding, 23% were unknown proteins. Twenty types of differentially accumulated proteins including ACC oxidase (CmACO) were further analyzed at the transcription level, most of which were in accordance with the results of 2-DE. Moreover, the protein abundance changes of CmACO are supported by western blot experiments. Ethylene evolution was higher during the ARF compared with day 0 after cutting, while silver nitrate, an inhibitor of ethylene synthesis, pretreatment delayed the ARF. It suggested that ACC oxidase plays an important role in ARF of chrysanthemum. Conclusions The proteomic analysis of cutting bases of chrysanthemum allowed us to identify proteins whose expression was related to ARF. We identified auxin-induced protein PCNT115 and ACC oxidase positively or negatively correlated to ARF, respectively. Several other proteins related to carbohydrate and energy metabolism, protein degradation, photosynthetic and cell structure were also correlated to ARF. The induction of protein CmACO provide a strong case for ethylene as the immediate signal for ARF. This strongly suggests that the proteins we have identified will be valuable for further insight into the molecular mechanisms controlling ARF. PMID:24369042

  12. Corynebacterium diphtheriae invasion-associated protein (DIP1281) is involved in cell surface organization, adhesion and internalization in epithelial cells

    PubMed Central

    2010-01-01

    Background Corynebacterium diphtheriae, the causative agent of diphtheria, is well-investigated in respect to toxin production, while little is known about C. diphtheriae factors crucial for colonization of the host. In this study, we investigated the function of surface-associated protein DIP1281, previously annotated as hypothetical invasion-associated protein. Results Microscopic inspection of DIP1281 mutant strains revealed an increased size of the single cells in combination with an altered less club-like shape and formation of chains of cells rather than the typical V-like division forms or palisades of growing C. diphtheriae cells. Cell viability was not impaired. Immuno-fluorescence microscopy, SDS-PAGE and 2-D PAGE of surface proteins revealed clear differences of wild-type and mutant protein patterns, which were verified by atomic force microscopy. DIP1281 mutant cells were not only altered in shape and surface structure but completely lack the ability to adhere to host cells and consequently invade these. Conclusions Our data indicate that DIP1281 is predominantly involved in the organization of the outer surface protein layer rather than in the separation of the peptidoglycan cell wall of dividing bacteria. The adhesion- and invasion-negative phenotype of corresponding mutant strains is an effect of rearrangements of the outer surface. PMID:20051108

  13. Inclusion bodies as potential vehicles for recombinant protein delivery into epithelial cells

    PubMed Central

    2012-01-01

    Background We present the potential of inclusion bodies (IBs) as a protein delivery method for polymeric filamentous proteins. We used as cell factory a strain of E. coli, a conventional host organism, and keratin 14 (K14) as an example of a complex protein. Keratins build the intermediate filament cytoskeleton of all epithelial cells. In order to build filaments, monomeric K14 needs first to dimerize with its binding partner (keratin 5, K5), which is then followed by heterodimer assembly into filaments. Results K14 IBs were electroporated into SW13 cells grown in culture together with a “reporter” plasmid containing EYFP labeled keratin 5 (K5) cDNA. As SW13 cells do not normally express keratins, and keratin filaments are built exclusively of keratin heterodimers (i.e. K5/K14), the short filamentous structures we obtained in this study can only be the result of: a) if both IBs and plasmid DNA are transfected simultaneously into the cell(s); b) once inside the cells, K14 protein is being released from IBs; c) released K14 is functional, able to form heterodimers with EYFP-K5. Conclusions Soluble IBs may be also developed for complex cytoskeletal proteins and used as nanoparticles for their delivery into epithelial cells. PMID:22624805

  14. Temperature dependence of phonons in photosynthesis proteins

    NASA Astrophysics Data System (ADS)

    Xu, Mengyang; Myles, Dean; Blankenship, Robert; Markelz, Andrea

    Protein long range vibrations are essential to biological function. For many proteins, these vibrations steer functional conformational changes. For photoharvesting proteins, the structural vibrations play an additional critical role in energy transfer to the reaction center by both phonon assisted energy transfer and energy dissipation. The characterization of these vibrations to understand how they are optimized to balance photoharvesting and photoprotection is challenging. To date this characterization has mainly relied on fluorescence line narrowing measurements at cryogenic temperatures. However, protein dynamics has a strong temperature dependence, with an apparent turn on in anharmonicity between 180-220 K. If this transition affects intramolecular vibrations, the low temperature measurements will not represent the phonon spectrum at biological temperatures. Here we use the new technique of anisotropic terahertz microscopy (ATM) to measure the intramolecular vibrations of FMO complex. ATM is uniquely capable of isolating protein vibrations from isotropic background. We find resonances both red and blue shift with temperature above the dynamical transition. The results indicate that the characterization of vibrations must be performed at biologically relevant temperatures to properly understand the energy overlap with the excitation energy transfer. This work was supported by NSF:DBI 1556359, BioXFEL seed Grant funding from NSF:DBI 1231306, DOE: DE-SC0016317, and the Bruce Holm University at Buffalo Research Foundation Grant.

  15. X-ray crystallography and its impact on understanding bacterial cell wall remodeling processes.

    PubMed

    Büttner, Felix Michael; Renner-Schneck, Michaela; Stehle, Thilo

    2015-02-01

    The molecular structure of matter defines its properties and function. This is especially true for biological macromolecules such as proteins, which participate in virtually all biochemical processes. A three dimensional structural model of a protein is thus essential for the detailed understanding of its physiological function and the characterization of essential properties such as ligand binding and reaction mechanism. X-ray crystallography is a well-established technique that has been used for many years, but it is still by far the most widely used method for structure determination. A particular strength of this technique is the elucidation of atomic details of molecular interactions, thus providing an invaluable tool for a multitude of scientific projects ranging from the structural classification of macromolecules over the validation of enzymatic mechanisms or the understanding of host-pathogen interactions to structure-guided drug design. In the first part of this review, we describe essential methodological and practical aspects of X-ray crystallography. We provide some pointers that should allow researchers without a background in structural biology to assess the overall quality and reliability of a crystal structure. To highlight its potential, we then survey the impact X-ray crystallography has had on advancing an understanding of a class of enzymes that modify the bacterial cell wall. A substantial number of different bacterial amidase structures have been solved, mostly by X-ray crystallography. Comparison of these structures highlights conserved as well as divergent features. In combination with functional analyses, structural information on these enzymes has therefore proven to be a valuable template not only for understanding their mechanism of catalysis, but also for targeted interference with substrate binding. Copyright © 2015 Elsevier GmbH. All rights reserved.

  16. Correlated Protein Motion Measurements of Dihydrofolate Reductase Crystals

    NASA Astrophysics Data System (ADS)

    Xu, Mengyang; Niessen, Katherine; Pace, James; Cody, Vivian; Markelz, Andrea

    2014-03-01

    We report the first direct measurements of the long range structural vibrational modes in dihydrofolate reductase (DHFR). DHFR is a universal housekeeping enzyme that catalyzes the reduction of 7,8-dihydrofolate to 5,6,7,8-tetra-hydrofolate, with the aid of coenzyme nicotinamide adenine dinucleotide phosphate (NADPH). This crucial enzymatic role as the target for anti-cancer [methotrexate (MTX)], and other clinically useful drugs, has made DHFR a long-standing target of enzymological studies. The terahertz (THz) frequency range (5-100 cm-1), corresponds to global correlated protein motions. In our lab we have developed Crystal Anisotropy Terahertz Microscopy (CATM), which directly measures these large scale intra-molecular protein vibrations, by removing the relaxational background of the solvent and residue side chain librational motions. We demonstrate narrowband features in the anisotropic absorbance for mouse DHFR with the ligand binding of NADPH and MTX single crystals as well as Escherichia coli DHFR with the ligand binding of NADPH and MTX single crystals. This work is supported by NSF grant MRI2 grant DBI2959989.

  17. In Silico Analysis for the Study of Botulinum Toxin Structure

    NASA Astrophysics Data System (ADS)

    Suzuki, Tomonori; Miyazaki, Satoru

    2010-01-01

    Protein-protein interactions play many important roles in biological function. Knowledge of protein-protein complex structure is required for understanding the function. The determination of protein-protein complex structure by experimental studies remains difficult, therefore computational prediction of protein structures by structure modeling and docking studies is valuable method. In addition, MD simulation is also one of the most popular methods for protein structure modeling and characteristics. Here, we attempt to predict protein-protein complex structure and property using some of bioinformatic methods, and we focus botulinum toxin complex as target structure.

  18. Search extension transforms Wiki into a relational system: A case for flavonoid metabolite database

    PubMed Central

    Arita, Masanori; Suwa, Kazuhiro

    2008-01-01

    Background In computer science, database systems are based on the relational model founded by Edgar Codd in 1970. On the other hand, in the area of biology the word 'database' often refers to loosely formatted, very large text files. Although such bio-databases may describe conflicts or ambiguities (e.g. a protein pair do and do not interact, or unknown parameters) in a positive sense, the flexibility of the data format sacrifices a systematic query mechanism equivalent to the widely used SQL. Results To overcome this disadvantage, we propose embeddable string-search commands on a Wiki-based system and designed a half-formatted database. As proof of principle, a database of flavonoid with 6902 molecular structures from over 1687 plant species was implemented on MediaWiki, the background system of Wikipedia. Registered users can describe any information in an arbitrary format. Structured part is subject to text-string searches to realize relational operations. The system was written in PHP language as the extension of MediaWiki. All modifications are open-source and publicly available. Conclusion This scheme benefits from both the free-formatted Wiki style and the concise and structured relational-database style. MediaWiki supports multi-user environments for document management, and the cost for database maintenance is alleviated. PMID:18822113

  19. Genomes2Drugs: Identifies Target Proteins and Lead Drugs from Proteome Data

    PubMed Central

    Toomey, David; Hoppe, Heinrich C.; Brennan, Marian P.; Nolan, Kevin B.; Chubb, Anthony J.

    2009-01-01

    Background Genome sequencing and bioinformatics have provided the full hypothetical proteome of many pathogenic organisms. Advances in microarray and mass spectrometry have also yielded large output datasets of possible target proteins/genes. However, the challenge remains to identify new targets for drug discovery from this wealth of information. Further analysis includes bioinformatics and/or molecular biology tools to validate the findings. This is time consuming and expensive, and could fail to yield novel drugs if protein purification and crystallography is impossible. To pre-empt this, a researcher may want to rapidly filter the output datasets for proteins that show good homology to proteins that have already been structurally characterised or proteins that are already targets for known drugs. Critically, those researchers developing novel antibiotics need to select out the proteins that show close homology to any human proteins, as future inhibitors are likely to cross-react with the host protein, causing off-target toxicity effects later in clinical trials. Methodology/Principal Findings To solve many of these issues, we have developed a free online resource called Genomes2Drugs which ranks sequences to identify proteins that are (i) homologous to previously crystallized proteins or (ii) targets of known drugs, but are (iii) not homologous to human proteins. When tested using the Plasmodium falciparum malarial genome the program correctly enriched the ranked list of proteins with known drug target proteins. Conclusions/Significance Genomes2Drugs rapidly identifies proteins that are likely to succeed in drug discovery pipelines. This free online resource helps in the identification of potential drug targets. Importantly, the program further highlights proteins that are likely to be inhibited by FDA-approved drugs. These drugs can then be rapidly moved into Phase IV clinical studies under ‘change-of-application’ patents. PMID:19593435

  20. Adsorption and carbonylation of plasma proteins by dialyser membrane material: in vitro and in vivo proteomics investigations

    PubMed Central

    Pavone, Barbara; Sirolli, Vittorio; Bucci, Sonia; Libardi, Fulvio; Felaco, Paolo; Amoroso, Luigi; Sacchetta, Paolo; Urbani, Andrea; Bonomini, Mario

    2010-01-01

    Background. Protein carbonylation is an irreversible and not reparable reaction which is caused by the introduction into proteins of carbonyl derivatives such as ketones and aldehydes, generated from direct oxidation processes or from secondary protein reaction with reactive carbonyl compounds. Several studies have demonstrated significantly increased levels of reactive carbonyl compounds, a general increase in plasma protein carbonyls and carbonyl formation on major plasma proteins in blood from uremic patients, particularly those undergoing chronic haemodialysis. Materials and methods. In the present preliminary study, we first assessed by an in vitro filtration apparatus the possible effects of different materials used for haemodialysis membranes on protein retention and carbonylation. We employed hollow fiber minidialyzers of identical structural characteristics composed of either polymethylmethacrylate, ethylenevinyl alcohol, or cellulose diacetate materials. Protein Western Blot and SDS-PAGE coupled to mass spectrometry analysis were applied to highlight the carbonylated protein-binding characteristics of the different materials. We also investigated in vivo protein carbonylation and carboxy methyl lisine-modification in plasma obtained before and after a haemodialysis session. Results. Our data underline a different capability on protein adsorption associated with the different properties of the filter materials, highlighting the central buffering and protective role of serum albumin. In particular, polymethylmethacrylate and cellulose diacetate showed, in vitro, the highest capacity of binding plasma proteins on the surface of the hollow fiber minidialyzers. Conclusions. The present study suggests that biomaterials used for fabrication of haemodialysis membrane may affect the carbonyl balance in chronic uremic patients. PMID:20606741

  1. Evolution of insect proteomes: insights into synapse organization and synaptic vesicle life cycle

    PubMed Central

    Yanay, Chava; Morpurgo, Noa; Linial, Michal

    2008-01-01

    Background The molecular components in synapses that are essential to the life cycle of synaptic vesicles are well characterized. Nonetheless, many aspects of synaptic processes, in particular how they relate to complex behaviour, remain elusive. The genomes of flies, mosquitoes, the honeybee and the beetle are now fully sequenced and span an evolutionary breadth of about 350 million years; this provides a unique opportunity to conduct a comparative genomics study of the synapse. Results We compiled a list of 120 gene prototypes that comprise the core of presynaptic structures in insects. Insects lack several scaffolding proteins in the active zone, such as bassoon and piccollo, and the most abundant protein in the mammalian synaptic vesicle, namely synaptophysin. The pattern of evolution of synaptic protein complexes is analyzed. According to this analysis, the components of presynaptic complexes as well as proteins that take part in organelle biogenesis are tightly coordinated. Most synaptic proteins are involved in rich protein interaction networks. Overall, the number of interacting proteins and the degrees of sequence conservation between human and insects are closely correlated. Such a correlation holds for exocytotic but not for endocytotic proteins. Conclusion This comparative study of human with insects sheds light on the composition and assembly of protein complexes in the synapse. Specifically, the nature of the protein interaction graphs differentiate exocytotic from endocytotic proteins and suggest unique evolutionary constraints for each set. General principles in the design of proteins of the presynaptic site can be inferred from a comparative study of human and insect genomes. PMID:18257909

  2. Relationships between IgE/IgG4 Epitopes, Structure and Function in Anisakis simplex Ani s 5, a Member of the SXP/RAL-2 Protein Family

    PubMed Central

    García-Mayoral, María Flor; Treviño, Miguel Angel; Pérez-Piñar, Teresa; Caballero, María Luisa; Knaute, Tobias; Umpierrez, Ana

    2014-01-01

    Background Anisakiasis is a re-emerging global disease caused by consumption of raw or lightly cooked fish contaminated with L3 Anisakis larvae. This zoonotic disease is characterized by severe gastrointestinal and/or allergic symptoms which may misdiagnosed as appendicitis, gastric ulcer or other food allergies. The Anisakis allergen Ani s 5 is a protein belonging to the SXP/RAL-2 family; it is detected exclusively in nematodes. Previous studies showed that SXP/RAL-2 proteins are active antigens; however, their structure and function remain unknown. The aim of this study was to elucidate the three-dimensional structure of Ani s 5 and its main IgE and IgG4 binding regions. Methodology/Principal Findings The tertiary structure of recombinant Ani s 5 in solution was solved by nuclear magnetic resonance. Mg2+, but not Ca2+, binding was determined by band shift using SDS-PAGE. IgE and IgG4 epitopes were elucidated by microarray immunoassay and SPOTs membranes using sera from nine Anisakis allergic patients. The tertiary structure of Ani s 5 is composed of six alpha helices (H), with a Calmodulin like fold. H3 is a long, central helix that organizes the structure, with H1 and H2 packing at its N-terminus and H4 and H5 packing at its C-terminus. The orientation of H6 is undefined. Regarding epitopes recognized by IgE and IgG4 immunoglobulins, the same eleven peptides derived from Ani s 5 were bound by both IgE and IgG4. Peptides 14 (L40-K59), 26 (A76-A95) and 35 (I103-D122) were recognized by three out of nine sera. Conclusions/Significance This is the first reported 3D structure of an Anisakis allergen. Magnesium ion binding and structural resemblance to Calmodulin, suggest some putative functions for SXP/RAL-2 proteins. Furthermore, the IgE/IgG4 binding regions of Ani s 5 were identified as segments localized on its surface. These data will contribute towards a better understanding of the interactions that occur between immunoglobulins and allergens and, in turn, facilitate the design of novel diagnostic tests and immunotherapeutic strategies. PMID:24603892

  3. Fluorescent Approaches to High Throughput Crystallography

    NASA Technical Reports Server (NTRS)

    Pusey, Marc L.; Forsythe, Elizabeth

    2005-01-01

    X-ray crystallography remains the primary method for determining the structure of macromolecules. The first requirement is to have crystals, and obtaining them is often the rate-limiting step. The numbers of crystallization trials that are set up for any one protein for structural genomics, and the rate at which they are being set up, now overwhelm the ability for strictly human analysis of the results. Automated analysis methods are now being implemented with varying degrees of success, but these typically cannot reliably extract intermediate results. By covalently modifying a subpopulation, 51%, of a macromolecule solution with a fluorescent probe, the labeled material will add to a growing crystal as a microheterogeneous growth unit. Labeling procedures can be readily incorporated into the final stages of purification. The covalently attached probe will concentrate in the crystal relative to the solution, and under fluorescent illumination the crystals show up as bright objects against a dark background. As crystalline packing is more dense than amorphous precipitate, the fluorescence intensity can be used as a guide in distinguishing different types of precipitated phases, even in the absence of obvious crystalline features, widening the available potential lead conditions in the absence of clear hits. Non-protein structures, such as salt crystals, will not incorporate the probe and will not show up under fluorescent illumination. Also, brightly fluorescent crystals are readily found against less fluorescent precipitated phases, which under white light illumination may serve to obscure the crystals. Automated image analysis to find crystals should be greatly facilitated, without having to first define crystallization drop boundaries and by having the protein or protein structures all that show up. The trace fluorescently labeled crystals will also emit with sufficient intensity to aid in the automation of crystal alignment using relatively low cost optics, further increasing throughput at synchrotrons. This presentation will focus on the methodology for fluorescent labeling, the crystallization results, and the effects of the trace labeling on the crystal quality.

  4. Fluorescent Approaches to High Throughput Crystallography

    NASA Technical Reports Server (NTRS)

    Minamitani, Elizabeth Forsythe; Pusey, Marc L.

    2004-01-01

    X-ray crystallography remains the primary method for determining the structure of macromolecules. The first requirement is to have crystals, and obtaining them is often the rate-limiting step. The numbers of crystallization trials that are set up for any one protein for structural genomics, and the rate at which they are being set up, now overwhelm the ability for strictly human analysis of the results. Automated analysis methods are now being implemented with varying degrees of success, but these typically cannot reliably extract intermediate results. By covalently modifying a subpopulation, less than or = 1%, of a macromolecule solution with a fluorescent probe, the labeled material will add to a growing crystal as a microheterogeneous growth unit. Labeling procedures can be readily incorporated into the final stages of a macromolecules purification. The covalently attached probe will concentrate in the crystal relative to the solution, and under fluorescent illumination the crystals will show up as bright objects against a dark background. As crystalline packing is more dense than amorphous precipitate, the fluorescence intensity can be used as a guide in distinguishing different types of precipitated phases, even in the absence of obvious crystalline features, widening the available potential lead conditions in the absence of clear "bits." Non-protein structures, such as salt crystals, will not incorporate the probe and will not show up under fluorescent illumination. Also, brightly fluorescent crystals are readily found against less fluorescent precipitated phases, which under white light illumination may serve to obscure the crystals. Automated image analysis to find crystals should be greatly facilitated, without having to first define crystallization drop boundaries and by having the protein or protein structures all that show up. The trace fluorescently labeled crystals will also emit with sufficient intensity to aid in the automation of crystal alignment using relatively low cost optics, further increasing throughput at synchrotrons. This presentation will focus on the methodology for fluorescent labeling, the crystallization results, and the effects of the trace labeling on the crystal quality.

  5. Fluorescent Approaches to High Throughput Crystallography

    NASA Technical Reports Server (NTRS)

    Pusey, Marc L.; Forsythe, Elizabeth; Achari, Amiruddha

    2005-01-01

    X-ray crystallography remains the primary method for determining the structure of macromolecules. The first requirement is to have crystals, and obtaining them is often the rate-limiting step. The numbers of crystallization trials that are set up for any one protein for structural genomics, and the rate at which they are being set up, now overwhelm the ability for strictly human analysis of the results. Automated analysis methods are now being implemented with varying degrees of success, but these typically cannot reliably extract intermediate results. By covalently modifying a subpopulation, less than or = 1 %, of a macromolecule solution with a fluorescent probe, the labeled material will add to a growing crystal as a microheterogeneous growth unit. Labeling procedures can be readily incorporated into the final stages of purification. The covalently attached probe will concentrate in the crystal relative to the solution, and under fluorescent illumination the crystals show up as bright objects against a dark background. As crystalline packing is more dense than amorphous precipitate, the fluorescence intensity can be used as a guide in distinguishing different types of precipitated phases, even in the absence of obvious crystalline features, widening the available potential lead conditions in the absence of clear "hits." Non-protein structures, such as salt crystals, will not incorporate the probe and will not show up under fluorescent illumination. Also, brightly fluorescent crystals are readily found against less fluorescent precipitated phases, which under white light illumination may serve to obscure the crystals. Automated image analysis to find crystals should be greatly facilitated, without having to first define crystallization drop boundaries and by having the protein or protein structures all that show up. The trace fluorescently labeled crystals will also emit with sufficient intensity to aid in the automation of crystal alignment using relatively low cost optics, further increasing throughput at synchrotrons. Preliminary experiments show that the presence of the fluorescent probe does not affect the nucleation process or the quality of the X-ray data obtained.

  6. Fluorescent Approaches to High Throughput Crystallography

    NASA Technical Reports Server (NTRS)

    Pusey, Marc L.; Forsythe, Elizabeth

    2004-01-01

    X-ray crystallography remains the primary method for determining the structure of macromolecules. The first requirement is to have crystals, and obtaining them is often the rate-limiting step. The numbers of crystallization trials that are set up for any one protein for structural genomics, and the rate at which they are being set up, now overwhelm the ability for strictly human analysis of the results. Automated analysis methods are now being implemented with varying degrees of success, but these typically can not reliably extract intermediate results. By covalently modifying a subpopulation, less than or = 1%, of a macromolecule solution with a fluorescent probe, the labeled material will add to a growing crystal as a microheterogeneous growth unit. Labeling procedures can be readily incorporated into the final stages of purification. The covalently attached probe will concentrate in the crystal relative to the solution, and under fluorescent illumination the crystals show up as bright objects against a dark background. As crystalline packing is more dense than amorphous precipitate, the fluorescence intensity can be used as a guide in distinguishing different types of precipitated phases, even in the absence of obvious crystalline features, widening the available potential lead conditions in the absence of clear "hits." Non-protein structures, such as salt crystals, will not incorporate the probe and will not show up under fluorescent illumination. Also, brightly fluorescent crystals are readily found against less fluorescent precipitated phases, which under white light illumination may serve to obscure the crystals. Automated image analysis to find crystals should be greatly facilitated, without having to first define crystallization drop boundaries and by having the protein or protein structures all that show up. The trace fluorescently labeled crystals will also emit with sufficient intensity to aid in the automation of crystal alignment using relatively low cost optics, further increasing throughput at synchrotrons. This presentation will focus on the methodology for fluorescent labeling, the crystallization results, and the effects of the trace labeling on the crystal quality.

  7. Conformational and functional analysis of molecular dynamics trajectories by Self-Organising Maps

    PubMed Central

    2011-01-01

    Background Molecular dynamics (MD) simulations are powerful tools to investigate the conformational dynamics of proteins that is often a critical element of their function. Identification of functionally relevant conformations is generally done clustering the large ensemble of structures that are generated. Recently, Self-Organising Maps (SOMs) were reported performing more accurately and providing more consistent results than traditional clustering algorithms in various data mining problems. We present a novel strategy to analyse and compare conformational ensembles of protein domains using a two-level approach that combines SOMs and hierarchical clustering. Results The conformational dynamics of the α-spectrin SH3 protein domain and six single mutants were analysed by MD simulations. The Cα's Cartesian coordinates of conformations sampled in the essential space were used as input data vectors for SOM training, then complete linkage clustering was performed on the SOM prototype vectors. A specific protocol to optimize a SOM for structural ensembles was proposed: the optimal SOM was selected by means of a Taguchi experimental design plan applied to different data sets, and the optimal sampling rate of the MD trajectory was selected. The proposed two-level approach was applied to single trajectories of the SH3 domain independently as well as to groups of them at the same time. The results demonstrated the potential of this approach in the analysis of large ensembles of molecular structures: the possibility of producing a topological mapping of the conformational space in a simple 2D visualisation, as well as of effectively highlighting differences in the conformational dynamics directly related to biological functions. Conclusions The use of a two-level approach combining SOMs and hierarchical clustering for conformational analysis of structural ensembles of proteins was proposed. It can easily be extended to other study cases and to conformational ensembles from other sources. PMID:21569575

  8. MrGrid: A Portable Grid Based Molecular Replacement Pipeline

    PubMed Central

    Reboul, Cyril F.; Androulakis, Steve G.; Phan, Jennifer M. N.; Whisstock, James C.; Goscinski, Wojtek J.; Abramson, David; Buckle, Ashley M.

    2010-01-01

    Background The crystallographic determination of protein structures can be computationally demanding and for difficult cases can benefit from user-friendly interfaces to high-performance computing resources. Molecular replacement (MR) is a popular protein crystallographic technique that exploits the structural similarity between proteins that share some sequence similarity. But the need to trial permutations of search models, space group symmetries and other parameters makes MR time- and labour-intensive. However, MR calculations are embarrassingly parallel and thus ideally suited to distributed computing. In order to address this problem we have developed MrGrid, web-based software that allows multiple MR calculations to be executed across a grid of networked computers, allowing high-throughput MR. Methodology/Principal Findings MrGrid is a portable web based application written in Java/JSP and Ruby, and taking advantage of Apple Xgrid technology. Designed to interface with a user defined Xgrid resource the package manages the distribution of multiple MR runs to the available nodes on the Xgrid. We evaluated MrGrid using 10 different protein test cases on a network of 13 computers, and achieved an average speed up factor of 5.69. Conclusions MrGrid enables the user to retrieve and manage the results of tens to hundreds of MR calculations quickly and via a single web interface, as well as broadening the range of strategies that can be attempted. This high-throughput approach allows parameter sweeps to be performed in parallel, improving the chances of MR success. PMID:20386612

  9. Diversity in Protein Glycosylation among Insect Species

    PubMed Central

    Vandenborre, Gianni; Smagghe, Guy; Ghesquière, Bart; Menschaert, Gerben; Nagender Rao, Rameshwaram; Gevaert, Kris; Van Damme, Els J. M.

    2011-01-01

    Background A very common protein modification in multicellular organisms is protein glycosylation or the addition of carbohydrate structures to the peptide backbone. Although the Class of the Insecta is the largest animal taxon on Earth, almost all information concerning glycosylation in insects is derived from studies with only one species, namely the fruit fly Drosophila melanogaster. Methodology/Principal Findings In this report, the differences in glycoproteomes between insects belonging to several economically important insect orders were studied. Using GNA (Galanthus nivalis agglutinin) affinity chromatography, different sets of glycoproteins with mannosyl-containing glycan structures were purified from the flour beetle (Tribolium castaneum), the silkworm (Bombyx mori), the honeybee (Apis mellifera), the fruit fly (D. melanogaster) and the pea aphid (Acyrthosiphon pisum). To identify and characterize the purified glycoproteins, LC-MS/MS analysis was performed. For all insect species, it was demonstrated that glycoproteins were related to a broad range of biological processes and molecular functions. Moreover, the majority of glycoproteins retained on the GNA column were unique to one particular insect species and only a few glycoproteins were present in the five different glycoprotein sets. Furthermore, these data support the hypothesis that insect glycoproteins can be decorated with mannosylated O-glycans. Conclusions/Significance The results presented here demonstrate that oligomannose N-glycosylation events are highly specific depending on the insect species. In addition, we also demonstrated that protein O-mannosylation in insect species may occur more frequently than currently believed. PMID:21373189

  10. Estimated Effects of Future Atmospheric CO2 Concentrations on Protein Intake and the Risk of Protein Deficiency by Country and Region

    PubMed Central

    Schwartz, Joel; Myers, Samuel S.

    2017-01-01

    Background: Crops grown under elevated atmospheric CO2 concentrations (eCO2) contain less protein. Crops particularly affected include rice and wheat, which are primary sources of dietary protein for many countries. Objectives: We aimed to estimate global and country-specific risks of protein deficiency attributable to anthropogenic CO2 emissions by 2050. Methods: To model per capita protein intake in countries around the world under eCO2, we first established the effect size of eCO2 on the protein concentration of edible portions of crops by performing a meta-analysis of published literature. We then estimated per-country protein intake under current and anticipated future eCO2 using global food balance sheets (FBS). We modeled protein intake distributions within countries using Gini coefficients, and we estimated those at risk of deficiency from estimated average protein requirements (EAR) weighted by population age structure. Results: Under eCO2, rice, wheat, barley, and potato protein contents decreased by 7.6%, 7.8%, 14.1%, and 6.4%, respectively. Consequently, 18 countries may lose >5% of their dietary protein, including India (5.3%). By 2050, assuming today’s diets and levels of income inequality, an additional 1.6% or 148.4 million of the world’s population may be placed at risk of protein deficiency because of eCO2. In India, an additional 53 million people may become at risk. Conclusions: Anthropogenic CO2 emissions threaten the adequacy of protein intake worldwide. Elevated atmospheric CO2 may widen the disparity in protein intake within countries, with plant-based diets being the most vulnerable. https://doi.org/10.1289/EHP41 PMID:28885977

  11. The structural basis for the functional comparability of Factor VIII and the long-acting variant recombinant Factor VIII Fc fusion protein

    PubMed Central

    Leksa, N.C.; Chiu, P.-L.; Bou-Assaf, G.M.; Quan, C.; Liu, Z.; Goodman, A.B.; Chambers, M.G.; Tsutakawa, S.E.; Hammel, M.; Peters, R.T.; Walz, T.; Kulman, J.D.

    2017-01-01

    SUMMARY Background Fusion of the human IgG1 Fc domain to the C-terminal C2 domain of B domain-deleted (BDD) factor VIII (FVIII) results in the rFVIIIFc fusion protein that has a 1.5-fold longer half-life in humans. Objective To assess the structural properties of rFVIIIFc by comparing its constituent FVIII and Fc elements with their respective isolated components and evaluating their structural independence within rFVIIIFc. Methods rFVIIIFc and its isolated FVIII and Fc components were compared by hydrogen-deuterium exchange mass spectrometry (HDX-MS). The structure of rFVIIIFc was also evaluated by X-ray crystallography, small-angle X-ray scattering (SAXS), and electron microscopy (EM). The degree of steric interference by the appended Fc domain was assessed by EM and surface plasmon resonance (SPR). Results HDX-MS analysis of rFVIIIFc revealed that fusion caused no structural perturbations in FVIII or Fc. The rFVIIIFc crystal structure showed that the FVIII component is indistinguishable from published BDD FVIII structures. The Fc domain was not observed, indicating high mobility. SAXS analysis was consistent with an ensemble of rigid-body models in which the Fc domain exists in a largely extended orientation relative to FVIII. Binding of Fab fragments of anti-C2 domain antibodies to BDD FVIII was visualized by EM, and the affinities of the corresponding intact antibodies for BDD FVIII and rFVIIIFc were comparable by SPR analysis. Conclusions The FVIII and Fc components of rFVIIIFc are structurally indistinguishable from their isolated constituents and exhibit a high degree of structural independence, consistent with the functional comparability of rFVIIIFc and unmodified FVIII. PMID:28397397

  12. Protein interface classification by evolutionary analysis

    PubMed Central

    2012-01-01

    Background Distinguishing biologically relevant interfaces from lattice contacts in protein crystals is a fundamental problem in structural biology. Despite efforts towards the computational prediction of interface character, many issues are still unresolved. Results We present here a protein-protein interface classifier that relies on evolutionary data to detect the biological character of interfaces. The classifier uses a simple geometric measure, number of core residues, and two evolutionary indicators based on the sequence entropy of homolog sequences. Both aim at detecting differential selection pressure between interface core and rim or rest of surface. The core residues, defined as fully buried residues (>95% burial), appear to be fundamental determinants of biological interfaces: their number is in itself a powerful discriminator of interface character and together with the evolutionary measures it is able to clearly distinguish evolved biological contacts from crystal ones. We demonstrate that this definition of core residues leads to distinctively better results than earlier definitions from the literature. The stringent selection and quality filtering of structural and sequence data was key to the success of the method. Most importantly we demonstrate that a more conservative selection of homolog sequences - with relatively high sequence identities to the query - is able to produce a clearer signal than previous attempts. Conclusions An evolutionary approach like the one presented here is key to the advancement of the field, which so far was missing an effective method exploiting the evolutionary character of protein interfaces. Its coverage and performance will only improve over time thanks to the incessant growth of sequence databases. Currently our method reaches an accuracy of 89% in classifying interfaces of the Ponstingl 2003 datasets and it lends itself to a variety of useful applications in structural biology and bioinformatics. We made the corresponding software implementation available to the community as an easy-to-use graphical web interface at http://www.eppic-web.org. PMID:23259833

  13. Fluorescent Approaches to High Throughput Crystallography

    NASA Technical Reports Server (NTRS)

    Pusey, Marc L.; Forsythe, Elizabeth; Achari, Aniruddha

    2006-01-01

    We have shown that by covalently modifying a subpopulation, less than or equal to 1%, of a macromolecule with a fluorescent probe, the labeled material will add to a growing crystal as a microheterogeneous growth unit. Labeling procedures can be readily incorporated into the final stages of purification, and the presence of the probe at low concentrations does not affect the X-ray data quality or the crystallization behavior. The presence of the trace fluorescent label gives a number of advantages when used with high throughput crystallizations. The covalently attached probe will concentrate in the crystal relative to the solution, and under fluorescent illumination crystals show up as bright objects against a dark background. Non-protein structures, such as salt crystals, will not incorporate the probe and will not show up under fluorescent illumination. Brightly fluorescent crystals are readily found against less bright precipitated phases, which under white light illumination may obscure the crystals. Automated image analysis to find crystals should be greatly facilitated, without having to first define crystallization drop boundaries as the protein or protein structures is all that shows up. Fluorescence intensity is a faster search parameter, whether visually or by automated methods, than looking for crystalline features. We are now testing the use of high fluorescence intensity regions, in the absence of clear crystalline features or "hits", as a means for determining potential lead conditions. A working hypothesis is that kinetics leading to non-structured phases may overwhelm and trap more slowly formed ordered assemblies, which subsequently show up as regions of brighter fluorescence intensity. Preliminary experiments with test proteins have resulted in the extraction of a number of crystallization conditions from screening outcomes based solely on the presence of bright fluorescent regions. Subsequent experiments will test this approach using a wider range of proteins. The trace fluorescently labeled crystals will also emit with sufficient intensity to aid in the automation of crystal alignment using relatively low cost optics, further increasing throughput at synchrotrons.

  14. Development of the lateral ventricular choroid plexus in a marsupial, Monodelphis domestica

    PubMed Central

    2010-01-01

    Background Choroid plexus epithelial cells are the site of blood/cerebrospinal fluid (CSF) barrier and regulate molecular transfer between the two compartments. Their mitotic activity in the adult is low. During development, the pattern of growth and timing of acquisition of functional properties of plexus epithelium are not known. Methods Numbers and size of choroid plexus epithelial cells and their nuclei were counted and measured in the lateral ventricular plexus from the first day of its appearance until adulthood. Newborn Monodelphis pups were injected with 5-bromo-2-deoxyuridine (BrdU) at postnatal day 3 (P3), P4 and P5. Additional animals were injected at P63, P64 and P65. BrdU-immunopositive nuclei were counted and their position mapped in the plexus structure at different ages after injections. Double-labelling immunocytochemistry with antibodies to plasma protein identified post-mitotic cells involved in protein transfer. Results Numbers of choroid plexus epithelial cells increased 10-fold between the time of birth and adulthood. In newborn pups each consecutive injection of BrdU labelled 20-40 of epithelial cells counted. After 3 injections, numbers of BrdU positive cells remained constant for at least 2 months. BrdU injections at an older age (P63, P64, P65) resulted in a smaller number of labelled plexus cells. Numbers of plexus cells immunopositive for both BrdU and plasma protein increased with age indicating that protein transferring properties are acquired post mitotically. Labelled nuclei were only detected on the dorsal arm of the plexus as it grows from the neuroependyma, moving along the structure in a 'conveyor belt' like fashion. Conclusions The present study established that lateral ventricular choroid plexus epithelial cells are born on the dorsal side of the structure only. Cells born in the first few days after choroid plexus differentiation from the neuroependyma remain present even two months later. Protein-transferring properties are acquired post-mitotically and relatively early in plexus development. PMID:20920364

  15. NMR spectroscopic and analytical ultracentrifuge analysis of membrane protein detergent complexes

    PubMed Central

    Maslennikov, Innokentiy; Kefala, Georgia; Johnson, Casey; Riek, Roland; Choe, Senyon; Kwiatkowski, Witek

    2007-01-01

    Background Structural studies of integral membrane proteins (IMPs) are hampered by inherent difficulties in their heterologous expression and in the purification of solubilized protein-detergent complexes (PDCs). The choice and concentrations of detergents used in an IMP preparation play a critical role in protein homogeneity and are thus important for successful crystallization. Results Seeking an effective and standardized means applicable to genomic approaches for the characterization of PDCs, we chose 1D-NMR spectroscopic analysis to monitor the detergent content throughout their purification: protein extraction, detergent exchange, and sample concentration. We demonstrate that a single NMR measurement combined with a SDS-PAGE of a detergent extracted sample provides a useful gauge of the detergent's extraction potential for a given protein. Furthermore, careful monitoring of the detergent content during the process of IMP production allows for a high level of reproducibility. We also show that in many cases a simple sedimentation velocity measurement provides sufficient data to estimate both the oligomeric state and the detergent-to-protein ratio in PDCs, as well as to evaluate the homogeneity of the samples prior to crystallization screening. Conclusion The techniques presented here facilitate the screening and selection of the extraction detergent, as well as help to maintain reproducibility in the detergent exchange and PDC concentration procedures. Such reproducibility is particularly important for the optimization of initial crystallization conditions, for which multiple purifications are routinely required. PMID:17988403

  16. Evolution of plant virus movement proteins from the 30K superfamily and of their homologs integrated in plant genomes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mushegian, Arcady R., E-mail: mushegian2@gmail.com; Elena, Santiago F., E-mail: sfelena@ibmcp.upv.es; The Santa Fe Institute, Santa Fe, NM 87501

    Homologs of Tobacco mosaic virus 30K cell-to-cell movement protein are encoded by diverse plant viruses. Mechanisms of action and evolutionary origins of these proteins remain obscure. We expand the picture of conservation and evolution of the 30K proteins, producing sequence alignment of the 30K superfamily with the broadest phylogenetic coverage thus far and illuminating structural features of the core all-beta fold of these proteins. Integrated copies of pararetrovirus 30K movement genes are prevalent in euphyllophytes, with at least one copy intact in nearly every examined species, and mRNAs detected for most of them. Sequence analysis suggests repeated integrations, pseudogenizations, andmore » positive selection in those provirus genes. An unannotated 30K-superfamily gene in Arabidopsis thaliana genome is likely expressed as a fusion with the At1g37113 transcript. This molecular background of endopararetrovirus gene products in plants may change our view of virus infection and pathogenesis, and perhaps of cellular homeostasis in the hosts. - Highlights: • Sequence region shared by plant virus “30K” movement proteins has an all-beta fold. • Most euphyllophyte genomes contain integrated copies of pararetroviruses. • These integrated virus genomes often include intact movement protein genes. • Molecular evidence suggests that these “30K” genes may be selected for function.« less

  17. Burkholderia Hep_Hap autotransporter (BuHA) proteins elicit a strong antibody response during experimental glanders but not human melioidosis

    PubMed Central

    Tiyawisutsri, Rachaneeporn; Holden, Matthew TG; Tumapa, Sarinna; Rengpipat, Sirirat; Clarke, Simon R; Foster, Simon J; Nierman, William C; Day, Nicholas PJ; Peacock, Sharon J

    2007-01-01

    Background The bacterial biothreat agents Burkholderia mallei and Burkholderia pseudomallei are the cause of glanders and melioidosis, respectively. Genomic and epidemiological studies have shown that B. mallei is a recently emerged, host restricted clone of B. pseudomallei. Results Using bacteriophage-mediated immunoscreening we identified genes expressed in vivo during experimental equine glanders infection. A family of immunodominant antigens were identified that share protein domain architectures with hemagglutinins and invasins. These have been designated Burkholderia Hep_Hag autotransporter (BuHA) proteins. A total of 110/207 positive clones (53%) of a B. mallei expression library screened with sera from two infected horses belonged to this family. This contrasted with 6/189 positive clones (3%) of a B. pseudomallei expression library screened with serum from 21 patients with culture-proven melioidosis. Conclusion Members of the BuHA proteins are found in other Gram-negative bacteria and have been shown to have important roles related to virulence. Compared with other bacterial species, the genomes of both B. mallei and B. pseudomallei contain a relative abundance of this family of proteins. The domain structures of these proteins suggest that they function as multimeric surface proteins that modulate interactions of the cell with the host and environment. Their effect on the cellular immune response to B. mallei and their potential as diagnostics for glanders requires further study. PMID:17362501

  18. Interconversion of Anthozoa GFP-like fluorescent and non-fluorescent proteins by mutagenesis

    PubMed Central

    Bulina, Maria E; Chudakov, Dmitry M; Mudrik, Nikolay N; Lukyanov, Konstantin A

    2002-01-01

    Background Within the family of green fluorescent protein (GFP) homologs, one can mark two main groups, specifically, fluorescent proteins (FPs) and non-fluorescent or chromoproteins (CPs). Structural background of differences between FPs and CPs are poorly understood to date. Results Here, we applied site-directed and random mutagenesis in order to to transform CP into FP and vice versa. A purple chromoprotein asCP (asFP595) from Anemonia sulcata and a red fluorescent protein DsRed from Discosoma sp. were selected as representatives of CPs and FPs, respectively. For asCP, some substitutions at positions 148 and 165 (numbering in accordance to GFP) were found to dramatically increase quantum yield of red fluorescence. For DsRed, substitutions at positions 148, 165, 167, and 203 significantly decreased fluorescence intensity, so that the spectral characteristics of these mutants became more close to those of CPs. Finally, a practically non-fluorescent mutant DsRed-NF was generated. This mutant carried four amino acid substitutions, specifically, S148C, I165N, K167M, and S203A. DsRed-NF possessed a high extinction coefficient and an extremely low quantum yield (< 0.001). These spectral characteristics allow one to regard DsRed-NF as a true chromoprotein. Conclusions We located a novel point in asCP sequence (position 165) mutations at which can result in red fluorescence appearance. Probably, this finding could be applied onto other CPs to generate red and far-red fluorescent mutants. A possibility to transform an FP into CP was demonstrated. Key role of residues adjacent to chromophore's phenolic ring in fluorescent/non-fluorescent states determination was revealed. PMID:11972899

  19. Allele-specific Characterization of Alanine: Glyoxylate Aminotransferase Variants Associated with Primary Hyperoxaluria

    PubMed Central

    Lage, Melissa D.; Pittman, Adrianne M. C.; Roncador, Alessandro; Cellini, Barbara; Tucker, Chandra L.

    2014-01-01

    Primary Hyperoxaluria Type 1 (PH1) is a rare autosomal recessive kidney stone disease caused by deficiency of the peroxisomal enzyme alanine: glyoxylate aminotransferase (AGT), which is involved in glyoxylate detoxification. Over 75 different missense mutations in AGT have been found associated with PH1. While some of the mutations have been found to affect enzyme activity, stability, and/or localization, approximately half of these mutations are completely uncharacterized. In this study, we sought to systematically characterize AGT missense mutations associated with PH1. To facilitate analysis, we used two high-throughput yeast-based assays: one that assesses AGT specific activity, and one that assesses protein stability. Approximately 30% of PH1-associated missense mutations are found in conjunction with a minor allele polymorphic variant, which can interact to elicit complex effects on protein stability and trafficking. To better understand this allele interaction, we functionally characterized each of 34 mutants on both the major (wild-type) and minor allele backgrounds, identifying mutations that synergize with the minor allele. We classify these mutants into four distinct categories depending on activity/stability results in the different alleles. Twelve mutants were found to display reduced activity in combination with the minor allele, compared with the major allele background. When mapped on the AGT dimer structure, these mutants reveal localized regions of the protein that appear particularly sensitive to interactions with the minor allele variant. While the majority of the deleterious effects on activity in the minor allele can be attributed to synergistic interaction affecting protein stability, we identify one mutation, E274D, that appears to specifically affect activity when in combination with the minor allele. PMID:24718375

  20. Revealing the functionality of hypothetical protein KPN00728 from Klebsiella pneumoniae MGH78578: molecular dynamics simulation approaches

    PubMed Central

    2011-01-01

    Background Previously, the hypothetical protein, KPN00728 from Klebsiella pneumoniae MGH78578 was the Succinate dehydrogenase (SDH) chain C subunit via structural prediction and molecular docking simulation studies. However, due to limitation in docking simulation, an in-depth understanding of how SDH interaction occurs across the transmembrane of mitochondria could not be provided. Results In this present study, molecular dynamics (MD) simulation of KPN00728 and SDH chain D in a membrane was performed in order to gain a deeper insight into its molecular role as SDH. Structural stability was successfully obtained in the calculation for area per lipid, tail order parameter, thickness of lipid and secondary structural properties. Interestingly, water molecules were found to be highly possible in mediating the interaction between Ubiquinone (UQ) and SDH chain C via interaction with Ser27 and Arg31 residues as compared with earlier docking study. Polar residues such as Asp95 and Glu101 (KPN00728), Asp15 and Glu78 (SDH chain D) might have contributed in the creation of a polar environment which is essential for electron transport chain in Krebs cycle. Conclusions As a conclusion, a part from the structural stability comparability, the dynamic of the interacting residues and hydrogen bonding analysis had further proved that the interaction of KPN00728 as SDH is preserved and well agreed with our postulation earlier. PMID:22372825

  1. Prediction of vitamin interacting residues in a vitamin binding protein using evolutionary information

    PubMed Central

    2013-01-01

    Background The vitamins are important cofactors in various enzymatic-reactions. In past, many inhibitors have been designed against vitamin binding pockets in order to inhibit vitamin-protein interactions. Thus, it is important to identify vitamin interacting residues in a protein. It is possible to detect vitamin-binding pockets on a protein, if its tertiary structure is known. Unfortunately tertiary structures of limited proteins are available. Therefore, it is important to develop in-silico models for predicting vitamin interacting residues in protein from its primary structure. Results In this study, first we compared protein-interacting residues of vitamins with other ligands using Two Sample Logo (TSL). It was observed that ATP, GTP, NAD, FAD and mannose preferred {G,R,K,S,H}, {G,K,T,S,D,N}, {T,G,Y}, {G,Y,W} and {Y,D,W,N,E} residues respectively, whereas vitamins preferred {Y,F,S,W,T,G,H} residues for the interaction with proteins. Furthermore, compositional information of preferred and non-preferred residues along with patterns-specificity was also observed within different vitamin-classes. Vitamins A, B and B6 preferred {F,I,W,Y,L,V}, {S,Y,G,T,H,W,N,E} and {S,T,G,H,Y,N} interacting residues respectively. It suggested that protein-binding patterns of vitamins are different from other ligands, and motivated us to develop separate predictor for vitamins and their sub-classes. The four different prediction modules, (i) vitamin interacting residues (VIRs), (ii) vitamin-A interacting residues (VAIRs), (iii) vitamin-B interacting residues (VBIRs) and (iv) pyridoxal-5-phosphate (vitamin B6) interacting residues (PLPIRs) have been developed. We applied various classifiers of SVM, BayesNet, NaiveBayes, ComplementNaiveBayes, NaiveBayesMultinomial, RandomForest and IBk etc., as machine learning techniques, using binary and Position-Specific Scoring Matrix (PSSM) features of protein sequences. Finally, we selected best performing SVM modules and obtained highest MCC of 0.53, 0.48, 0.61, 0.81 for VIRs, VAIRs, VBIRs, PLPIRs respectively, using PSSM-based evolutionary information. All the modules developed in this study have been trained and tested on non-redundant datasets and evaluated using five-fold cross-validation technique. The performances were also evaluated on the balanced and different independent datasets. Conclusions This study demonstrates that it is possible to predict VIRs, VAIRs, VBIRs and PLPIRs from evolutionary information of protein sequence. In order to provide service to the scientific community, we have developed web-server and standalone software VitaPred (http://crdd.osdd.net/raghava/vitapred/). PMID:23387468

  2. Structure of Thermotoga maritima Stationary Phase Survival Protein SurE: A Novel Acid Phosphatase

    PubMed Central

    Zhang, R.-G.; Skarina, T.; Katz, J.E.; Beasley, S.; Khachatryan, A.; Vyas, S.; Arrowsmith, C.H.; Clarke, S.; Edwards, A.; Joachimiak, A.; Savchenko, A.

    2009-01-01

    Summary Background The rpoS, nlpD, pcm, and surE genes are among many whose expression is induced during the stationary phase of bacterial growth. rpoS codes for the stationary-phase RNA polymerase σ subunit, and nlpD codes for a lipoprotein. The pcm gene product repairs damaged proteins by converting the atypical isoaspartyl residues back to L-aspartyls. The physiological and biochemical functions of surE are unknown, but its importance in stress is supported by the duplication of the surE gene in E. coli subjected to high-temperature growth. The pcm and surE genes are highly conserved in bacteria, archaea, and plants. Results The structure of SurE from Thermotoga maritima was determined at 2.0 Å. The SurE monomer is composed of two domains; a conserved N-terminal domain, a Rossman fold, and a C-terminal oligomerization domain, a new fold. Monomers form a dimer that assembles into a tetramer. Biochemical analysis suggests that SurE is an acid phosphatase, with an optimum pH of 5.5–6.2. The active site was identified in the N-terminal domain through analysis of conserved residues. Structure-based site-directed point mutations abolished phosphatase activity. T. maritima SurE intra- and inter-subunit salt bridges were identified that may explain the SurE thermostability. Conclusions The structure of SurE provided information about the protein’s fold, oligomeric state, and active site. The protein possessed magnesium-dependent acid phosphatase activity, but the physiologically relevant substrate(s) remains to be identified. The importance of three of the assigned active site residues in catalysis was confirmed by site-directed mutagenesis. PMID:11709173

  3. Purification of infectious human herpesvirus 6A virions and association of host cell proteins

    PubMed Central

    Hammarstedt, Maria; Ahlqvist, Jenny; Jacobson, Steven; Garoff, Henrik; Fogdell-Hahn, Anna

    2007-01-01

    Background Viruses that are incorporating host cell proteins might trigger autoimmune diseases. It is therefore of interest to identify possible host proteins associated with viruses, especially for enveloped viruses that have been suggested to play a role in autoimmune diseases, like human herpesvirus 6A (HHV-6A) in multiple sclerosis (MS). Results We have established a method for rapid and morphology preserving purification of HHV-6A virions, which in combination with parallel analyses with background control material released from mock-infected cells facilitates qualitative and quantitative investigations of the protein content of HHV-6A virions. In our iodixanol gradient purified preparation, we detected high levels of viral DNA by real-time PCR and viral proteins by metabolic labelling, silver staining and western blots. In contrast, the background level of cellular contamination was low in the purified samples as demonstrated by the silver staining and metabolic labelling analyses. Western blot analyses showed that the cellular complement protein CD46, the receptor for HHV-6A, is associated with the purified and infectious virions. Also, the cellular proteins clathrin, ezrin and Tsg101 are associated with intact HHV-6A virions. Conclusion Cellular proteins are associated with HHV-6A virions. The relevance of the association in disease and especially in autoimmunity will be further investigated. PMID:17949490

  4. A Library of Plasmodium vivax Recombinant Merozoite Proteins Reveals New Vaccine Candidates and Protein-Protein Interactions

    PubMed Central

    Hostetler, Jessica B.; Sharma, Sumana; Bartholdson, S. Josefin; Wright, Gavin J.; Fairhurst, Rick M.; Rayner, Julian C.

    2015-01-01

    Background A vaccine targeting Plasmodium vivax will be an essential component of any comprehensive malaria elimination program, but major gaps in our understanding of P. vivax biology, including the protein-protein interactions that mediate merozoite invasion of reticulocytes, hinder the search for candidate antigens. Only one ligand-receptor interaction has been identified, that between P. vivax Duffy Binding Protein (PvDBP) and the erythrocyte Duffy Antigen Receptor for Chemokines (DARC), and strain-specific immune responses to PvDBP make it a complex vaccine target. To broaden the repertoire of potential P. vivax merozoite-stage vaccine targets, we exploited a recent breakthrough in expressing full-length ectodomains of Plasmodium proteins in a functionally-active form in mammalian cells and initiated a large-scale study of P. vivax merozoite proteins that are potentially involved in reticulocyte binding and invasion. Methodology/Principal Findings We selected 39 P. vivax proteins that are predicted to localize to the merozoite surface or invasive secretory organelles, some of which show homology to P. falciparum vaccine candidates. Of these, we were able to express 37 full-length protein ectodomains in a mammalian expression system, which has been previously used to express P. falciparum invasion ligands such as PfRH5. To establish whether the expressed proteins were correctly folded, we assessed whether they were recognized by antibodies from Cambodian patients with acute vivax malaria. IgG from these samples showed at least a two-fold change in reactivity over naïve controls in 27 of 34 antigens tested, and the majority showed heat-labile IgG immunoreactivity, suggesting the presence of conformation-sensitive epitopes and native tertiary protein structures. Using a method specifically designed to detect low-affinity, extracellular protein-protein interactions, we confirmed a predicted interaction between P. vivax 6-cysteine proteins P12 and P41, further suggesting that the proteins are natively folded and functional. This screen also identified two novel protein-protein interactions, between P12 and PVX_110945, and between MSP3.10 and MSP7.1, the latter of which was confirmed by surface plasmon resonance. Conclusions/Significance We produced a new library of recombinant full-length P. vivax ectodomains, established that the majority of them contain tertiary structure, and used them to identify predicted and novel protein-protein interactions. As well as identifying new interactions for further biological studies, this library will be useful in identifying P. vivax proteins with vaccine potential, and studying P. vivax malaria pathogenesis and immunity. Trial Registration ClinicalTrials.gov NCT00663546 PMID:26701602

  5. Predicting protein-protein interactions on a proteome scale by matching evolutionary and structural similarities at interfaces using PRISM.

    PubMed

    Tuncbag, Nurcan; Gursoy, Attila; Nussinov, Ruth; Keskin, Ozlem

    2011-08-11

    Prediction of protein-protein interactions at the structural level on the proteome scale is important because it allows prediction of protein function, helps drug discovery and takes steps toward genome-wide structural systems biology. We provide a protocol (termed PRISM, protein interactions by structural matching) for large-scale prediction of protein-protein interactions and assembly of protein complex structures. The method consists of two components: rigid-body structural comparisons of target proteins to known template protein-protein interfaces and flexible refinement using a docking energy function. The PRISM rationale follows our observation that globally different protein structures can interact via similar architectural motifs. PRISM predicts binding residues by using structural similarity and evolutionary conservation of putative binding residue 'hot spots'. Ultimately, PRISM could help to construct cellular pathways and functional, proteome-scale annotation. PRISM is implemented in Python and runs in a UNIX environment. The program accepts Protein Data Bank-formatted protein structures and is available at http://prism.ccbb.ku.edu.tr/prism_protocol/.

  6. The minus-end actin capping protein, UNC-94/tropomodulin, regulates development of the Caenorhabditis elegans intestine

    PubMed Central

    Cox-Paulson, Elisabeth; Cannataro, Vincent; Gallagher, Thomas; Hoffman, Corey; Mantione, Gary; McIntosh, Matthew; Silva, Malan; Vissichelli, Nicole; Walker, Rachel; Simske, Jeffrey; Ono, Shoichiro; Hoops, Harold

    2014-01-01

    Background Tropomodulins are actin capping proteins that regulate the stability of the slow growing, minus-ends of actin filaments. The C. elegans tropomodulin homolog, UNC-94 has sequence and functional similarity to vertebrate tropomodulins. We investigated the role of UNC-94 in C. elegans intestinal morphogenesis. Results In the embryonic C. elegans intestine, UNC-94 localizes to the terminal web, an actin and intermediate filament rich structure that underlies the apical membrane. Loss of UNC-94 function results in areas of flattened intestinal lumen. In worms homozygous for the strong loss-of-function allele, unc-94(tm724), the terminal web is thinner and the amount of F-actin is reduced, pointing to a role for UNC-94 in regulating the structure of the terminal web. The non-muscle myosin, NMY-1, also localizes to the terminal web; and we present evidence that increasing actomyosin contractility by depleting the myosin phosphatase regulatory subunit, mel-11, can rescue the flattened lumen phenotype of unc-94 mutants. Conclusions The data support a model in which minus-end actin capping by UNC-94 promotes proper F-actin structure and contraction in the terminal web, yielding proper shape of the intestinal lumen. This establishes a new role for a tropomodulin in regulating lumen shape during tubulogenesis. PMID:24677443

  7. Influenza A H3N2 subtype virus NS1 protein targets into the nucleus and binds primarily via its C-terminal NLS2/NoLS to nucleolin and fibrillarin

    PubMed Central

    2012-01-01

    Background Influenza A virus non-structural protein 1 (NS1) is a virulence factor, which is targeted into the cell cytoplasm, nucleus and nucleolus. NS1 is a multi-functional protein that inhibits host cell pre-mRNA processing and counteracts host cell antiviral responses. Previously, we have shown that the NS1 protein of the H3N2 subtype influenza viruses possesses a C-terminal nuclear localization signal (NLS) that also functions as a nucleolar localization signal (NoLS) and targets the protein into the nucleolus. Results Here, we show that the NS1 protein of the human H3N2 virus subtype interacts in vitro primarily via its C-terminal NLS2/NoLS and to a minor extent via its N-terminal NLS1 with the nucleolar proteins, nucleolin and fibrillarin. Using chimeric green fluorescence protein (GFP)-NS1 fusion constructs, we show that the nucleolar retention of the NS1 protein is determined by its C-terminal NLS2/NoLS in vivo. Confocal laser microscopy analysis shows that the NS1 protein colocalizes with nucleolin in nucleoplasm and nucleolus and with B23 and fibrillarin in the nucleolus of influenza A/Udorn/72 virus-infected A549 cells. Since some viral proteins contain NoLSs, it is likely that viruses have evolved specific nucleolar functions. Conclusion NS1 protein of the human H3N2 virus interacts primarily via the C-terminal NLS2/NoLS and to a minor extent via the N-terminal NLS1 with the main nucleolar proteins, nucleolin, B23 and fibrillarin. PMID:22909121

  8. Functional region prediction with a set of appropriate homologous sequences-an index for sequence selection by integrating structure and sequence information with spatial statistics

    PubMed Central

    2012-01-01

    Background The detection of conserved residue clusters on a protein structure is one of the effective strategies for the prediction of functional protein regions. Various methods, such as Evolutionary Trace, have been developed based on this strategy. In such approaches, the conserved residues are identified through comparisons of homologous amino acid sequences. Therefore, the selection of homologous sequences is a critical step. It is empirically known that a certain degree of sequence divergence in the set of homologous sequences is required for the identification of conserved residues. However, the development of a method to select homologous sequences appropriate for the identification of conserved residues has not been sufficiently addressed. An objective and general method to select appropriate homologous sequences is desired for the efficient prediction of functional regions. Results We have developed a novel index to select the sequences appropriate for the identification of conserved residues, and implemented the index within our method to predict the functional regions of a protein. The implementation of the index improved the performance of the functional region prediction. The index represents the degree of conserved residue clustering on the tertiary structure of the protein. For this purpose, the structure and sequence information were integrated within the index by the application of spatial statistics. Spatial statistics is a field of statistics in which not only the attributes but also the geometrical coordinates of the data are considered simultaneously. Higher degrees of clustering generate larger index scores. We adopted the set of homologous sequences with the highest index score, under the assumption that the best prediction accuracy is obtained when the degree of clustering is the maximum. The set of sequences selected by the index led to higher functional region prediction performance than the sets of sequences selected by other sequence-based methods. Conclusions Appropriate homologous sequences are selected automatically and objectively by the index. Such sequence selection improved the performance of functional region prediction. As far as we know, this is the first approach in which spatial statistics have been applied to protein analyses. Such integration of structure and sequence information would be useful for other bioinformatics problems. PMID:22643026

  9. Molecular dynamics simulations on the Tre1 G protein-coupled receptor: exploring the role of the arginine of the NRY motif in Tre1 structure

    PubMed Central

    2013-01-01

    Background The arginine of the D/E/NRY motif in Rhodopsin family G protein-coupled receptors (GPCRs) is conserved in 96% of these proteins. In some GPCRs, this arginine in transmembrane 3 can form a salt bridge with an aspartic acid or glutamic acid in transmembrane 6. The Drosophila melanogaster GPCR Trapped in endoderm-1 (Tre1) is required for normal primordial germ cell migration. In a mutant form of the protein, Tre1sctt, eight amino acids RYILIACH are missing, resulting in a severe disruption of primordial germ cell development. The impact of the loss of these amino acids on Tre1 structure is unknown. Since the missing amino acids in Tre1sctt include the arginine that is part of the D/E/NRY motif in Tre1, molecular dynamics simulations were performed to explore the hypothesis that these amino acids are involved in salt bridge formation and help maintain Tre1 structure. Results Structural predictions of wild type Tre1 (Tre1+) and Tre1sctt were subjected to over 250 ns of molecular dynamics simulations. The ability of the model systems to form a salt bridge between the arginine of the D/E/NRY motif and an aspartic acid residue in transmembrane 6 was analyzed. The results indicate that a stable salt bridge can form in the Tre1+ systems and a weak salt bridge or no salt bridge, using an alternative arginine, is likely in the Tre1sctt systems. Conclusions The weak salt bridge or lack of a salt bridge in the Tre1sctt systems could be one possible explanation for the disrupted function of Tre1sctt in primordial germ cell migration. These results provide a framework for studying the importance of the arginine of the D/E/NRY motif in the structure and function of other GPCRs that are involved in cell migration, such as CXCR4 in the mouse, zebrafish, and chicken. PMID:24044607

  10. Loss of the insulator protein CTCF during nematode evolution

    PubMed Central

    Heger, Peter; Marin, Birger; Schierenberg, Einhard

    2009-01-01

    Background The zinc finger (ZF) protein CTCF (CCCTC-binding factor) is highly conserved in Drosophila and vertebrates where it has been shown to mediate chromatin insulation at a genomewide level. A mode of genetic regulation that involves insulators and insulator binding proteins to establish independent transcriptional units is currently not known in nematodes including Caenorhabditis elegans. We therefore searched in nematodes for orthologs of proteins that are involved in chromatin insulation. Results While orthologs for other insulator proteins were absent in all 35 analysed nematode species, we find orthologs of CTCF in a subset of nematodes. As an example for these we cloned the Trichinella spiralis CTCF-like gene and revealed a genomic structure very similar to the Drosophila counterpart. To investigate the pattern of CTCF occurrence in nematodes, we performed phylogenetic analysis with the ZF protein sets of completely sequenced nematodes. We show that three ZF proteins from three basal nematodes cluster together with known CTCF proteins whereas no zinc finger protein of C. elegans and other derived nematodes does so. Conclusion Our findings show that CTCF and possibly chromatin insulation are present in basal nematodes. We suggest that the insulator protein CTCF has been secondarily lost in derived nematodes like C. elegans. We propose a switch in the regulation of gene expression during nematode evolution, from the common vertebrate and insect type involving distantly acting regulatory elements and chromatin insulation to a so far poorly characterised mode present in more derived nematodes. Here, all or some of these components are missing. Instead operons, polycistronic transcriptional units common in derived nematodes, seemingly adopted their function. PMID:19712444

  11. Identification of amino acid residues in protein SRP72 required for binding to a kinked 5e motif of the human signal recognition particle RNA

    PubMed Central

    2010-01-01

    Background Human cells depend critically on the signal recognition particle (SRP) for the sorting and delivery of their proteins. The SRP is a ribonucleoprotein complex which binds to signal sequences of secretory polypeptides as they emerge from the ribosome. Among the six proteins of the eukaryotic SRP, the largest protein, SRP72, is essential for protein targeting and possesses a poorly characterized RNA binding domain. Results We delineated the minimal region of SRP72 capable of forming a stable complex with an SRP RNA fragment. The region encompassed residues 545 to 585 of the full-length human SRP72 and contained a lysine-rich cluster (KKKKKKKKGK) at postions 552 to 561 as well as a conserved Pfam motif with the sequence PDPXRWLPXXER at positions 572 to 583. We demonstrated by site-directed mutagenesis that both regions participated in the formation of a complex with the RNA. In agreement with biochemical data and results from chymotryptic digestion experiments, molecular modeling of SRP72 implied that the invariant W577 was located inside the predicted structure of an RNA binding domain. The 11-nucleotide 5e motif contained within the SRP RNA fragment was shown by comparative electrophoresis on native polyacrylamide gels to conform to an RNA kink-turn. The model of the complex suggested that the conserved A240 of the K-turn, previously identified as being essential for the binding to SRP72, could protrude into a groove of the SRP72 RNA binding domain, similar but not identical to how other K-turn recognizing proteins interact with RNA. Conclusions The results from the presented experiments provided insights into the molecular details of a functionally important and structurally interesting RNA-protein interaction. A model for how a ligand binding pocket of SRP72 can accommodate a new RNA K-turn in the 5e region of the eukaryotic SRP RNA is proposed. PMID:21073748

  12. Binding of undamaged double stranded DNA to vaccinia virus uracil-DNA glycosylase

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Schormann, Norbert; Banerjee, Surajit; Ricciardi, Robert

    Background: Uracil-DNA glycosylases are evolutionarily conserved DNA repair enzymes. However, vaccinia virus uracil-DNA glycosylase (known as D4), also serves as an intrinsic and essential component of the processive DNA polymerase complex during DNA replication. In this complex D4 binds to a unique poxvirus specific protein A20 which tethers it to the DNA polymerase. At the replication fork the DNA scanning and repair function of D4 is coupled with DNA replication. So far, DNA-binding to D4 has not been structurally characterized. Results: This manuscript describes the first structure of a DNA-complex of a uracil-DNA glycosylase from the poxvirus family. This alsomore » represents the first structure of a uracil DNA glycosylase in complex with an undamaged DNA. In the asymmetric unit two D4 subunits bind simultaneously to complementary strands of the DNA double helix. Each D4 subunit interacts mainly with the central region of one strand. DNA binds to the opposite side of the A20-binding surface on D4. In comparison of the present structure with the structure of uracil-containing DNA-bound human uracil-DNA glycosylase suggests that for DNA binding and uracil removal D4 employs a unique set of residues and motifs that are highly conserved within the poxvirus family but different in other organisms. Conclusion: The first structure of D4 bound to a truly non-specific undamaged double-stranded DNA suggests that initial binding of DNA may involve multiple non-specific interactions between the protein and the phosphate backbone.« less

  13. Binding of undamaged double stranded DNA to vaccinia virus uracil-DNA glycosylase

    DOE PAGES

    Schormann, Norbert; Banerjee, Surajit; Ricciardi, Robert; ...

    2015-06-02

    Background: Uracil-DNA glycosylases are evolutionarily conserved DNA repair enzymes. However, vaccinia virus uracil-DNA glycosylase (known as D4), also serves as an intrinsic and essential component of the processive DNA polymerase complex during DNA replication. In this complex D4 binds to a unique poxvirus specific protein A20 which tethers it to the DNA polymerase. At the replication fork the DNA scanning and repair function of D4 is coupled with DNA replication. So far, DNA-binding to D4 has not been structurally characterized. Results: This manuscript describes the first structure of a DNA-complex of a uracil-DNA glycosylase from the poxvirus family. This alsomore » represents the first structure of a uracil DNA glycosylase in complex with an undamaged DNA. In the asymmetric unit two D4 subunits bind simultaneously to complementary strands of the DNA double helix. Each D4 subunit interacts mainly with the central region of one strand. DNA binds to the opposite side of the A20-binding surface on D4. In comparison of the present structure with the structure of uracil-containing DNA-bound human uracil-DNA glycosylase suggests that for DNA binding and uracil removal D4 employs a unique set of residues and motifs that are highly conserved within the poxvirus family but different in other organisms. Conclusion: The first structure of D4 bound to a truly non-specific undamaged double-stranded DNA suggests that initial binding of DNA may involve multiple non-specific interactions between the protein and the phosphate backbone.« less

  14. Utilization of protein intrinsic disorder knowledge in structural proteomics

    PubMed Central

    Oldfield, Christopher J.; Xue, Bin; Van, Ya-Yue; Ulrich, Eldon L.; Markley, John L.; Dunker, A. Keith; Uversky, Vladimir N.

    2014-01-01

    Intrinsically disordered proteins (IDPs) and proteins with long disordered regions are highly abundant in various proteomes. Despite their lack of well-defined ordered structure, these proteins and regions are frequently involved in crucial biological processes. Although in recent years these proteins have attracted the attention of many researchers, IDPs represent a significant challenge for structural characterization since these proteins can impact many of the processes in the structure determination pipeline. Here we investigate the effects of IDPs on the structure determination process and the utility of disorder prediction in selecting and improving proteins for structural characterization. Examination of the extent of intrinsic disorder in existing crystal structures found that relatively few protein crystal structures contain extensive regions of intrinsic disorder. Although intrinsic disorder is not the only cause of crystallization failures and many structured proteins cannot be crystallized, filtering out highly disordered proteins from structure-determination target lists is still likely to be cost effective. Therefore it is desirable to avoid highly disordered proteins from structure-determination target lists and we show that disorder prediction can be applied effectively to enrich structure determination pipelines with proteins more likely to yield crystal structures. For structural investigation of specific proteins, disorder prediction can be used to improve targets for structure determination. Finally, a framework for considering intrinsic disorder in the structure determination pipeline is proposed. PMID:23232152

  15. Structure based alignment and clustering of proteins (STRALCP)

    DOEpatents

    Zemla, Adam T.; Zhou, Carol E.; Smith, Jason R.; Lam, Marisa W.

    2013-06-18

    Disclosed are computational methods of clustering a set of protein structures based on local and pair-wise global similarity values. Pair-wise local and global similarity values are generated based on pair-wise structural alignments for each protein in the set of protein structures. Initially, the protein structures are clustered based on pair-wise local similarity values. The protein structures are then clustered based on pair-wise global similarity values. For each given cluster both a representative structure and spans of conserved residues are identified. The representative protein structure is used to assign newly-solved protein structures to a group. The spans are used to characterize conservation and assign a "structural footprint" to the cluster.

  16. The "Transport Specificity Ratio": a structure-function tool to search the protein fold for loci that control transition state stability in membrane transport catalysis

    PubMed Central

    King, Steven C

    2004-01-01

    Background In establishing structure-function relationships for membrane transport proteins, the interpretation of phenotypic changes can be problematic, owing to uncertainties in protein expression levels, sub-cellular localization, and protein-folding fidelity. A dual-label competitive transport assay called "Transport Specificity Ratio" (TSR) analysis has been developed that is simple to perform, and circumvents the "expression problem," providing a reliable TSR phenotype (a constant) for comparison to other transporters. Results Using the Escherichia coli GABA (4-aminobutyrate) permease (GabP) as a model carrier, it is demonstrated that the TSR phenotype is largely independent of assay conditions, exhibiting: (i) indifference to the particular substrate concentrations used, (ii) indifference to extreme changes (40-fold) in transporter expression level, and within broad limits (iii) indifference to assay duration. The theoretical underpinnings of TSR analysis predict all of the above observations, supporting that TSR has (i) applicability in the analysis of membrane transport, and (ii) particular utility in the face of incomplete information on protein expression levels and initial reaction rate intervals (e.g., in high-throughput screening situations). The TSR was used to identify gab permease (GabP) variants that exhibit relative changes in catalytic specificity (kcat/Km) for [14C]GABA (4-aminobutyrate) versus [3H]NA (nipecotic acid). Conclusions The TSR phenotype is an easily measured constant that reflects innate molecular properties of the transition state, and provides a reliable index of the difference in catalytic specificity that a carrier exhibits toward a particular pair of substrates. A change in the TSR phenotype, called a Δ(TSR), represents a specificity shift attributable to underlying changes in the intrinsic substrate binding energy (ΔGb) that translocation catalysts rely upon to decrease activation energy (). TSR analysis is therefore a structure-function tool that enables parsimonious scanning for positions in the protein fold that couple to the transition state, creating stability and thereby serving as functional determinants of catalytic power (efficiency, or specificity). PMID:15548327

  17. Prediction of beta-turns at over 80% accuracy based on an ensemble of predicted secondary structures and multiple alignments

    PubMed Central

    Zheng, Ce; Kurgan, Lukasz

    2008-01-01

    Background β-turn is a secondary protein structure type that plays significant role in protein folding, stability, and molecular recognition. To date, several methods for prediction of β-turns from protein sequences were developed, but they are characterized by relatively poor prediction quality. The novelty of the proposed sequence-based β-turn predictor stems from the usage of a window based information extracted from four predicted three-state secondary structures, which together with a selected set of position specific scoring matrix (PSSM) values serve as an input to the support vector machine (SVM) predictor. Results We show that (1) all four predicted secondary structures are useful; (2) the most useful information extracted from the predicted secondary structure includes the structure of the predicted residue, secondary structure content in a window around the predicted residue, and features that indicate whether the predicted residue is inside a secondary structure segment; (3) the PSSM values of Asn, Asp, Gly, Ile, Leu, Met, Pro, and Val were among the top ranked features, which corroborates with recent studies. The Asn, Asp, Gly, and Pro indicate potential β-turns, while the remaining four amino acids are useful to predict non-β-turns. Empirical evaluation using three nonredundant datasets shows favorable Qtotal, Qpredicted and MCC values when compared with over a dozen of modern competing methods. Our method is the first to break the 80% Qtotal barrier and achieves Qtotal = 80.9%, MCC = 0.47, and Qpredicted higher by over 6% when compared with the second best method. We use feature selection to reduce the dimensionality of the feature vector used as the input for the proposed prediction method. The applied feature set is smaller by 86, 62 and 37% when compared with the second and two third-best (with respect to MCC) competing methods, respectively. Conclusion Experiments show that the proposed method constitutes an improvement over the competing prediction methods. The proposed prediction model can better discriminate between β-turns and non-β-turns due to obtaining lower numbers of false positive predictions. The prediction model and datasets are freely available at . PMID:18847492

  18. An Analysis of Interactions between Fluorescently-Tagged Mutant and Wild-Type SOD1 in Intracellular Inclusions

    PubMed Central

    Qualls, David A.; Crosby, Keith; Brown, Hilda; Borchelt, David R.

    2013-01-01

    Background By mechanisms yet to be discerned, the co-expression of high levels of wild-type human superoxide dismutase 1 (hSOD1) with variants of hSOD1 encoding mutations linked familial amyotrophic lateral sclerosis (fALS) hastens the onset of motor neuron degeneration in transgenic mice. Although it is known that spinal cords of paralyzed mice accumulate detergent insoluble forms of WT hSOD1 along with mutant hSOD1, it has been difficult to determine whether there is co-deposition of the proteins in inclusion structures. Methodology/Principal Findings In the present study, we use cell culture models of mutant SOD1 aggregation, focusing on the A4V, G37R, and G85R variants, to examine interactions between WT-hSOD1 and misfolded mutant SOD1. In these studies, we fuse WT and mutant proteins to either yellow or red fluorescent protein so that the two proteins can be distinguished within inclusions structures. Conclusions/Significance Although the interpretation of the data is not entirely straightforward because we have strong evidence that the nature of the fused fluorophores affects the organization of the inclusions that form, our data are most consistent with the idea that normal dimeric WT-hSOD1 does not readily interact with misfolded forms of mutant hSOD1. We also demonstrate the monomerization of WT-hSOD1 by experimental mutation does induce the protein to aggregate, although such monomerization may enable interactions with misfolded mutant SOD1. Our data suggest that WT-hSOD1 is not prone to become intimately associated with misfolded mutant hSOD1 within intracellular inclusions that can be generated in cultured cells. PMID:24391857

  19. The High Mobility Group A1 (HMGA1) Transcriptome in Cancer and Development

    PubMed Central

    Sumter, T.F.; Xian, L.; Huso, T.; Koo, M.; Chang, Y.-T.; Almasri, T.N.; Chia, L.; Inglis, C.; Reid, D.; Resar, L.M.S.

    2017-01-01

    Background & Objectives Chromatin structure is the single most important feature that distinguishes a cancer cell from a normal cell histologically. Chromatin remodeling proteins regulate chromatin structure and high mobility group A (HMGA1) proteins are among the most abundant, nonhistone chromatin remodeling proteins found in cancer cells. These proteins include HMGA1a/HMGA1b isoforms, which result from alternatively spliced mRNA. The HMGA1 gene is overexpressed in cancer and high levels portend a poor prognosis in diverse tumors. HMGA1 is also highly expressed during embryogenesis and postnatally in adult stem cells. Overexpression of HMGA1 drives neoplastic transformation in cultured cells, while inhibiting HMGA1 blocks oncogenic and cancer stem cell properties. Hmga1 transgenic mice succumb to aggressive tumors, demonstrating that dysregulated expression of HMGA1 causes cancer in vivo. HMGA1 is also required for reprogramming somatic cells into induced pluripotent stem cells. HMGA1 proteins function as ancillary transcription factors that bend chromatin and recruit other transcription factors to DNA. They induce oncogenic transformation by activating or repressing specific genes involved in this process and an HMGA1 “transcriptome” is emerging. Although prior studies reveal potent oncogenic properties of HMGA1, we are only beginning to understand the molecular mechanisms through which HMGA1 functions. In this review, we summarize the list of putative downstream transcriptional targets regulated by HMGA1. We also briefly discuss studies linking HMGA1 to Alzheimer’s disease and type-2 diabetes. Conclusion Further elucidation of HMGA1 function should lead to novel therapeutic strategies for cancer and possibly for other diseases associated with aberrant HMGA1 expression. PMID:26980699

  20. Overcoming the heterologous bias: an in vivo functional analysis of multidrug efflux transporter, CgCdr1p in matched pair clinical isolates of Candida glabrata.

    PubMed

    Puri, Nidhi; Manoharlal, Raman; Sharma, Monika; Sanglard, Dominique; Prasad, Rajendra

    2011-01-07

    We have taken advantage of the natural milieu of matched pair of azole sensitive (AS) and azole resistant (AR) clinical isolates of Candida glabrata for expressing its major ABC multidrug transporter, CgCdr1p for structure and functional analysis. This was accomplished by tagging a green fluorescent protein (GFP) downstream of ORF of CgCDR1 and integrating the resultant fusion protein at its native chromosomal locus in AS and AR backgrounds. The characterization confirmed that in comparison to AS isolate, CgCdr1p-GFP was over-expressed in AR isolates due to its hyperactive native promoter and the GFP tag did not affect its functionality in either construct. We observed that in addition to Rhodamine 6 G (R6G) and Fluconazole (FLC), a recently identified fluorescent substrate of multidrug transporters Nile Red (NR) could also be expelled by CgCdr1p. Competition assays with these substrates revealed the presence of overlapping multiple drug binding sites in CgCdr1p. Point mutations employing site directed mutagenesis confirmed that the role played by unique amino acid residues critical to ATP catalysis and localization of ABC drug transporter proteins are well conserved in C. glabrata as in other yeasts. This study demonstrates a first in vivo novel system where over-expression of GFP tagged MDR transporter protein can be driven by its own hyperactive promoter of AR isolates. Taken together, this in vivo system can be exploited for the structure and functional analysis of CgCdr1p and similar proteins wherein the artefactual concerns encountered in using heterologous systems are totally excluded. Copyright © 2010 Elsevier Inc. All rights reserved.

  1. Thermal, Chemical and pH Induced Denaturation of a Multimeric β-Galactosidase Reveals Multiple Unfolding Pathways

    PubMed Central

    Kishore, Devesh; Kundu, Suman; Kayastha, Arvind M.

    2012-01-01

    Background In this case study, we analysed the properties of unfolded states and pathways leading to complete denaturation of a multimeric chick pea β-galactosidase (CpGAL), as obtained from treatment with guanidium hydrochloride, urea, elevated temperature and extreme pH. Methodology/Principal Findings CpGAL, a heterodimeric protein with native molecular mass of 85 kDa, belongs to α+β class of protein. The conformational stability and thermodynamic parameters of CpGAL unfolding in different states were estimated and interpreted using circular dichroism and fluorescence spectroscopic measurements. The enzyme was found to be structurally and functionally stable in the entire pH range and upto 50°C temperature. Further increase in temperature induces unfolding followed by aggregation. Chemical induced denaturation was found to be cooperative and transitions were irreversible, non-coincidental and sigmoidal. Free energy of protein unfolding (ΔG0) and unfolding constant (Kobs) were also calculated for chemically denatured CpGAL. Significance The protein seems to use different pathways for unfolding in different environments and is a classical example of how the environment dictates the path a protein might take to fold while its amino acid sequence only defines its final three-dimensional conformation. The knowledge accumulated could be of immense biotechnological significance as well. PMID:23185611

  2. Intrinsic disorder in Viral Proteins Genome-Linked: experimental and predictive analyses

    PubMed Central

    Hébrard, Eugénie; Bessin, Yannick; Michon, Thierry; Longhi, Sonia; Uversky, Vladimir N; Delalande, François; Van Dorsselaer, Alain; Romero, Pedro; Walter, Jocelyne; Declerk, Nathalie; Fargette, Denis

    2009-01-01

    Background VPgs are viral proteins linked to the 5' end of some viral genomes. Interactions between several VPgs and eukaryotic translation initiation factors eIF4Es are critical for plant infection. However, VPgs are not restricted to phytoviruses, being also involved in genome replication and protein translation of several animal viruses. To date, structural data are still limited to small picornaviral VPgs. Recently three phytoviral VPgs were shown to be natively unfolded proteins. Results In this paper, we report the bacterial expression, purification and biochemical characterization of two phytoviral VPgs, namely the VPgs of Rice yellow mottle virus (RYMV, genus Sobemovirus) and Lettuce mosaic virus (LMV, genus Potyvirus). Using far-UV circular dichroism and size exclusion chromatography, we show that RYMV and LMV VPgs are predominantly or partly unstructured in solution, respectively. Using several disorder predictors, we show that both proteins are predicted to possess disordered regions. We next extend theses results to 14 VPgs representative of the viral diversity. Disordered regions were predicted in all VPg sequences whatever the genus and the family. Conclusion Based on these results, we propose that intrinsic disorder is a common feature of VPgs. The functional role of intrinsic disorder is discussed in light of the biological roles of VPgs. PMID:19220875

  3. Molecular and phylogenetic characterization of the sieve element occlusion gene family in Fabaceae and non-Fabaceae plants

    PubMed Central

    2010-01-01

    Background The phloem of dicotyledonous plants contains specialized P-proteins (phloem proteins) that accumulate during sieve element differentiation and remain parietally associated with the cisternae of the endoplasmic reticulum in mature sieve elements. Wounding causes P-protein filaments to accumulate at the sieve plates and block the translocation of photosynthate. Specialized, spindle-shaped P-proteins known as forisomes that undergo reversible calcium-dependent conformational changes have evolved exclusively in the Fabaceae. Recently, the molecular characterization of three genes encoding forisome components in the model legume Medicago truncatula (MtSEO1, MtSEO2 and MtSEO3; SEO = sieve element occlusion) was reported, but little is known about the molecular characteristics of P-proteins in non-Fabaceae. Results We performed a comprehensive genome-wide comparative analysis by screening the M. truncatula, Glycine max, Arabidopsis thaliana, Vitis vinifera and Solanum phureja genomes, and a Malus domestica EST library for homologs of MtSEO1, MtSEO2 and MtSEO3 and identified numerous novel SEO genes in Fabaceae and even non-Fabaceae plants, which do not possess forisomes. Even in Fabaceae some SEO genes appear to not encode forisome components. All SEO genes have a similar exon-intron structure and are expressed predominantly in the phloem. Phylogenetic analysis revealed the presence of several subgroups with Fabaceae-specific subgroups containing all of the known as well as newly identified forisome component proteins. We constructed Hidden Markov Models that identified three conserved protein domains, which characterize SEO proteins when present in combination. In addition, one common and three subgroup specific protein motifs were found in the amino acid sequences of SEO proteins. SEO genes are organized in genomic clusters and the conserved synteny allowed us to identify several M. truncatula vs G. max orthologs as well as paralogs within the G. max genome. Conclusions The unexpected occurrence of forisome-like genes in non-Fabaceae plants may indicate that these proteins encode species-specific P-proteins, which is backed up by the phloem-specific expression profiles. The conservation of gene structure, the presence of specific motifs and domains and the genomic synteny argue for a common phylogenetic origin of forisomes and other P-proteins. PMID:20932300

  4. Caveolin, sterol carrier protein-2, membrane cholesterol-rich microdomains and intracellular cholesterol trafficking.

    PubMed

    Schroeder, Friedhelm; Huang, Huan; McIntosh, Avery L; Atshaves, Barbara P; Martin, Gregory G; Kier, Ann B

    2010-01-01

    While the existence of membrane lateral microdomains has been known for over 30 years, interest in these structures accelerated in the past decade due to the discovery that cholesterol-rich microdomains serve important biological functions. It is increasingly appreciated that cholesterol-rich microdomains in the plasma membranes of eukaryotic cells represent an organizing nexus for multiple cellular proteins involved in transmembrane nutrient uptake (cholesterol, fatty acid, glucose, etc.), cell-signaling, immune recognition, pathogen entry, and many other roles. Despite these advances, however, relatively little is known regarding the organization of cholesterol itself in these plasma membrane microdomains. Although a variety of non-sterol markers indicate the presence of microdomains in the plasma membranes of living cells, none of these studies have demonstrated that cholesterol is enriched in these microdomains in living cells. Further, the role of cholesterol-rich membrane microdomains as targets for intracellular cholesterol trafficking proteins such as sterol carrier protein-2 (SCP-2) that facilitate cholesterol uptake and transcellular transport for targeting storage (cholesterol esters) or efflux is only beginning to be understood. Herein, we summarize the background as well as recent progress in this field that has advanced our understanding of these issues.

  5. In vivo system for analyzing the function of the PsbP protein using Chlamydomonas reinhardtii.

    PubMed

    Nishimura, Taishi; Sato, Fumihiko; Ifuku, Kentaro

    2017-09-01

    The PsbP protein is an extrinsic subunit of photosystem II (PSII) specifically developed in green-plant species including land plants and green algae. The protein-protein interactions involving PsbP and its effect on oxygen evolution have been investigated in vitro using isolated PSII membranes. However, the importance of those interactions needs to be examined at the cellular level. To this end, we developed a system expressing exogenous PsbP in the background of the Chlamydomonas BF25 mutant lacking native PsbP. Expression of His-tagged PsbP successfully restored the oxygen-evolving activity and photoautotrophic growth of the mutant, while PsbP-∆15 lacking the N-terminal 15 residues, which are crucial for the oxygen-evolving activity of spinach PSII in vitro, only partially did. This demonstrated the importance of N-terminal sequence of PsbP for the photosynthetic activity in vivo. Furthermore, the PSII-LHCII supercomplex can be specifically purified from the Chlamydomonas cells having His-tagged PsbP using a metal affinity chromatography. This study provides a platform not only for the functional analysis of PsbP in vivo but also for structural analysis of the PSII-LHCII supercomplex from green algae.

  6. Functional redundancy of division specific penicillin-binding proteins in Bacillus subtilis.

    PubMed

    Sassine, Jad; Xu, Meizhu; Sidiq, Karzan R; Emmins, Robyn; Errington, Jeff; Daniel, Richard A

    2017-10-01

    Bacterial cell division involves the dynamic assembly of a diverse set of proteins that coordinate the invagination of the cell membrane and synthesis of cell wall material to create the new cell poles of the separated daughter cells. Penicillin-binding protein PBP 2B is a key cell division protein in Bacillus subtilis proposed to have a specific catalytic role in septal wall synthesis. Unexpectedly, we find that a catalytically inactive mutant of PBP 2B supports cell division, but in this background the normally dispensable PBP 3 becomes essential. Phenotypic analysis of pbpC mutants (encoding PBP 3) shows that PBP 2B has a crucial structural role in assembly of the division complex, independent of catalysis, and that its biochemical activity in septum formation can be provided by PBP 3. Bioinformatic analysis revealed a close sequence relationship between PBP 3 and Staphylococcus aureus PBP 2A, which is responsible for methicillin resistance. These findings suggest that mechanisms for rescuing cell division when the biochemical activity of PBP 2B is perturbed evolved prior to the clinical use of β-lactams. © 2017 The Authors. Molecular Microbiology published by John Wiley & Sons Ltd.

  7. Mammalian Exo1 encodes both structural and catalytic functions that play distinct roles in essential biological processes

    PubMed Central

    Schaetzlein, Sonja; Chahwan, Richard; Avdievich, Elena; Roa, Sergio; Wei, Kaichun; Eoff, Robert L.; Sellers, Rani S.; Clark, Alan B.; Kunkel, Thomas A.; Scharff, Matthew D.; Edelmann, Winfried

    2013-01-01

    Mammalian Exonuclease 1 (EXO1) is an evolutionarily conserved, multifunctional exonuclease involved in DNA damage repair, replication, immunoglobulin diversity, meiosis, and telomere maintenance. It has been assumed that EXO1 participates in these processes primarily through its exonuclease activity, but recent studies also suggest that EXO1 has a structural function in the assembly of higher-order protein complexes. To dissect the enzymatic and nonenzymatic roles of EXO1 in the different biological processes in vivo, we generated an EXO1-E109K knockin (Exo1EK) mouse expressing a stable exonuclease-deficient protein and, for comparison, a fully EXO1-deficient (Exo1null) mouse. In contrast to Exo1null/null mice, Exo1EK/EK mice retained mismatch repair activity and displayed normal class switch recombination and meiosis. However, both Exo1-mutant lines showed defects in DNA damage response including DNA double-strand break repair (DSBR) through DNA end resection, chromosomal stability, and tumor suppression, indicating that the enzymatic function is required for those processes. On a transformation-related protein 53 (Trp53)-null background, the DSBR defect caused by the E109K mutation altered the tumor spectrum but did not affect the overall survival as compared with p53-Exo1null mice, whose defects in both DSBR and mismatch repair also compromised survival. The separation of these functions demonstrates the differential requirement for the structural function and nuclease activity of mammalian EXO1 in distinct DNA repair processes and tumorigenesis in vivo. PMID:23754438

  8. Identification of distinct SET/TAF-Iβ domains required for core histone binding and quantitative characterisation of the interaction

    PubMed Central

    Karetsou, Zoe; Emmanouilidou, Anastasia; Sanidas, Ioannis; Liokatis, Stamatis; Nikolakaki, Eleni; Politou, Anastasia S; Papamarcaki, Thomais

    2009-01-01

    Background The assembly of nucleosomes to higher-order chromatin structures is finely tuned by the relative affinities of histones for chaperones and nucleosomal binding sites. The myeloid leukaemia protein SET/TAF-Iβ belongs to the NAP1 family of histone chaperones and participates in several chromatin-based mechanisms, such as chromatin assembly, nucleosome reorganisation and transcriptional activation. To better understand the histone chaperone function of SET/TAF-Iβ, we designed several SET/TAF-Iβ truncations, examined their structural integrity by circular Dichroism and assessed qualitatively and quantitatively the histone binding properties of wild-type protein and mutant forms using GST-pull down experiments and fluorescence spectroscopy-based binding assays. Results Wild type SET/TAF-Iβ binds to histones H2B and H3 with Kd values of 2.87 and 0.15 μM, respectively. The preferential binding of SET/TAF-Iβ to histone H3 is mediated by its central region and the globular part of H3. On the contrary, the acidic C-terminal tail and the amino-terminal dimerisation domain of SET/TAF-Iβ, as well as the H3 amino-terminal tail, are dispensable for this interaction. Conclusion This type of analysis allowed us to assess the relative affinities of SET/TAF-Iβ for different histones and identify the domains of the protein required for effective histone recognition. Our findings are consistent with recent structural studies of SET/TAF-Iβ and can be valuable to understand the role of SET/TAF-Iβ in chromatin function. PMID:19358706

  9. Discovery-2: an interactive resource for the rational selection and comparison of putative drug target proteins in malaria

    PubMed Central

    2013-01-01

    Background Drug resistance to anti-malarial compounds remains a serious problem, with resistance to newer pharmaceuticals developing at an alarming rate. The development of new anti-malarials remains a priority, and the rational selection of putative targets is a key element of this process. Discovery-2 is an update of the original Discovery in silico resource for the rational selection of putative drug target proteins, enabling researchers to obtain information for a protein which may be useful for the selection of putative drug targets, and to perform advanced filtering of proteins encoded by the malaria genome based on a series of molecular properties. Methods An updated in silico resource has been developed where researchers are able to mine information on malaria proteins and predicted ligands, as well as perform comparisons to the human and mosquito host characteristics. Protein properties used include: domains, motifs, EC numbers, GO terms, orthologs, protein-protein interactions, protein-ligand interactions. Newly added features include drugability measures from ChEMBL, automated literature relations and links to clinical trial information. Searching by chemical structure is also available. Results The updated functionality of the Discovery-2 resource is presented, together with a detailed case study of the Plasmodium falciparum S-adenosyl-L-homocysteine hydrolase (PfSAHH) protein. A short example of a chemical search with pyrimethamine is also illustrated. Conclusion The updated Discovery-2 resource allows researchers to obtain detailed properties of proteins from the malaria genome, which may be of interest in the target selection process, and to perform advanced filtering and selection of proteins based on a relevant range of molecular characteristics. PMID:23537208

  10. NovelFam3000 – Uncharacterized human protein domains conserved across model organisms

    PubMed Central

    Kemmer, Danielle; Podowski, Raf M; Arenillas, David; Lim, Jonathan; Hodges, Emily; Roth, Peggy; Sonnhammer, Erik LL; Höög, Christer; Wasserman, Wyeth W

    2006-01-01

    Background Despite significant efforts from the research community, an extensive portion of the proteins encoded by human genes lack an assigned cellular function. Most metazoan proteins are composed of structural and/or functional domains, of which many appear in multiple proteins. Once a domain is characterized in one protein, the presence of a similar sequence in an uncharacterized protein serves as a basis for inference of function. Thus knowledge of a domain's function, or the protein within which it arises, can facilitate the analysis of an entire set of proteins. Description From the Pfam domain database, we extracted uncharacterized protein domains represented in proteins from humans, worms, and flies. A data centre was created to facilitate the analysis of the uncharacterized domain-containing proteins. The centre both provides researchers with links to dispersed internet resources containing gene-specific experimental data and enables them to post relevant experimental results or comments. For each human gene in the system, a characterization score is posted, allowing users to track the progress of characterization over time or to identify for study uncharacterized domains in well-characterized genes. As a test of the system, a subset of 39 domains was selected for analysis and the experimental results posted to the NovelFam3000 system. For 25 human protein members of these 39 domain families, detailed sub-cellular localizations were determined. Specific observations are presented based on the analysis of the integrated information provided through the online NovelFam3000 system. Conclusion Consistent experimental results between multiple members of a domain family allow for inferences of the domain's functional role. We unite bioinformatics resources and experimental data in order to accelerate the functional characterization of scarcely annotated domain families. PMID:16533400

  11. Validation of Molecular Dynamics Simulations for Prediction of Three-Dimensional Structures of Small Proteins.

    PubMed

    Kato, Koichi; Nakayoshi, Tomoki; Fukuyoshi, Shuichi; Kurimoto, Eiji; Oda, Akifumi

    2017-10-12

    Although various higher-order protein structure prediction methods have been developed, almost all of them were developed based on the three-dimensional (3D) structure information of known proteins. Here we predicted the short protein structures by molecular dynamics (MD) simulations in which only Newton's equations of motion were used and 3D structural information of known proteins was not required. To evaluate the ability of MD simulationto predict protein structures, we calculated seven short test protein (10-46 residues) in the denatured state and compared their predicted and experimental structures. The predicted structure for Trp-cage (20 residues) was close to the experimental structure by 200-ns MD simulation. For proteins shorter or longer than Trp-cage, root-mean square deviation values were larger than those for Trp-cage. However, secondary structures could be reproduced by MD simulations for proteins with 10-34 residues. Simulations by replica exchange MD were performed, but the results were similar to those from normal MD simulations. These results suggest that normal MD simulations can roughly predict short protein structures and 200-ns simulations are frequently sufficient for estimating the secondary structures of protein (approximately 20 residues). Structural prediction method using only fundamental physical laws are useful for investigating non-natural proteins, such as primitive proteins and artificial proteins for peptide-based drug delivery systems.

  12. SHuffle, a novel Escherichia coli protein expression strain capable of correctly folding disulfide bonded proteins in its cytoplasm

    PubMed Central

    2012-01-01

    Background Production of correctly disulfide bonded proteins to high yields remains a challenge. Recombinant protein expression in Escherichia coli is the popular choice, especially within the research community. While there is an ever growing demand for new expression strains, few strains are dedicated to post-translational modifications, such as disulfide bond formation. Thus, new protein expression strains must be engineered and the parameters involved in producing disulfide bonded proteins must be understood. Results We have engineered a new E. coli protein expression strain named SHuffle, dedicated to producing correctly disulfide bonded active proteins to high yields within its cytoplasm. This strain is based on the trxB gor suppressor strain SMG96 where its cytoplasmic reductive pathways have been diminished, allowing for the formation of disulfide bonds in the cytoplasm. We have further engineered a major improvement by integrating into its chromosome a signal sequenceless disulfide bond isomerase, DsbC. We probed the redox state of DsbC in the oxidizing cytoplasm and evaluated its role in assisting the formation of correctly folded multi-disulfide bonded proteins. We optimized protein expression conditions, varying temperature, induction conditions, strain background and the co-expression of various helper proteins. We found that temperature has the biggest impact on improving yields and that the E. coli B strain background of this strain was superior to the K12 version. We also discovered that auto-expression of substrate target proteins using this strain resulted in higher yields of active pure protein. Finally, we found that co-expression of mutant thioredoxins and PDI homologs improved yields of various substrate proteins. Conclusions This work is the first extensive characterization of the trxB gor suppressor strain. The results presented should help researchers design the appropriate protein expression conditions using SHuffle strains. PMID:22569138

  13. Local functional descriptors for surface comparison based binding prediction

    PubMed Central

    2012-01-01

    Background Molecular recognition in proteins occurs due to appropriate arrangements of physical, chemical, and geometric properties of an atomic surface. Similar surface regions should create similar binding interfaces. Effective methods for comparing surface regions can be used in identifying similar regions, and to predict interactions without regard to the underlying structural scaffold that creates the surface. Results We present a new descriptor for protein functional surfaces and algorithms for using these descriptors to compare protein surface regions to identify ligand binding interfaces. Our approach uses descriptors of local regions of the surface, and assembles collections of matches to compare larger regions. Our approach uses a variety of physical, chemical, and geometric properties, adaptively weighting these properties as appropriate for different regions of the interface. Our approach builds a classifier based on a training corpus of examples of binding sites of the target ligand. The constructed classifiers can be applied to a query protein providing a probability for each position on the protein that the position is part of a binding interface. We demonstrate the effectiveness of the approach on a number of benchmarks, demonstrating performance that is comparable to the state-of-the-art, with an approach with more generality than these prior methods. Conclusions Local functional descriptors offer a new method for protein surface comparison that is sufficiently flexible to serve in a variety of applications. PMID:23176080

  14. Dietary L-arginine supplementation reduces Methotrexate-induced intestinal mucosal injury in rat

    PubMed Central

    2012-01-01

    Background Arginine (ARG) and nitric oxide maintain the mucosal integrity of the intestine in various intestinal disorders. In the present study, we evaluated the effects of oral ARG supplementation on intestinal structural changes, enterocyte proliferation and apoptosis following methotrexate (MTX)-induced intestinal damage in a rat. Methods Male rats were divided into four experimental groups: Control rats, CONTR-ARG rats, were treated with oral ARG given in drinking water 72 hours before and 72 hours following vehicle injection, MTX rats were treated with a single dose of methotrexate, and MTX-ARG rats were treated with oral ARG following injection of MTX. Intestinal mucosal damage, mucosal structural changes, enterocyte proliferation and enterocyte apoptosis were determined 72 hours following MTX injection. RT-PCR was used to determine bax and bcl-2 mRNA expression. Results MTX-ARG rats demonstrated greater jejunal and ileal bowel weight, greater ileal mucosal weight, greater ileal mucosal DNA and protein levels, greater villus height in jejunum and ileum and crypt depth in ileum, compared to MTX animals. A significant decrease in enterocyte apoptosis in the ileum of MTX-ARG rats (vs MTX) was accompanied by decreased bax mRNA and protein expression and increased bcl-2 protein levels. Conclusions Treatment with oral ARG prevents mucosal injury and improves intestinal recovery following MTX- injury in the rat. PMID:22545735

  15. Comparative study of the effectiveness and limitations of current methods for detecting sequence coevolution.

    PubMed

    Mao, Wenzhi; Kaya, Cihan; Dutta, Anindita; Horovitz, Amnon; Bahar, Ivet

    2015-06-15

    With rapid accumulation of sequence data on several species, extracting rational and systematic information from multiple sequence alignments (MSAs) is becoming increasingly important. Currently, there is a plethora of computational methods for investigating coupled evolutionary changes in pairs of positions along the amino acid sequence, and making inferences on structure and function. Yet, the significance of coevolution signals remains to be established. Also, a large number of false positives (FPs) arise from insufficient MSA size, phylogenetic background and indirect couplings. Here, a set of 16 pairs of non-interacting proteins is thoroughly examined to assess the effectiveness and limitations of different methods. The analysis shows that recent computationally expensive methods designed to remove biases from indirect couplings outperform others in detecting tertiary structural contacts as well as eliminating intermolecular FPs; whereas traditional methods such as mutual information benefit from refinements such as shuffling, while being highly efficient. Computations repeated with 2,330 pairs of protein families from the Negatome database corroborated these results. Finally, using a training dataset of 162 families of proteins, we propose a combined method that outperforms existing individual methods. Overall, the study provides simple guidelines towards the choice of suitable methods and strategies based on available MSA size and computing resources. Software is freely available through the Evol component of ProDy API. © The Author 2015. Published by Oxford University Press.

  16. Template-based modeling and ab initio refinement of protein oligomer structures using GALAXY in CAPRI round 30.

    PubMed

    Lee, Hasup; Baek, Minkyung; Lee, Gyu Rie; Park, Sangwoo; Seok, Chaok

    2017-03-01

    Many proteins function as homo- or hetero-oligomers; therefore, attempts to understand and regulate protein functions require knowledge of protein oligomer structures. The number of available experimental protein structures is increasing, and oligomer structures can be predicted using the experimental structures of related proteins as templates. However, template-based models may have errors due to sequence differences between the target and template proteins, which can lead to functional differences. Such structural differences may be predicted by loop modeling of local regions or refinement of the overall structure. In CAPRI (Critical Assessment of PRotein Interactions) round 30, we used recently developed features of the GALAXY protein modeling package, including template-based structure prediction, loop modeling, model refinement, and protein-protein docking to predict protein complex structures from amino acid sequences. Out of the 25 CAPRI targets, medium and acceptable quality models were obtained for 14 and 1 target(s), respectively, for which proper oligomer or monomer templates could be detected. Symmetric interface loop modeling on oligomer model structures successfully improved model quality, while loop modeling on monomer model structures failed. Overall refinement of the predicted oligomer structures consistently improved the model quality, in particular in interface contacts. Proteins 2017; 85:399-407. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  17. Protein flexibility in the light of structural alphabets

    PubMed Central

    Craveur, Pierrick; Joseph, Agnel P.; Esque, Jeremy; Narwani, Tarun J.; Noël, Floriane; Shinada, Nicolas; Goguet, Matthieu; Leonard, Sylvain; Poulain, Pierre; Bertrand, Olivier; Faure, Guilhem; Rebehmed, Joseph; Ghozlane, Amine; Swapna, Lakshmipuram S.; Bhaskara, Ramachandra M.; Barnoud, Jonathan; Téletchéa, Stéphane; Jallu, Vincent; Cerny, Jiri; Schneider, Bohdan; Etchebest, Catherine; Srinivasan, Narayanaswamy; Gelly, Jean-Christophe; de Brevern, Alexandre G.

    2015-01-01

    Protein structures are valuable tools to understand protein function. Nonetheless, proteins are often considered as rigid macromolecules while their structures exhibit specific flexibility, which is essential to complete their functions. Analyses of protein structures and dynamics are often performed with a simplified three-state description, i.e., the classical secondary structures. More precise and complete description of protein backbone conformation can be obtained using libraries of small protein fragments that are able to approximate every part of protein structures. These libraries, called structural alphabets (SAs), have been widely used in structure analysis field, from definition of ligand binding sites to superimposition of protein structures. SAs are also well suited to analyze the dynamics of protein structures. Here, we review innovative approaches that investigate protein flexibility based on SAs description. Coupled to various sources of experimental data (e.g., B-factor) and computational methodology (e.g., Molecular Dynamic simulation), SAs turn out to be powerful tools to analyze protein dynamics, e.g., to examine allosteric mechanisms in large set of structures in complexes, to identify order/disorder transition. SAs were also shown to be quite efficient to predict protein flexibility from amino-acid sequence. Finally, in this review, we exemplify the interest of SAs for studying flexibility with different cases of proteins implicated in pathologies and diseases. PMID:26075209

  18. Fish proteins as targets of ferrous-catalyzed oxidation: identification of protein carbonyls by fluorescent labeling on two-dimensional gels and MALDI-TOF/TOF mass spectrometry.

    PubMed

    Pazos, Manuel; da Rocha, Angela Pereira; Roepstorff, Peter; Rogowska-Wrzesinska, Adelina

    2011-07-27

    Protein oxidation in fish meat is considered to affect negatively the muscle texture. An important source of free radicals taking part in this process is Fenton's reaction dependent on ferrous ions present in the tissue. The aim of this study was to investigate the susceptibility of cod muscle proteins in sarcoplasmic and myofibril fractions to in vitro metal-catalyzed oxidation and to point out protein candidates that might play a major role in the deterioration of fish quality. Extracted control proteins and proteins subjected to free radicals generated by Fe(II)/ascorbate mixture were labeled with fluorescein-5-thiosemicarbazide (FTSC) to tag carbonyl groups and separated by two-dimensional gel electrophoresis. Consecutive visualization of protein carbonyl levels by capturing the FTSC signal and total protein levels by capturing the SyproRuby staining signal allowed us to quantify the relative change in protein carbonyl levels corrected for changes in protein content. Proteins were identified using MALDI-TOF/TOF mass spectrometry and homology-based searches. The results show that freshly extracted cod muscle proteins exhibit a detectable carbonylation background and that the incubation with Fe(II)/ascorbate triggers a further oxidation of both sarcoplasmic and myofibril proteins. Different proteins exhibited various degrees of sensitivity to oxidation processes. Glyceraldehyde 3-phosphate dehydrogenase (GAPDH), nucleoside diphosphate kinase B (NDK), triosephosphate isomerase, phosphoglycerate mutase, lactate dehydrogenase, creatine kinase, and enolase were the sarcoplasmic proteins most vulnerable to ferrous-catalyzed oxidation. Moreover, NDK, phosphoglycerate mutase, and GAPDH were identified in several spots differing by their pI, and those forms showed different susceptibilities to metal-catalyzed oxidation, indicating that post-translational modifications may change the resistance of proteins to oxidative damage. The Fe(II)/ascorbate treatment significantly increased carbonylation of important structural proteins in fish muscle, mainly actin and myosin, and degradation products of those proteins were observed, some of them exhibiting increased carbonylation levels.

  19. Fast structure similarity searches among protein models: efficient clustering of protein fragments

    PubMed Central

    2012-01-01

    Background For many predictive applications a large number of models is generated and later clustered in subsets based on structure similarity. In most clustering algorithms an all-vs-all root mean square deviation (RMSD) comparison is performed. Most of the time is typically spent on comparison of non-similar structures. For sets with more than, say, 10,000 models this procedure is very time-consuming and alternative faster algorithms, restricting comparisons only to most similar structures would be useful. Results We exploit the inverse triangle inequality on the RMSD between two structures given the RMSDs with a third structure. The lower bound on RMSD may be used, when restricting the search of similarity to a reasonably low RMSD threshold value, to speed up similarity searches significantly. Tests are performed on large sets of decoys which are widely used as test cases for predictive methods, with a speed-up of up to 100 times with respect to all-vs-all comparison depending on the set and parameters used. Sample applications are shown. Conclusions The algorithm presented here allows fast comparison of large data sets of structures with limited memory requirements. As an example of application we present clustering of more than 100000 fragments of length 5 from the top500H dataset into few hundred representative fragments. A more realistic scenario is provided by the search of similarity within the very large decoy sets used for the tests. Other applications regard filtering nearly-indentical conformation in selected CASP9 datasets and clustering molecular dynamics snapshots. Availability A linux executable and a Perl script with examples are given in the supplementary material (Additional file 1). The source code is available upon request from the authors. PMID:22642815

  20. Serum S100B Protein is Specifically Related to White Matter Changes in Schizophrenia

    PubMed Central

    Milleit, Berko; Smesny, Stefan; Rothermundt, Matthias; Preul, Christoph; Schroeter, Matthias L.; von Eiff, Christof; Ponath, Gerald; Milleit, Christine; Sauer, Heinrich; Gaser, Christian

    2016-01-01

    Background: Schizophrenia can be conceptualized as a form of dysconnectivity between brain regions.To investigate the neurobiological foundation of dysconnectivity, one approach is to analyze white matter structures, such as the pathology of fiber tracks. S100B is considered a marker protein for glial cells, in particular oligodendrocytes and astroglia, that passes the blood brain barrier and is detectable in peripheral blood. Earlier Studies have consistently reported increased S100B levels in schizophrenia. In this study, we aim to investigate associations between S100B and structural white matter abnormalities. Methods: We analyzed data of 17 unmedicated schizophrenic patients (first and recurrent episode) and 22 controls. We used voxel based morphometry (VBM) to detect group differences of white matter structures as obtained from T1-weighted MR-images and considered S100B serum levels as a regressor in an age-corrected interaction analysis. Results: S100B was increased in both patient subgroups. Using VBM, we found clusters indicating significant differences of the association between S100B concentration and white matter. Involved anatomical structures are the posterior cingulate bundle and temporal white matter structures assigned to the superior longitudinal fasciculus. Conclusions: S100B-associated alterations of white matter are shown to be existent already at time of first manifestation of psychosis and are distinct from findings in recurrent episode patients. This suggests involvement of S100B in an ongoing and dynamic process associated with structural brain changes in schizophrenia. However, it remains elusive whether increased S100B serum concentrations in psychotic patients represent a protective response to a continuous pathogenic process or if elevated S100B levels are actively involved in promoting structural brain damage. PMID:27013967

  1. Structural basis of Bloom syndrome (BS) causing mutations in the BLM helicase domain.

    PubMed Central

    Rong, S. B.; Väliaho, J.; Vihinen, M.

    2000-01-01

    BACKGROUND: Bloom syndrome (BS) is characterized by mutations within the BLM gene. The Bloom syndrome protein (BLM) has similarity to the RecQ subfamily of DNA helicases, which contain seven conserved helicase domains and share significant sequence and structural similarity with the Rep and PcrA DNA helicases. We modeled the three-dimensional structure of the BLM helicase domain to analyze the structural basis of BS-causing mutations. MATERIALS AND METHODS: The sequence alignment was performed for RecQ DNA helicases and Rep and PcrA helicases. The crystal structure of PcrA helicase (PDB entry 3PJR) was used as the template for modeling the BLM helicase domain. The model was used to infer the function of BLM and to analyze the effect of the mutations. RESULTS: The structural model with good stereochemistry of the BLM helicase domain contains two subdomains, 1A and 2A. The electrostatic potential of the model is highly negative over most of the surface, except for the cleft between subdomains 1A and 2A which is similar to the template protein. The ATP-binding site is located inside the model between subdomains 1A and 2A; whereas, the DNA-binding region is situated at the surface cleft, with positive potential between 1A and 2A. CONCLUSIONS: The three-dimensional structure of the BLM helicase domain was modeled and applied to interpret BS-causing mutations. The mutation I841T is likely to weaken DNA binding, while the mutations C891R, C901Y, and Q672R presumably disturb the ATP binding. In addition, other critical positions are discussed. PMID:10965492

  2. A Template-Based Protein Structure Reconstruction Method Using Deep Autoencoder Learning.

    PubMed

    Li, Haiou; Lyu, Qiang; Cheng, Jianlin

    2016-12-01

    Protein structure prediction is an important problem in computational biology, and is widely applied to various biomedical problems such as protein function study, protein design, and drug design. In this work, we developed a novel deep learning approach based on a deeply stacked denoising autoencoder for protein structure reconstruction. We applied our approach to a template-based protein structure prediction using only the 3D structural coordinates of homologous template proteins as input. The templates were identified for a target protein by a PSI-BLAST search. 3DRobot (a program that automatically generates diverse and well-packed protein structure decoys) was used to generate initial decoy models for the target from the templates. A stacked denoising autoencoder was trained on the decoys to obtain a deep learning model for the target protein. The trained deep model was then used to reconstruct the final structural model for the target sequence. With target proteins that have highly similar template proteins as benchmarks, the GDT-TS score of the predicted structures is greater than 0.7, suggesting that the deep autoencoder is a promising method for protein structure reconstruction.

  3. MDcons: Intermolecular contact maps as a tool to analyze the interface of protein complexes from molecular dynamics trajectories

    PubMed Central

    2014-01-01

    Background Molecular Dynamics (MD) simulations of protein complexes suffer from the lack of specific tools in the analysis step. Analyses of MD trajectories of protein complexes indeed generally rely on classical measures, such as the RMSD, RMSF and gyration radius, conceived and developed for single macromolecules. As a matter of fact, instead, researchers engaged in simulating the dynamics of a protein complex are mainly interested in characterizing the conservation/variation of its biological interface. Results On these bases, herein we propose a novel approach to the analysis of MD trajectories or other conformational ensembles of protein complexes, MDcons, which uses the conservation of inter-residue contacts at the interface as a measure of the similarity between different snapshots. A "consensus contact map" is also provided, where the conservation of the different contacts is drawn in a grey scale. Finally, the interface area of the complex is monitored during the simulations. To show its utility, we used this novel approach to study two protein-protein complexes with interfaces of comparable size and both dominated by hydrophilic interactions, but having binding affinities at the extremes of the experimental range. MDcons is demonstrated to be extremely useful to analyse the MD trajectories of the investigated complexes, adding important insight into the dynamic behavior of their biological interface. Conclusions MDcons specifically allows the user to highlight and characterize the dynamics of the interface in protein complexes and can thus be used as a complementary tool for the analysis of MD simulations of both experimental and predicted structures of protein complexes. PMID:25077693

  4. Expression of three topologically distinct membrane proteins elicits unique stress response pathways in the yeast Saccharomyces cerevisiae.

    PubMed

    Buck, Teresa M; Jordan, Rick; Lyons-Weiler, James; Adelman, Joshua L; Needham, Patrick G; Kleyman, Thomas R; Brodsky, Jeffrey L

    2015-06-01

    Misfolded membrane proteins are retained in the endoplasmic reticulum (ER) and are subject to ER-associated degradation, which clears the secretory pathway of potentially toxic species. While the transcriptional response to environmental stressors has been extensively studied, limited data exist describing the cellular response to misfolded membrane proteins. To this end, we expressed and then compared the transcriptional profiles elicited by the synthesis of three ER retained, misfolded ion channels: The α-subunit of the epithelial sodium channel, ENaC, the cystic fibrosis transmembrane conductance regulator, CFTR, and an inwardly rectifying potassium channel, Kir2.1, which vary in their mass, membrane topologies, and quaternary structures. To examine transcriptional profiles in a null background, the proteins were expressed in yeast, which was previously used to examine the degradation requirements for each substrate. Surprisingly, the proteins failed to induce a canonical unfolded protein response or heat shock response, although messages encoding several cytosolic and ER lumenal protein folding factors rose when αENaC or CFTR was expressed. In contrast, the levels of these genes were unaltered by Kir2.1 expression; instead, the yeast iron regulon was activated. Nevertheless, a significant number of genes that respond to various environmental stressors were upregulated by all three substrates, and compared with previous microarray data we deduced the existence of a group of genes that reflect a novel misfolded membrane protein response. These data indicate that aberrant proteins in the ER elicit profound yet unique cellular responses. Copyright © 2015 the American Physiological Society.

  5. Protein structure similarity from Principle Component Correlation analysis.

    PubMed

    Zhou, Xiaobo; Chou, James; Wong, Stephen T C

    2006-01-25

    Owing to rapid expansion of protein structure databases in recent years, methods of structure comparison are becoming increasingly effective and important in revealing novel information on functional properties of proteins and their roles in the grand scheme of evolutionary biology. Currently, the structural similarity between two proteins is measured by the root-mean-square-deviation (RMSD) in their best-superimposed atomic coordinates. RMSD is the golden rule of measuring structural similarity when the structures are nearly identical; it, however, fails to detect the higher order topological similarities in proteins evolved into different shapes. We propose new algorithms for extracting geometrical invariants of proteins that can be effectively used to identify homologous protein structures or topologies in order to quantify both close and remote structural similarities. We measure structural similarity between proteins by correlating the principle components of their secondary structure interaction matrix. In our approach, the Principle Component Correlation (PCC) analysis, a symmetric interaction matrix for a protein structure is constructed with relationship parameters between secondary elements that can take the form of distance, orientation, or other relevant structural invariants. When using a distance-based construction in the presence or absence of encoded N to C terminal sense, there are strong correlations between the principle components of interaction matrices of structurally or topologically similar proteins. The PCC method is extensively tested for protein structures that belong to the same topological class but are significantly different by RMSD measure. The PCC analysis can also differentiate proteins having similar shapes but different topological arrangements. Additionally, we demonstrate that when using two independently defined interaction matrices, comparison of their maximum eigenvalues can be highly effective in clustering structurally or topologically similar proteins. We believe that the PCC analysis of interaction matrix is highly flexible in adopting various structural parameters for protein structure comparison.

  6. Structure-Based Druggability Assessment of the Mammalian Structural Proteome with Inclusion of Light Protein Flexibility

    PubMed Central

    Loving, Kathryn A.; Lin, Andy; Cheng, Alan C.

    2014-01-01

    Advances reported over the last few years and the increasing availability of protein crystal structure data have greatly improved structure-based druggability approaches. However, in practice, nearly all druggability estimation methods are applied to protein crystal structures as rigid proteins, with protein flexibility often not directly addressed. The inclusion of protein flexibility is important in correctly identifying the druggability of pockets that would be missed by methods based solely on the rigid crystal structure. These include cryptic pockets and flexible pockets often found at protein-protein interaction interfaces. Here, we apply an approach that uses protein modeling in concert with druggability estimation to account for light protein backbone movement and protein side-chain flexibility in protein binding sites. We assess the advantages and limitations of this approach on widely-used protein druggability sets. Applying the approach to all mammalian protein crystal structures in the PDB results in identification of 69 proteins with potential druggable cryptic pockets. PMID:25079060

  7. Using simple manipulatives to improve student comprehension of a complex biological process: protein synthesis.

    PubMed

    Guzman, Karen; Bartlett, John

    2012-01-01

    Biological systems and living processes involve a complex interplay of biochemicals and macromolecular structures that can be challenging for undergraduate students to comprehend and, thus, misconceptions abound. Protein synthesis, or translation, is an example of a biological process for which students often hold many misconceptions. This article describes an exercise that was developed to illustrate the process of translation using simple objects to represent complex molecules. Animations, 3D physical models, computer simulations, laboratory experiments and classroom lectures are also used to reinforce the students' understanding of translation, but by focusing on the simple manipulatives in this exercise, students are better able to visualize concepts that can elude them when using the other methods. The translation exercise is described along with suggestions for background material, questions used to evaluate student comprehension and tips for using the manipulatives to identify common misconceptions. Copyright © 2012 Wiley Periodicals, Inc.

  8. Characterizing the Background Corona with SDO/AIA

    NASA Technical Reports Server (NTRS)

    Napier, Kate; Alexander, Caroline; Winebarger, Amy

    2014-01-01

    Characterizing the nature of the solar coronal background would enable scientists to more accurately determine plasma parameters, and may lead to a better understanding of the coronal heating problem. Because scientists study the 3D structure of the Sun in 2D, any line-of-sight includes both foreground and background material, and thus, the issue of background subtraction arises. By investigating the intensity values in and around an active region, using multiple wavelengths collected from the Atmospheric Imaging Assembly (AIA) on the Solar Dynamics Observatory (SDO) over an eight-hour period, this project aims to characterize the background as smooth or structured. Different methods were employed to measure the true coronal background and create minimum intensity images. These were then investigated for the presence of structure. The background images created were found to contain long-lived structures, including coronal loops, that were still present in all of the wavelengths, 131, 171, 193, 211, and 335 A. The intensity profiles across the active region indicate that the background is much more structured than previously thought.

  9. Reductive evolution and the loss of PDC/PAS domains from the genus Staphylococcus

    PubMed Central

    2013-01-01

    Background The Per-Arnt-Sim (PAS) domain represents a ubiquitous structural fold that is involved in bacterial sensing and adaptation systems, including several virulence related functions. Although PAS domains and the subclass of PhoQ-DcuS-CitA (PDC) domains have a common structure, there is limited amino acid sequence similarity. To gain greater insight into the evolution of PDC/PAS domains present in the bacterial kingdom and staphylococci in specific, the PDC/PAS domains from the genomic sequences of 48 bacteria, representing 5 phyla, were identified using the sensitive search method based on HMM-to-HMM comparisons (HHblits). Results A total of 1,007 PAS domains and 686 PDC domains distributed over 1,174 proteins were identified. For 28 Gram-positive bacteria, the distribution, organization, and molecular evolution of PDC/PAS domains were analyzed in greater detail, with a special emphasis on the genus Staphylococcus. Compared to other bacteria the staphylococci have relatively fewer proteins (6–9) containing PDC/PAS domains. As a general rule, the staphylococcal genomes examined in this study contain a core group of seven PDC/PAS domain-containing proteins consisting of WalK, SrrB, PhoR, ArlS, HssS, NreB, and GdpP. The exceptions to this rule are: 1) S. saprophyticus lacks the core NreB protein; 2) S. carnosus has two additional PAS domain containing proteins; 3) S. epidermidis, S. aureus, and S. pseudintermedius have an additional protein with two PDC domains that is predicted to code for a sensor histidine kinase; 4) S. lugdunensis has an additional PDC containing protein predicted to be a sensor histidine kinase. Conclusions This comprehensive analysis demonstrates that variation in PDC/PAS domains among bacteria has limited correlations to the genome size or pathogenicity; however, our analysis established that bacteria having a motile phase in their life cycle have significantly more PDC/PAS-containing proteins. In addition, our analysis revealed a tremendous amount of variation in the number of PDC/PAS-containing proteins within genera. This variation extended to the Staphylococcus genus, which had between 6 and 9 PDC/PAS proteins and some of these appear to be previously undescribed signaling proteins. This latter point is important because most staphylococcal proteins that contain PDC/PAS domains regulate virulence factor synthesis or antibiotic resistance. PMID:23902280

  10. Small-molecule-based protein-labeling technology in live cell studies: probe-design concepts and applications.

    PubMed

    Mizukami, Shin; Hori, Yuichiro; Kikuchi, Kazuya

    2014-01-21

    The use of genetic engineering techniques allows researchers to combine functional proteins with fluorescent proteins (FPs) to produce fusion proteins that can be visualized in living cells, tissues, and animals. However, several limitations of FPs, such as slow maturation kinetics or issues with photostability under laser illumination, have led researchers to examine new technologies beyond FP-based imaging. Recently, new protein-labeling technologies using protein/peptide tags and tag-specific probes have attracted increasing attention. Although several protein-labeling systems are com mercially available, researchers continue to work on addressing some of the limitations of this technology. To reduce the level of background fluorescence from unlabeled probes, researchers have pursued fluorogenic labeling, in which the labeling probes do not fluoresce until the target proteins are labeled. In this Account, we review two different fluorogenic protein-labeling systems that we have recently developed. First we give a brief history of protein labeling technologies and describe the challenges involved in protein labeling. In the second section, we discuss a fluorogenic labeling system based on a noncatalytic mutant of β-lactamase, which forms specific covalent bonds with β-lactam antibiotics such as ampicillin or cephalosporin. Based on fluorescence (or Förster) resonance energy transfer and other physicochemical principles, we have developed several types of fluorogenic labeling probes. To extend the utility of this labeling system, we took advantage of a hydrophobic β-lactam prodrug structure to achieve intracellular protein labeling. We also describe a small protein tag, photoactive yellow protein (PYP)-tag, and its probes. By utilizing a quenching mechanism based on close intramolecular contact, we incorporated a turn-on switch into the probes for fluorogenic protein labeling. One of these probes allowed us to rapidly image a protein while avoiding washout. In the future, we expect that protein-labeling systems with finely designed probes will lead to novel methodologies that allow researchers to image biomolecules and to perturb protein functions.

  11. What are the structural features that drive partitioning of proteins in aqueous two-phase systems?

    PubMed

    Wu, Zhonghua; Hu, Gang; Wang, Kui; Zaslavsky, Boris Yu; Kurgan, Lukasz; Uversky, Vladimir N

    2017-01-01

    Protein partitioning in aqueous two-phase systems (ATPSs) represents a convenient, inexpensive, and easy to scale-up protein separation technique. Since partition behavior of a protein dramatically depends on an ATPS composition, it would be highly beneficial to have reliable means for (even qualitative) prediction of partitioning of a target protein under different conditions. Our aim was to understand which structural features of proteins contribute to partitioning of a query protein in a given ATPS. We undertook a systematic empirical analysis of relations between 57 numerical structural descriptors derived from the corresponding amino acid sequences and crystal structures of 10 well-characterized proteins and the partition behavior of these proteins in 29 different ATPSs. This analysis revealed that just a few structural characteristics of proteins can accurately determine behavior of these proteins in a given ATPS. However, partition behavior of proteins in different ATPSs relies on different structural features. In other words, we could not find a unique set of protein structural features derived from their crystal structures that could be used for the description of the protein partition behavior of all proteins in all ATPSs analyzed in this study. We likely need to gain better insight into relationships between protein-solvent interactions and protein structure peculiarities, in particular given limitations of the used here crystal structures, to be able to construct a model that accurately predicts protein partition behavior across all ATPSs. Copyright © 2016 Elsevier B.V. All rights reserved.

  12. Identification of latexin by a proteomic analysis in rat normal articular cartilage

    PubMed Central

    2010-01-01

    Background Osteoarthritis (OA) is characterized by degeneration of articular cartilage. Animal models of OA induced are a widely used tool in the study of the pathogenesis of disease. Several proteomic techniques for selective extraction of proteins have provided protein profiles of chondrocytes and secretory patterns in normal and osteoarthritic cartilage, including the discovery of new and promising biomarkers. In this proteomic analysis to study several proteins from rat normal articular cartilage, two-dimensional electrophoresis and mass spectrometry (MS) were used. Interestingly, latexin (LXN) was found. Using an immunohistochemical technique, it was possible to determine its localization within the chondrocytes from normal and osteoarthritic articular cartilage. Results In this study, 147 proteins were visualized, and 47 proteins were identified by MS. A significant proportion of proteins are involved in metabolic processes and energy (32%), as well as participating in different biological functions including structural organization (19%), signal transduction and molecular signaling (11%), redox homeostasis (9%), transcription and protein synthesis (6%), and transport (6%). The identified proteins were assigned to one or more subcellular compartments. Among the identified proteins, we found some proteins already recognized in other studies such as OA-associated proteins. Interestingly, we identified LXN, an inhibitor of mammalian carboxypeptidases, which had not been described in articular cartilage. Immunolabeling assays for LXN showed a granular distribution pattern in the cytoplasm of most chondrocytes of the middle, deep and calcified zones of normal articular cartilage as well as in subchondral bone. In osteoarthritic cartilage, LXN was observed in superficial and deep zones. Conclusions This study provides the first proteomic analysis of normal articular cartilage of rat. We identified LXN, whose location was demonstrated by immunolabeling in the chondrocytes from the middle, deep and calcified zones of normal articular cartilage, and superficial and deep zones of osteoarthritic cartilage. PMID:20525390

  13. Investigating Molecular Structures of Bio-Fuel and Bio-Oil Seeds as Predictors To Estimate Protein Bioavailability for Ruminants by Advanced Nondestructive Vibrational Molecular Spectroscopy.

    PubMed

    Ban, Yajing; L Prates, Luciana; Yu, Peiqiang

    2017-10-18

    This study was conducted to (1) determine protein and carbohydrate molecular structure profiles and (2) quantify the relationship between structural features and protein bioavailability of newly developed carinata and canola seeds for dairy cows by using Fourier transform infrared molecular spectroscopy. Results showed similarity in protein structural makeup within the entire protein structural region between carinata and canola seeds. The highest area ratios related to structural CHO, total CHO, and cellulosic compounds were obtained for carinata seeds. Carinata and canola seeds showed similar carbohydrate and protein molecular structures by multivariate analyses. Carbohydrate molecular structure profiles were highly correlated to protein rumen degradation and intestinal digestion characteristics. In conclusion, the molecular spectroscopy can detect inherent structural characteristics in carinata and canola seeds in which carbohydrate-relative structural features are related to protein metabolism and utilization. Protein and carbohydrate spectral profiles could be used as predictors of rumen protein bioavailability in cows.

  14. Exploring Human Diseases and Biological Mechanisms by Protein Structure Prediction and Modeling.

    PubMed

    Wang, Juexin; Luttrell, Joseph; Zhang, Ning; Khan, Saad; Shi, NianQing; Wang, Michael X; Kang, Jing-Qiong; Wang, Zheng; Xu, Dong

    2016-01-01

    Protein structure prediction and modeling provide a tool for understanding protein functions by computationally constructing protein structures from amino acid sequences and analyzing them. With help from protein prediction tools and web servers, users can obtain the three-dimensional protein structure models and gain knowledge of functions from the proteins. In this chapter, we will provide several examples of such studies. As an example, structure modeling methods were used to investigate the relation between mutation-caused misfolding of protein and human diseases including epilepsy and leukemia. Protein structure prediction and modeling were also applied in nucleotide-gated channels and their interaction interfaces to investigate their roles in brain and heart cells. In molecular mechanism studies of plants, rice salinity tolerance mechanism was studied via structure modeling on crucial proteins identified by systems biology analysis; trait-associated protein-protein interactions were modeled, which sheds some light on the roles of mutations in soybean oil/protein content. In the age of precision medicine, we believe protein structure prediction and modeling will play more and more important roles in investigating biomedical mechanism of diseases and drug design.

  15. Evolutionary and structural analyses of alpha-papillomavirus capsid proteins yields novel insights into L2 structure and interaction with L1

    PubMed Central

    Lowe, John; Panda, Debasis; Rose, Suzanne; Jensen, Ty; Hughes, Willie A; Tso, For Yue; Angeletti, Peter C

    2008-01-01

    Background PVs (PV) are small, non-enveloped, double-stranded DNA viruses that have been identified as the primary etiological agent for cervical cancer and their potential for malignant transformation in mucosal tissue has a large impact on public health. The PV family Papillomaviridae is organized into multiple genus based on sequential parsimony, host range, tissue tropism, and histology. We focused this analysis on the late gene products, major (L1) and minor (L2) capsid proteins from the family Papillomaviridae genus Alpha-papillomavirus. Alpha-PVs preferentially infect oral and anogenital mucosa of humans and primates with varied risk of oncogenic transformation. Development of evolutionary associations between PVs will likely provide novel information to assist in clarifying the currently elusive relationship between PV and its microenvironment (i.e., the single infected cell) and macro environment (i.e., the skin tissue). We attempt to identify the regions of the major capsid proteins as well as minor capsid proteins of alpha-papillomavirus that have been evolutionarily conserved, and define regions that are under constant selective pressure with respect to the entire family of viruses. Results This analysis shows the loops of L1 are in fact the most variable regions among the alpha-PVs. We also identify regions of L2, involved in interaction with L1, as evolutionarily conserved among the members of alpha- PVs. Finally, a predicted three-dimensional model was generated to further elucidate probable aspects of the L1 and L2 interaction. PMID:19087355

  16. The putative Notch ligand HyJagged is a transmembrane protein present in all cell types of adult Hydra and upregulated at the boundary between bud and parent

    PubMed Central

    2011-01-01

    Background The Notch signalling pathway is conserved in pre-bilaterian animals. In the Cnidarian Hydra it is involved in interstitial stem cell differentiation and in boundary formation during budding. Experimental evidence suggests that in Hydra Notch is activated by presenilin through proteolytic cleavage at the S3 site as in all animals. However, the endogenous ligand for HvNotch has not been described yet. Results We have cloned a cDNA from Hydra, which encodes a bona-fide Notch ligand with a conserved domain structure similar to that of Jagged-like Notch ligands from other animals. Hyjagged mRNA is undetectable in adult Hydra by in situ hybridisation but is strongly upregulated and easily visible at the border between bud and parent shortly before bud detachment. In contrast, HyJagged protein is found in all cell types of an adult hydra, where it localises to membranes and endosomes. Co-localisation experiments showed that it is present in the same cells as HvNotch, however not always in the same membrane structures. Conclusions The putative Notch ligand HyJagged is conserved in Cnidarians. Together with HvNotch it may be involved in the formation of the parent-bud boundary in Hydra. Moreover, protein distribution of both, HvNotch receptor and HyJagged indicate a more widespread function for these two transmembrane proteins in the adult hydra, which may be regulated by additional factors, possibly involving endocytic pathways. PMID:21899759

  17. Improving membrane protein expression and function using genomic edits

    DOE PAGES

    Jensen, Heather M.; Eng, Thomas; Chubukov, Victor; ...

    2017-10-12

    Expression of membrane proteins often leads to growth inhibition and perturbs central metabolism and this burden varies with the protein being overexpressed. There are also known strain backgrounds that allow greater expression of membrane proteins but that differ in efficacy across proteins. Here, we hypothesized that for any membrane protein, it may be possible to identify a modified strain background where its expression can be accommodated with less burden. To directly test this hypothesis, we used a bar-coded transposon insertion library in tandem with cell sorting to assess genome-wide impact of gene deletions on membrane protein expression. The expression ofmore » five membrane proteins (CyoB, CydB, MdlB, YidC, and LepI) and one soluble protein (GST), each fused to GFP, was examined. We identified Escherichia coli mutants that demonstrated increased membrane protein expression relative to that in wild type. For two of the proteins (CyoB and CydB), we conducted functional assays to confirm that the increase in protein expression also led to phenotypic improvement in function. This study represents a systematic approach to broadly identify genetic loci that can be used to improve membrane protein expression, and our method can be used to improve expression of any protein that poses a cellular burden.« less

  18. Improving membrane protein expression and function using genomic edits

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jensen, Heather M.; Eng, Thomas; Chubukov, Victor

    Expression of membrane proteins often leads to growth inhibition and perturbs central metabolism and this burden varies with the protein being overexpressed. There are also known strain backgrounds that allow greater expression of membrane proteins but that differ in efficacy across proteins. Here, we hypothesized that for any membrane protein, it may be possible to identify a modified strain background where its expression can be accommodated with less burden. To directly test this hypothesis, we used a bar-coded transposon insertion library in tandem with cell sorting to assess genome-wide impact of gene deletions on membrane protein expression. The expression ofmore » five membrane proteins (CyoB, CydB, MdlB, YidC, and LepI) and one soluble protein (GST), each fused to GFP, was examined. We identified Escherichia coli mutants that demonstrated increased membrane protein expression relative to that in wild type. For two of the proteins (CyoB and CydB), we conducted functional assays to confirm that the increase in protein expression also led to phenotypic improvement in function. This study represents a systematic approach to broadly identify genetic loci that can be used to improve membrane protein expression, and our method can be used to improve expression of any protein that poses a cellular burden.« less

  19. Expression in Escherichia coli, purification, refolding and antifungal activity of an osmotin from Solanum nigrum

    PubMed Central

    Campos, Magnólia de A; Silva, Marilia S; Magalhães, Cláudio P; Ribeiro, Simone G; Sarto, Rafael PD; Vieira, Eduardo A; Grossi de Sá, Maria F

    2008-01-01

    Background Heterologous protein expression in microorganisms may contribute to identify and demonstrate antifungal activity of novel proteins. The Solanum nigrum osmotin-like protein (SnOLP) gene encodes a member of pathogenesis-related (PR) proteins, from the PR-5 sub-group, the last comprising several proteins with different functions, including antifungal activity. Based on deduced amino acid sequence of SnOLP, computer modeling produced a tertiary structure which is indicative of antifungal activity. Results To validate the potential antifungal activity of SnOLP, a hexahistidine-tagged mature SnOLP form was overexpressed in Escherichia coli M15 strain carried out by a pQE30 vector construction. The urea solubilized His6-tagged mature SnOLP protein was affinity-purified by immobilized-metal (Ni2+) affinity column chromatography. As SnOLP requires the correct formation of eight disulfide bonds, not correctly formed in bacterial cells, we adapted an in vitro method to refold the E. coli expressed SnOLP by using reduced:oxidized gluthatione redox buffer. This method generated biologically active conformations of the recombinant mature SnOLP, which exerted antifungal action towards plant pathogenic fungi (Fusarium solani f. sp.glycines, Colletotrichum spp., Macrophomina phaseolina) and oomycete (Phytophthora nicotiana var. parasitica) under in vitro conditions. Conclusion Since SnOLP displays activity against economically important plant pathogenic fungi and oomycete, it represents a novel PR-5 protein with promising utility for biotechnological applications. PMID:18334031

  20. Knowledge-based computational intelligence development for predicting protein secondary structures from sequences.

    PubMed

    Shen, Hong-Bin; Yi, Dong-Liang; Yao, Li-Xiu; Yang, Jie; Chou, Kuo-Chen

    2008-10-01

    In the postgenomic age, with the avalanche of protein sequences generated and relatively slow progress in determining their structures by experiments, it is important to develop automated methods to predict the structure of a protein from its sequence. The membrane proteins are a special group in the protein family that accounts for approximately 30% of all proteins; however, solved membrane protein structures only represent less than 1% of known protein structures to date. Although a great success has been achieved for developing computational intelligence techniques to predict secondary structures in both globular and membrane proteins, there is still much challenging work in this regard. In this review article, we firstly summarize the recent progress of automation methodology development in predicting protein secondary structures, especially in membrane proteins; we will then give some future directions in this research field.

  1. Energetically Unfavorable Amide Conformations for N6-Acetyllysine Side Chains in Refined Protein Structures

    PubMed Central

    Genshaft, Alexander; Moser, Joe-Ann S.; D'Antonio, Edward L.; Bowman, Christine M.; Christianson, David W.

    2013-01-01

    The reversible acetylation of lysine to form N6-acetyllysine in the regulation of protein function is a hallmark of epigenetics. Acetylation of the positively charged amino group of the lysine side chain generates a neutral N-alkylacetamide moiety that serves as a molecular “switch” for the modulation of protein function and protein-protein interactions. We now report the analysis of 381 N6-acetyllysine side chain amide conformations as found in 79 protein crystal structures and 11 protein NMR structures deposited in the Protein Data Bank (PDB) of the Research Collaboratory for Structural Bioinformatics. We find that only 74.3% of N6-acetyllysine residues in protein crystal structures and 46.5% in protein NMR structures contain amide groups with energetically preferred trans or generously trans conformations. Surprisingly, 17.6% of N6-acetyllysine residues in protein crystal structures and 5.3% in protein NMR structures contain amide groups with energetically unfavorable cis or generously cis conformations. Even more surprisingly, 8.1% of N6-acetyllysine residues in protein crystal structures and 48.2% in NMR structures contain amide groups with energetically prohibitive twisted conformations that approach the transition state structure for cis-trans isomerization. In contrast, 109 unique N-alkylacetamide groups contained in 84 highly-accurate small molecule crystal structures retrieved from the Cambridge Structural Database exclusively adopt energetically preferred trans conformations. Therefore, we conclude that cis and twisted N6-acetyllysine amides in protein structures deposited in the PDB are erroneously modeled due to their energetically unfavorable or prohibitive conformations. PMID:23401043

  2. The Prediction of Botulinum Toxin Structure Based on in Silico and in Vitro Analysis

    NASA Astrophysics Data System (ADS)

    Suzuki, Tomonori; Miyazaki, Satoru

    2011-01-01

    Many of biological system mediated through protein-protein interactions. Knowledge of protein-protein complex structure is required for understanding the function. The determination of huge size and flexible protein-protein complex structure by experimental studies remains difficult, costly and five-consuming, therefore computational prediction of protein structures by homolog modeling and docking studies is valuable method. In addition, MD simulation is also one of the most powerful methods allowing to see the real dynamics of proteins. Here, we predict protein-protein complex structure of botulinum toxin to analyze its property. These bioinformatics methods are useful to report the relation between the flexibility of backbone structure and the activity.

  3. pi-Turns: types, systematics and the context of their occurrence in protein structures

    PubMed Central

    Dasgupta, Bhaskar; Chakrabarti, Pinak

    2008-01-01

    Background For a proper understanding of protein structure and folding it is important to know if a polypeptide segment adopts a conformation inherent in the sequence or it depends on the context of its flanking secondary structures. Turns of various lengths have been studied and characterized starting from three-residue γ-turn to six-residue π-turn. The Schellman motif occurring at the C-terminal end of α-helices is a classical example of hydrogen bonded π-turn involving residues at (i) and (i+5) positions. Hydrogen bonded and non-hydrogen bonded β- and α-turns have been identified previously; likewise, a systematic characterization of π-turns would provide valuable insight into turn structures. Results An analysis of protein structures indicates that at least 20% of π-turns occur independent of the Schellman motif. The two categories of π-turns, designated as π-HB and SCH, have been further classified on the basis of backbone conformation and both have AAAa as the major class. They differ in the residue usage at position (i+1), the former having a large preference for Pro that is absent in the latter. As in the case of shorter length β- and α-turns, π-turns have also been identified not only on the basis of the existence of hydrogen bond, but also using the distance between terminal Cα-atoms, and this resulted in a comparable number of non-hydrogen-bonded π-turns (π-NHB). The presence of shorter β- and α-turns within all categories of π-turns, the subtle variations in backbone torsion angles along the turn residues, the location of the turns in the context of tertiary structures have been studied. Conclusion π-turns have been characterized, first using hydrogen bond and the distance between Cα atoms of the terminal residues, and then using backbone torsion angles. While the Schellman motif has a structural role in helix termination, many of the π-HB turns, being located on surface cavities, have functional role and there is also sequence conservation. PMID:18808671

  4. Structural analysis on mutation residues and interfacial water molecules for human TIM disease understanding

    PubMed Central

    2013-01-01

    Background Human triosephosphate isomerase (HsTIM) deficiency is a genetic disease caused often by the pathogenic mutation E104D. This mutation, located at the side of an abnormally large cluster of water in the inter-subunit interface, reduces the thermostability of the enzyme. Why and how these water molecules are directly related to the excessive thermolability of the mutant have not been investigated in structural biology. Results This work compares the structure of the E104D mutant with its wild type counterparts. It is found that the water topology in the dimer interface of HsTIM is atypical, having a "wet-core-dry-rim" distribution with 16 water molecules tightly packed in a small deep region surrounded by 22 residues including GLU104. These water molecules are co-conserved with their surrounding residues in non-archaeal TIMs (dimers) but not conserved across archaeal TIMs (tetramers), indicating their importance in preserving the overall quaternary structure. As the structural permutation induced by the mutation is not significant, we hypothesize that the excessive thermolability of the E104D mutant is attributed to the easy propagation of atoms' flexibility from the surface into the core via the large cluster of water. It is indeed found that the B factor increment in the wet region is higher than other regions, and, more importantly, the B factor increment in the wet region is maintained in the deeply buried core. Molecular dynamics simulations revealed that for the mutant structure at normal temperature, a clear increase of the root-mean-square deviation is observed for the wet region contacting with the large cluster of interfacial water. Such increase is not observed for other interfacial regions or the whole protein. This clearly suggests that, in the E104D mutant, the large water cluster is responsible for the subunit interface flexibility and overall thermolability, and it ultimately leads to the deficiency of this enzyme. Conclusions Our study reveals that a large cluster of water buried in protein interfaces is fragile and high-maintenance, closely related to the structure, function and evolution of the whole protein. PMID:24564410

  5. The role of porcine reproductive and respiratory syndrome (PRRS) virus structural and non-structural proteins in virus pathogenesis.

    PubMed

    Music, Nedzad; Gagnon, Carl A

    2010-12-01

    Porcine reproductive and respiratory syndrome (PRRS) is an economically devastating viral disease affecting the swine industry worldwide. The etiological agent, PRRS virus (PRRSV), possesses a RNA viral genome with nine open reading frames (ORFs). The ORF1a and ORF1b replicase-associated genes encode the polyproteins pp1a and pp1ab, respectively. The pp1a is processed in nine non-structural proteins (nsps): nsp1α, nsp1β, and nsp2 to nsp8. Proteolytic cleavage of pp1ab generates products nsp9 to nsp12. The proteolytic pp1a cleavage products process and cleave pp1a and pp1ab into nsp products. The nsp9 to nsp12 are involved in virus genome transcription and replication. The 3' end of the viral genome encodes four minor and three major structural proteins. The GP(2a), GP₃ and GP₄ (encoded by ORF2a, 3 and 4), are glycosylated membrane associated minor structural proteins. The fourth minor structural protein, the E protein (encoded by ORF2b), is an unglycosylated membrane associated protein. The viral envelope contains two major structural proteins: a glycosylated major envelope protein GP₅ (encoded by ORF5) and an unglycosylated membrane M protein (encoded by ORF6). The third major structural protein is the nucleocapsid N protein (encoded by ORF7). All PRRSV non-structural and structural proteins are essential for virus replication, and PRRSV infectivity is relatively intolerant to subtle changes within the structural proteins. PRRSV virulence is multigenic and resides in both the non-structural and structural viral proteins. This review discusses the molecular characteristics, biological and immunological functions of the PRRSV structural and nsps and their involvement in the virus pathogenesis.

  6. Triplex DNA-binding proteins are associated with clinical outcomes revealed by proteomic measurements in patients with colorectal cancer

    PubMed Central

    2012-01-01

    Background Tri- and tetra-nucleotide repeats in mammalian genomes can induce formation of alternative non-B DNA structures such as triplexes and guanine (G)-quadruplexes. These structures can induce mutagenesis, chromosomal translocations and genomic instability. We wanted to determine if proteins that bind triplex DNA structures are quantitatively or qualitatively different between colorectal tumor and adjacent normal tissue and if this binding activity correlates with patient clinical characteristics. Methods Extracts from 63 human colorectal tumor and adjacent normal tissues were examined by gel shifts (EMSA) for triplex DNA-binding proteins, which were correlated with clinicopathological tumor characteristics using the Mann-Whitney U, Spearman’s rho, Kaplan-Meier and Mantel-Cox log-rank tests. Biotinylated triplex DNA and streptavidin agarose affinity binding were used to purify triplex-binding proteins in RKO cells. Western blotting and reverse-phase protein array were used to measure protein expression in tissue extracts. Results Increased triplex DNA-binding activity in tumor extracts correlated significantly with lymphatic disease, metastasis, and reduced overall survival. We identified three multifunctional splicing factors with biotinylated triplex DNA affinity: U2AF65 in cytoplasmic extracts, and PSF and p54nrb in nuclear extracts. Super-shift EMSA with anti-U2AF65 antibodies produced a shifted band of the major EMSA H3 complex, identifying U2AF65 as the protein present in the major EMSA band. U2AF65 expression correlated significantly with EMSA H3 values in all extracts and was higher in extracts from Stage III/IV vs. Stage I/II colon tumors (p = 0.024). EMSA H3 values and U2AF65 expression also correlated significantly with GSK3 beta, beta-catenin, and NF- B p65 expression, whereas p54nrb and PSF expression correlated with c-Myc, cyclin D1, and CDK4. EMSA values and expression of all three splicing factors correlated with ErbB1, mTOR, PTEN, and Stat5. Western blots confirmed that full-length and truncated beta-catenin expression correlated with U2AF65 expression in tumor extracts. Conclusions Increased triplex DNA-binding activity in vitro correlates with lymph node disease, metastasis, and reduced overall survival in colorectal cancer, and increased U2AF65 expression is associated with total and truncated beta-catenin expression in high-stage colorectal tumors. PMID:22682314

  7. Taking advantage of local structure descriptors to analyze interresidue contacts in protein structures and protein complexes.

    PubMed

    Martin, Juliette; Regad, Leslie; Etchebest, Catherine; Camproux, Anne-Claude

    2008-11-15

    Interresidue protein contacts in proteins structures and at protein-protein interface are classically described by the amino acid types of interacting residues and the local structural context of the contact, if any, is described using secondary structures. In this study, we present an alternate analysis of interresidue contact using local structures defined by the structural alphabet introduced by Camproux et al. This structural alphabet allows to describe a 3D structure as a sequence of prototype fragments called structural letters, of 27 different types. Each residue can then be assigned to a particular local structure, even in loop regions. The analysis of interresidue contacts within protein structures defined using Voronoï tessellations reveals that pairwise contact specificity is greater in terms of structural letters than amino acids. Using a simple heuristic based on specificity score comparison, we find that 74% of the long-range contacts within protein structures are better described using structural letters than amino acid types. The investigation is extended to a set of protein-protein complexes, showing that the similar global rules apply as for intraprotein contacts, with 64% of the interprotein contacts best described by local structures. We then present an evaluation of pairing functions integrating structural letters to decoy scoring and show that some complexes could benefit from the use of structural letter-based pairing functions.

  8. Akirins in sea lice: first steps towards a deeper understanding.

    PubMed

    Carpio, Yamila; García, Claudia; Pons, Tirso; Haussmann, Denise; Rodríguez-Ramos, Tania; Basabe, Liliana; Acosta, Jannel; Estrada, Mario Pablo

    2013-10-01

    Sea lice (Copepoda, Caligidae) are the most widely distributed marine pathogens in the salmon industry. Vaccination could be an environmentally friendly alternative for sea lice control; however, research on the development of such vaccines is still at an early stage of development. Recent results have suggested that subolesin/akirin/my32 are good candidate antigens for the control of arthropod infestations, including sea lice, but background knowledge about these genes in crustaceans is limited. Herein, we characterize the my32 gene/protein from two important sea lice species, Caligus rogercresseyi and Lepeophtheirus salmonis, based on cDNA sequence isolation, phylogenetic relationships, three dimensional structure prediction and expression analysis. The results show that these genes/proteins have the main characteristics of akirins from invertebrates. In addition, immunization with purified recombinant my32 from L. salmonis elicited a specific antibody response in mice and fish. These results provide an improvement to our current knowledge about my32 proteins and their potential use as vaccine candidates against sea lice in fish. Copyright © 2013 Elsevier Inc. All rights reserved.

  9. Improved Success of Sparse Matrix Protein Crystallization Screening with Heterogeneous Nucleating Agents

    PubMed Central

    Thakur, Anil S.; Robin, Gautier; Guncar, Gregor; Saunders, Neil F. W.; Newman, Janet; Martin, Jennifer L.; Kobe, Bostjan

    2007-01-01

    Background Crystallization is a major bottleneck in the process of macromolecular structure determination by X-ray crystallography. Successful crystallization requires the formation of nuclei and their subsequent growth to crystals of suitable size. Crystal growth generally occurs spontaneously in a supersaturated solution as a result of homogenous nucleation. However, in a typical sparse matrix screening experiment, precipitant and protein concentration are not sampled extensively, and supersaturation conditions suitable for nucleation are often missed. Methodology/Principal Findings We tested the effect of nine potential heterogenous nucleating agents on crystallization of ten test proteins in a sparse matrix screen. Several nucleating agents induced crystal formation under conditions where no crystallization occurred in the absence of the nucleating agent. Four nucleating agents: dried seaweed; horse hair; cellulose and hydroxyapatite, had a considerable overall positive effect on crystallization success. This effect was further enhanced when these nucleating agents were used in combination with each other. Conclusions/Significance Our results suggest that the addition of heterogeneous nucleating agents increases the chances of crystal formation when using sparse matrix screens. PMID:17971854

  10. New paradigm in ankyrin repeats: Beyond protein-protein interaction module.

    PubMed

    Islam, Zeyaul; Nagampalli, Raghavendra Sashi Krishna; Fatima, Munazza Tamkeen; Ashraf, Ghulam Md

    2018-04-01

    Classically, ankyrin repeat (ANK) proteins are built from tandems of two or more repeats and form curved solenoid structures that are associated with protein-protein interactions. These are short, widespread structural motif of around 33 amino acids repeats in tandem, having a canonical helix-loop-helix fold, found individually or in combination with other domains. The multiplicity of structural pattern enables it to form assemblies of diverse sizes, required for their abilities to confer multiple binding and structural roles of proteins. Three-dimensional structures of these repeats determined to date reveal a degree of structural variability that translates into the considerable functional versatility of this protein superfamily. Recent work on the ANK has proposed novel structural information, especially protein-lipid, protein-sugar and protein-protein interaction. Self-assembly of these repeats was also shown to prevent the associated protein in forming filaments. In this review, we summarize the latest findings and how the new structural information has increased our understanding of the structural determinants of ANK proteins. We discussed latest findings on how these proteins participate in various interactions to diversify the ANK roles in numerous biological processes, and explored the emerging and evolving field of designer ankyrins and its framework for protein engineering emphasizing on biotechnological applications. Copyright © 2017 Elsevier B.V. All rights reserved.

  11. Classification of proteins: available structural space for molecular modeling.

    PubMed

    Andreeva, Antonina

    2012-01-01

    The wealth of available protein structural data provides unprecedented opportunity to study and better understand the underlying principles of protein folding and protein structure evolution. A key to achieving this lies in the ability to analyse these data and to organize them in a coherent classification scheme. Over the past years several protein classifications have been developed that aim to group proteins based on their structural relationships. Some of these classification schemes explore the concept of structural neighbourhood (structural continuum), whereas other utilize the notion of protein evolution and thus provide a discrete rather than continuum view of protein structure space. This chapter presents a strategy for classification of proteins with known three-dimensional structure. Steps in the classification process along with basic definitions are introduced. Examples illustrating some fundamental concepts of protein folding and evolution with a special focus on the exceptions to them are presented.

  12. Structural alterations in rat liver proteins due to streptozotocin-induced diabetes and the recovery effect of selenium: Fourier transform infrared microspectroscopy and neural network study

    NASA Astrophysics Data System (ADS)

    Bozkurt, Ozlem; Haman Bayari, Sevgi; Severcan, Mete; Krafft, Christoph; Popp, Jürgen; Severcan, Feride

    2012-07-01

    The relation between protein structural alterations and tissue dysfunction is a major concern as protein fibrillation and/or aggregation due to structural alterations has been reported in many disease states. In the current study, Fourier transform infrared microspectroscopic imaging has been used to investigate diabetes-induced changes on protein secondary structure and macromolecular content in streptozotocin-induced diabetic rat liver. Protein secondary structural alterations were predicted using neural network approach utilizing the amide I region. Moreover, the role of selenium in the recovery of diabetes-induced alterations on macromolecular content and protein secondary structure was also studied. The results revealed that diabetes induced a decrease in lipid to protein and glycogen to protein ratios in diabetic livers. Significant alterations in protein secondary structure were observed with a decrease in α-helical and an increase in β-sheet content. Both doses of selenium restored diabetes-induced changes in lipid to protein and glycogen to protein ratios. However, low-dose selenium supplementation was not sufficient to recover the effects of diabetes on protein secondary structure, while a higher dose of selenium fully restored diabetes-induced alterations in protein structure.

  13. A new graph-based method for pairwise global network alignment

    PubMed Central

    Klau, Gunnar W

    2009-01-01

    Background In addition to component-based comparative approaches, network alignments provide the means to study conserved network topology such as common pathways and more complex network motifs. Yet, unlike in classical sequence alignment, the comparison of networks becomes computationally more challenging, as most meaningful assumptions instantly lead to NP-hard problems. Most previous algorithmic work on network alignments is heuristic in nature. Results We introduce the graph-based maximum structural matching formulation for pairwise global network alignment. We relate the formulation to previous work and prove NP-hardness of the problem. Based on the new formulation we build upon recent results in computational structural biology and present a novel Lagrangian relaxation approach that, in combination with a branch-and-bound method, computes provably optimal network alignments. The Lagrangian algorithm alone is a powerful heuristic method, which produces solutions that are often near-optimal and – unlike those computed by pure heuristics – come with a quality guarantee. Conclusion Computational experiments on the alignment of protein-protein interaction networks and on the classification of metabolic subnetworks demonstrate that the new method is reasonably fast and has advantages over pure heuristics. Our software tool is freely available as part of the LISA library. PMID:19208162

  14. Correlations between Community Structure and Link Formation in Complex Networks

    PubMed Central

    Liu, Zhen; He, Jia-Lin; Kapoor, Komal; Srivastava, Jaideep

    2013-01-01

    Background Links in complex networks commonly represent specific ties between pairs of nodes, such as protein-protein interactions in biological networks or friendships in social networks. However, understanding the mechanism of link formation in complex networks is a long standing challenge for network analysis and data mining. Methodology/Principal Findings Links in complex networks have a tendency to cluster locally and form so-called communities. This widely existed phenomenon reflects some underlying mechanism of link formation. To study the correlations between community structure and link formation, we present a general computational framework including a theory for network partitioning and link probability estimation. Our approach enables us to accurately identify missing links in partially observed networks in an efficient way. The links having high connection likelihoods in the communities reveal that links are formed preferentially to create cliques and accordingly promote the clustering level of the communities. The experimental results verify that such a mechanism can be well captured by our approach. Conclusions/Significance Our findings provide a new insight into understanding how links are created in the communities. The computational framework opens a wide range of possibilities to develop new approaches and applications, such as community detection and missing link prediction. PMID:24039818

  15. Effect of Heating and Glycation on the Allergenicity of 2S Albumins (Ara h 2/6) from Peanut

    PubMed Central

    Skov, Per Stahl; Johnson, Phil E.; Rigby, Neil M.; Przybylski-Nicaise, Laetitia; Bernard, Hervé; Wal, Jean-Michel; Ballmer-Weber, Barbara; Zuidmeer-Jongejan, Laurian; Szépfalusi, Zsolt; Ruinemans-Koerts, Janneke; Jansen, Ad P. H.; Savelkoul, Huub F. J.; Wichers, Harry J.; Mackie, Alan R.; Mills, Clare E. N.; Adel-Patient, Karine

    2011-01-01

    Background Peanut allergy is one of the most common and severe food allergies, and processing is known to influence the allergenicity of peanut proteins. We aimed to establish the effect of heating and glycation on the IgE-binding properties and biological activity of 2S albumins (Ara h 2/6) from peanut. Methodology/Principal Findings Native Ara h 2/6 was purified from raw peanuts and heated in solution (15 min, 110°C) in the presence or absence of glucose. Ara h 2 and 6 were also purified from roasted peanut. Using PBMC and sera from peanut-allergic patients, the cellular proliferative potency and IgE reactivity (reverse EAST inhibition) and functionality (basophil degranulation capacity) of allergens were assessed. Heating Ara h 2/6 at 110°C resulted in extensive denaturation, hydrolysis and aggregation of the protein, whilst Ara h 2 and 6 isolated from roasted peanut retained its native conformation. Allergen stimulation of PBMC induced proliferation and Th2 cytokine secretion which was unaffected by thermal processing. Conversely, IgE reactivity and functionality of Ara h 2/6 was decreased by heating. Whilst heating-glycation further reduced the IgE binding capacity of the proteins, it moderated their loss of histamine releasing capacity. Ara h 2 and 6 purified from roasted peanut demonstrated the same IgE reactivity as unheated, native Ara h 2/6. Conclusions/Significance Although no effect of processing on T-cell reactivity was observed, heat induced denaturation reduced the IgE reactivity and subsequent functionality of Ara h 2/6. Conversely, Ara h 2 and 6 purified from roasted peanut retained the structure and IgE reactivity/functionality of the native protein which may explain the allergenic potency of this protein. Through detailed molecular study and allergenicity assessment approaches, this work then gives new insights into the effect of thermal processing on structure/allergenicity of peanut proteins. PMID:21901150

  16. Fetal-Maternal Interactions in the Synepitheliochorial Placenta Using the eGFP Cloned Cattle Model

    PubMed Central

    Mess, Andrea; Perecin, Felipe; Bressan, Fabiana Fernandes; Mesquita, Ligia Garcia; Miglino, Maria Angelica; Pimentel, José RodrigoValim; Neto, Paulo Fantinato; Meirelles, Flávio Vieira

    2013-01-01

    Background To investigate mechanisms of fetal-maternal cell interactions in the bovine placenta, we developed a model of transgenic enhanced Green Fluorescent Protein (t-eGFP) expressing bovine embryos produced by nuclear transfer (NT) to assess the distribution of fetal-derived products in the bovine placenta. In addition, we searched for male specific DNA in the blood of females carrying in vitro produced male embryos. Our hypothesis is that the bovine placenta is more permeable to fetal-derived products than described elsewhere. Methodology/Principal Findings Samples of placentomes, chorion, endometrium, maternal peripheral blood leukocytes and blood plasma were collected during early gestation and processed for nested-PCR for eGFP and testis-specific Y-encoded protein (TSPY), western blotting and immunohistochemistry for eGFP detection, as well as transmission electron microscopy to verify the level of interaction between maternal and fetal cells. TSPY and eGFP DNA were present in the blood of cows carrying male pregnancies at day 60 of pregnancy. Protein and mRNA of eGFP were observed in the trophoblast and uterine tissues. In the placentomes, the protein expression was weak in the syncytial regions, but intense in neighboring cells on both sides of the fetal-maternal interface. Ultrastructurally, our samples from t-eGFP expressing NT pregnancies showed to be normal, such as the presence of interdigitating structures between fetal and maternal cells. In addition, channels-like structures were present in the trophoblast cells. Conclusions/Significance Data suggested that there is a delivery of fetal contents to the maternal system on both systemic and local levels that involved nuclear acids and proteins. It not clear the mechanisms involved in the transfer of fetal-derived molecules to the maternal system. This delivery may occur through nonclassical protein secretion; throughout transtrophoblastic-like channels and/or by apoptotic processes previously described. In conclusion, the bovine synepitheliochorial placenta displays an intimate fetal-maternal interaction, similar to other placental types for instance human and mouse. PMID:23724045

  17. Gaia: automated quality assessment of protein structure models.

    PubMed

    Kota, Pradeep; Ding, Feng; Ramachandran, Srinivas; Dokholyan, Nikolay V

    2011-08-15

    Increasing use of structural modeling for understanding structure-function relationships in proteins has led to the need to ensure that the protein models being used are of acceptable quality. Quality of a given protein structure can be assessed by comparing various intrinsic structural properties of the protein to those observed in high-resolution protein structures. In this study, we present tools to compare a given structure to high-resolution crystal structures. We assess packing by calculating the total void volume, the percentage of unsatisfied hydrogen bonds, the number of steric clashes and the scaling of the accessible surface area. We assess covalent geometry by determining bond lengths, angles, dihedrals and rotamers. The statistical parameters for the above measures, obtained from high-resolution crystal structures enable us to provide a quality-score that points to specific areas where a given protein structural model needs improvement. We provide these tools that appraise protein structures in the form of a web server Gaia (http://chiron.dokhlab.org). Gaia evaluates the packing and covalent geometry of a given protein structure and provides quantitative comparison of the given structure to high-resolution crystal structures. dokh@unc.edu Supplementary data are available at Bioinformatics online.

  18. Enhanced Induction of T Cell Immunity Using Dendritic Cells Pulsed with HIV Tat and HCMV-pp65 Fusion Protein In Vitro

    PubMed Central

    Park, Jung-Sun; Park, Soo-Young; Cho, Hyun-Il; Sohn, Hyun-Jung

    2011-01-01

    Background Cytotoxic T lymphocytes (CTLs) appear to play an important role in the control and prevention of human cytomegalovirus (HCMV) infection. The pp65 antigen is a structural protein, which has been defined as a potential target for effective immunity against HCMV infection. Incorporation of an 11 amino acid region of the HIV TAT protein transduction domain (Tat) into protein facilitates rapid, efficient entry into cells. Methods To establish a strategy for the generation of HCMV-specific CTLs in vitro, recombinant truncated N- and C-terminal pp65 protein (pp65 N&C) and N- and C-terminal pp65 protein fused with Tat (Tat/pp65 N&C) was produced in E.coli system. Peripheral blood mononuclear cells were stimulated with dendritic cells (DCs) pulsed with pp65 N&C or Tat/pp65 N&C protein and immune responses induced was examined using IFN-γ ELISPOT assay, cytotoxicity assay and tetramer staining. Results DCs pulsed with Tat/pp65N&C protein could induce higher T-cell responses in vitro compared with pp65N&C. Moreover, the DCs pulsed with Tat/pp65 N&C could stimulate both of CD8+ and CD4+ T-cell responses. The T cells induced by DCs pulsed with Tat/pp65 N&C showed higher cytotoxicity than that of pp65-pulsed DCs against autologous lymphoblastoid B-cell line (LCL) expressing the HCMV-pp65 antigen. Conclusion Our results suggest that DCs pulsed with Tat/pp65 N&C protein effectively induced pp65-specific CTL in vitro. Tat fusion recombinant protein may be useful for the development of adoptive T-cell immunotherapy and DC-based vaccines. PMID:21860612

  19. The Influence of Tobacco Smoke on Protein and Metal Levels in the Serum of Women during Pregnancy

    PubMed Central

    Wrześniak, Marta; Kepinska, Marta; Królik, Małgorzata; Milnerowicz, Halina

    2016-01-01

    Background Tobacco smoking by pregnant women has a negative effect on fetal development and increases pregnancy risk by changing the oxidative balance and microelements level. Smoking affects the concentration, structure and function of proteins, potentially leading to various negative effects on pregnancy outcomes. Methodology/Principal Findings The influence of tobacco smoke on key protein fractions in smoking and non-smoking healthy pregnant women was determined by capillary electrophoresis (CE). Concentrations of the proteins α1-antitrypsin, α1-acid glycoprotein, α2-macroglobulin and transferrin were determined by ELISA tests. Total protein concentration was measured by the Biuret method. Smoking status was established by cotinine levels. Cadmium (Cd) and Zinc (Zn) concentrations were determined by flame atomic absorption spectrometry and the Zn/Cd ratio was calculated based on these numbers. Smoking women had a 3.7 times higher level of Cd than non-smoking women. Zn levels decreased during pregnancy for all women. The Zn/Cd ratio was three times lower in smoking women. The differences between the changes in the protein profile for smoking and non-smoking women were noted. Regarding proteins, α1-antitrypsin and α2-macroglobulin levels were lower in the non-smoking group than in the smoking group and correlated with Cd levels (r = -0.968, p = 0.032 for non-smokers; r = −0.835, p = 0.019 for smokers). Zn/Cd ratios correlated negatively with α1-, α2- and β-globulins. Conclusions/Significance Exposure to tobacco smoke increases the concentration of Cd in the blood of pregnant women and may lead to an elevated risk of pregnancy disorders. During pregnancy alter concentrations of some proteins. The correlation of Cd with proteins suggests that it is one of the causes of protein aberrations. PMID:27548057

  20. Macromolecular composition of phloem exudate from white lupin (Lupinus albus L.)

    PubMed Central

    2011-01-01

    Background Members of the legume genus Lupinus exude phloem 'spontaneously' from incisions made to the vasculature. This feature was exploited to document macromolecules present in exudate of white lupin (Lupinus albus [L.] cv Kiev mutant), in particular to identify proteins and RNA molecules, including microRNA (miRNA). Results Proteomic analysis tentatively identified 86 proteins from 130 spots collected from 2D gels analysed by partial amino acid sequence determination using MS/MS. Analysis of a cDNA library constructed from exudate identified 609 unique transcripts. Both proteins and transcripts were classified into functional groups. The largest group of proteins comprised those involved in metabolism (24%), followed by protein modification/turnover (9%), redox regulation (8%), cell structural components (6%), stress and defence response (6%) with fewer in other groups. More prominent proteins were cyclophilin, ubiquitin, a glycine-rich RNA-binding protein, a group of proteins that comprise a glutathione/ascorbate-based mechanism to scavenge oxygen radicals, enzymes of glycolysis and other metabolism including methionine and ethylene synthesis. Potential signalling macromolecules such as transcripts encoding proteins mediating calcium level and the Flowering locus T (FT) protein were also identified. From around 330 small RNA clones (18-25 nt) 12 were identified as probable miRNAs by homology with those from other species. miRNA composition of exudate varied with site of collection (e.g. upward versus downward translocation streams) and nutrition (e.g. phosphorus level). Conclusions This is the first inventory of macromolecule composition of phloem exudate from a species in the Fabaceae, providing a basis to identify systemic signalling macromolecules with potential roles in regulating development, growth and stress response of legumes. PMID:21342527

  1. Evolutionary dynamics of protein domain architecture in plants

    PubMed Central

    2012-01-01

    Background Protein domains are the structural, functional and evolutionary units of the protein. Protein domain architectures are the linear arrangements of domain(s) in individual proteins. Although the evolutionary history of protein domain architecture has been extensively studied in microorganisms, the evolutionary dynamics of domain architecture in the plant kingdom remains largely undefined. To address this question, we analyzed the lineage-based protein domain architecture content in 14 completed green plant genomes. Results Our analyses show that all 14 plant genomes maintain similar distributions of species-specific, single-domain, and multi-domain architectures. Approximately 65% of plant domain architectures are universally present in all plant lineages, while the remaining architectures are lineage-specific. Clear examples are seen of both the loss and gain of specific protein architectures in higher plants. There has been a dynamic, lineage-wise expansion of domain architectures during plant evolution. The data suggest that this expansion can be largely explained by changes in nuclear ploidy resulting from rounds of whole genome duplications. Indeed, there has been a decrease in the number of unique domain architectures when the genomes were normalized into a presumed ancestral genome that has not undergone whole genome duplications. Conclusions Our data show the conservation of universal domain architectures in all available plant genomes, indicating the presence of an evolutionarily conserved, core set of protein components. However, the occurrence of lineage-specific domain architectures indicates that domain architecture diversity has been maintained beyond these core components in plant genomes. Although several features of genome-wide domain architecture content are conserved in plants, the data clearly demonstrate lineage-wise, progressive changes and expansions of individual protein domain architectures, reinforcing the notion that plant genomes have undergone dynamic evolution. PMID:22252370

  2. Sponge non-metastatic Group I Nme gene/protein - structure and function is conserved from sponges to humans

    PubMed Central

    2011-01-01

    Background Nucleoside diphosphate kinases NDPK are evolutionarily conserved enzymes present in Bacteria, Archaea and Eukarya, with human Nme1 the most studied representative of the family and the first identified metastasis suppressor. Sponges (Porifera) are simple metazoans without tissues, closest to the common ancestor of all animals. They changed little during evolution and probably provide the best insight into the metazoan ancestor's genomic features. Recent studies show that sponges have a wide repertoire of genes many of which are involved in diseases in more complex metazoans. The original function of those genes and the way it has evolved in the animal lineage is largely unknown. Here we report new results on the metastasis suppressor gene/protein homolog from the marine sponge Suberites domuncula, NmeGp1Sd. The purpose of this study was to investigate the properties of the sponge Group I Nme gene and protein, and compare it to its human homolog in order to elucidate the evolution of the structure and function of Nme. Results We found that sponge genes coding for Group I Nme protein are intron-rich. Furthermore, we discovered that the sponge NmeGp1Sd protein has a similar level of kinase activity as its human homolog Nme1, does not cleave negatively supercoiled DNA and shows nonspecific DNA-binding activity. The sponge NmeGp1Sd forms a hexamer, like human Nme1, and all other eukaryotic Nme proteins. NmeGp1Sd interacts with human Nme1 in human cells and exhibits the same subcellular localization. Stable clones expressing sponge NmeGp1Sd inhibited the migratory potential of CAL 27 cells, as already reported for human Nme1, which suggests that Nme's function in migratory processes was engaged long before the composition of true tissues. Conclusions This study suggests that the ancestor of all animals possessed a NmeGp1 protein with properties and functions similar to evolutionarily recent versions of the protein, even before the appearance of true tissues and the origin of tumors and metastasis. PMID:21457554

  3. An Efficient Strategy for Small-Scale Screening and Production of Archaeal Membrane Transport Proteins in Escherichia coli

    PubMed Central

    Ma, Pikyee; Varela, Filipa; Magoch, Malgorzata; Silva, Ana Rita; Rosário, Ana Lúcia; Brito, José; Oliveira, Tânia Filipa; Nogly, Przemyslaw; Pessanha, Miguel; Stelter, Meike; Kletzin, Arnulf; Henderson, Peter J. F.; Archer, Margarida

    2013-01-01

    Background Membrane proteins play a key role in many fundamental cellular processes such as transport of nutrients, sensing of environmental signals and energy transduction, and account for over 50% of all known drug targets. Despite their importance, structural and functional characterisation of membrane proteins still remains a challenge, partially due to the difficulties in recombinant expression and purification. Therefore the need for development of efficient methods for heterologous production is essential. Methodology/Principal Findings Fifteen integral membrane transport proteins from Archaea were selected as test targets, chosen to represent two superfamilies widespread in all organisms known as the Major Facilitator Superfamily (MFS) and the 5-Helix Inverted Repeat Transporter superfamily (5HIRT). These proteins typically have eleven to twelve predicted transmembrane helices and are putative transporters for sugar, metabolite, nucleobase, vitamin or neurotransmitter. They include a wide range of examples from the following families: Metabolite-H+-symporter; Sugar Porter; Nucleobase-Cation-Symporter-1; Nucleobase-Cation-Symporter-2; and neurotransmitter-sodium-symporter. Overproduction of transporters was evaluated with three vectors (pTTQ18, pET52b, pWarf) and two Escherichia coli strains (BL21 Star and C43 (DE3)). Thirteen transporter genes were successfully expressed; only two did not express in any of the tested vector-strain combinations. Initial trials showed that seven transporters could be purified and six of these yielded quantities of ≥ 0.4 mg per litre suitable for functional and structural studies. Size-exclusion chromatography confirmed that two purified transporters were almost homogeneous while four others were shown to be non-aggregating, indicating that they are ready for up-scale production and crystallisation trials. Conclusions/Significance Here, we describe an efficient strategy for heterologous production of membrane transport proteins in E. coli. Small-volume cultures (10 mL) produced sufficient amount of proteins to assess their purity and aggregation state. The methods described in this work are simple to implement and can be easily applied to many more membrane proteins. PMID:24282478

  4. Protein Structure Prediction by Protein Threading

    NASA Astrophysics Data System (ADS)

    Xu, Ying; Liu, Zhijie; Cai, Liming; Xu, Dong

    The seminal work of Bowie, Lüthy, and Eisenberg (Bowie et al., 1991) on "the inverse protein folding problem" laid the foundation of protein structure prediction by protein threading. By using simple measures for fitness of different amino acid types to local structural environments defined in terms of solvent accessibility and protein secondary structure, the authors derived a simple and yet profoundly novel approach to assessing if a protein sequence fits well with a given protein structural fold. Their follow-up work (Elofsson et al., 1996; Fischer and Eisenberg, 1996; Fischer et al., 1996a,b) and the work by Jones, Taylor, and Thornton (Jones et al., 1992) on protein fold recognition led to the development of a new brand of powerful tools for protein structure prediction, which we now term "protein threading." These computational tools have played a key role in extending the utility of all the experimentally solved structures by X-ray crystallography and nuclear magnetic resonance (NMR), providing structural models and functional predictions for many of the proteins encoded in the hundreds of genomes that have been sequenced up to now.

  5. 3D-SURFER 2.0: web platform for real-time search and characterization of protein surfaces.

    PubMed

    Xiong, Yi; Esquivel-Rodriguez, Juan; Sael, Lee; Kihara, Daisuke

    2014-01-01

    The increasing number of uncharacterized protein structures necessitates the development of computational approaches for function annotation using the protein tertiary structures. Protein structure database search is the basis of any structure-based functional elucidation of proteins. 3D-SURFER is a web platform for real-time protein surface comparison of a given protein structure against the entire PDB using 3D Zernike descriptors. It can smoothly navigate the protein structure space in real-time from one query structure to another. A major new feature of Release 2.0 is the ability to compare the protein surface of a single chain, a single domain, or a single complex against databases of protein chains, domains, complexes, or a combination of all three in the latest PDB. Additionally, two types of protein structures can now be compared: all-atom-surface and backbone-atom-surface. The server can also accept a batch job for a large number of database searches. Pockets in protein surfaces can be identified by VisGrid and LIGSITE (csc) . The server is available at http://kiharalab.org/3d-surfer/.

  6. Protein Structure Determination using Metagenome sequence data

    PubMed Central

    Ovchinnikov, Sergey; Park, Hahnbeom; Varghese, Neha; Huang, Po-Ssu; Pavlopoulos, Georgios A.; Kim, David E.; Kamisetty, Hetunandan; Kyrpides, Nikos C.; Baker, David

    2017-01-01

    Despite decades of work by structural biologists, there are still ~5200 protein families with unknown structure outside the range of comparative modeling. We show that Rosetta structure prediction guided by residue-residue contacts inferred from evolutionary information can accurately model proteins that belong to large families, and that metagenome sequence data more than triples the number of protein families with sufficient sequences for accurate modeling. We then integrate metagenome data, contact based structure matching and Rosetta structure calculations to generate models for 614 protein families with currently unknown structures; 206 are membrane proteins and 137 have folds not represented in the PDB. This approach provides the representative models for large protein families originally envisioned as the goal of the protein structure initiative at a fraction of the cost. PMID:28104891

  7. Functional analysis of the Helicobacter pullorum N-linked protein glycosylation system.

    PubMed

    Jervis, Adrian J; Wood, Alison G; Cain, Joel A; Butler, Jonathan A; Frost, Helen; Lord, Elizabeth; Langdon, Rebecca; Cordwell, Stuart J; Wren, Brendan W; Linton, Dennis

    2018-04-01

    N-linked protein glycosylation systems operate in species from all three domains of life. The model bacterial N-linked glycosylation system from Campylobacter jejuni is encoded by pgl genes present at a single chromosomal locus. This gene cluster includes the pglB oligosaccharyltransferase responsible for transfer of glycan from lipid carrier to protein. Although all genomes from species of the Campylobacter genus contain a pgl locus, among the related Helicobacter genus only three evolutionarily related species (H. pullorum, H. canadensis and H. winghamensis) potentially encode N-linked protein glycosylation systems. Helicobacter putative pgl genes are scattered in five chromosomal loci and include two putative oligosaccharyltransferase-encoding pglB genes per genome. We have previously demonstrated the in vitro N-linked glycosylation activity of H. pullorum resulting in transfer of a pentasaccharide to a peptide at asparagine within the sequon (D/E)XNXS/T. In this study, we identified the first H. pullorum N-linked glycoprotein, termed HgpA. Production of histidine-tagged HgpA in the background of insertional knockout mutants of H. pullorum pgl/wbp genes followed by analysis of HgpA glycan structures demonstrated the role of individual gene products in the PglB1-dependent N-linked protein glycosylation pathway. Glycopeptide purification by zwitterionic-hydrophilic interaction liquid chromatography coupled with tandem mass spectrometry identified six glycosites from five H. pullorum proteins, which was consistent with proteins reactive with a polyclonal antiserum generated against glycosylated HgpA. This study demonstrates functioning of a H. pullorum N-linked general protein glycosylation system.

  8. Co-production of GroELS discriminates between intrinsic and thermally-induced recombinant protein aggregation during substrate quality control

    PubMed Central

    2011-01-01

    Background The effects and effectiveness of the chaperone pair GroELS on the yield and quality of recombinant polypeptides produced in Escherichia coli are matter of controversy, as the reported activities of this complex are not always consistent and eventually indicate undesired side effects. The divergence in the reported data could be due, at least partially, to different experimental conditions in independent research approaches. Results We have then selected two structurally different model proteins (namely GFP and E. coli β-galactosidase) and two derived aggregation-prone fusions to explore, in a systematic way, the eventual effects of GroELS co-production on yield, solubility and conformational quality. Host cells were cultured at two alternative temperatures below the threshold at which thermal stress is expected to be triggered, to minimize the involvement of independent stress factors. Conclusions From the analysis of protein yield, solubility and biological activity of the four model proteins produced alone or along the chaperones, we conclude that GroELS impacts on yield and quality of aggregation-prone proteins with intrinsic determinants but not on thermally induced protein aggregation. No effective modifications of protein solubility have been observed, but significant stabilization of small (encapsulable) substrates and moderate chaperone-induced degradation of larger (excluded) polypeptides. These findings indicate that the activities of this chaperone pair in the context of actively producing recombinant bacteria discriminate between intrinsic and thermally-induced protein aggregation, and that the side effects of GroELS overproduction might be determined by substrate size. PMID:21992454

  9. The Leishmania infantum PUF proteins are targets of the humoral response during visceral leishmaniasis

    PubMed Central

    2010-01-01

    Background RNA-binding proteins of the PUF family share a conserved domain consisting of tandemly repeated 36-40 amino acid motifs (typically eight) known as Puf repeats. Proteins containing tandem repeats are often dominant targets of humoral responses during infectious diseases. Thus, we considered of interest to analyze whether Leishmania PUF proteins result antigenic during visceral leishmaniasis (VL). Findings Here, employing whole-genome databases, we report the composition, and structural features, of the PUF family in Leishmania infantum. Additionally, the 10 genes of the L. infantum PUF family were cloned and used to express the Leishmania PUFs in bacteria as recombinant proteins. Finally, the antigenicity of these PUF proteins was evaluated by determining levels of specific antibodies in sera from experimentally infected hamsters. The Leishmania PUFs were all recognized by the sera, even though with different degree of reactivity and/or frequency of recognition. The reactivity of hamster sera against recombinant LiPUF1 and LiPUF2 was particularly prominent, and these proteins were subsequently assayed against sera from human patients. High antibody responses against rLiPUF1 and rLiPUF2 were found in sera from VL patients, but these proteins resulted also recognized by sera from Chagas' disease patients. Conclusion Our results suggest that Leishmania PUFs are targets of the humoral response during L. infantum infection and may represent candidates for serodiagnosis and/or vaccine reagents; however, it should be kept in mind the cross-reactivity of LiPUFs with antibodies induced against other trypanosomatids such as Trypanosoma cruzi. PMID:20180988

  10. Visual signal detection in structured backgrounds. II. Effects of contrast gain control, background variations, and white noise

    NASA Technical Reports Server (NTRS)

    Eckstein, M. P.; Ahumada, A. J. Jr; Watson, A. B.

    1997-01-01

    Studies of visual detection of a signal superimposed on one of two identical backgrounds show performance degradation when the background has high contrast and is similar in spatial frequency and/or orientation to the signal. To account for this finding, models include a contrast gain control mechanism that pools activity across spatial frequency, orientation and space to inhibit (divisively) the response of the receptor sensitive to the signal. In tasks in which the observer has to detect a known signal added to one of M different backgrounds grounds due to added visual noise, the main sources of degradation are the stochastic noise in the image and the suboptimal visual processing. We investigate how these two sources of degradation (contrast gain control and variations in the background) interact in a task in which the signal is embedded in one of M locations in a complex spatially varying background (structured background). We use backgrounds extracted from patient digital medical images. To isolate effects of the fixed deterministic background (the contrast gain control) from the effects of the background variations, we conduct detection experiments with three different background conditions: (1) uniform background, (2) a repeated sample of structured background, and (3) different samples of structured background. Results show that human visual detection degrades from the uniform background condition to the repeated background condition and degrades even further in the different backgrounds condition. These results suggest that both the contrast gain control mechanism and the background random variations degrade human performance in detection of a signal in a complex, spatially varying background. A filter model and added white noise are used to generate estimates of sampling efficiencies, an equivalent internal noise, an equivalent contrast-gain-control-induced noise, and an equivalent noise due to the variations in the structured background.

  11. HMPAS: Human Membrane Protein Analysis System

    PubMed Central

    2013-01-01

    Background Membrane proteins perform essential roles in diverse cellular functions and are regarded as major pharmaceutical targets. The significance of membrane proteins has led to the developing dozens of resources related with membrane proteins. However, most of these resources are built for specific well-known membrane protein groups, making it difficult to find common and specific features of various membrane protein groups. Methods We collected human membrane proteins from the dispersed resources and predicted novel membrane protein candidates by using ortholog information and our membrane protein classifiers. The membrane proteins were classified according to the type of interaction with the membrane, subcellular localization, and molecular function. We also made new feature dataset to characterize the membrane proteins in various aspects including membrane protein topology, domain, biological process, disease, and drug. Moreover, protein structure and ICD-10-CM based integrated disease and drug information was newly included. To analyze the comprehensive information of membrane proteins, we implemented analysis tools to identify novel sequence and functional features of the classified membrane protein groups and to extract features from protein sequences. Results We constructed HMPAS with 28,509 collected known membrane proteins and 8,076 newly predicted candidates. This system provides integrated information of human membrane proteins individually and in groups organized by 45 subcellular locations and 1,401 molecular functions. As a case study, we identified associations between the membrane proteins and diseases and present that membrane proteins are promising targets for diseases related with nervous system and circulatory system. A web-based interface of this system was constructed to facilitate researchers not only to retrieve organized information of individual proteins but also to use the tools to analyze the membrane proteins. Conclusions HMPAS provides comprehensive information about human membrane proteins including specific features of certain membrane protein groups. In this system, user can acquire the information of individual proteins and specified groups focused on their conserved sequence features, involved cellular processes, and diseases. HMPAS may contribute as a valuable resource for the inference of novel cellular mechanisms and pharmaceutical targets associated with the human membrane proteins. HMPAS is freely available at http://fcode.kaist.ac.kr/hmpas. PMID:24564858

  12. Characterizing the True Background Corona with SDO/AIA

    NASA Technical Reports Server (NTRS)

    Napier, Kate; Winebarger, Amy; Alexander, Caroline

    2014-01-01

    Characterizing the nature of the solar coronal background would enable scientists to more accurately determine plasma parameters, and may lead to a better understanding of the coronal heating problem. Because scientists study the 3D structure of the Sun in 2D, any line of sight includes both foreground and background material, and thus, the issue of background subtraction arises. By investigating the intensity values in and around an active region, using multiple wavelengths collected from the Atmospheric Imaging Assembly (AIA) on the Solar Dynamics Observatory (SDO) over an eight-hour period, this project aims to characterize the background as smooth or structured. Different methods were employed to measure the true coronal background and create minimum intensity images. These were then investigated for the presence of structure. The background images created were found to contain long-lived structures, including coronal loops, that were still present in all of the wavelengths, 193 Angstroms,171 Angstroms,131 Angstroms, and 211 Angstroms. The intensity profiles across the active region indicate that the background is much more structured than previously thought.

  13. Lessons from making the Structural Classification of Proteins (SCOP) and their implications for protein structure modelling.

    PubMed

    Andreeva, Antonina

    2016-06-15

    The Structural Classification of Proteins (SCOP) database has facilitated the development of many tools and algorithms and it has been successfully used in protein structure prediction and large-scale genome annotations. During the development of SCOP, numerous exceptions were found to topological rules, along with complex evolutionary scenarios and peculiarities in proteins including the ability to fold into alternative structures. This article reviews cases of structural variations observed for individual proteins and among groups of homologues, knowledge of which is essential for protein structure modelling. © 2016 The Author(s). published by Portland Press Limited on behalf of the Biochemical Society.

  14. SDSL-ESR-based protein structure characterization.

    PubMed

    Strancar, Janez; Kavalenka, Aleh; Urbancic, Iztok; Ljubetic, Ajasja; Hemminga, Marcus A

    2010-03-01

    As proteins are key molecules in living cells, knowledge about their structure can provide important insights and applications in science, biotechnology, and medicine. However, many protein structures are still a big challenge for existing high-resolution structure-determination methods, as can be seen in the number of protein structures published in the Protein Data Bank. This is especially the case for less-ordered, more hydrophobic and more flexible protein systems. The lack of efficient methods for structure determination calls for urgent development of a new class of biophysical techniques. This work attempts to address this problem with a novel combination of site-directed spin labelling electron spin resonance spectroscopy (SDSL-ESR) and protein structure modelling, which is coupled by restriction of the conformational spaces of the amino acid side chains. Comparison of the application to four different protein systems enables us to generalize the new method and to establish a general procedure for determination of protein structure.

  15. Dissecting the relationship between protein structure and sequence variation

    NASA Astrophysics Data System (ADS)

    Shahmoradi, Amir; Wilke, Claus; Wilke Lab Team

    2015-03-01

    Over the past decade several independent works have shown that some structural properties of proteins are capable of predicting protein evolution. The strength and significance of these structure-sequence relations, however, appear to vary widely among different proteins, with absolute correlation strengths ranging from 0 . 1 to 0 . 8 . Here we present the results from a comprehensive search for the potential biophysical and structural determinants of protein evolution by studying more than 200 structural and evolutionary properties in a dataset of 209 monomeric enzymes. We discuss the main protein characteristics responsible for the general patterns of protein evolution, and identify sequence divergence as the main determinant of the strengths of virtually all structure-evolution relationships, explaining ~ 10 - 30 % of observed variation in sequence-structure relations. In addition to sequence divergence, we identify several protein structural properties that are moderately but significantly coupled with the strength of sequence-structure relations. In particular, proteins with more homogeneous back-bone hydrogen bond energies, large fractions of helical secondary structures and low fraction of beta sheets tend to have the strongest sequence-structure relation. BEACON-NSF center for the study of evolution in action.

  16. Evaluation of 3D-Jury on CASP7 models

    PubMed Central

    Kaján, László; Rychlewski, Leszek

    2007-01-01

    Background 3D-Jury, the structure prediction consensus method publicly available in the Meta Server , was evaluated using models gathered in the 7th round of the Critical Assessment of Techniques for Protein Structure Prediction (CASP7). 3D-Jury is an automated expert process that generates protein structure meta-predictions from sets of models obtained from partner servers. Results The performance of 3D-Jury was analysed for three aspects. First, we examined the correlation between the 3D-Jury score and a model quality measure: the number of correctly predicted residues. The 3D-Jury score was shown to correlate significantly with the number of correctly predicted residues, the correlation is good enough to be used for prediction. 3D-Jury was also found to improve upon the competing servers' choice of the best structure model in most cases. The value of the 3D-Jury score as a generic reliability measure was also examined. We found that the 3D-Jury score separates bad models from good models better than the reliability score of the original server in 27 cases and falls short of it in only 5 cases out of a total of 38. We report the release of a new Meta Server feature: instant 3D-Jury scoring of uploaded user models. Conclusion The 3D-Jury score continues to be a good indicator of structural model quality. It also provides a generic reliability score, especially important for models that were not assigned such by the original server. Individual structure modellers can also benefit from the 3D-Jury scoring system by testing their models in the new instant scoring feature available in the Meta Server. PMID:17711571

  17. β-Propeller Blades as Ancestral Peptides in Protein Evolution

    PubMed Central

    Kopec, Klaus O.; Lupas, Andrei N.

    2013-01-01

    Proteins of the β-propeller fold are ubiquitous in nature and widely used as structural scaffolds for ligand binding and enzymatic activity. This fold comprises between four and twelve four-stranded β-meanders, the so called blades that are arranged circularly around a central funnel-shaped pore. Despite the large size range of β-propellers, their blades frequently show sequence similarity indicative of a common ancestry and it has been proposed that the majority of β-propellers arose divergently by amplification and diversification of an ancestral blade. Given the structural versatility of β-propellers and the hypothesis that the first folded proteins evolved from a simpler set of peptides, we investigated whether this blade may have given rise to other folds as well. Using sequence comparisons, we identified proteins of four other folds as potential homologs of β-propellers: the luminal domain of inositol-requiring enzyme 1 (IRE1-LD), type II β-prisms, β-pinwheels, and WW domains. Because, with increasing evolutionary distance and decreasing sequence length, the statistical significance of sequence comparisons becomes progressively harder to distinguish from the background of convergent similarities, we complemented our analyses with a new method that evaluates possible homology based on the correlation between sequence and structure similarity. Our results indicate a homologous relationship of IRE1-LD and type II β-prisms with β-propellers, and an analogous one for β-pinwheels and WW domains. Whereas IRE1-LD most likely originated by fold-changing mutations from a fully formed PQQ motif β-propeller, type II β-prisms originated by amplification and differentiation of a single blade, possibly also of the PQQ type. We conclude that both β-propellers and type II β-prisms arose by independent amplification of a blade-sized fragment, which represents a remnant of an ancient peptide world. PMID:24143202

  18. Protein enriched pasta: structure and digestibility of its protein network.

    PubMed

    Laleg, Karima; Barron, Cécile; Santé-Lhoutellier, Véronique; Walrand, Stéphane; Micard, Valérie

    2016-02-01

    Wheat (W) pasta was enriched in 6% gluten (G), 35% faba (F) or 5% egg (E) to increase its protein content (13% to 17%). The impact of the enrichment on the multiscale structure of the pasta and on in vitro protein digestibility was studied. Increasing the protein content (W- vs. G-pasta) strengthened pasta structure at molecular and macroscopic scales but reduced its protein digestibility by 3% by forming a higher covalently linked protein network. Greater changes in the macroscopic and molecular structure of the pasta were obtained by varying the nature of protein used for enrichment. Proteins in G- and E-pasta were highly covalently linked (28-32%) resulting in a strong pasta structure. Conversely, F-protein (98% SDS-soluble) altered the pasta structure by diluting gluten and formed a weak protein network (18% covalent link). As a result, protein digestibility in F-pasta was significantly higher (46%) than in E- (44%) and G-pasta (39%). The effect of low (55 °C, LT) vs. very high temperature (90 °C, VHT) drying on the protein network structure and digestibility was shown to cause greater molecular changes than pasta formulation. Whatever the pasta, a general strengthening of its structure, a 33% to 47% increase in covalently linked proteins and a higher β-sheet structure were observed. However, these structural differences were evened out after the pasta was cooked, resulting in identical protein digestibility in LT and VHT pasta. Even after VHT drying, F-pasta had the best amino acid profile with the highest protein digestibility, proof of its nutritional interest.

  19. CMsearch: simultaneous exploration of protein sequence space and structure space improves not only protein homology detection but also protein structure prediction.

    PubMed

    Cui, Xuefeng; Lu, Zhiwu; Wang, Sheng; Jing-Yan Wang, Jim; Gao, Xin

    2016-06-15

    Protein homology detection, a fundamental problem in computational biology, is an indispensable step toward predicting protein structures and understanding protein functions. Despite the advances in recent decades on sequence alignment, threading and alignment-free methods, protein homology detection remains a challenging open problem. Recently, network methods that try to find transitive paths in the protein structure space demonstrate the importance of incorporating network information of the structure space. Yet, current methods merge the sequence space and the structure space into a single space, and thus introduce inconsistency in combining different sources of information. We present a novel network-based protein homology detection method, CMsearch, based on cross-modal learning. Instead of exploring a single network built from the mixture of sequence and structure space information, CMsearch builds two separate networks to represent the sequence space and the structure space. It then learns sequence-structure correlation by simultaneously taking sequence information, structure information, sequence space information and structure space information into consideration. We tested CMsearch on two challenging tasks, protein homology detection and protein structure prediction, by querying all 8332 PDB40 proteins. Our results demonstrate that CMsearch is insensitive to the similarity metrics used to define the sequence and the structure spaces. By using HMM-HMM alignment as the sequence similarity metric, CMsearch clearly outperforms state-of-the-art homology detection methods and the CASP-winning template-based protein structure prediction methods. Our program is freely available for download from http://sfb.kaust.edu.sa/Pages/Software.aspx : xin.gao@kaust.edu.sa Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.

  20. Identification of Conserved Water Sites in Protein Structures for Drug Design.

    PubMed

    Jukič, Marko; Konc, Janez; Gobec, Stanislav; Janežič, Dušanka

    2017-12-26

    Identification of conserved waters in protein structures is a challenging task with applications in molecular docking and protein stability prediction. As an alternative to computationally demanding simulations of proteins in water, experimental cocrystallized waters in the Protein Data Bank (PDB) in combination with a local structure alignment algorithm can be used for reliable prediction of conserved water sites. We developed the ProBiS H2O approach based on the previously developed ProBiS algorithm, which enables identification of conserved water sites in proteins using experimental protein structures from the PDB or a set of custom protein structures available to the user. With a protein structure, a binding site, or an individual water molecule as a query, ProBiS H2O collects similar proteins from the PDB and performs local or binding site-specific superimpositions of the query structure with similar proteins using the ProBiS algorithm. It collects the experimental water molecules from the similar proteins and transposes them to the query protein. Transposed waters are clustered by their mutual proximity, which enables identification of discrete sites in the query protein with high water conservation. ProBiS H2O is a robust and fast new approach that uses existing experimental structural data to identify conserved water sites on the interfaces of protein complexes, for example protein-small molecule interfaces, and elsewhere on the protein structures. It has been successfully validated in several reported proteins in which conserved water molecules were found to play an important role in ligand binding with applications in drug design.

  1. Lrs14 transcriptional regulators influence biofilm formation and cell motility of Crenarchaea

    PubMed Central

    Orell, Alvaro; Peeters, Eveline; Vassen, Victoria; Jachlewski, Silke; Schalles, Sven; Siebers, Bettina; Albers, Sonja-Verena

    2013-01-01

    Like bacteria, archaea predominately exist as biofilms in nature. However, the environmental cues and the molecular mechanisms driving archaeal biofilm development are not characterized. Here we provide data suggesting that the transcriptional regulators belonging to the Lrs14-like protein family constitute a key regulatory factor during Sulfolobus biofilm development. Among the six lrs14-like genes encoded by Sulfolobus acidocaldarius, the deletion of three led to markedly altered biofilm phenotypes. Although Δsaci1223 and Δsaci1242 deletion mutants were impaired in biofilm formation, the Δsaci0446 deletion strain exhibited a highly increased extracellular polymeric substance (EPS) production, leading to a robust biofilm structure. Moreover, although the expression of the adhesive pili (aap) genes was upregulated, the genes of the motility structure, the archaellum (fla), were downregulated rendering the Δsaci0446 strain non-motile. Gel shift assays confirmed that Saci0446 bound to the promoter regions of fla and aap thus controlling the expression of both cell surface structures. In addition, genetic epistasis analysis using Δsaci0446 as background strain identified a gene cluster involved in the EPS biosynthetic pathway of S. acidocaldarius. These results provide insights into both the molecular mechanisms that govern biofilm formation in Crenarchaea and the functionality of the Lrs14-like proteins, an archaea-specific class of transcriptional regulators. PMID:23657363

  2. Flavocoxid, a Natural Antioxidant, Protects Mouse Kidney from Cadmium-Induced Toxicity

    PubMed Central

    Trichilo, Vincenzo; Pisani, Antonina; Malta, Consuelo; Laurà, Rosalba; Santoro, Domenico; Germanà, Antonino; Minutoli, Letteria

    2018-01-01

    Background Cadmium (Cd), a diffused environmental pollutant, has adverse effects on urinary apparatus. The role of flavocoxid, a natural flavonoid with antioxidant activity, on the morphological and biochemical changes induced in vivo by Cd in mice kidney was evaluated. Methods C57 BL/6J mice received 0.9% NaCl alone, flavocoxid (20 mg/kg/day i.p.) alone, Cd chloride (CdCl2) (2 mg/kg/day i.p.) alone, or CdCl2 plus flavocoxid (2 mg/kg/day i.p. plus 20 mg/kg/day i.p.) for 14 days. The kidneys were processed for biochemical, structural, ultrastructural, and morphometric evaluation. Results Cd treatment alone significantly increased urea nitrogen and creatinine, iNOS, MMP-9, and pERK 1/2 expression and protein carbonyl; reduced GSH, GR, and GPx; and induced structural and ultrastructural changes in the glomeruli and in the tubular epithelium. After 14 days of treatment, flavocoxid administration reduced urea nitrogen and creatinine, iNOS, MMP-9, and pERK 1/2 expression and protein carbonyl; increased GSH, GR, and GPx; and showed an evident preservation of the glomerular and tubular structure and ultrastructure. Conclusions A protective role of flavocoxid against Cd-induced oxidative damages in mouse kidney was demonstrated for the first time. Flavocoxid may have a promising antioxidant role against environmental Cd harmful effects on glomerular and tubular lesions. PMID:29849925

  3. Functional assignment to JEV proteins using SVM.

    PubMed

    Sahoo, Ganesh Chandra; Dikhit, Manas Ranjan; Das, Pradeep

    2008-01-01

    Identification of different protein functions facilitates a mechanistic understanding of Japanese encephalitis virus (JEV) infection and opens novel means for drug development. Support vector machines (SVM), useful for predicting the functional class of distantly related proteins, is employed to ascribe a possible functional class to Japanese encephalitis virus protein. Our study from SVMProt and available JE virus sequences suggests that structural and nonstructural proteins of JEV genome possibly belong to diverse protein functions, are expected to occur in the life cycle of JE virus. Protein functions common to both structural and non-structural proteins are iron-binding, metal-binding, lipid-binding, copper-binding, transmembrane, outer membrane, channels/Pores - Pore-forming toxins (proteins and peptides) group of proteins. Non-structural proteins perform functions like actin binding, zinc-binding, calcium-binding, hydrolases, Carbon-Oxygen Lyases, P-type ATPase, proteins belonging to major facilitator family (MFS), secreting main terminal branch (MTB) family, phosphotransfer-driven group translocators and ATP-binding cassette (ABC) family group of proteins. Whereas structural proteins besides belonging to same structural group of proteins (capsid, structural, envelope), they also perform functions like nuclear receptor, antibiotic resistance, RNA-binding, DNA-binding, magnesium-binding, isomerase (intra-molecular), oxidoreductase and participate in type II (general) secretory pathway (IISP).

  4. Functional assignment to JEV proteins using SVM

    PubMed Central

    Sahoo, Ganesh Chandra; Dikhit, Manas Ranjan; Das, Pradeep

    2008-01-01

    Identification of different protein functions facilitates a mechanistic understanding of Japanese encephalitis virus (JEV) infection and opens novel means for drug development. Support vector machines (SVM), useful for predicting the functional class of distantly related proteins, is employed to ascribe a possible functional class to Japanese encephalitis virus protein. Our study from SVMProt and available JE virus sequences suggests that structural and nonstructural proteins of JEV genome possibly belong to diverse protein functions, are expected to occur in the life cycle of JE virus. Protein functions common to both structural and non-structural proteins are iron-binding, metal-binding, lipid-binding, copper-binding, transmembrane, outer membrane, channels/Pores - Pore-forming toxins (proteins and peptides) group of proteins. Non-structural proteins perform functions like actin binding, zinc-binding, calcium-binding, hydrolases, Carbon-Oxygen Lyases, P-type ATPase, proteins belonging to major facilitator family (MFS), secreting main terminal branch (MTB) family, phosphotransfer-driven group translocators and ATP-binding cassette (ABC) family group of proteins. Whereas structural proteins besides belonging to same structural group of proteins (capsid, structural, envelope), they also perform functions like nuclear receptor, antibiotic resistance, RNA-binding, DNA-binding, magnesium-binding, isomerase (intra-molecular), oxidoreductase and participate in type II (general) secretory pathway (IISP). PMID:19052658

  5. The four hexamerin genes in the honey bee: structure, molecular evolution and function deduced from expression patterns in queens, workers and drones

    PubMed Central

    2010-01-01

    Background Hexamerins are hemocyanin-derived proteins that have lost the ability to bind copper ions and transport oxygen; instead, they became storage proteins. The current study aimed to broaden our knowledge on the hexamerin genes found in the honey bee genome by exploring their structural characteristics, expression profiles, evolution, and functions in the life cycle of workers, drones and queens. Results The hexamerin genes of the honey bee (hex 70a, hex 70b, hex 70c and hex 110) diverge considerably in structure, so that the overall amino acid identity shared among their deduced protein subunits varies from 30 to 42%. Bioinformatics search for motifs in the respective upstream control regions (UCRs) revealed six overrepresented motifs including a potential binding site for Ultraspiracle (Usp), a target of juvenile hormone (JH). The expression of these genes was induced by topical application of JH on worker larvae. The four genes are highly transcribed by the larval fat body, although with significant differences in transcript levels, but only hex 110 and hex 70a are re-induced in the adult fat body in a caste- and sex-specific fashion, workers showing the highest expression. Transcripts for hex 110, hex 70a and hex70b were detected in developing ovaries and testes, and hex 110 was highly transcribed in the ovaries of egg-laying queens. A phylogenetic analysis revealed that HEX 110 is located at the most basal position among the holometabola hexamerins, and like HEX 70a and HEX 70c, it shares potential orthology relationship with hexamerins from other hymenopteran species. Conclusions Striking differences were found in the structure and developmental expression of the four hexamerin genes in the honey bee. The presence of a potential binding site for Usp in the respective 5' UCRs, and the results of experiments on JH level manipulation in vivo support the hypothesis of regulation by JH. Transcript levels and patterns in the fat body and gonads suggest that, in addition to their primary role in supplying amino acids for metamorphosis, hexamerins serve as storage proteins for gonad development, egg production, and to support foraging activity. A phylogenetic analysis including the four deduced hexamerins and related proteins revealed a complex pattern of evolution, with independent radiation in insect orders. PMID:20346164

  6. Physics of protein folding

    NASA Astrophysics Data System (ADS)

    Finkelstein, A. V.; Galzitskaya, O. V.

    2004-04-01

    Protein physics is grounded on three fundamental experimental facts: protein, this long heteropolymer, has a well defined compact three-dimensional structure; this structure can spontaneously arise from the unfolded protein chain in appropriate environment; and this structure is separated from the unfolded state of the chain by the “all-or-none” phase transition, which ensures robustness of protein structure and therefore of its action. The aim of this review is to consider modern understanding of physical principles of self-organization of protein structures and to overview such important features of this process, as finding out the unique protein structure among zillions alternatives, nucleation of the folding process and metastable folding intermediates. Towards this end we will consider the main experimental facts and simple, mostly phenomenological theoretical models. We will concentrate on relatively small (single-domain) water-soluble globular proteins (whose structure and especially folding are much better studied and understood than those of large or membrane and fibrous proteins) and consider kinetic and structural aspects of transition of initially unfolded protein chains into their final solid (“native”) 3D structures.

  7. Comprehensive comparison of two protein family of P-ATPases (13A1 and 13A3) in insects.

    PubMed

    Seddigh, Samin

    2017-06-01

    The P-type ATPases (P-ATPases) are present in all living cells where they mediate ion transport across membranes on the expense of ATP hydrolysis. Different ions which are transported by these pumps are protons like calcium, sodium, potassium, and heavy metals such as manganese, iron, copper, and zinc. Maintenance of the proper gradients for essential ions across cellular membranes makes P-ATPases crucial for cell survival. In this study, characterization of two families of P-ATPases including P-ATPase 13A1 and P-ATPase 13A3 protein was compared in two different insect species from different orders. According to the conserved motifs found with MEME, nine motifs were shared by insects of 13A1 family but eight in 13A3 family. Seven different insect species from 13A1 and five samples from 13A3 family were selected as the representative samples for functional and structural analyses. The structural and functional analyses were performed with ProtParam, SOPMA, SignalP 4.1, TMHMM 2.0, ProtScale and ProDom tools in the ExPASy database. The tertiary structure of Bombus terrestris as a sample of each family of insects were predicted by the Phyre2 and TM-score servers and their similarities were verified by SuperPose server. The tertiary structures were predicted via the "c3b9bA" model (PDB Accession Code: 3B9B) in P-ATPase 13A1 family and "c2zxeA" model (PDB Accession Code: 2ZXE) in P-ATPase 13A3 family. A phylogenetic tree was constructed with MEGA 6.06 software using the Neighbor-joining method. According to the results, there was a high identity of P-ATPase families so that they should be derived from a common ancestor however they belonged to separate groups. In protein-protein interaction analysis by STRING 10.0, six common enriched pathways of KEGG were identified in B. terrestris in both families. The obtained data provide a background for bioinformatic studies of the function and evolution of other insects and organisms. Copyright © 2017 Elsevier Ltd. All rights reserved.

  8. General overview on structure prediction of twilight-zone proteins.

    PubMed

    Khor, Bee Yin; Tye, Gee Jun; Lim, Theam Soon; Choong, Yee Siew

    2015-09-04

    Protein structure prediction from amino acid sequence has been one of the most challenging aspects in computational structural biology despite significant progress in recent years showed by critical assessment of protein structure prediction (CASP) experiments. When experimentally determined structures are unavailable, the predictive structures may serve as starting points to study a protein. If the target protein consists of homologous region, high-resolution (typically <1.5 Å) model can be built via comparative modelling. However, when confronted with low sequence similarity of the target protein (also known as twilight-zone protein, sequence identity with available templates is less than 30%), the protein structure prediction has to be initiated from scratch. Traditionally, twilight-zone proteins can be predicted via threading or ab initio method. Based on the current trend, combination of different methods brings an improved success in the prediction of twilight-zone proteins. In this mini review, the methods, progresses and challenges for the prediction of twilight-zone proteins were discussed.

  9. Structure-Based Characterization of Multiprotein Complexes

    PubMed Central

    Wiederstein, Markus; Gruber, Markus; Frank, Karl; Melo, Francisco; Sippl, Manfred J.

    2014-01-01

    Summary Multiprotein complexes govern virtually all cellular processes. Their 3D structures provide important clues to their biological roles, especially through structural correlations among protein molecules and complexes. The detection of such correlations generally requires comprehensive searches in databases of known protein structures by means of appropriate structure-matching techniques. Here, we present a high-speed structure search engine capable of instantly matching large protein oligomers against the complete and up-to-date database of biologically functional assemblies of protein molecules. We use this tool to reveal unseen structural correlations on the level of protein quaternary structure and demonstrate its general usefulness for efficiently exploring complex structural relationships among known protein assemblies. PMID:24954616

  10. Mathematical methods for protein science

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hart, W.; Istrail, S.; Atkins, J.

    1997-12-31

    Understanding the structure and function of proteins is a fundamental endeavor in molecular biology. Currently, over 100,000 protein sequences have been determined by experimental methods. The three dimensional structure of the protein determines its function, but there are currently less than 4,000 structures known to atomic resolution. Accordingly, techniques to predict protein structure from sequence have an important role in aiding the understanding of the Genome and the effects of mutations in genetic disease. The authors describe current efforts at Sandia to better understand the structure of proteins through rigorous mathematical analyses of simple lattice models. The efforts have focusedmore » on two aspects of protein science: mathematical structure prediction, and inverse protein folding.« less

  11. Resource for structure related information on transmembrane proteins

    NASA Astrophysics Data System (ADS)

    Tusnády, Gábor E.; Simon, István

    Transmembrane proteins are involved in a wide variety of vital biological processes including transport of water-soluble molecules, flow of information and energy production. Despite significant efforts to determine the structures of these proteins, only a few thousand solved structures are known so far. Here, we review the various resources for structure-related information on these types of proteins ranging from the 3D structure to the topology and from the up-to-date databases to the various Internet sites and servers dealing with structure prediction and structure analysis. Abbreviations: 3D, three dimensional; PDB, Protein Data Bank; TMP, transmembrane protein.

  12. G-LoSA for Prediction of Protein-Ligand Binding Sites and Structures.

    PubMed

    Lee, Hui Sun; Im, Wonpil

    2017-01-01

    Recent advances in high-throughput structure determination and computational protein structure prediction have significantly enriched the universe of protein structure. However, there is still a large gap between the number of available protein structures and that of proteins with annotated function in high accuracy. Computational structure-based protein function prediction has emerged to reduce this knowledge gap. The identification of a ligand binding site and its structure is critical to the determination of a protein's molecular function. We present a computational methodology for predicting small molecule ligand binding site and ligand structure using G-LoSA, our protein local structure alignment and similarity measurement tool. All the computational procedures described here can be easily implemented using G-LoSA Toolkit, a package of standalone software programs and preprocessed PDB structure libraries. G-LoSA and G-LoSA Toolkit are freely available to academic users at http://compbio.lehigh.edu/GLoSA . We also illustrate a case study to show the potential of our template-based approach harnessing G-LoSA for protein function prediction.

  13. A structural-alphabet-based strategy for finding structural motifs across protein families

    PubMed Central

    Wu, Chih Yuan; Chen, Yao Chi; Lim, Carmay

    2010-01-01

    Proteins with insignificant sequence and overall structure similarity may still share locally conserved contiguous structural segments; i.e. structural/3D motifs. Most methods for finding 3D motifs require a known motif to search for other similar structures or functionally/structurally crucial residues. Here, without requiring a query motif or essential residues, a fully automated method for discovering 3D motifs of various sizes across protein families with different folds based on a 16-letter structural alphabet is presented. It was applied to structurally non-redundant proteins bound to DNA, RNA, obligate/non-obligate proteins as well as free DNA-binding proteins (DBPs) and proteins with known structures but unknown function. Its usefulness was illustrated by analyzing the 3D motifs found in DBPs. A non-specific motif was found with a ‘corner’ architecture that confers a stable scaffold and enables diverse interactions, making it suitable for binding not only DNA but also RNA and proteins. Furthermore, DNA-specific motifs present ‘only’ in DBPs were discovered. The motifs found can provide useful guidelines in detecting binding sites and computational protein redesign. PMID:20525797

  14. The planetary biology of cytochrome P450 aromatases

    PubMed Central

    Gaucher, Eric A; Graddy, Logan G; Li, Tang; Simmen, Rosalia CM; Simmen, Frank A; Schreiber, David R; Liberles, David A; Janis, Christine M; Benner, Steven A

    2004-01-01

    Background Joining a model for the molecular evolution of a protein family to the paleontological and geological records (geobiology), and then to the chemical structures of substrates, products, and protein folds, is emerging as a broad strategy for generating hypotheses concerning function in a post-genomic world. This strategy expands systems biology to a planetary context, necessary for a notion of fitness to underlie (as it must) any discussion of function within a biomolecular system. Results Here, we report an example of such an expansion, where tools from planetary biology were used to analyze three genes from the pig Sus scrofa that encode cytochrome P450 aromatases–enzymes that convert androgens into estrogens. The evolutionary history of the vertebrate aromatase gene family was reconstructed. Transition redundant exchange silent substitution metrics were used to interpolate dates for the divergence of family members, the paleontological record was consulted to identify changes in physiology that correlated in time with the change in molecular behavior, and new aromatase sequences from peccary were obtained. Metrics that detect changing function in proteins were then applied, including KA/KS values and those that exploit structural biology. These identified specific amino acid replacements that were associated with changing substrate and product specificity during the time of presumed adaptive change. The combined analysis suggests that aromatase paralogs arose in pigs as a result of selection for Suoidea with larger litters than their ancestors, and permitted the Suoidea to survive the global climatic trauma that began in the Eocene. Conclusions This combination of bioinformatics analysis, molecular evolution, paleontology, cladistics, global climatology, structural biology, and organic chemistry serves as a paradigm in planetary biology. As the geological, paleontological, and genomic records improve, this approach should become widely useful to make systems biology statements about high-level function for biomolecular systems. PMID:15315709

  15. Overcoming the heterologous bias: An in vivo functional analysis of multidrug efflux transporter, CgCdr1p in matched pair clinical isolates of Candida glabrata

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Puri, Nidhi; Manoharlal, Raman; Sharma, Monika

    2011-01-07

    Research highlights: {yields} First report to demonstrate an in vivo expression system of an ABC multidrug transporter CgCdr1p of C. glabrata. {yields} First report on the structure and functional characterization of CgCdr1p. {yields} Functional conservation of divergent but typical residues of CgCdr1p. {yields} CgCdr1p elicits promiscuity towards substrates and has a large drug binding pocket with overlapping specificities. -- Abstract: We have taken advantage of the natural milieu of matched pair of azole sensitive (AS) and azole resistant (AR) clinical isolates of Candida glabrata for expressing its major ABC multidrug transporter, CgCdr1p for structure and functional analysis. This was accomplishedmore » by tagging a green fluorescent protein (GFP) downstream of ORF of CgCDR1 and integrating the resultant fusion protein at its native chromosomal locus in AS and AR backgrounds. The characterization confirmed that in comparison to AS isolate, CgCdr1p-GFP was over-expressed in AR isolates due to its hyperactive native promoter and the GFP tag did not affect its functionality in either construct. We observed that in addition to Rhodamine 6 G (R6G) and Fluconazole (FLC), a recently identified fluorescent substrate of multidrug transporters Nile Red (NR) could also be expelled by CgCdr1p. Competition assays with these substrates revealed the presence of overlapping multiple drug binding sites in CgCdr1p. Point mutations employing site directed mutagenesis confirmed that the role played by unique amino acid residues critical to ATP catalysis and localization of ABC drug transporter proteins are well conserved in C. glabrata as in other yeasts. This study demonstrates a first in vivo novel system where over-expression of GFP tagged MDR transporter protein can be driven by its own hyperactive promoter of AR isolates. Taken together, this in vivo system can be exploited for the structure and functional analysis of CgCdr1p and similar proteins wherein the arte-factual concerns encountered in using heterologous systems are totally excluded.« less

  16. Subjects harboring presenilin familial Alzheimer’s disease mutations exhibit diverse white matter biochemistry alterations

    PubMed Central

    Roher, Alex E; Maarouf, Chera L; Malek-Ahmadi, Michael; Wilson, Jeffrey; Kokjohn, Tyler A; Daugs, Ian D; Whiteside, Charisse M; Kalback, Walter M; Macias, MiMi P; Jacobson, Sandra A; Sabbagh, Marwan N; Ghetti, Bernardino; Beach, Thomas G

    2013-01-01

    Alzheimer’s disease (AD) dementia impacts all facets of higher order cognitive function and is characterized by the presence of distinctive pathological lesions in the gray matter (GM). The profound alterations in GM structure and function have fostered the view that AD impacts are primarily a consequence of GM damage. However, the white matter (WM) represents about 50% of the cerebrum and this area of the brain is substantially atrophied and profoundly abnormal in both sporadic AD (SAD) and familial AD (FAD). We examined the WM biochemistry by ELISA and Western blot analyses of key proteins in 10 FAD cases harboring mutations in the presenilin genes PSEN1 and PSEN2 as well as in 4 non-demented control (NDC) individuals and 4 subjects with SAD. The molecules examined were direct substrates of PSEN1 such as Notch-1 and amyloid precursor protein (APP). In addition, apolipoproteins, axonal transport molecules, cytoskeletal and structural proteins, neurotrophic factors and synaptic proteins were examined. PSEN-FAD subjects had, on average, higher amounts of WM amyloid-beta (Aβ) peptides compared to SAD, which may play a role in the devastating dysfunction of the brain. However, the PSEN-FAD mutations we examined did not produce uniform increases in the relative proportions of Aβ42 and exhibited substantial variability in total Aβ levels. These observations suggest that neurodegeneration and dementia do not depend solely on enhanced Aβ42 levels. Our data revealed additional complexities in PSEN-FAD individuals. Some direct substrates of γ-secretase, such as Notch, N-cadherin, Erb-B4 and APP, deviated substantially from the NDC group baseline for some, but not all, mutation types. Proteins that were not direct γ-secretase substrates, but play key structural and functional roles in the WM, likewise exhibited varied concentrations in the distinct PSEN mutation backgrounds. Detailing the diverse biochemical pathology spectrum of PSEN mutations may offer valuable insights into dementia progression and the design of effective therapeutic interventions for both SAD and FAD. PMID:24093083

  17. Improving predicted protein loop structure ranking using a Pareto-optimality consensus method

    PubMed Central

    2010-01-01

    Background Accurate protein loop structure models are important to understand functions of many proteins. Identifying the native or near-native models by distinguishing them from the misfolded ones is a critical step in protein loop structure prediction. Results We have developed a Pareto Optimal Consensus (POC) method, which is a consensus model ranking approach to integrate multiple knowledge- or physics-based scoring functions. The procedure of identifying the models of best quality in a model set includes: 1) identifying the models at the Pareto optimal front with respect to a set of scoring functions, and 2) ranking them based on the fuzzy dominance relationship to the rest of the models. We apply the POC method to a large number of decoy sets for loops of 4- to 12-residue in length using a functional space composed of several carefully-selected scoring functions: Rosetta, DOPE, DDFIRE, OPLS-AA, and a triplet backbone dihedral potential developed in our lab. Our computational results show that the sets of Pareto-optimal decoys, which are typically composed of ~20% or less of the overall decoys in a set, have a good coverage of the best or near-best decoys in more than 99% of the loop targets. Compared to the individual scoring function yielding best selection accuracy in the decoy sets, the POC method yields 23%, 37%, and 64% less false positives in distinguishing the native conformation, indentifying a near-native model (RMSD < 0.5A from the native) as top-ranked, and selecting at least one near-native model in the top-5-ranked models, respectively. Similar effectiveness of the POC method is also found in the decoy sets from membrane protein loops. Furthermore, the POC method outperforms the other popularly-used consensus strategies in model ranking, such as rank-by-number, rank-by-rank, rank-by-vote, and regression-based methods. Conclusions By integrating multiple knowledge- and physics-based scoring functions based on Pareto optimality and fuzzy dominance, the POC method is effective in distinguishing the best loop models from the other ones within a loop model set. PMID:20642859

  18. An overview of the structures of protein-DNA complexes

    PubMed Central

    Luscombe, Nicholas M; Austin, Susan E; Berman , Helen M; Thornton, Janet M

    2000-01-01

    On the basis of a structural analysis of 240 protein-DNA complexes contained in the Protein Data Bank (PDB), we have classified the DNA-binding proteins involved into eight different structural/functional groups, which are further classified into 54 structural families. Here we present this classification and review the functions, structures and binding interactions of these protein-DNA complexes. PMID:11104519

  19. The amyR-deletion strain of Aspergillus niger CICC2462 is a suitable host strain to express secreted protein with a low background.

    PubMed

    Zhang, Hui; Wang, Shuang; Zhang, Xiang Xiang; Ji, Wei; Song, Fuping; Zhao, Yue; Li, Jie

    2016-04-28

    The filamentous fungus Aspergillus niger is widely exploited as an important expression host for industrial production. The glucoamylase high-producing strain A. niger CICC2462 has been used as a host strain for the establishment of a secretion expression system. It expresses recombinant xylanase, mannase and asparaginase at a high level, but some high secretory background proteins in these recombinant strains still remain, such as alpha-amylase and alpha-glucosidase; lead to a low-purity of fermentation products. The aim was to construct an A. niger host strain with a low background of protein secretion. The transcription factor amyR was deleted in A. niger CICC2462, and the results from enzyme activity assays and SDS-PAGE analysis showed that the glucoamylase and amylase activities of the ∆amyR strains were significantly lower than those of the wild-type strain. High-throughput RNA-sequencing and shotgun LC-MS/MS proteomic technology analysis demonstrated that the expression of amylolytic enzymes was decreased at both the transcriptional and translational levels in the ∆amyR strain. Interestingly, the ∆amyR strain growth rate better than the wild-type strain. Our findings clearly indicated that the ∆amyR strain of A. niger CICC2462 can be used as a host strain with a low background of protein secretion.

  20. Functional structural motifs for protein-ligand, protein-protein, and protein-nucleic acid interactions and their connection to supersecondary structures.

    PubMed

    Kinjo, Akira R; Nakamura, Haruki

    2013-01-01

    Protein functions are mediated by interactions between proteins and other molecules. One useful approach to analyze protein functions is to compare and classify the structures of interaction interfaces of proteins. Here, we describe the procedures for compiling a database of interface structures and efficiently comparing the interface structures. To do so requires a good understanding of the data structures of the Protein Data Bank (PDB). Therefore, we also provide a detailed account of the PDB exchange dictionary necessary for extracting data that are relevant for analyzing interaction interfaces and secondary structures. We identify recurring structural motifs by classifying similar interface structures, and we define a coarse-grained representation of supersecondary structures (SSS) which represents a sequence of two or three secondary structure elements including their relative orientations as a string of four to seven letters. By examining the correspondence between structural motifs and SSS strings, we show that no SSS string has particularly high propensity to be found interaction interfaces in general, indicating any SSS can be used as a binding interface. When individual structural motifs are examined, there are some SSS strings that have high propensity for particular groups of structural motifs. In addition, it is shown that while the SSS strings found in particular structural motifs for nonpolymer and protein interfaces are as abundant as in other structural motifs that belong to the same subunit, structural motifs for nucleic acid interfaces exhibit somewhat stronger preference for SSS strings. In regard to protein folds, many motif-specific SSS strings were found across many folds, suggesting that SSS may be a useful description to investigate the universality of ligand binding modes.

Top