Metagenomics and the protein universe
Godzik, Adam
2011-01-01
Metagenomics sequencing projects have dramatically increased our knowledge of the protein universe and provided over one-half of currently known protein sequences; they have also introduced a much broader phylogenetic diversity into the protein databases. The full analysis of metagenomic datasets is only beginning, but it has already led to the discovery of thousands of new protein families, likely representing novel functions specific to given environments. At the same time, a deeper analysis of such novel families, including experimental structure determination of some representatives, suggests that most of them represent distant homologs of already characterized protein families, and thus most of the protein diversity present in the new environments are due to functional divergence of the known protein families rather than the emergence of new ones. PMID:21497084
Pang, Siew Wai; Lahiri, Chandrajit; Poh, Chit Laa; Tan, Kuan Onn
2018-05-01
Paraneoplastic Ma Family (PNMA) comprises a growing number of family members which share relatively conserved protein sequences encoded by the human genome and is localized to several human chromosomes, including the X-chromosome. Based on sequence analysis, PNMA family members share sequence homology to the Gag protein of LTR retrotransposon, and several family members with aberrant protein expressions have been reported to be closely associated with the human Paraneoplastic Disorder (PND). In addition, gene mutations of specific members of PNMA family are known to be associated with human mental retardation or 3-M syndrome consisting of restrictive post-natal growth or dwarfism, and development of skeletal abnormalities. Other than sequence homology, the physiological function of many members in this family remains unclear. However, several members of this family have been characterized, including cell signalling events mediated by these proteins that are associated with apoptosis, and cancer in different cell types. Furthermore, while certain PNMA family members show restricted gene expression in the human brain and testis, other PNMA family members exhibit broader gene expression or preferential and selective protein interaction profiles, suggesting functional divergence within the family. Functional analysis of some members of this family have identified protein domains that are required for subcellular localization, protein-protein interactions, and cell signalling events which are the focus of this review paper. Copyright © 2018 Elsevier Inc. All rights reserved.
Acyl carrier protein structural classification and normal mode analysis
Cantu, David C; Forrester, Michael J; Charov, Katherine; Reilly, Peter J
2012-01-01
All acyl carrier protein primary and tertiary structures were gathered into the ThYme database. They are classified into 16 families by amino acid sequence similarity, with members of the different families having sequences with statistically highly significant differences. These classifications are supported by tertiary structure superposition analysis. Tertiary structures from a number of families are very similar, suggesting that these families may come from a single distant ancestor. Normal vibrational mode analysis was conducted on experimentally determined freestanding structures, showing greater fluctuations at chain termini and loops than in most helices. Their modes overlap more so within families than between different families. The tertiary structures of three acyl carrier protein families that lacked any known structures were predicted as well. PMID:22374859
2012-01-01
Background WASP family proteins stimulate the actin-nucleating activity of the ARP2/3 complex. They include members of the well-known WASP and WAVE/Scar proteins, and the recently identified WASH and WHAMM proteins. WASP family proteins contain family specific N-terminal domains followed by proline-rich regions and C-terminal VCA domains that harbour the ARP2/3-activating regions. Results To reveal the evolution of ARP2/3 activation by WASP family proteins we performed a "holistic" analysis by manually assembling and annotating all homologs in most of the eukaryotic genomes available. We have identified two new families: the WAML proteins (WASP and MIM like), which combine the membrane-deforming and actin bundling functions of the IMD domains with the ARP2/3-activating VCA regions, and the WAWH protein (WASP without WH1 domain) that have been identified in amoebae, Apusozoa, and the anole lizard. Surprisingly, with one exception we did not identify any alternative splice forms for WASP family proteins, which is in strong contrast to other actin-binding proteins like Ena/VASP, MIM, or NHS proteins that share domains with WASP proteins. Conclusions Our analysis showed that the last common ancestor of the eukaryotes must have contained a homolog of WASP, WAVE, and WASH. Specific families have subsequently been lost in many taxa like the WASPs in plants, algae, Stramenopiles, and Euglenozoa, and the WASH proteins in fungi. The WHAMM proteins are metazoa specific and have most probably been invented by the Eumetazoa. The diversity of WASP family proteins has strongly been increased by many species- and taxon-specific gene duplications and multimerisations. All data is freely accessible via http://www.cymobase.org. PMID:22316129
Pittman, Jon K; Hirschi, Kendal D
2016-12-01
The Ca(2+)/Cation Antiporter (CaCA) superfamily is an ancient and widespread family of ion-coupled cation transporters found in nearly all kingdoms of life. In animals, K(+)-dependent and K(+)-indendent Na(+)/Ca(2+) exchangers (NCKX and NCX) are important CaCA members. Recently it was proposed that all rice and Arabidopsis CaCA proteins should be classified as NCX proteins. Here we performed phylogenetic analysis of CaCA genes and protein structure homology modelling to further characterise members of this transporter superfamily. Phylogenetic analysis of rice and Arabidopsis CaCAs in comparison with selected CaCA members from non-plant species demonstrated that these genes form clearly distinct families, with the H(+)/Cation exchanger (CAX) and cation/Ca(2+) exchanger (CCX) families dominant in higher plants but the NCKX and NCX families absent. NCX-related Mg(2+)/H(+) exchanger (MHX) and CAX-related Na(+)/Ca(2+) exchanger-like (NCL) proteins are instead present. Analysis of genomes of ten closely-related rice species and four Arabidopsis-related species found that CaCA gene family structures are highly conserved within related plants, apart from minor variation. Protein structures were modelled for OsCAX1a and OsMHX1. Despite exhibiting broad structural conservation, there are clear structural differences observed between the different CaCA types. Members of the CaCA superfamily form clearly distinct families with different phylogenetic, structural and functional characteristics, and therefore should not be simply classified as NCX proteins, which should remain as a separate gene family.
DWARF – a data warehouse system for analyzing protein families
Fischer, Markus; Thai, Quan K; Grieb, Melanie; Pleiss, Jürgen
2006-01-01
Background The emerging field of integrative bioinformatics provides the tools to organize and systematically analyze vast amounts of highly diverse biological data and thus allows to gain a novel understanding of complex biological systems. The data warehouse DWARF applies integrative bioinformatics approaches to the analysis of large protein families. Description The data warehouse system DWARF integrates data on sequence, structure, and functional annotation for protein fold families. The underlying relational data model consists of three major sections representing entities related to the protein (biochemical function, source organism, classification to homologous families and superfamilies), the protein sequence (position-specific annotation, mutant information), and the protein structure (secondary structure information, superimposed tertiary structure). Tools for extracting, transforming and loading data from public available resources (ExPDB, GenBank, DSSP) are provided to populate the database. The data can be accessed by an interface for searching and browsing, and by analysis tools that operate on annotation, sequence, or structure. We applied DWARF to the family of α/β-hydrolases to host the Lipase Engineering database. Release 2.3 contains 6138 sequences and 167 experimentally determined protein structures, which are assigned to 37 superfamilies 103 homologous families. Conclusion DWARF has been designed for constructing databases of large structurally related protein families and for evaluating their sequence-structure-function relationships by a systematic analysis of sequence, structure and functional annotation. It has been applied to predict biochemical properties from sequence, and serves as a valuable tool for protein engineering. PMID:17094801
SCOWLP classification: Structural comparison and analysis of protein binding regions
Teyra, Joan; Paszkowski-Rogacz, Maciej; Anders, Gerd; Pisabarro, M Teresa
2008-01-01
Background Detailed information about protein interactions is critical for our understanding of the principles governing protein recognition mechanisms. The structures of many proteins have been experimentally determined in complex with different ligands bound either in the same or different binding regions. Thus, the structural interactome requires the development of tools to classify protein binding regions. A proper classification may provide a general view of the regions that a protein uses to bind others and also facilitate a detailed comparative analysis of the interacting information for specific protein binding regions at atomic level. Such classification might be of potential use for deciphering protein interaction networks, understanding protein function, rational engineering and design. Description Protein binding regions (PBRs) might be ideally described as well-defined separated regions that share no interacting residues one another. However, PBRs are often irregular, discontinuous and can share a wide range of interacting residues among them. The criteria to define an individual binding region can be often arbitrary and may differ from other binding regions within a protein family. Therefore, the rational behind protein interface classification should aim to fulfil the requirements of the analysis to be performed. We extract detailed interaction information of protein domains, peptides and interfacial solvent from the SCOWLP database and we classify the PBRs of each domain family. For this purpose, we define a similarity index based on the overlapping of interacting residues mapped in pair-wise structural alignments. We perform our classification with agglomerative hierarchical clustering using the complete-linkage method. Our classification is calculated at different similarity cut-offs to allow flexibility in the analysis of PBRs, feature especially interesting for those protein families with conflictive binding regions. The hierarchical classification of PBRs is implemented into the SCOWLP database and extends the SCOP classification with three additional family sub-levels: Binding Region, Interface and Contacting Domains. SCOWLP contains 9,334 binding regions distributed within 2,561 families. In 65% of the cases we observe families containing more than one binding region. Besides, 22% of the regions are forming complex with more than one different protein family. Conclusion The current SCOWLP classification and its web application represent a framework for the study of protein interfaces and comparative analysis of protein family binding regions. This comparison can be performed at atomic level and allows the user to study interactome conservation and variability. The new SCOWLP classification may be of great utility for reconstruction of protein complexes, understanding protein networks and ligand design. SCOWLP will be updated with every SCOP release. The web application is available at . PMID:18182098
Genome-Wide Identification and Comparative Analysis of Albumin Family in Vertebrates
Li, Shugang; Cao, Yiping; Geng, Fang
2017-01-01
Albumins are the most well-known globular proteins, and the most typical representatives are the serum albumins. However, less attention was paid to the albumin family, except for the human and bovine serum albumin. To characterize the features of albumin family, we have mined all the putative albumin proteins from the available genome sequences. The results showed that albumin is widely distributed in vertebrates, but not present in the bacteria and archaea. The phylogenetic analysis of vertebrate albumin family implied an evolutionary relationship between members of serum albumin, α-fetoprotein, vitamin D–binding protein, and afamin. Meanwhile, a new member from the albumin family was found, namely, extracellular matrix protein 1. The structural analysis revealed that the motifs for forming the internal disulfide bonds are highly conserved in the albumin family, despite the low overall sequence identity across the family. The domain arrangement of albumin proteins indicated that most of vertebrate albumins contain 3 characteristic domains, arising from 2 evolutionary patterns. And a significant trend has been observed that the albumin proteins in higher vertebrate species tend to possess more characteristic domains. This study has provided the fundamental information required for achieving a better understanding of the albumin distribution, phylogenetic relationship, characteristic motif, structure, and new insights into the evolutionary pattern. PMID:28680266
Protein interactions and ligand binding: from protein subfamilies to functional specificity.
Rausell, Antonio; Juan, David; Pazos, Florencio; Valencia, Alfonso
2010-02-02
The divergence accumulated during the evolution of protein families translates into their internal organization as subfamilies, and it is directly reflected in the characteristic patterns of differentially conserved residues. These specifically conserved positions in protein subfamilies are known as "specificity determining positions" (SDPs). Previous studies have limited their analysis to the study of the relationship between these positions and ligand-binding specificity, demonstrating significant yet limited predictive capacity. We have systematically extended this observation to include the role of differential protein interactions in the segregation of protein subfamilies and explored in detail the structural distribution of SDPs at protein interfaces. Our results show the extensive influence of protein interactions in the evolution of protein families and the widespread association of SDPs with protein interfaces. The combined analysis of SDPs in interfaces and ligand-binding sites provides a more complete picture of the organization of protein families, constituting the necessary framework for a large scale analysis of the evolution of protein function.
Jaiswal, Mamta; Dvorsky, Radovan; Ahmadian, Mohammad Reza
2013-02-08
The diffuse B-cell lymphoma (Dbl) family of the guanine nucleotide exchange factors is a direct activator of the Rho family proteins. The Rho family proteins are involved in almost every cellular process that ranges from fundamental (e.g. the establishment of cell polarity) to highly specialized processes (e.g. the contraction of vascular smooth muscle cells). Abnormal activation of the Rho proteins is known to play a crucial role in cancer, infectious and cognitive disorders, and cardiovascular diseases. However, the existence of 74 Dbl proteins and 25 Rho-related proteins in humans, which are largely uncharacterized, has led to increasing complexity in identifying specific upstream pathways. Thus, we comprehensively investigated sequence-structure-function-property relationships of 21 representatives of the Dbl protein family regarding their specificities and activities toward 12 Rho family proteins. The meta-analysis approach provides an unprecedented opportunity to broadly profile functional properties of Dbl family proteins, including catalytic efficiency, substrate selectivity, and signaling specificity. Our analysis has provided novel insights into the following: (i) understanding of the relative differences of various Rho protein members in nucleotide exchange; (ii) comparing and defining individual and overall guanine nucleotide exchange factor activities of a large representative set of the Dbl proteins toward 12 Rho proteins; (iii) grouping the Dbl family into functionally distinct categories based on both their catalytic efficiencies and their sequence-structural relationships; (iv) identifying conserved amino acids as fingerprints of the Dbl and Rho protein interaction; and (v) defining amino acid sequences conserved within, but not between, Dbl subfamilies. Therefore, the characteristics of such specificity-determining residues identified the regions or clusters conserved within the Dbl subfamilies.
Crystal structure of the YDR533c S. cerevisiae protein, a class II member of the Hsp31 family.
Graille, Marc; Quevillon-Cheruel, Sophie; Leulliot, Nicolas; Zhou, Cong-Zhao; Li de la Sierra Gallay, Ines; Jacquamet, Lilian; Ferrer, Jean-Luc; Liger, Dominique; Poupon, Anne; Janin, Joel; van Tilbeurgh, Herman
2004-05-01
The ORF YDR533c from Saccharomyces cerevisiae codes for a 25.5 kDa protein of unknown biochemical function. Transcriptome analysis of yeast has shown that this gene is activated in response to various stress conditions together with proteins belonging to the heat shock family. In order to clarify its biochemical function, we determined the crystal structure of YDR533c to 1.85 A resolution by the single anomalous diffraction method. The protein possesses an alpha/beta hydrolase fold and a putative Cys-His-Glu catalytic triad common to a large enzyme family containing proteases, amidotransferases, lipases, and esterases. The protein has strong structural resemblance with the E. coli Hsp31 protein and the intracellular protease I from Pyrococcus horikoshii, which are considered class I and class III members of the Hsp31 family, respectively. Detailed structural analysis strongly suggests that the YDR533c protein crystal structure is the first one of a class II member of the Hsp31 family.
Liu, Xiuying; Luo, GuanZheng; Bai, Xiujuan; Wang, Xiu-Jie
2009-10-01
MicroRNAs are approximately 22 nt long small non-coding RNAs that play important regulatory roles in eukaryotes. The biogenesis and functional processes of microRNAs require the participation of many proteins, of which, the well studied ones are Dicer, Drosha, Argonaute and Exportin 5. To systematically study these four protein families, we screened 11 animal genomes to search for genes encoding above mentioned proteins, and identified some new members for each family. Domain analysis results revealed that most proteins within the same family share identical or similar domains. Alternative spliced transcript variants were found for some proteins. We also examined the expression patterns of these proteins in different human tissues and identified other proteins that could potentially interact with these proteins. These findings provided systematic information on the four key proteins involved in microRNA biogenesis and functional pathways in animals, and will shed light on further functional studies of these proteins.
Re-visiting protein-centric two-tier classification of existing DNA-protein complexes
2012-01-01
Background Precise DNA-protein interactions play most important and vital role in maintaining the normal physiological functioning of the cell, as it controls many high fidelity cellular processes. Detailed study of the nature of these interactions has paved the way for understanding the mechanisms behind the biological processes in which they are involved. Earlier in 2000, a systematic classification of DNA-protein complexes based on the structural analysis of the proteins was proposed at two tiers, namely groups and families. With the advancement in the number and resolution of structures of DNA-protein complexes deposited in the Protein Data Bank, it is important to revisit the existing classification. Results On the basis of the sequence analysis of DNA binding proteins, we have built upon the protein centric, two-tier classification of DNA-protein complexes by adding new members to existing families and making new families and groups. While classifying the new complexes, we also realised the emergence of new groups and families. The new group observed was where β-propeller was seen to interact with DNA. There were 34 SCOP folds which were observed to be present in the complexes of both old and new classifications, whereas 28 folds are present exclusively in the new complexes. Some new families noticed were NarL transcription factor, Z-α DNA binding proteins, Forkhead transcription factor, AP2 protein, Methyl CpG binding protein etc. Conclusions Our results suggest that with the increasing number of availability of DNA-protein complexes in Protein Data Bank, the number of families in the classification increased by approximately three fold. The folds present exclusively in newly classified complexes is suggestive of inclusion of proteins with new function in new classification, the most populated of which are the folds responsible for DNA damage repair. The proposed re-visited classification can be used to perform genome-wide surveys in the genomes of interest for the presence of DNA-binding proteins. Further analysis of these complexes can aid in developing algorithms for identifying DNA-binding proteins and their family members from mere sequence information. PMID:22800292
Re-visiting protein-centric two-tier classification of existing DNA-protein complexes.
Malhotra, Sony; Sowdhamini, Ramanathan
2012-07-16
Precise DNA-protein interactions play most important and vital role in maintaining the normal physiological functioning of the cell, as it controls many high fidelity cellular processes. Detailed study of the nature of these interactions has paved the way for understanding the mechanisms behind the biological processes in which they are involved. Earlier in 2000, a systematic classification of DNA-protein complexes based on the structural analysis of the proteins was proposed at two tiers, namely groups and families. With the advancement in the number and resolution of structures of DNA-protein complexes deposited in the Protein Data Bank, it is important to revisit the existing classification. On the basis of the sequence analysis of DNA binding proteins, we have built upon the protein centric, two-tier classification of DNA-protein complexes by adding new members to existing families and making new families and groups. While classifying the new complexes, we also realised the emergence of new groups and families. The new group observed was where β-propeller was seen to interact with DNA. There were 34 SCOP folds which were observed to be present in the complexes of both old and new classifications, whereas 28 folds are present exclusively in the new complexes. Some new families noticed were NarL transcription factor, Z-α DNA binding proteins, Forkhead transcription factor, AP2 protein, Methyl CpG binding protein etc. Our results suggest that with the increasing number of availability of DNA-protein complexes in Protein Data Bank, the number of families in the classification increased by approximately three fold. The folds present exclusively in newly classified complexes is suggestive of inclusion of proteins with new function in new classification, the most populated of which are the folds responsible for DNA damage repair. The proposed re-visited classification can be used to perform genome-wide surveys in the genomes of interest for the presence of DNA-binding proteins. Further analysis of these complexes can aid in developing algorithms for identifying DNA-binding proteins and their family members from mere sequence information.
Evaluation of variability in high-resolution protein structures by global distance scoring.
Anzai, Risa; Asami, Yoshiki; Inoue, Waka; Ueno, Hina; Yamada, Koya; Okada, Tetsuji
2018-01-01
Systematic analysis of the statistical and dynamical properties of proteins is critical to understanding cellular events. Extraction of biologically relevant information from a set of high-resolution structures is important because it can provide mechanistic details behind the functional properties of protein families, enabling rational comparison between families. Most of the current structural comparisons are pairwise-based, which hampers the global analysis of increasing contents in the Protein Data Bank. Additionally, pairing of protein structures introduces uncertainty with respect to reproducibility because it frequently accompanies other settings for superimposition. This study introduces intramolecular distance scoring for the global analysis of proteins, for each of which at least several high-resolution structures are available. As a pilot study, we have tested 300 human proteins and showed that the method is comprehensively used to overview advances in each protein and protein family at the atomic level. This method, together with the interpretation of the model calculations, provide new criteria for understanding specific structural variation in a protein, enabling global comparison of the variability in proteins from different species.
Structure-sequence based analysis for identification of conserved regions in proteins
Zemla, Adam T; Zhou, Carol E; Lam, Marisa W; Smith, Jason R; Pardes, Elizabeth
2013-05-28
Disclosed are computational methods, and associated hardware and software products for scoring conservation in a protein structure based on a computationally identified family or cluster of protein structures. A method of computationally identifying a family or cluster of protein structures in also disclosed herein.
Structure-based analysis of catalysis and substrate definition in the HIT protein family.
Lima, C D; Klein, M G; Hendrickson, W A
1997-10-10
The histidine triad (HIT) protein family is among the most ubiquitous and highly conserved in nature, but a biological activity has not yet been identified for any member of the HIT family. Fragile histidine triad protein (FHIT) and protein kinase C interacting protein (PKCI) were used in a structure-based approach to elucidate characteristics of in vivo ligands and reactions. Crystallographic structures of apo, substrate analog, pentacovalent transition-state analog, and product states of both enzymes reveal a catalytic mechanism and define substrate characteristics required for catalysis, thus unifying the HIT family as nucleotidyl hydrolases, transferases, or both. The approach described here may be useful in identifying structure-function relations between protein families identified through genomics.
Song, Jian Bo; Wang, Yan Xiang; Li, Hai Bo; Li, Bo Wen; Zhou, Zhao Sheng; Gao, Shuai; Yang, Zhi Min
2015-07-01
F-box protein is a subunit of Skp1-Rbx1-Cul1-F-box protein (SCF) complex with typically conserved F-box motifs of approximately 40 amino acids and is one of the largest protein families in eukaryotes. F-box proteins play critical roles in selective and specific protein degradation through the 26S proteasome. In this study, we bioinformatically identified 972 putative F-box proteins from Medicago truncatula genome. Our analysis showed that in addition to the conserved motif, the F-box proteins have several other functional domains in their C-terminal regions (e.g., LRRs, Kelch, FBA, and PP2), some of which were found to be M. truncatula species-specific. By phylogenetic analysis of the F-box motifs, these proteins can be classified into three major families, and each family can be further grouped into more subgroups. Analysis of the genomic distribution of F-box genes on M. truncatula chromosomes revealed that the evolutional expansion of F-box genes in M. truncatula was probably due to localized gene duplications. To investigate the possible response of the F-box genes to abiotic stresses, both publicly available and customer-prepared microarrays were analyzed. Most of the F-box protein genes can be responding to salt and heavy metal stresses. Real-time PCR analysis confirmed that some of the F-box protein genes containing heat, drought, salicylic acid, and abscisic acid responsive cis-elements were able to respond to the abiotic stresses.
FunShift: a database of function shift analysis on protein subfamilies
Abhiman, Saraswathi; Sonnhammer, Erik L. L.
2005-01-01
Members of a protein family normally have a general biochemical function in common, but frequently one or more subgroups have evolved a slightly different function, such as different substrate specificity. It is important to detect such function shifts for a more accurate functional annotation. The FunShift database described here is a compilation of function shift analysis performed between subfamilies in protein families. It consists of two main components: (i) subfamilies derived from protein domain families and (ii) pairwise subfamily comparisons analyzed for function shift. The present release, FunShift 12, was derived from Pfam 12 and consists of 151 934 subfamilies derived from 7300 families. We carried out function shift analysis by two complementary methods on families with up to 500 members. From a total of 179 210 subfamily pairs, 62 384 were predicted to be functionally shifted in 2881 families. Each subfamily pair is provided with a markup of probable functional specificity-determining sites. Tools for searching and exploring the data are provided to make this database a valuable resource for protein function annotation. Knowledge of these functionally important sites will be useful for experimental biologists performing functional mutation studies. FunShift is available at http://FunShift.cgb.ki.se. PMID:15608176
Dewhurst, Henry M.; Choudhury, Shilpa; Torres, Matthew P.
2015-01-01
Predicting the biological function potential of post-translational modifications (PTMs) is becoming increasingly important in light of the exponential increase in available PTM data from high-throughput proteomics. We developed structural analysis of PTM hotspots (SAPH-ire)—a quantitative PTM ranking method that integrates experimental PTM observations, sequence conservation, protein structure, and interaction data to allow rank order comparisons within or between protein families. Here, we applied SAPH-ire to the study of PTMs in diverse G protein families, a conserved and ubiquitous class of proteins essential for maintenance of intracellular structure (tubulins) and signal transduction (large and small Ras-like G proteins). A total of 1728 experimentally verified PTMs from eight unique G protein families were clustered into 451 unique hotspots, 51 of which have a known and cited biological function or response. Using customized software, the hotspots were analyzed in the context of 598 unique protein structures. By comparing distributions of hotspots with known versus unknown function, we show that SAPH-ire analysis is predictive for PTM biological function. Notably, SAPH-ire revealed high-ranking hotspots for which a functional impact has not yet been determined, including phosphorylation hotspots in the N-terminal tails of G protein gamma subunits—conserved protein structures never before reported as regulators of G protein coupled receptor signaling. To validate this prediction we used the yeast model system for G protein coupled receptor signaling, revealing that gamma subunit–N-terminal tail phosphorylation is activated in response to G protein coupled receptor stimulation and regulates protein stability in vivo. These results demonstrate the utility of integrating protein structural and sequence features into PTM prioritization schemes that can improve the analysis and functional power of modification-specific proteomics data. PMID:26070665
PATtyFams: Protein families for the microbial genomes in the PATRIC database
Davis, James J.; Gerdes, Svetlana; Olsen, Gary J.; ...
2016-02-08
The ability to build accurate protein families is a fundamental operation in bioinformatics that influences comparative analyses, genome annotation, and metabolic modeling. For several years we have been maintaining protein families for all microbial genomes in the PATRIC database (Pathosystems Resource Integration Center, patricbrc.org) in order to drive many of the comparative analysis tools that are available through the PATRIC website. However, due to the burgeoning number of genomes, traditional approaches for generating protein families are becoming prohibitive. In this report, we describe a new approach for generating protein families, which we call PATtyFams. This method uses the k-mer-based functionmore » assignments available through RAST (Rapid Annotation using Subsystem Technology) to rapidly guide family formation, and then differentiates the function-based groups into families using a Markov Cluster algorithm (MCL). In conclusion, this new approach for generating protein families is rapid, scalable and has properties that are consistent with alignment-based methods.« less
PATtyFams: Protein families for the microbial genomes in the PATRIC database
DOE Office of Scientific and Technical Information (OSTI.GOV)
Davis, James J.; Gerdes, Svetlana; Olsen, Gary J.
The ability to build accurate protein families is a fundamental operation in bioinformatics that influences comparative analyses, genome annotation, and metabolic modeling. For several years we have been maintaining protein families for all microbial genomes in the PATRIC database (Pathosystems Resource Integration Center, patricbrc.org) in order to drive many of the comparative analysis tools that are available through the PATRIC website. However, due to the burgeoning number of genomes, traditional approaches for generating protein families are becoming prohibitive. In this report, we describe a new approach for generating protein families, which we call PATtyFams. This method uses the k-mer-based functionmore » assignments available through RAST (Rapid Annotation using Subsystem Technology) to rapidly guide family formation, and then differentiates the function-based groups into families using a Markov Cluster algorithm (MCL). In conclusion, this new approach for generating protein families is rapid, scalable and has properties that are consistent with alignment-based methods.« less
Pereira, Filipe; Duarte-Pereira, Sara; Silva, Raquel M.; da Costa, Luís Teixeira; Pereira-Castro, Isabel
2016-01-01
The NET (for NocA, Nlz, Elbow, TLP-1) protein family is a group of conserved zinc finger proteins linked to embryonic development and recently associated with breast cancer. The members of this family act as transcriptional repressors interacting with both class I histone deacetylases and Groucho/TLE co-repressors. In Drosophila, the NET family members Elbow and NocA are vital for the development of tracheae, eyes, wings and legs, whereas in vertebrates ZNF703 and ZNF503 are important for the development of the nervous system, eyes and limbs. Despite the relevance of this protein family in embryogenesis and cancer, many aspects of its origin and evolution remain unknown. Here, we show that NET family members are present and expressed in multiple metazoan lineages, from cnidarians to vertebrates. We identified several protein domains conserved in all metazoan species or in specific taxonomic groups. Our phylogenetic analysis suggests that the NET family emerged in the last common ancestor of cnidarians and bilaterians and that several rounds of independent events of gene duplication occurred throughout evolution. Overall, we provide novel data on the expression and evolutionary history of the NET family that can be relevant to understanding its biological role in both normal conditions and disease. PMID:27929068
Odronitz, Florian; Kollmar, Martin
2006-11-29
Annotation of protein sequences of eukaryotic organisms is crucial for the understanding of their function in the cell. Manual annotation is still by far the most accurate way to correctly predict genes. The classification of protein sequences, their phylogenetic relation and the assignment of function involves information from various sources. This often leads to a collection of heterogeneous data, which is hard to track. Cytoskeletal and motor proteins consist of large and diverse superfamilies comprising up to several dozen members per organism. Up to date there is no integrated tool available to assist in the manual large-scale comparative genomic analysis of protein families. Pfarao (Protein Family Application for Retrieval, Analysis and Organisation) is a database driven online working environment for the analysis of manually annotated protein sequences and their relationship. Currently, the system can store and interrelate a wide range of information about protein sequences, species, phylogenetic relations and sequencing projects as well as links to literature and domain predictions. Sequences can be imported from multiple sequence alignments that are generated during the annotation process. A web interface allows to conveniently browse the database and to compile tabular and graphical summaries of its content. We implemented a protein sequence-centric web application to store, organize, interrelate, and present heterogeneous data that is generated in manual genome annotation and comparative genomics. The application has been developed for the analysis of cytoskeletal and motor proteins (CyMoBase) but can easily be adapted for any protein.
Genome-wide analysis of the TPX2 family proteins in Eucalyptus grandis.
Du, Pingzhou; Kumar, Manoj; Yao, Yuan; Xie, Qiaoli; Wang, Jinyan; Zhang, Baolong; Gan, Siming; Wang, Yuqi; Wu, Ai-Min
2016-11-24
The Xklp2 (TPX2) proteins belong to the microtubule-associated (MAP) family of proteins. All members of the family contain the conserved TPX2 motif, which can interact with microtubules, regulate microtubule dynamics or assist with different microtubule functions, for example, maintenance of cell morphology or regulation of cell growth and development. However, the role of members of the TPX family have not been studied in the model tree species Eucalyptus to date. Here, we report the identification of the members of the TPX2 family in Eucalyptus grandis (Eg) and analyse the expression patterns and functions of these genes. In present study, a comprehensive analysis of the plant TPX2 family proteins was performed. Phylogenetic analyses indicated that the genes can be classified into 6 distinct subfamilies. A genome-wide survey identified 12 members of the TPX2 family in the sequenced genome of Eucalyptus grandis. The basic genetic properties of the TPX2 family in Eucalyptus were analysed. Our results suggest that the TPX2 family proteins within different sub-groups are relatively conserved but there are important differences between groups. Quantitative real-time PCR (qRT-PCR) was performed to confirm the expression levels of the genes in different tissues. The results showed that in the whole plant, the levels of EgWDL5 transcript are the highest, followed by those of EgWDL4. Compared with other tissues, the level of the EgMAP20 transcript is the highest in the root. Over-expression of EgMAP20 in Arabidopsis resulted in organ twisting. The cotyledon petioles showed left-handed twisting while the hypocotyl epidermal cells produced right-handed helical twisting. Finally, EgMAP20, EgWDL3 and EgWDL3L were all able to decorate microtubules. Plant TPX2 family proteins were systematically analysed using bioinformatics methods. There are 12 TPX2 family proteins in Eucalyptus. We have performed an initial characterization of the functions of several members of the TPX2 family. We found that the gene products are localized to the microtubule cytoskeleton. Our results lay the foundation for future efforts to reveal the biological significance of TPX2 family proteins in Eucalyptus.
Dewhurst, Henry M; Choudhury, Shilpa; Torres, Matthew P
2015-08-01
Predicting the biological function potential of post-translational modifications (PTMs) is becoming increasingly important in light of the exponential increase in available PTM data from high-throughput proteomics. We developed structural analysis of PTM hotspots (SAPH-ire)--a quantitative PTM ranking method that integrates experimental PTM observations, sequence conservation, protein structure, and interaction data to allow rank order comparisons within or between protein families. Here, we applied SAPH-ire to the study of PTMs in diverse G protein families, a conserved and ubiquitous class of proteins essential for maintenance of intracellular structure (tubulins) and signal transduction (large and small Ras-like G proteins). A total of 1728 experimentally verified PTMs from eight unique G protein families were clustered into 451 unique hotspots, 51 of which have a known and cited biological function or response. Using customized software, the hotspots were analyzed in the context of 598 unique protein structures. By comparing distributions of hotspots with known versus unknown function, we show that SAPH-ire analysis is predictive for PTM biological function. Notably, SAPH-ire revealed high-ranking hotspots for which a functional impact has not yet been determined, including phosphorylation hotspots in the N-terminal tails of G protein gamma subunits--conserved protein structures never before reported as regulators of G protein coupled receptor signaling. To validate this prediction we used the yeast model system for G protein coupled receptor signaling, revealing that gamma subunit-N-terminal tail phosphorylation is activated in response to G protein coupled receptor stimulation and regulates protein stability in vivo. These results demonstrate the utility of integrating protein structural and sequence features into PTM prioritization schemes that can improve the analysis and functional power of modification-specific proteomics data. © 2015 by The American Society for Biochemistry and Molecular Biology, Inc.
Odronitz, Florian; Kollmar, Martin
2006-01-01
Background Annotation of protein sequences of eukaryotic organisms is crucial for the understanding of their function in the cell. Manual annotation is still by far the most accurate way to correctly predict genes. The classification of protein sequences, their phylogenetic relation and the assignment of function involves information from various sources. This often leads to a collection of heterogeneous data, which is hard to track. Cytoskeletal and motor proteins consist of large and diverse superfamilies comprising up to several dozen members per organism. Up to date there is no integrated tool available to assist in the manual large-scale comparative genomic analysis of protein families. Description Pfarao (Protein Family Application for Retrieval, Analysis and Organisation) is a database driven online working environment for the analysis of manually annotated protein sequences and their relationship. Currently, the system can store and interrelate a wide range of information about protein sequences, species, phylogenetic relations and sequencing projects as well as links to literature and domain predictions. Sequences can be imported from multiple sequence alignments that are generated during the annotation process. A web interface allows to conveniently browse the database and to compile tabular and graphical summaries of its content. Conclusion We implemented a protein sequence-centric web application to store, organize, interrelate, and present heterogeneous data that is generated in manual genome annotation and comparative genomics. The application has been developed for the analysis of cytoskeletal and motor proteins (CyMoBase) but can easily be adapted for any protein. PMID:17134497
Campion, S R; Ameen, A S; Lai, L; King, J M; Munzenmaier, T N
2001-08-15
This report describes the application of a simple computational tool, AAPAIR.TAB, for the systematic analysis of the cysteine-rich EGF, Sushi, and Laminin motif/sequence families at the two-amino acid level. Automated dipeptide frequency/bias analysis detects preferences in the distribution of amino acids in established protein families, by determining which "ordered dipeptides" occur most frequently in comprehensive motif-specific sequence data sets. Graphic display of the dipeptide frequency/bias data revealed family-specific preferences for certain dipeptides, but more importantly detected a shared preference for employment of the ordered dipeptides Gly-Tyr (GY) and Gly-Phe (GF) in all three protein families. The dipeptide Asn-Gly (NG) also exhibited high-frequency and bias in the EGF and Sushi motif families, whereas Asn-Thr (NT) was distinguished in the Laminin family. Evaluation of the distribution of dipeptides identified by frequency/bias analysis subsequently revealed the highly restricted localization of the G(F/Y) and N(G/T) sequence elements at two separate sites of extreme conservation in the consensus sequence of all three sequence families. The similar employment of the high-frequency/bias dipeptides in three distinct protein sequence families was further correlated with the concurrence of these shared molecular determinants at similar positions within the distinctive scaffolds of three structurally divergent, but similarly employed, motif modules.
Analysis of functional redundancies within the Arabidopsis TCP transcription factor family.
Danisman, Selahattin; van Dijk, Aalt D J; Bimbo, Andrea; van der Wal, Froukje; Hennig, Lars; de Folter, Stefan; Angenent, Gerco C; Immink, Richard G H
2013-12-01
Analyses of the functions of TEOSINTE-LIKE1, CYCLOIDEA, and PROLIFERATING CELL FACTOR1 (TCP) transcription factors have been hampered by functional redundancy between its individual members. In general, putative functionally redundant genes are predicted based on sequence similarity and confirmed by genetic analysis. In the TCP family, however, identification is impeded by relatively low overall sequence similarity. In a search for functionally redundant TCP pairs that control Arabidopsis leaf development, this work performed an integrative bioinformatics analysis, combining protein sequence similarities, gene expression data, and results of pair-wise protein-protein interaction studies for the 24 members of the Arabidopsis TCP transcription factor family. For this, the work completed any lacking gene expression and protein-protein interaction data experimentally and then performed a comprehensive prediction of potential functional redundant TCP pairs. Subsequently, redundant functions could be confirmed for selected predicted TCP pairs by genetic and molecular analyses. It is demonstrated that the previously uncharacterized class I TCP19 gene plays a role in the control of leaf senescence in a redundant fashion with TCP20. Altogether, this work shows the power of combining classical genetic and molecular approaches with bioinformatics predictions to unravel functional redundancies in the TCP transcription factor family.
2012-01-01
Background GDSL esterases/lipases are a newly discovered subclass of lipolytic enzymes that are very important and attractive research subjects because of their multifunctional properties, such as broad substrate specificity and regiospecificity. Compared with the current knowledge regarding these enzymes in bacteria, our understanding of the plant GDSL enzymes is very limited, although the GDSL gene family in plant species include numerous members in many fully sequenced plant genomes. Only two genes from a large rice GDSL esterase/lipase gene family were previously characterised, and the majority of the members remain unknown. In the present study, we describe the rice OsGELP (Oryza sativa GDSL esterase/lipase protein) gene family at the genomic and proteomic levels, and use this knowledge to provide insights into the multifunctionality of the rice OsGELP enzymes. Results In this study, an extensive bioinformatics analysis identified 114 genes in the rice OsGELP gene family. A complete overview of this family in rice is presented, including the chromosome locations, gene structures, phylogeny, and protein motifs. Among the OsGELPs and the plant GDSL esterase/lipase proteins of known functions, 41 motifs were found that represent the core secondary structure elements or appear specifically in different phylogenetic subclades. The specification and distribution of identified putative conserved clade-common and -specific peptide motifs, and their location on the predicted protein three dimensional structure may possibly signify their functional roles. Potentially important regions for substrate specificity are highlighted, in accordance with protein three-dimensional model and location of the phylogenetic specific conserved motifs. The differential expression of some representative genes were confirmed by quantitative real-time PCR. The phylogenetic analysis, together with protein motif architectures, and the expression profiling were analysed to predict the possible biological functions of the rice OsGELP genes. Conclusions Our current genomic analysis, for the first time, presents fundamental information on the organization of the rice OsGELP gene family. With combination of the genomic, phylogenetic, microarray expression, protein motif distribution, and protein structure analyses, we were able to create supported basis for the functional prediction of many members in the rice GDSL esterase/lipase family. The present study provides a platform for the selection of candidate genes for further detailed functional study. PMID:22793791
Analysis of Cytoskeletal and Motility Proteins in the Sea Urchin Genome Assembly
RL, Morris; MP, Hoffman; RA, Obar; SS, McCafferty; IR, Gibbons; AD, Leone; J, Cool; EL, Allgood; AM, Musante; KM, Judkins; BJ, Rossetti; AP, Rawson; DR, Burgess
2007-01-01
The sea urchin embryo is a classical model system for studying the role of the cytoskeleton in such events as fertilization, mitosis, cleavage, cell migration and gastrulation. We have conducted an analysis of gene models derived from the Strongylocentrotus purpuratus genome assembly and have gathered strong evidence for the existence of multiple gene families encoding cytoskeletal proteins and their regulators in sea urchin. While many cytoskeletal genes have been cloned from sea urchin with sequences already existing in public databases, genome analysis reveals a significantly higher degree of diversity within certain gene families. Furthermore, genes are described corresponding to homologs of cytoskeletal proteins not previously documented in sea urchins. To illustrate the varying degree of sequence diversity that exists within cytoskeletal gene families, we conducted an analysis of genes encoding actins, specific actin-binding proteins, myosins, tubulins, kinesins, dyneins, specific microtubule-associated proteins, and intermediate filaments. We conducted ontological analysis of select genes to better understand the relatedness of urchin cytoskeletal genes to those of other deuterostomes. We analyzed developmental expression (EST) data to confirm the existence of select gene models and to understand their differential expression during various stages of early development. PMID:17027957
Molecular evolution of miraculin-like proteins in soybean Kunitz super-family.
Selvakumar, Purushotham; Gahloth, Deepankar; Tomar, Prabhat Pratap Singh; Sharma, Nidhi; Sharma, Ashwani Kumar
2011-12-01
Miraculin-like proteins (MLPs) belong to soybean Kunitz super-family and have been characterized from many plant families like Rutaceae, Solanaceae, Rubiaceae, etc. Many of them possess trypsin inhibitory activity and are involved in plant defense. MLPs exhibit significant sequence identity (~30-95%) to native miraculin protein, also belonging to Kunitz super-family compared with a typical Kunitz family member (~30%). The sequence and structure-function comparison of MLPs with that of a classical Kunitz inhibitor have demonstrated that MLPs have evolved to form a distinct group within Kunitz super-family. Sequence analysis of new genes along with available MLP sequences in the literature revealed three major groups for these proteins. A significant feature of Rutaceae MLP type 2 sequences is the presence of phosphorylation motif. Subtle changes are seen in putative reactive loop residues among different MLPs suggesting altered specificities to specific proteases. In phylogenetic analysis, Rutaceae MLP type 1 and type 2 proteins clustered together on separate branches, whereas native miraculin along with other MLPs formed distinct clusters. Site-specific positive Darwinian selection was observed at many sites in both the groups of Rutaceae MLP sequences with most of the residues undergoing positive selection located in loop regions. The results demonstrate the sequence and thereby the structure-function divergence of MLPs as a distinct group within soybean Kunitz super-family due to biotic and abiotic stresses of local environment.
Gu, Qihui; Wu, Qingping; Zhang, Jumei; Guo, Weipeng; Wu, Huiqing; Sun, Ming
2017-07-07
Phenol is a hazardous chemical known to be widely distributed in aquatic environments. Biodegradation is an attractive option for removal of phenol from water sources. Acinetobacter sp. DW-1 isolated from drinking water biofilters can use phenol as a sole carbon and energy source. In this study, we found that Immobilized Acinetobacter sp. DW-1cells were effective in biodegradation of phenol. In addition, we performed proteome and transcriptome analysis of Acinetobacter sp. DW-1 during phenol biodegradation. The results showed that Acinetobacter sp. DW-1 degrades phenol mainly by the ortho pathway because of the induction of phenol hydroxylase, catechol-1,2-dioxygenase. Furthermore, some novel candidate proteins (OsmC-like family protein, MetA-pathway of phenol degradation family protein, fimbrial protein and coenzyme F390 synthetase) and transcriptional regulators (GntR/LuxR/CRP/FNR/TetR/Fis family transcriptional regulator) were successfully identified to be potentially involved in phenol biodegradation. In particular, MetA-pathway of phenol degradation family protein and fimbrial protein showed a strong positive correlation with phenol biodegradation, and Fis family transcriptional regulator is likely to exert its effect as activators of gene expression. This study provides valuable clues for identifying global proteins and genes involved in phenol biodegradation and provides a fundamental platform for further studies to reveal the phenol degradation mechanism of Acinetobacter sp.
Practical analysis of specificity-determining residues in protein families.
Chagoyen, Mónica; García-Martín, Juan A; Pazos, Florencio
2016-03-01
Determining the residues that are important for the molecular activity of a protein is a topic of broad interest in biomedicine and biotechnology. This knowledge can help understanding the protein's molecular mechanism as well as to fine-tune its natural function eventually with biotechnological or therapeutic implications. Some of the protein residues are essential for the function common to all members of a family of proteins, while others explain the particular specificities of certain subfamilies (like binding on different substrates or cofactors and distinct binding affinities). Owing to the difficulty in experimentally determining them, a number of computational methods were developed to detect these functional residues, generally known as 'specificity-determining positions' (or SDPs), from a collection of homologous protein sequences. These methods are mature enough for being routinely used by molecular biologists in directing experiments aimed at getting insight into the functional specificity of a family of proteins and eventually modifying it. In this review, we summarize some of the recent discoveries achieved through SDP computational identification in a number of relevant protein families, as well as the main approaches and software tools available to perform this type of analysis. © The Author 2015. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.
Lentes, K U; Mathieu, E; Bischoff, R; Rasmussen, U B; Pavirani, A
1993-01-01
Current methods for comparative analyses of protein sequences are 1D-alignments of amino acid sequences based on the maximization of amino acid identity (homology) and the prediction of secondary structure elements. This method has a major drawback once the amino acid identity drops below 20-25%, since maximization of a homology score does not take into account any structural information. A new technique called Hydrophobic Cluster Analysis (HCA) has been developed by Lemesle-Varloot et al. (Biochimie 72, 555-574), 1990). This consists of comparing several sequences simultaneously and combining homology detection with secondary structure analysis. HCA is primarily based on the detection and comparison of structural segments constituting the hydrophobic core of globular protein domains, with or without transmembrane domains. We have applied HCA to the analysis of different families of G-protein coupled receptors, such as catecholamine receptors as well as peptide hormone receptors. Utilizing HCA the thrombin receptor, a new and as yet unique member of the family of G-protein coupled receptors, can be clearly classified as being closely related to the family of neuropeptide receptors rather than to the catecholamine receptors for which the shape of the hydrophobic clusters and the length of their third cytoplasmic loop are very different. Furthermore, the potential of HCA to predict relationships between new putative and already characterized members of this family of receptors will be presented.
A Protein Domain and Family Based Approach to Rare Variant Association Analysis.
Richardson, Tom G; Shihab, Hashem A; Rivas, Manuel A; McCarthy, Mark I; Campbell, Colin; Timpson, Nicholas J; Gaunt, Tom R
2016-01-01
It has become common practice to analyse large scale sequencing data with statistical approaches based around the aggregation of rare variants within the same gene. We applied a novel approach to rare variant analysis by collapsing variants together using protein domain and family coordinates, regarded to be a more discrete definition of a biologically functional unit. Using Pfam definitions, we collapsed rare variants (Minor Allele Frequency ≤ 1%) together in three different ways 1) variants within single genomic regions which map to individual protein domains 2) variants within two individual protein domain regions which are predicted to be responsible for a protein-protein interaction 3) all variants within combined regions from multiple genes responsible for coding the same protein domain (i.e. protein families). A conventional collapsing analysis using gene coordinates was also undertaken for comparison. We used UK10K sequence data and investigated associations between regions of variants and lipid traits using the sequence kernel association test (SKAT). We observed no strong evidence of association between regions of variants based on Pfam domain definitions and lipid traits. Quantile-Quantile plots illustrated that the overall distributions of p-values from the protein domain analyses were comparable to that of a conventional gene-based approach. Deviations from this distribution suggested that collapsing by either protein domain or gene definitions may be favourable depending on the trait analysed. We have collapsed rare variants together using protein domain and family coordinates to present an alternative approach over collapsing across conventionally used gene-based regions. Although no strong evidence of association was detected in these analyses, future studies may still find value in adopting these approaches to detect previously unidentified association signals.
Co-evolution of SNF spliceosomal proteins with their RNA targets in trans-splicing nematodes.
Strange, Rex Meade; Russelburg, L Peyton; Delaney, Kimberly J
2016-08-01
Although the mechanism of pre-mRNA splicing has been well characterized, the evolution of spliceosomal proteins is poorly understood. The U1A/U2B″/SNF family (hereafter referred to as the SNF family) of RNA binding spliceosomal proteins participates in both the U1 and U2 small interacting nuclear ribonucleoproteins (snRNPs). The highly constrained nature of this system has inhibited an analysis of co-evolutionary trends between the proteins and their RNA binding targets. Here we report accelerated sequence evolution in the SNF protein family in Phylum Nematoda, which has allowed an analysis of protein:RNA co-evolution. In a comparison of SNF genes from ecdysozoan species, we found a correlation between trans-splicing species (nematodes) and increased phylogenetic branch lengths of the SNF protein family, with respect to their sister clade Arthropoda. In particular, we found that nematodes (~70-80 % of pre-mRNAs are trans-spliced) have experienced higher rates of SNF sequence evolution than arthropods (predominantly cis-spliced) at both the nucleotide and amino acid levels. Interestingly, this increased evolutionary rate correlates with the reliance on trans-splicing by nematodes, which would alter the role of the SNF family of spliceosomal proteins. We mapped amino acid substitutions to functionally important regions of the SNF protein, specifically to sites that are predicted to disrupt protein:RNA and protein:protein interactions. Finally, we investigated SNF's RNA targets: the U1 and U2 snRNAs. Both are more divergent in nematodes than arthropods, suggesting the RNAs have co-evolved with SNF in order to maintain the necessarily high affinity interaction that has been characterized in other species.
The Sorcerer II Global Ocean Sampling expedition: expanding the universe of protein families.
Yooseph, Shibu; Sutton, Granger; Rusch, Douglas B; Halpern, Aaron L; Williamson, Shannon J; Remington, Karin; Eisen, Jonathan A; Heidelberg, Karla B; Manning, Gerard; Li, Weizhong; Jaroszewski, Lukasz; Cieplak, Piotr; Miller, Christopher S; Li, Huiying; Mashiyama, Susan T; Joachimiak, Marcin P; van Belle, Christopher; Chandonia, John-Marc; Soergel, David A; Zhai, Yufeng; Natarajan, Kannan; Lee, Shaun; Raphael, Benjamin J; Bafna, Vineet; Friedman, Robert; Brenner, Steven E; Godzik, Adam; Eisenberg, David; Dixon, Jack E; Taylor, Susan S; Strausberg, Robert L; Frazier, Marvin; Venter, J Craig
2007-03-01
Metagenomics projects based on shotgun sequencing of populations of micro-organisms yield insight into protein families. We used sequence similarity clustering to explore proteins with a comprehensive dataset consisting of sequences from available databases together with 6.12 million proteins predicted from an assembly of 7.7 million Global Ocean Sampling (GOS) sequences. The GOS dataset covers nearly all known prokaryotic protein families. A total of 3,995 medium- and large-sized clusters consisting of only GOS sequences are identified, out of which 1,700 have no detectable homology to known families. The GOS-only clusters contain a higher than expected proportion of sequences of viral origin, thus reflecting a poor sampling of viral diversity until now. Protein domain distributions in the GOS dataset and current protein databases show distinct biases. Several protein domains that were previously categorized as kingdom specific are shown to have GOS examples in other kingdoms. About 6,000 sequences (ORFans) from the literature that heretofore lacked similarity to known proteins have matches in the GOS data. The GOS dataset is also used to improve remote homology detection. Overall, besides nearly doubling the number of current proteins, the predicted GOS proteins also add a great deal of diversity to known protein families and shed light on their evolution. These observations are illustrated using several protein families, including phosphatases, proteases, ultraviolet-irradiation DNA damage repair enzymes, glutamine synthetase, and RuBisCO. The diversity added by GOS data has implications for choosing targets for experimental structure characterization as part of structural genomics efforts. Our analysis indicates that new families are being discovered at a rate that is linear or almost linear with the addition of new sequences, implying that we are still far from discovering all protein families in nature.
Vyas, Sejal; Chesarone-Cataldo, Melissa; Todorova, Tanya; Huang, Yun-Han; Chang, Paul
2013-01-01
The poly(ADP-ribose) polymerase (PARP) family of proteins use NAD+ as their substrate to modify acceptor proteins with adenosine diphosphate-ribose (ADPr) modifications. The function of most PARPs under physiological conditions is unknown. Here, to better understand this protein family, we systematically analyze the cell cycle localization of each PARP and of poly(ADP-ribose), a product of PARP activity, then identify the knock-down phenotype of each protein and perform secondary assays to elucidate function. We show that most PARPs are cytoplasmic, identify cell cycle differences in the ratio of nuclear to cytoplasmic poly(ADP-ribose), and identify four phenotypic classes of PARP function. These include the regulation of membrane structures, cell viability, cell division, and the actin cytoskeleton. Further analysis of PARP14 shows that it is a component of focal adhesion complexes required for proper cell motility and focal adhesion function. In total, we show that PARP proteins are critical regulators of eukaryotic physiology. PMID:23917125
Exploring the Common Dynamics of Homologous Proteins. Application to the Globin Family
Maguid, Sandra; Fernandez-Alberti, Sebastian; Ferrelli, Leticia; Echave, Julian
2005-01-01
We present a procedure to explore the global dynamics shared between members of the same protein family. The method allows the comparison of patterns of vibrational motion obtained by Gaussian network model analysis. After the identification of collective coordinates that were conserved during evolution, we quantify the common dynamics within a family. Representative vectors that describe these dynamics are defined using a singular value decomposition approach. As a test case, the globin heme-binding family is considered. The two lowest normal modes are shown to be conserved within this family. Our results encourage the development of models for protein evolution that take into account the conservation of dynamical features. PMID:15749782
Wu, Wentao; Liu, Yaxue; Wang, Yuqian; Li, Huimin; Liu, Jiaxi; Tan, Jiaxin; He, Jiadai; Bai, Jingwen; Ma, Haoli
2017-10-08
The plant hormone auxin plays pivotal roles in many aspects of plant growth and development. The auxin/indole-3-acetic acid (Aux/IAA) gene family encodes short-lived nuclear proteins acting on auxin perception and signaling, but the evolutionary history of this gene family remains to be elucidated. In this study, the Aux/IAA gene family in 17 plant species covering all major lineages of plants is identified and analyzed by using multiple bioinformatics methods. A total of 434 Aux/IAA genes was found among these plant species, and the gene copy number ranges from three ( Physcomitrella patens ) to 63 ( Glycine max ). The phylogenetic analysis shows that the canonical Aux/IAA proteins can be generally divided into five major clades, and the origin of Aux/IAA proteins could be traced back to the common ancestor of land plants and green algae. Many truncated Aux/IAA proteins were found, and some of these truncated Aux/IAA proteins may be generated from the C-terminal truncation of auxin response factor (ARF) proteins. Our results indicate that tandem and segmental duplications play dominant roles for the expansion of the Aux/IAA gene family mainly under purifying selection. The putative nuclear localization signals (NLSs) in Aux/IAA proteins are conservative, and two kinds of new primordial bipartite NLSs in P. patens and Selaginella moellendorffii were discovered. Our findings not only give insights into the origin and expansion of the Aux/IAA gene family, but also provide a basis for understanding their functions during the course of evolution.
Wu, Wentao; Liu, Yaxue; Wang, Yuqian; Li, Huimin; Liu, Jiaxi; Tan, Jiaxin; He, Jiadai; Bai, Jingwen
2017-01-01
The plant hormone auxin plays pivotal roles in many aspects of plant growth and development. The auxin/indole-3-acetic acid (Aux/IAA) gene family encodes short-lived nuclear proteins acting on auxin perception and signaling, but the evolutionary history of this gene family remains to be elucidated. In this study, the Aux/IAA gene family in 17 plant species covering all major lineages of plants is identified and analyzed by using multiple bioinformatics methods. A total of 434 Aux/IAA genes was found among these plant species, and the gene copy number ranges from three (Physcomitrella patens) to 63 (Glycine max). The phylogenetic analysis shows that the canonical Aux/IAA proteins can be generally divided into five major clades, and the origin of Aux/IAA proteins could be traced back to the common ancestor of land plants and green algae. Many truncated Aux/IAA proteins were found, and some of these truncated Aux/IAA proteins may be generated from the C-terminal truncation of auxin response factor (ARF) proteins. Our results indicate that tandem and segmental duplications play dominant roles for the expansion of the Aux/IAA gene family mainly under purifying selection. The putative nuclear localization signals (NLSs) in Aux/IAA proteins are conservative, and two kinds of new primordial bipartite NLSs in P. patens and Selaginella moellendorffii were discovered. Our findings not only give insights into the origin and expansion of the Aux/IAA gene family, but also provide a basis for understanding their functions during the course of evolution. PMID:28991190
Protein sectors: evolutionary units of three-dimensional structure
Halabi, Najeeb; Rivoire, Olivier; Leibler, Stanislas; Ranganathan, Rama
2011-01-01
Proteins display a hierarchy of structural features at primary, secondary, tertiary, and higher-order levels, an organization that guides our current understanding of their biological properties and evolutionary origins. Here, we reveal a structural organization distinct from this traditional hierarchy by statistical analysis of correlated evolution between amino acids. Applied to the S1A serine proteases, the analysis indicates a decomposition of the protein into three quasi-independent groups of correlated amino acids that we term “protein sectors”. Each sector is physically connected in the tertiary structure, has a distinct functional role, and constitutes an independent mode of sequence divergence in the protein family. Functionally relevant sectors are evident in other protein families as well, suggesting that they may be general features of proteins. We propose that sectors represent a structural organization of proteins that reflects their evolutionary histories. PMID:19703402
NASA Technical Reports Server (NTRS)
Gruppi, C. M.; Wolgemuth, D. J.
1993-01-01
This study extends to the protein level our previous observations, which had established the stage and cellular specificity of expression of hsp86 and hsp84 in the murine testis in the absence of exogenous stress. Immunoblot analysis was used to demonstrate that HSP86 protein was present throughout testicular development and that its levels increased with the appearance of differentiating germ cells. HSP86 was most abundant in the germ cell population and was present at significantly lower levels in the somatic cells. By contrast, the HSP84 protein was detected in the somatic cells of the testis rather than in germ cells. The steady-state levels of HSP86 and HSP84 paralleled the pattern of the expression of their respective mRNAs, suggesting that regulation at the level of translation was not a major mechanism controlling hsp90 gene expression in testicular cells. Immunoprecipitation analysis revealed that a 70-kDa protein coprecipitated with the HSP86/HSP84 proteins in testicular homogenates. This protein was identified as an HSP70 family member by immunoblot analysis, suggesting that HSP70 and HSP90 family members interact in testicular cells.
Liu, Ake; Wang, Yong; Zhang, Debao; Wang, Xuhua; Song, Huifang; Dang, Chunwang; Yao, Qin; Chen, Keping
2013-08-01
Helix-loop-helix (bHLH) proteins play essential regulatory roles in a variety of biological processes. These highly conserved proteins form a large transcription factor superfamily, and are commonly identified in large numbers within animal, plant, and fungal genomes. The bHLH domain has been well studied in many animal species, but has not yet been characterized in non-avian reptiles. In this study, we identified 102 putative bHLH genes in the genome of the green anole lizard, Anolis carolinensis. Based on phylogenetic analysis, these genes were classified into 43 families, with 43, 24, 16, 3, 10, and 3 members assigned into groups A, B, C, D, E, and F, respectively, and 3 members categorized as "orphans". Within-group evolutionary relationships inferred from the phylogenetic analysis were consistent with highly conserved patterns observed for introns and additional domains. Results from phylogenetic analysis of the H/E(spl) family suggest that genome and tandem gene duplications have contributed to this family's expansion. Our classification and evolutionary analysis has provided insights into the evolutionary diversification of animal bHLH genes, and should aid future studies on bHLH protein regulation of key growth and developmental processes.
CHALMERS, IAIN W.; HOFFMANN, KARL F.
2012-01-01
SUMMARY During platyhelminth infection, a cocktail of proteins is released by the parasite to aid invasion, initiate feeding, facilitate adaptation and mediate modulation of the host immune response. Included amongst these proteins is the Venom Allergen-Like (VAL) family, part of the larger sperm coating protein/Tpx-1/Ag5/PR-1/Sc7 (SCP/TAPS) superfamily. To explore the significance of this protein family during Platyhelminthes development and host interactions, we systematically summarize all published proteomic, genomic and immunological investigations of the VAL protein family to date. By conducting new genomic and transcriptomic interrogations to identify over 200 VAL proteins (228) from species in all 4 traditional taxonomic classes (Trematoda, Cestoda, Monogenea and Turbellaria), we further expand our knowledge related to platyhelminth VAL diversity across the phylum. Subsequent phylogenetic and tertiary structural analyses reveal several class-specific VAL features, which likely indicate a range of roles mediated by this protein family. Our comprehensive analysis of platyhelminth VALs represents a unifying synopsis for understanding diversity within this protein family and a firm context in which to initiate future functional characterization of these enigmatic members. PMID:22717097
Integrating protein structural dynamics and evolutionary analysis with Bio3D.
Skjærven, Lars; Yao, Xin-Qiu; Scarabelli, Guido; Grant, Barry J
2014-12-10
Popular bioinformatics approaches for studying protein functional dynamics include comparisons of crystallographic structures, molecular dynamics simulations and normal mode analysis. However, determining how observed displacements and predicted motions from these traditionally separate analyses relate to each other, as well as to the evolution of sequence, structure and function within large protein families, remains a considerable challenge. This is in part due to the general lack of tools that integrate information of molecular structure, dynamics and evolution. Here, we describe the integration of new methodologies for evolutionary sequence, structure and simulation analysis into the Bio3D package. This major update includes unique high-throughput normal mode analysis for examining and contrasting the dynamics of related proteins with non-identical sequences and structures, as well as new methods for quantifying dynamical couplings and their residue-wise dissection from correlation network analysis. These new methodologies are integrated with major biomolecular databases as well as established methods for evolutionary sequence and comparative structural analysis. New functionality for directly comparing results derived from normal modes, molecular dynamics and principal component analysis of heterogeneous experimental structure distributions is also included. We demonstrate these integrated capabilities with example applications to dihydrofolate reductase and heterotrimeric G-protein families along with a discussion of the mechanistic insight provided in each case. The integration of structural dynamics and evolutionary analysis in Bio3D enables researchers to go beyond a prediction of single protein dynamics to investigate dynamical features across large protein families. The Bio3D package is distributed with full source code and extensive documentation as a platform independent R package under a GPL2 license from http://thegrantlab.org/bio3d/ .
Family-specific scaling laws in bacterial genomes.
De Lazzari, Eleonora; Grilli, Jacopo; Maslov, Sergei; Cosentino Lagomarsino, Marco
2017-07-27
Among several quantitative invariants found in evolutionary genomics, one of the most striking is the scaling of the overall abundance of proteins, or protein domains, sharing a specific functional annotation across genomes of given size. The size of these functional categories change, on average, as power-laws in the total number of protein-coding genes. Here, we show that such regularities are not restricted to the overall behavior of high-level functional categories, but also exist systematically at the level of single evolutionary families of protein domains. Specifically, the number of proteins within each family follows family-specific scaling laws with genome size. Functionally similar sets of families tend to follow similar scaling laws, but this is not always the case. To understand this systematically, we provide a comprehensive classification of families based on their scaling properties. Additionally, we develop a quantitative score for the heterogeneity of the scaling of families belonging to a given category or predefined group. Under the common reasonable assumption that selection is driven solely or mainly by biological function, these findings point to fine-tuned and interdependent functional roles of specific protein domains, beyond our current functional annotations. This analysis provides a deeper view on the links between evolutionary expansion of protein families and the functional constraints shaping the gene repertoire of bacterial genomes. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Gruber, David F; Gaffney, Jean P; Mehr, Shaadi; DeSalle, Rob; Sparks, John S; Platisa, Jelena; Pieribone, Vincent A
2015-01-01
We report the identification and characterization of two new members of a family of bilirubin-inducible fluorescent proteins (FPs) from marine chlopsid eels and demonstrate a key region of the sequence that serves as an evolutionary switch from non-fluorescent to fluorescent fatty acid-binding proteins (FABPs). Using transcriptomic analysis of two species of brightly fluorescent Kaupichthys eels (Kaupichthys hyoproroides and Kaupichthys n. sp.), two new FPs were identified, cloned and characterized (Chlopsid FP I and Chlopsid FP II). We then performed phylogenetic analysis on 210 FABPs, spanning 16 vertebrate orders, and including 163 vertebrate taxa. We show that the fluorescent FPs diverged as a protein family and are the sister group to brain FABPs. Our results indicate that the evolution of this family involved at least three gene duplication events. We show that fluorescent FABPs possess a unique, conserved tripeptide Gly-Pro-Pro sequence motif, which is not found in non-fluorescent fatty acid binding proteins. This motif arose from a duplication event of the FABP brain isoforms and was under strong purifying selection, leading to the classification of this new FP family. Residues adjacent to the motif are under strong positive selection, suggesting a further refinement of the eel protein's fluorescent properties. We present a phylogenetic reconstruction of this emerging FP family and describe additional fluorescent FABP members from groups of distantly related eels. The elucidation of this class of fish FPs with diverse properties provides new templates for the development of protein-based fluorescent tools. The evolutionary adaptation from fatty acid-binding proteins to fluorescent fatty acid-binding proteins raises intrigue as to the functional role of bright green fluorescence in this cryptic genus of reclusive eels that inhabit a blue, nearly monochromatic, marine environment.
NovelFam3000 – Uncharacterized human protein domains conserved across model organisms
Kemmer, Danielle; Podowski, Raf M; Arenillas, David; Lim, Jonathan; Hodges, Emily; Roth, Peggy; Sonnhammer, Erik LL; Höög, Christer; Wasserman, Wyeth W
2006-01-01
Background Despite significant efforts from the research community, an extensive portion of the proteins encoded by human genes lack an assigned cellular function. Most metazoan proteins are composed of structural and/or functional domains, of which many appear in multiple proteins. Once a domain is characterized in one protein, the presence of a similar sequence in an uncharacterized protein serves as a basis for inference of function. Thus knowledge of a domain's function, or the protein within which it arises, can facilitate the analysis of an entire set of proteins. Description From the Pfam domain database, we extracted uncharacterized protein domains represented in proteins from humans, worms, and flies. A data centre was created to facilitate the analysis of the uncharacterized domain-containing proteins. The centre both provides researchers with links to dispersed internet resources containing gene-specific experimental data and enables them to post relevant experimental results or comments. For each human gene in the system, a characterization score is posted, allowing users to track the progress of characterization over time or to identify for study uncharacterized domains in well-characterized genes. As a test of the system, a subset of 39 domains was selected for analysis and the experimental results posted to the NovelFam3000 system. For 25 human protein members of these 39 domain families, detailed sub-cellular localizations were determined. Specific observations are presented based on the analysis of the integrated information provided through the online NovelFam3000 system. Conclusion Consistent experimental results between multiple members of a domain family allow for inferences of the domain's functional role. We unite bioinformatics resources and experimental data in order to accelerate the functional characterization of scarcely annotated domain families. PMID:16533400
Guo, Min; Yang, Ruifu; Huang, Chen; Liao, Qiwen; Fan, Guangyi; Sun, Chenghang; Lee, Simon Ming-Yuen
2017-04-04
The nuclear envelope is considered a key classification marker that distinguishes prokaryotes from eukaryotes. However, this marker does not apply to the family Planctomycetaceae, which has intracellular spaces divided by lipidic intracytoplasmic membranes (ICMs). Nuclear localization signal (NLS), a short stretch of amino acid sequence, destines to transport proteins from cytoplasm into nucleus, and is also associated with the development of nuclear envelope. We attempted to investigate the NLS motifs in Planctomycetaceae genomes to demonstrate the potential molecular transition in the development of intracellular membrane system. In this study, we identified NLS-like motifs that have the same amino acid compositions as experimentally identified NLSs in genomes of 11 representative species of family Planctomycetaceae. A total of 15 NLS types and 170 NLS-bearing proteins were detected in the 11 strains. To determine the molecular transformation, we compared NLS-bearing protein abundances in the 11 representative Planctomycetaceae genomes with them in genomes of 16 taxonomically varied microorganisms: nine bacteria, two archaea and five fungi. In the 27 strains, 29 NLS types and 1101 NLS-bearing proteins were identified, principal component analysis showed a significant transitional gradient from bacteria to Planctomycetaceae to fungi on their NLS-bearing protein abundance profiles. Then, we clustered the 993 non-redundant NLS-bearing proteins into 181 families and annotated their involved metabolic pathways. Afterwards, we aligned the ten types of NLS motifs from the 13 families containing NLS-bearing proteins among bacteria, Planctomycetaceae or fungi, considering their diversity, length and origin. A transition towards increased complexity from non-planctomycete bacteria to Planctomycetaceae to archaea and fungi was detected based on the complexity of the 10 types of NLS-like motifs in the 13 NLS-bearing proteins families. The results of this study reveal that Planctomycetaceae separates slightly from the members of non-planctomycete bacteria but still has substantial differences from fungi, based on the NLS-like motifs and NLS-bearing protein analysis.
Genome analysis of the platypus reveals unique signatures of evolution.
Warren, Wesley C; Hillier, LaDeana W; Marshall Graves, Jennifer A; Birney, Ewan; Ponting, Chris P; Grützner, Frank; Belov, Katherine; Miller, Webb; Clarke, Laura; Chinwalla, Asif T; Yang, Shiaw-Pyng; Heger, Andreas; Locke, Devin P; Miethke, Pat; Waters, Paul D; Veyrunes, Frédéric; Fulton, Lucinda; Fulton, Bob; Graves, Tina; Wallis, John; Puente, Xose S; López-Otín, Carlos; Ordóñez, Gonzalo R; Eichler, Evan E; Chen, Lin; Cheng, Ze; Deakin, Janine E; Alsop, Amber; Thompson, Katherine; Kirby, Patrick; Papenfuss, Anthony T; Wakefield, Matthew J; Olender, Tsviya; Lancet, Doron; Huttley, Gavin A; Smit, Arian F A; Pask, Andrew; Temple-Smith, Peter; Batzer, Mark A; Walker, Jerilyn A; Konkel, Miriam K; Harris, Robert S; Whittington, Camilla M; Wong, Emily S W; Gemmell, Neil J; Buschiazzo, Emmanuel; Vargas Jentzsch, Iris M; Merkel, Angelika; Schmitz, Juergen; Zemann, Anja; Churakov, Gennady; Kriegs, Jan Ole; Brosius, Juergen; Murchison, Elizabeth P; Sachidanandam, Ravi; Smith, Carly; Hannon, Gregory J; Tsend-Ayush, Enkhjargal; McMillan, Daniel; Attenborough, Rosalind; Rens, Willem; Ferguson-Smith, Malcolm; Lefèvre, Christophe M; Sharp, Julie A; Nicholas, Kevin R; Ray, David A; Kube, Michael; Reinhardt, Richard; Pringle, Thomas H; Taylor, James; Jones, Russell C; Nixon, Brett; Dacheux, Jean-Louis; Niwa, Hitoshi; Sekita, Yoko; Huang, Xiaoqiu; Stark, Alexander; Kheradpour, Pouya; Kellis, Manolis; Flicek, Paul; Chen, Yuan; Webber, Caleb; Hardison, Ross; Nelson, Joanne; Hallsworth-Pepin, Kym; Delehaunty, Kim; Markovic, Chris; Minx, Pat; Feng, Yucheng; Kremitzki, Colin; Mitreva, Makedonka; Glasscock, Jarret; Wylie, Todd; Wohldmann, Patricia; Thiru, Prathapan; Nhan, Michael N; Pohl, Craig S; Smith, Scott M; Hou, Shunfeng; Nefedov, Mikhail; de Jong, Pieter J; Renfree, Marilyn B; Mardis, Elaine R; Wilson, Richard K
2008-05-08
We present a draft genome sequence of the platypus, Ornithorhynchus anatinus. This monotreme exhibits a fascinating combination of reptilian and mammalian characters. For example, platypuses have a coat of fur adapted to an aquatic lifestyle; platypus females lactate, yet lay eggs; and males are equipped with venom similar to that of reptiles. Analysis of the first monotreme genome aligned these features with genetic innovations. We find that reptile and platypus venom proteins have been co-opted independently from the same gene families; milk protein genes are conserved despite platypuses laying eggs; and immune gene family expansions are directly related to platypus biology. Expansions of protein, non-protein-coding RNA and microRNA families, as well as repeat elements, are identified. Sequencing of this genome now provides a valuable resource for deep mammalian comparative analyses, as well as for monotreme biology and conservation.
Genome analysis of the platypus reveals unique signatures of evolution
Warren, Wesley C.; Hillier, LaDeana W.; Marshall Graves, Jennifer A.; Birney, Ewan; Ponting, Chris P.; Grützner, Frank; Belov, Katherine; Miller, Webb; Clarke, Laura; Chinwalla, Asif T.; Yang, Shiaw-Pyng; Heger, Andreas; Locke, Devin P.; Miethke, Pat; Waters, Paul D.; Veyrunes, Frédéric; Fulton, Lucinda; Fulton, Bob; Graves, Tina; Wallis, John; Puente, Xose S.; López-Otín, Carlos; Ordóñez, Gonzalo R.; Eichler, Evan E.; Chen, Lin; Cheng, Ze; Deakin, Janine E.; Alsop, Amber; Thompson, Katherine; Kirby, Patrick; Papenfuss, Anthony T.; Wakefield, Matthew J.; Olender, Tsviya; Lancet, Doron; Huttley, Gavin A.; Smit, Arian F. A.; Pask, Andrew; Temple-Smith, Peter; Batzer, Mark A.; Walker, Jerilyn A.; Konkel, Miriam K.; Harris, Robert S.; Whittington, Camilla M.; Wong, Emily S. W.; Gemmell, Neil J.; Buschiazzo, Emmanuel; Vargas Jentzsch, Iris M.; Merkel, Angelika; Schmitz, Juergen; Zemann, Anja; Churakov, Gennady; Kriegs, Jan Ole; Brosius, Juergen; Murchison, Elizabeth P.; Sachidanandam, Ravi; Smith, Carly; Hannon, Gregory J.; Tsend-Ayush, Enkhjargal; McMillan, Daniel; Attenborough, Rosalind; Rens, Willem; Ferguson-Smith, Malcolm; Lefèvre, Christophe M.; Sharp, Julie A.; Nicholas, Kevin R.; Ray, David A.; Kube, Michael; Reinhardt, Richard; Pringle, Thomas H.; Taylor, James; Jones, Russell C.; Nixon, Brett; Dacheux, Jean-Louis; Niwa, Hitoshi; Sekita, Yoko; Huang, Xiaoqiu; Stark, Alexander; Kheradpour, Pouya; Kellis, Manolis; Flicek, Paul; Chen, Yuan; Webber, Caleb; Hardison, Ross; Nelson, Joanne; Hallsworth-Pepin, Kym; Delehaunty, Kim; Markovic, Chris; Minx, Pat; Feng, Yucheng; Kremitzki, Colin; Mitreva, Makedonka; Glasscock, Jarret; Wylie, Todd; Wohldmann, Patricia; Thiru, Prathapan; Nhan, Michael N.; Pohl, Craig S.; Smith, Scott M.; Hou, Shunfeng; Renfree, Marilyn B.; Mardis, Elaine R.; Wilson, Richard K.
2009-01-01
We present a draft genome sequence of the platypus, Ornithorhynchus anatinus. This monotreme exhibits a fascinating combination of reptilian and mammalian characters. For example, platypuses have a coat of fur adapted to an aquatic lifestyle; platypus females lactate, yet lay eggs; and males are equipped with venom similar to that of reptiles. Analysis of the first monotreme genome aligned these features with genetic innovations. We find that reptile and platypus venom proteins have been co-opted independently from the same gene families; milk protein genes are conserved despite platypuses laying eggs; and immune gene family expansions are directly related to platypus biology. Expansions of protein, non-protein-coding RNA and microRNA families, as well as repeat elements, are identified. Sequencing of this genome now provides a valuable resource for deep mammalian comparative analyses, as well as for monotreme biology and conservation. PMID:18464734
A genomewide survey of basic helix–loop–helix factors in Drosophila
Moore, Adrian W.; Barbel, Sandra; Jan, Lily Yeh; Jan, Yuh Nung
2000-01-01
The basic helix–loop–helix (bHLH) transcription factors play important roles in the specification of tissue type during the development of animals. We have used the information contained in the recently published genomic sequence of Drosophila melanogaster to identify 12 additional bHLH proteins. By sequence analysis we have assigned these proteins to families defined by Atonal, Hairy-Enhancer of Split, Hand, p48, Mesp, MYC/USF, and the bHLH-Per, Arnt, Sim (PAS) domain. In addition, one single protein represents a unique family of bHLH proteins. mRNA in situ analysis demonstrates that the genes encoding these proteins are expressed in several tissue types but are particularly concentrated in the developing nervous system and mesoderm. PMID:10973473
Genome-wide identification and characterisation of F-box family in maize.
Jia, Fengjuan; Wu, Bingjiang; Li, Hui; Huang, Jinguang; Zheng, Chengchao
2013-11-01
F-box-containing proteins, as the key components of the protein degradation machinery, are widely distributed in higher plants and are considered as one of the largest known families of regulatory proteins. The F-box protein family plays a crucial role in plant growth and development and in response to biotic and abiotic stresses. However, systematic analysis of the F-box family in maize (Zea mays) has not been reported yet. In this paper, we identified and characterised the maize F-box genes in a genome-wide scale, including phylogenetic analysis, chromosome distribution, gene structure, promoter analysis and gene expression profiles. A total of 359 F-box genes were identified and divided into 15 subgroups by phylogenetic analysis. The F-box domain was relatively conserved, whereas additional motifs outside the F-box domain may indicate the functional diversification of maize F-box genes. These genes were unevenly distributed in ten maize chromosomes, suggesting that they expanded in the maize genome because of tandem and segmental duplication events. The expression profiles suggested that the maize F-box genes had temporal and spatial expression patterns. Putative cis-acting regulatory DNA elements involved in abiotic stresses were observed in maize F-box gene promoters. The gene expression profiles under abiotic stresses also suggested that some genes participated in stress responsive pathways. Furthermore, ten genes were chosen for quantitative real-time PCR analysis under drought stress and the results were consistent with the microarray data. This study has produced a comparative genomics analysis of the maize ZmFBX gene family that can be used in further studies to uncover their roles in maize growth and development.
Yu, Panpan; Agbaegbu, Chinyere; Malide, Daniela A.; Wu, Xufeng; Katagiri, Yasuhiro; Hammer, John A.; Geller, Herbert M.
2015-01-01
ABSTRACT The lipid phosphate phosphatase-related proteins (LPPRs), also known as plasticity-related genes (PRGs), are classified as a new brain-enriched subclass of the lipid phosphate phosphatase (LPP) superfamily. They induce membrane protrusions, neurite outgrowth or dendritic spine formation in cell lines and primary neurons. However, the exact roles of LPPRs and the mechanisms underlying their effects are not certain. Here, we present the results of a large-scale proteome analysis to determine LPPR1-interacting proteins using co-immunoprecipitation coupled to mass spectrometry. We identified putative LPPR1-binding proteins involved in various biological processes. Most interestingly, we identified the interaction of LPPR1 with its family member LPPR3, LPPR4 and LPPR5. Their interactions were characterized by co-immunoprecipitation and colocalization analysis using confocal and super-resolution microscopy. Moreover, co-expressing two LPPR members mutually elevated their protein levels, facilitated their plasma membrane localization and resulted in an increased induction of membrane protrusions as well as the phosphorylation of S6 ribosomal protein. Taken together, we revealed a new functional cooperation between LPPR family members and discovered for the first time that LPPRs likely exert their function through forming complex with its family members. PMID:26183180
Comparative Study of Lectin Domains in Model Species: New Insights into Evolutionary Dynamics
Van Holle, Sofie; De Schutter, Kristof; Eggermont, Lore; Tsaneva, Mariya; Dang, Liuyi; Van Damme, Els J. M.
2017-01-01
Lectins are present throughout the plant kingdom and are reported to be involved in diverse biological processes. In this study, we provide a comparative analysis of the lectin families from model species in a phylogenetic framework. The analysis focuses on the different plant lectin domains identified in five representative core angiosperm genomes (Arabidopsis thaliana, Glycine max, Cucumis sativus, Oryza sativa ssp. japonica and Oryza sativa ssp. indica). The genomes were screened for genes encoding lectin domains using a combination of Basic Local Alignment Search Tool (BLAST), hidden Markov models, and InterProScan analysis. Additionally, phylogenetic relationships were investigated by constructing maximum likelihood phylogenetic trees. The results demonstrate that the majority of the lectin families are present in each of the species under study. Domain organization analysis showed that most identified proteins are multi-domain proteins, owing to the modular rearrangement of protein domains during evolution. Most of these multi-domain proteins are widespread, while others display a lineage-specific distribution. Furthermore, the phylogenetic analyses reveal that some lectin families evolved to be similar to the phylogeny of the plant species, while others share a closer evolutionary history based on the corresponding protein domain architecture. Our results yield insights into the evolutionary relationships and functional divergence of plant lectins. PMID:28587095
Zhu, Changjun; Zhao, Jian; Bibikova, Marina; Leverson, Joel D.; Bossy-Wetzel, Ella; Fan, Jian-Bing; Abraham, Robert T.; Jiang, Wei
2005-01-01
Microtubule (MT)-based motor proteins, kinesins and dyneins, play important roles in multiple cellular processes including cell division. In this study, we describe the generation and use of an Escherichia coli RNase III-prepared human kinesin/dynein esiRNA library to systematically analyze the functions of all human kinesin/dynein MT motor proteins. Our results indicate that at least 12 kinesins are involved in mitosis and cytokinesis. Eg5 (a member of the kinesin-5 family), Kif2A (a member of the kinesin-13 family), and KifC1 (a member of the kinesin-14 family) are crucial for spindle formation; KifC1, MCAK (a member of the kinesin-13 family), CENP-E (a member of the kinesin-7 family), Kif14 (a member of the kinesin-3 family), Kif18 (a member of the kinesin-8 family), and Kid (a member of the kinesin-10 family) are required for chromosome congression and alignment; Kif4A and Kif4B (members of the kinesin-4 family) have roles in anaphase spindle dynamics; and Kif4A, Kif4B, MKLP1, and MKLP2 (members of the kinesin-6 family) are essential for cytokinesis. Using immunofluorescence analysis, time-lapse microscopy, and rescue experiments, we investigate the roles of these 12 kinesins in detail. PMID:15843429
He, Yan; Luo, Majing; Yi, Minhan; Sheng, Yue; Cheng, Yibin; Zhou, Rongjia; Cheng, Hanhua
2013-01-01
Gonad differentiation is one of the most important developmental events in vertebrates. Some heat shock proteins are associated with gonad development. Heat shock protein 70 (Hsp70) in the teleost fish and its roles in sex differentiation are poorly understood. We have identified a testis-enriched heat shock protein Hspa8b2 in the swamp eel using Western blot analysis and Mass Spectrometry (MS). Fourteen Hsp70 family genes were further identified in this species based on transcriptome information. The phylogenetic tree of Hsp70 family was constructed using the Maximum Likelihood method and their expression patterns in the swamp eel gonads were analyzed by reverse transcription-polymerase chain reaction (RT-PCR). There are fourteen gene members in the Hsp70 family in the swamp eel genome. Hsp70 family, particularly Hspa8, has expanded in the species. One of the family members Hspa8b2 is predominantly expressed in testis of the swamp eel.
Li, Ying Hong; Xu, Jing Yu; Tao, Lin; Li, Xiao Feng; Li, Shuang; Zeng, Xian; Chen, Shang Ying; Zhang, Peng; Qin, Chu; Zhang, Cheng; Chen, Zhe; Zhu, Feng; Chen, Yu Zong
2016-01-01
Knowledge of protein function is important for biological, medical and therapeutic studies, but many proteins are still unknown in function. There is a need for more improved functional prediction methods. Our SVM-Prot web-server employed a machine learning method for predicting protein functional families from protein sequences irrespective of similarity, which complemented those similarity-based and other methods in predicting diverse classes of proteins including the distantly-related proteins and homologous proteins of different functions. Since its publication in 2003, we made major improvements to SVM-Prot with (1) expanded coverage from 54 to 192 functional families, (2) more diverse protein descriptors protein representation, (3) improved predictive performances due to the use of more enriched training datasets and more variety of protein descriptors, (4) newly integrated BLAST analysis option for assessing proteins in the SVM-Prot predicted functional families that were similar in sequence to a query protein, and (5) newly added batch submission option for supporting the classification of multiple proteins. Moreover, 2 more machine learning approaches, K nearest neighbor and probabilistic neural networks, were added for facilitating collective assessment of protein functions by multiple methods. SVM-Prot can be accessed at http://bidd2.nus.edu.sg/cgi-bin/svmprot/svmprot.cgi.
Estrada-Gómez, Sebastian; Vargas-Muñoz, Leidy Johana; Saldarriaga-Córdoba, Mónica; Cifuentes, Yeimy; Perafan, Carlos
2017-04-01
Theraphosidae spider venoms are well known for possess a complex mixture of protein and non-protein compounds in their venom. The objective of this study was to report and identify different proteins translated from the venom gland DNA information of the recently described Theraphosidae spider Pamphobeteus verdolaga. Using a venom gland transcriptomic analysis, we reported a set of the first complete sequences of seven different proteins of the recenlty described Theraphosidae spider P. verdolaga. Protein analysis indicates the presence of different proteins on the venom composition of this new spider, some of them uncommon in the Theraphosidae family. MS/MS analysis of P. verdolaga showed different fragments matching sphingomyelinases (sicaritoxin), barytoxins, hexatoxins, latroinsectotoxins, and linear (zadotoxins) peptides. Only four of the MS/MS fragments showed 100% sequence similarity with one of the transcribed proteins. Transcriptomic analysis showed the presence of different groups of proteins like phospholipases, hyaluronidases, inhibitory cysteine knots (ICK) peptides among others. The three database of protein domains used in this study (Pfam, SMART and CDD) showed congruency in the search of unique conserved protein domain for only four of the translated proteins. Those proteins matched with EF-hand proteins, cysteine rich secretory proteins, jingzhaotoxins, theraphotoxins and hexatoxins, from different Mygalomorphae spiders belonging to the families Theraphosidae, Barychelidae and Hexathelidae. None of the analyzed sequences showed a complete 100% similarity. Copyright © 2017 Elsevier Ltd. All rights reserved.
2010-01-01
Background The extended light-harvesting complex (LHC) protein superfamily is a centerpiece of eukaryotic photosynthesis, comprising the LHC family and several families involved in photoprotection, like the LHC-like and the photosystem II subunit S (PSBS). The evolution of this complex superfamily has long remained elusive, partially due to previously missing families. Results In this study we present a meticulous search for LHC-like sequences in public genome and expressed sequence tag databases covering twelve representative photosynthetic eukaryotes from the three primary lineages of plants (Plantae): glaucophytes, red algae and green plants (Viridiplantae). By introducing a coherent classification of the different protein families based on both, hidden Markov model analyses and structural predictions, numerous new LHC-like sequences were identified and several new families were described, including the red lineage chlorophyll a/b-binding-like protein (RedCAP) family from red algae and diatoms. The test of alternative topologies of sequences of the highly conserved chlorophyll-binding core structure of LHC and PSBS proteins significantly supports the independent origins of LHC and PSBS families via two unrelated internal gene duplication events. This result was confirmed by the application of cluster likelihood mapping. Conclusions The independent evolution of LHC and PSBS families is supported by strong phylogenetic evidence. In addition, a possible origin of LHC and PSBS families from different homologous members of the stress-enhanced protein subfamily, a diverse and anciently paralogous group of two-helix proteins, seems likely. The new hypothesis for the evolution of the extended LHC protein superfamily proposed here is in agreement with the character evolution analysis that incorporates the distribution of families and subfamilies across taxonomic lineages. Intriguingly, stress-enhanced proteins, which are universally found in the genomes of green plants, red algae, glaucophytes and in diatoms with complex plastids, could represent an important and previously missing link in the evolution of the extended LHC protein superfamily. PMID:20673336
Genome-Wide Identification and Expression of Xenopus F-Box Family of Proteins.
Saritas-Yildirim, Banu; Pliner, Hannah A; Ochoa, Angelica; Silva, Elena M
2015-01-01
Protein degradation via the multistep ubiquitin/26S proteasome pathway is a rapid way to alter the protein profile and drive cell processes and developmental changes. Many key regulators of embryonic development are targeted for degradation by E3 ubiquitin ligases. The most studied family of E3 ubiquitin ligases is the SCF ubiquitin ligases, which use F-box adaptor proteins to recognize and recruit target proteins. Here, we used a bioinformatics screen and phylogenetic analysis to identify and annotate the family of F-box proteins in the Xenopus tropicalis genome. To shed light on the function of the F-box proteins, we analyzed expression of F-box genes during early stages of Xenopus development. Many F-box genes are broadly expressed with expression domains localized to diverse tissues including brain, spinal cord, eye, neural crest derivatives, somites, kidneys, and heart. All together, our genome-wide identification and expression profiling of the Xenopus F-box family of proteins provide a foundation for future research aimed to identify the precise role of F-box dependent E3 ubiquitin ligases and their targets in the regulatory circuits of development.
Bevans, Carville G.; Krettler, Christoph; Reinhart, Christoph; Watzka, Matthias; Oldenburg, Johannes
2015-01-01
In humans and other vertebrate animals, vitamin K 2,3-epoxide reductase (VKOR) family enzymes are the gatekeepers between nutritionally acquired K vitamins and the vitamin K cycle responsible for posttranslational modifications that confer biological activity upon vitamin K-dependent proteins with crucial roles in hemostasis, bone development and homeostasis, hormonal carbohydrate regulation and fertility. We report a phylogenetic analysis of the VKOR family that identifies five major clades. Combined phylogenetic and site-specific conservation analyses point to clade-specific similarities and differences in structure and function. We discovered a single-site determinant uniquely identifying VKOR homologs belonging to human pathogenic, obligate intracellular prokaryotes and protists. Building on previous work by Sevier et al. (Protein Science 14:1630), we analyzed structural data from both VKOR and prokaryotic disulfide bond formation protein B (DsbB) families and hypothesize an ancient evolutionary relationship between the two families where one family arose from the other through a gene duplication/deletion event. This has resulted in circular permutation of primary sequence threading through the four-helical bundle protein folds of both families. This is the first report of circular permutation relating distant α-helical membrane protein sequences and folds. In conclusion, we suggest a chronology for the evolution of the five extant VKOR clades. PMID:26230708
Bevans, Carville G; Krettler, Christoph; Reinhart, Christoph; Watzka, Matthias; Oldenburg, Johannes
2015-07-29
In humans and other vertebrate animals, vitamin K 2,3-epoxide reductase (VKOR) family enzymes are the gatekeepers between nutritionally acquired K vitamins and the vitamin K cycle responsible for posttranslational modifications that confer biological activity upon vitamin K-dependent proteins with crucial roles in hemostasis, bone development and homeostasis, hormonal carbohydrate regulation and fertility. We report a phylogenetic analysis of the VKOR family that identifies five major clades. Combined phylogenetic and site-specific conservation analyses point to clade-specific similarities and differences in structure and function. We discovered a single-site determinant uniquely identifying VKOR homologs belonging to human pathogenic, obligate intracellular prokaryotes and protists. Building on previous work by Sevier et al. (Protein Science 14:1630), we analyzed structural data from both VKOR and prokaryotic disulfide bond formation protein B (DsbB) families and hypothesize an ancient evolutionary relationship between the two families where one family arose from the other through a gene duplication/deletion event. This has resulted in circular permutation of primary sequence threading through the four-helical bundle protein folds of both families. This is the first report of circular permutation relating distant a-helical membrane protein sequences and folds. In conclusion, we suggest a chronology for the evolution of the five extant VKOR clades.
Chloroplast outer envelope protein P39 in Arabidopsis thaliana belongs to the Omp85 protein family.
Hsueh, Yi-Ching; Flinner, Nadine; Gross, Lucia E; Haarmann, Raimund; Mirus, Oliver; Sommer, Maik S; Schleiff, Enrico
2017-08-01
Proteins of the Omp85 family chaperone the membrane insertion of β-barrel-shaped outer membrane proteins in bacteria, mitochondria, and probably chloroplasts and facilitate the transfer of nuclear-encoded cytosolically synthesized preproteins across the outer envelope of chloroplasts. This protein family is characterized by N-terminal polypeptide transport-associated (POTRA) domains and a C-terminal membrane-embedded β-barrel. We have investigated a recently identified Omp85 family member of Arabidopsis thaliana annotated as P39. We show by in vitro and in vivo experiments that P39 is localized in chloroplasts. The electrophysiological properties of P39 are consistent with those of other Omp85 family members confirming the sequence based assignment of P39 to this family. Bioinformatic analysis showed that P39 lacks any POTRA domain, while a complete 16 stranded β-barrel including the highly conserved L6 loop is proposed. The electrophysiological properties are most comparable to Toc75-V, which is consistent with the phylogenetic clustering of P39 in the Toc75-V rather than the Toc75-III branch of the Omp85 family tree. Taken together P39 forms a pore with Omp85 family protein characteristics. The bioinformatic comparison of the pore region of Toc75-III, Toc75-V, and P39 shows distinctions of the barrel region most likely related to function. Proteins 2017; 85:1391-1401. © 2014 Wiley Periodicals, Inc. © 2014 Wiley Periodicals, Inc.
2014-01-01
Background Bacteroides spp. form a significant part of our gut microbiome and are well known for optimized metabolism of diverse polysaccharides. Initial analysis of the archetypal Bacteroides thetaiotaomicron genome identified 172 glycosyl hydrolases and a large number of uncharacterized proteins associated with polysaccharide metabolism. Results BT_1012 from Bacteroides thetaiotaomicron VPI-5482 is a protein of unknown function and a member of a large protein family consisting entirely of uncharacterized proteins. Initial sequence analysis predicted that this protein has two domains, one on the N- and one on the C-terminal. A PSI-BLAST search found over 150 full length and over 90 half size homologs consisting only of the N-terminal domain. The experimentally determined three-dimensional structure of the BT_1012 protein confirms its two-domain architecture and structural analysis of both domains suggests their specific functions. The N-terminal domain is a putative catalytic domain with significant similarity to known glycoside hydrolases, the C-terminal domain has a beta-sandwich fold typically found in C-terminal domains of other glycosyl hydrolases, however these domains are typically involved in substrate binding. We describe the structure of the BT_1012 protein and discuss its sequence-structure relationship and their possible functional implications. Conclusions Structural and sequence analyses of the BT_1012 protein identifies it as a glycosyl hydrolase, expanding an already impressive catalog of enzymes involved in polysaccharide metabolism in Bacteroides spp. Based on this we have renamed the Pfam families representing the two domains found in the BT_1012 protein, PF13204 and PF12904, as putative glycoside hydrolase and glycoside hydrolase-associated C-terminal domain respectively. PMID:24742328
Inupakutika, Madhuri A; Sengupta, Soham; Nechushtai, Rachel; Jennings, Patricia A; Onuchic, Jose' N; Azad, Rajeev K; Padilla, Pamela; Mittler, Ron
2017-02-16
NEET proteins belong to a unique family of iron-sulfur proteins in which the 2Fe-2S cluster is coordinated by a CDGSH domain that is followed by the "NEET" motif. They are involved in the regulation of iron and reactive oxygen metabolism, and have been associated with the progression of diabetes, cancer, aging and neurodegenerative diseases. Despite their important biological functions, the evolution and diversification of eukaryotic NEET proteins are largely unknown. Here we used the three members of the human NEET protein family (CISD1, mitoNEET; CISD2, NAF-1 or Miner 1; and CISD3, Miner2) as our guides to conduct a phylogenetic analysis of eukaryotic NEET proteins and their evolution. Our findings identified the slime mold Dictyostelium discoideum's CISD proteins as the closest to the ancient archetype of eukaryotic NEET proteins. We further identified CISD3 homologs in fungi that were previously reported not to contain any NEET proteins, and revealed that plants lack homolog(s) of CISD3. Furthermore, our study suggests that the mammalian NEET proteins, mitoNEET (CISD1) and NAF-1 (CISD2), emerged via gene duplication around the origin of vertebrates. Our findings provide new insights into the classification and expansion of the NEET protein family, as well as offer clues to the diverged functions of the human mitoNEET and NAF-1 proteins.
Comparative structural analysis of human DEAD-box RNA helicases.
Schütz, Patrick; Karlberg, Tobias; van den Berg, Susanne; Collins, Ruairi; Lehtiö, Lari; Högbom, Martin; Holmberg-Schiavone, Lovisa; Tempel, Wolfram; Park, Hee-Won; Hammarström, Martin; Moche, Martin; Thorsell, Ann-Gerd; Schüler, Herwig
2010-09-30
DEAD-box RNA helicases play various, often critical, roles in all processes where RNAs are involved. Members of this family of proteins are linked to human disease, including cancer and viral infections. DEAD-box proteins contain two conserved domains that both contribute to RNA and ATP binding. Despite recent advances the molecular details of how these enzymes convert chemical energy into RNA remodeling is unknown. We present crystal structures of the isolated DEAD-domains of human DDX2A/eIF4A1, DDX2B/eIF4A2, DDX5, DDX10/DBP4, DDX18/myc-regulated DEAD-box protein, DDX20, DDX47, DDX52/ROK1, and DDX53/CAGE, and of the helicase domains of DDX25 and DDX41. Together with prior knowledge this enables a family-wide comparative structural analysis. We propose a general mechanism for opening of the RNA binding site. This analysis also provides insights into the diversity of DExD/H- proteins, with implications for understanding the functions of individual family members.
Comparative Structural Analysis of Human DEAD-Box RNA Helicases
Schütz, Patrick; Karlberg, Tobias; van den Berg, Susanne; Collins, Ruairi; Lehtiö, Lari; Högbom, Martin; Holmberg-Schiavone, Lovisa; Tempel, Wolfram; Park, Hee-Won; Hammarström, Martin; Moche, Martin; Thorsell, Ann-Gerd; Schüler, Herwig
2010-01-01
DEAD-box RNA helicases play various, often critical, roles in all processes where RNAs are involved. Members of this family of proteins are linked to human disease, including cancer and viral infections. DEAD-box proteins contain two conserved domains that both contribute to RNA and ATP binding. Despite recent advances the molecular details of how these enzymes convert chemical energy into RNA remodeling is unknown. We present crystal structures of the isolated DEAD-domains of human DDX2A/eIF4A1, DDX2B/eIF4A2, DDX5, DDX10/DBP4, DDX18/myc-regulated DEAD-box protein, DDX20, DDX47, DDX52/ROK1, and DDX53/CAGE, and of the helicase domains of DDX25 and DDX41. Together with prior knowledge this enables a family-wide comparative structural analysis. We propose a general mechanism for opening of the RNA binding site. This analysis also provides insights into the diversity of DExD/H- proteins, with implications for understanding the functions of individual family members. PMID:20941364
Zhao, Jie
2010-01-01
Arabinogalactan proteins (AGPs) comprise a family of hydroxyproline-rich glycoproteins that are implicated in plant growth and development. In this study, 69 AGPs are identified from the rice genome, including 13 classical AGPs, 15 arabinogalactan (AG) peptides, three non-classical AGPs, three early nodulin-like AGPs (eNod-like AGPs), eight non-specific lipid transfer protein-like AGPs (nsLTP-like AGPs), and 27 fasciclin-like AGPs (FLAs). The results from expressed sequence tags, microarrays, and massively parallel signature sequencing tags are used to analyse the expression of AGP-encoding genes, which is confirmed by real-time PCR. The results reveal that several rice AGP-encoding genes are predominantly expressed in anthers and display differential expression patterns in response to abscisic acid, gibberellic acid, and abiotic stresses. Based on the results obtained from this analysis, an attempt has been made to link the protein structures and expression patterns of rice AGP-encoding genes to their functions. Taken together, the genome-wide identification and expression analysis of the rice AGP gene family might facilitate further functional studies of rice AGPs. PMID:20423940
CORAL: aligning conserved core regions across domain families.
Fong, Jessica H; Marchler-Bauer, Aron
2009-08-01
Homologous protein families share highly conserved sequence and structure regions that are frequent targets for comparative analysis of related proteins and families. Many protein families, such as the curated domain families in the Conserved Domain Database (CDD), exhibit similar structural cores. To improve accuracy in aligning such protein families, we propose a profile-profile method CORAL that aligns individual core regions as gap-free units. CORAL computes optimal local alignment of two profiles with heuristics to preserve continuity within core regions. We benchmarked its performance on curated domains in CDD, which have pre-defined core regions, against COMPASS, HHalign and PSI-BLAST, using structure superpositions and comprehensive curator-optimized alignments as standards of truth. CORAL improves alignment accuracy on core regions over general profile methods, returning a balanced score of 0.57 for over 80% of all domain families in CDD, compared with the highest balanced score of 0.45 from other methods. Further, CORAL provides E-values to aid in detecting homologous protein families and, by respecting block boundaries, produces alignments with improved 'readability' that facilitate manual refinement. CORAL will be included in future versions of the NCBI Cn3D/CDTree software, which can be downloaded at http://www.ncbi.nlm.nih.gov/Structure/cdtree/cdtree.shtml. Supplementary data are available at Bioinformatics online.
Xu, Ruirui; Liu, Caiyun; Li, Ning; Zhang, Shizhong
2016-12-01
Argonaute (AGO) proteins, which are found in yeast, animals, and plants, are the core molecules of the RNA-induced silencing complex. These proteins play important roles in plant growth, development, and responses to biotic stresses. The complete analysis and classification of the AGO gene family have been recently reported in different plants. Nevertheless, systematic analysis and expression profiling of these genes have not been performed in apple (Malus domestica). Approximately 15 AGO genes were identified in the apple genome. The phylogenetic tree, chromosome location, conserved protein motifs, gene structure, and expression of the AGO gene family in apple were analyzed for gene prediction. All AGO genes were phylogenetically clustered into four groups (i.e., AGO1, AGO4, MEL1/AGO5, and ZIPPY/AGO7) with the AGO genes of Arabidopsis. These groups of the AGO gene family were statistically analyzed and compared among 31 plant species. The predicted apple AGO genes are distributed across nine chromosomes at different densities and include three segment duplications. Expression studies indicated that 15 AGO genes exhibit different expression patterns in at least one of the tissues tested. Additionally, analysis of gene expression levels indicated that the genes are mostly involved in responses to NaCl, PEG, heat, and low-temperature stresses. Hence, several candidate AGO genes are involved in different aspects of physiological and developmental processes and may play an important role in abiotic stress responses in apple. To the best of our knowledge, this study is the first to report a comprehensive analysis of the apple AGO gene family. Our results provide useful information to understand the classification and putative functions of these proteins, especially for gene members that may play important roles in abiotic stress responses in M. hupehensis.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jeon, Hyejin; Zheng, Long Tai; Lee, Shinrye
2011-08-15
Small G protein superfamily consists of more than 150 members, and is classified into six families: the Ras, Rho, Rab, Arf, Ran, and RGK families. They regulate a wide variety of cell functions such as cell proliferation/differentiation, cytoskeletal reorganization, vesicle trafficking, nucleocytoplasmic transport and microtubule organization. The small G proteins have also been shown to regulate cell death/survival and cell shape. In this study, we compared the role of representative members of the six families of small G proteins in cell migration and cell death/survival, two cellular phenotypes that are associated with inflammation, tumorigenesis, and metastasis. Our results show thatmore » small G proteins of the six families differentially regulate cell death and cell cycle distribution. In particular, our results indicate that Rho family of small G proteins is antiapoptotic. Ras, Rho, and Ran families promoted cell migration. There was no significant correlation between the cell death- and cell migration-regulating activities of the small G proteins. Nevertheless, RalA was not only cytoprotective against multiple chemotherapeutic drugs, but also promigratory inducing stress fiber formation, which was accompanied by the activation of Akt and Erk pathways. Our study provides a framework for further systematic investigation of small G proteins in the perspectives of cell death/survival and motility in inflammation and cancer.« less
Chai, Wenbo; Jiang, Pengfei; Huang, Guoyu; Jiang, Haiyang; Li, Xiaoyu
2017-10-01
The TCP family is a group of plant-specific transcription factors. TCP genes encode proteins harboring bHLH structure, which is implicated in DNA binding and protein-protein interactions and known as the TCP domain. TCP genes play important roles in plant development and have been evolutionarily and functionally elaborated in various plants, however, no overall phylogenetic analysis or expression profiling of TCP genes in Zea mays has been reported. In the present study, a systematic analysis of molecular evolution and functional prediction of TCP family genes in maize ( Z . mays L.) has been conducted. We performed a genome-wide survey of TCP genes in maize, revealing the gene structure, chromosomal location and phylogenetic relationship of family members. Microsynteny between grass species and tissue-specific expression profiles were also investigated. In total, 29 TCP genes were identified in the maize genome, unevenly distributed on the 10 maize chromosomes. Additionally, ZmTCP genes were categorized into nine classes based on phylogeny and purifying selection may largely be responsible for maintaining the functions of maize TCP genes. What's more, microsynteny analysis suggested that TCP genes have been conserved during evolution. Finally, expression analysis revealed that most TCP genes are expressed in the stem and ear, which suggests that ZmTCP genes influence stem and ear growth. This result is consistent with the previous finding that maize TCP genes represses the growth of axillary organs and enables the formation of female inflorescences. Altogether, this study presents a thorough overview of TCP family in maize and provides a new perspective on the evolution of this gene family. The results also indicate that TCP family genes may be involved in development stage in plant growing conditions. Additionally, our results will be useful for further functional analysis of the TCP gene family in maize.
Woo, Young-Min; Hu, David Wang-Nan; Larkins, Brian A.; Jung, Rudolf
2001-01-01
We analyzed cDNA libraries from developing endosperm of the B73 maize inbred line to evaluate the expression of storage protein genes. This study showed that zeins are by far the most highly expressed genes in the endosperm, but we found an inverse relationship between the number of zein genes and the relative amount of specific mRNAs. Although α-zeins are encoded by large multigene families, only a few of these genes are transcribed at high or detectable levels. In contrast, relatively small gene families encode the γ- and δ-zeins, and members of these gene families, especially the γ-zeins, are highly expressed. Knowledge of expressed storage protein genes allowed the development of DNA and antibody probes that distinguish between closely related gene family members. Using in situ hybridization, we found differences in the temporal and spatial expression of the α-, γ-, and δ-zein gene families, which provides evidence that γ-zeins are synthesized throughout the endosperm before α- and δ-zeins. This observation is consistent with earlier studies that suggested that γ-zeins play an important role in prolamin protein body assembly. Analysis of endosperm cDNAs also revealed several previously unidentified proteins, including a 50-kD γ-zein, an 18-kD α-globulin, and a legumin-related protein. Immunolocalization of the 50-kD γ-zein showed this protein to be located at the surface of prolamin-containing protein bodies, similar to other γ-zeins. The 18-kD α-globulin, however, is deposited in novel, vacuole-like organelles that were not described previously in maize endosperm. PMID:11595803
Bartho, Joseph D.; Bellini, Dom; Wuerges, Jochen; Demitri, Nicola; Toccafondi, Mirco; Schmitt, Armin O.; Zhao, Youfu; Walsh, Martin A.
2017-01-01
AmyR is a stress and virulence associated protein from the plant pathogenic Enterobacteriaceae species Erwinia amylovora, and is a functionally conserved ortholog of YbjN from Escherichia coli. The crystal structure of E. amylovora AmyR reveals a class I type III secretion chaperone-like fold, despite the lack of sequence similarity between these two classes of protein and lacking any evidence of a secretion-associated role. The results indicate that AmyR, and YbjN proteins in general, function through protein-protein interactions without any enzymatic action. The YbjN proteins of Enterobacteriaceae show remarkably low sequence similarity with other members of the YbjN protein family in Eubacteria, yet a high level of structural conservation is observed. Across the YbjN protein family sequence conservation is limited to residues stabilising the protein core and dimerization interface, while interacting regions are only conserved between closely related species. This study presents the first structure of a YbjN protein from Enterobacteriaceae, the most highly divergent and well-studied subgroup of YbjN proteins, and an in-depth sequence and structural analysis of this important but poorly understood protein family. PMID:28426806
Bartho, Joseph D; Bellini, Dom; Wuerges, Jochen; Demitri, Nicola; Toccafondi, Mirco; Schmitt, Armin O; Zhao, Youfu; Walsh, Martin A; Benini, Stefano
2017-01-01
AmyR is a stress and virulence associated protein from the plant pathogenic Enterobacteriaceae species Erwinia amylovora, and is a functionally conserved ortholog of YbjN from Escherichia coli. The crystal structure of E. amylovora AmyR reveals a class I type III secretion chaperone-like fold, despite the lack of sequence similarity between these two classes of protein and lacking any evidence of a secretion-associated role. The results indicate that AmyR, and YbjN proteins in general, function through protein-protein interactions without any enzymatic action. The YbjN proteins of Enterobacteriaceae show remarkably low sequence similarity with other members of the YbjN protein family in Eubacteria, yet a high level of structural conservation is observed. Across the YbjN protein family sequence conservation is limited to residues stabilising the protein core and dimerization interface, while interacting regions are only conserved between closely related species. This study presents the first structure of a YbjN protein from Enterobacteriaceae, the most highly divergent and well-studied subgroup of YbjN proteins, and an in-depth sequence and structural analysis of this important but poorly understood protein family.
Brown, S M; Crouch, M L
1990-01-01
We have isolated and characterized cDNA clones of a gene family (P2) expressed in Oenothera organensis pollen. This family contains approximately six to eight family members and is expressed at high levels only in pollen. The predicted protein sequence from a near full-length cDNA clone shows that the protein products of these genes are at least 38,000 daltons. We identified the protein encoded by one of the cDNAs in this family by using antibodies to beta-galactosidase/pollen cDNA fusion proteins. Immunoblot analysis using these antibodies identifies a family of proteins of approximately 40 kilodaltons that is present in mature pollen, indicating that these mRNAs are not stored solely for translation after pollen germination. These proteins accumulate late in pollen development and are not detectable in other parts of the plant. Although not present in unpollinated or self-pollinated styles, the 40-kilodalton to 45-kilodalton antigens are detectable in extracts from cross-pollinated styles, suggesting that the proteins are present in pollen tubes growing through the style during pollination. The proteins are also present in pollen tubes growing in vitro. Both nucleotide and amino acid sequences are similar to the published sequences for cDNAs encoding the enzyme polygalacturonase, which suggests that the P2 gene family may function in depolymerizing pectin during pollen development, germination, and tube growth. Cross-hybridizing RNAs and immunoreactive proteins were detected in pollen from a wide variety of plant species, which indicates that the P2 family of polygalacturonase-like genes are conserved and may be expressed in the pollen from many angiosperms. PMID:2152116
Mushegian, Arcady; Karin, Eli Levy; Pupko, Tal
2018-01-01
The order Herpesvirales includes animal viruses with large double-strand DNA genomes replicating in the nucleus. The main capsid protein in the best-studied family Herpesviridae contains a domain with HK97-like fold related to bacteriophage head proteins, and several virion maturation factors are also homologous between phages and herpesviruses. The origin of herpesvirus DNA replication proteins is less well understood. While analyzing the genomes of herpesviruses in the family Malacohepresviridae, we identified nearly 30 families of proteins conserved in other herpesviruses, including several phage-related domains in morphogenetic proteins. Herpesvirus DNA replication factors have complex evolutionary history: some are related to cellular proteins, but others are closer to homologs from large nucleocytoplasmic DNA viruses. Phylogenetic analyses suggest that the core replication machinery of herpesviruses may have been recruited from the same pool as in the case of other large DNA viruses of eukaryotes. Published by Elsevier Inc.
Ohta, K; Iwai, K; Kasahara, Y; Taniguchi, N; Krajewski, S; Reed, J C; Miyawaki, T
1995-11-01
The ability of Bcl-2 to inhibit apoptotic cell death is well established. Several homologues of the bcl-2 gene, such as bax, bcl-x or mcl-1, have recently been identified. Like Bcl-2, both Bcl-XL and Mcl-1 appear to function as repressors of apoptotic cell death, whereas Bax facilitates it, indicating possible interactions among them in the control of cellular survival. To investigate the in vivo role of expression of bcl-2 gene family products, immunoblot analysis using corresponding specific antisera was performed for peripheral blood cells and some lymphoid tissues in humans. We demonstrated that all Bcl-2 family proteins were expressed at various levels in hematolymphoid cell subpopulations isolated from peripheral blood, tonsil, spleen and thymus. Lymphoid expression of Bcl-2 family proteins tended to increase following activation, but declined with time in culture. Loss of Bcl-2 in cultured lymphoid cells was especially marked. Sole expression of Bax, but not other members of the Bcl-2 family, was observed on neutrophils, seemingly reflecting their shortest life-span among blood leukocytes. The results support the notion that a balance of expression of Bcl-2 family proteins may regulate the life and death of hematolymphoid cells at different stages of cell differentiation and activation.
ITEP: an integrated toolkit for exploration of microbial pan-genomes.
Benedict, Matthew N; Henriksen, James R; Metcalf, William W; Whitaker, Rachel J; Price, Nathan D
2014-01-03
Comparative genomics is a powerful approach for studying variation in physiological traits as well as the evolution and ecology of microorganisms. Recent technological advances have enabled sequencing large numbers of related genomes in a single project, requiring computational tools for their integrated analysis. In particular, accurate annotations and identification of gene presence and absence are critical for understanding and modeling the cellular physiology of newly sequenced genomes. Although many tools are available to compare the gene contents of related genomes, new tools are necessary to enable close examination and curation of protein families from large numbers of closely related organisms, to integrate curation with the analysis of gain and loss, and to generate metabolic networks linking the annotations to observed phenotypes. We have developed ITEP, an Integrated Toolkit for Exploration of microbial Pan-genomes, to curate protein families, compute similarities to externally-defined domains, analyze gene gain and loss, and generate draft metabolic networks from one or more curated reference network reconstructions in groups of related microbial species among which the combination of core and variable genes constitute the their "pan-genomes". The ITEP toolkit consists of: (1) a series of modular command-line scripts for identification, comparison, curation, and analysis of protein families and their distribution across many genomes; (2) a set of Python libraries for programmatic access to the same data; and (3) pre-packaged scripts to perform common analysis workflows on a collection of genomes. ITEP's capabilities include de novo protein family prediction, ortholog detection, analysis of functional domains, identification of core and variable genes and gene regions, sequence alignments and tree generation, annotation curation, and the integration of cross-genome analysis and metabolic networks for study of metabolic network evolution. ITEP is a powerful, flexible toolkit for generation and curation of protein families. ITEP's modular design allows for straightforward extension as analysis methods and tools evolve. By integrating comparative genomics with the development of draft metabolic networks, ITEP harnesses the power of comparative genomics to build confidence in links between genotype and phenotype and helps disambiguate gene annotations when they are evaluated in both evolutionary and metabolic network contexts.
Huang, Jianyan; Zhao, Xiaobo; Weng, Xiaoyu; Wang, Lei; Xie, Weibo
2012-01-01
Background The B-box (BBX) -containing proteins are a class of zinc finger proteins that contain one or two B-box domains and play important roles in plant growth and development. The Arabidopsis BBX gene family has recently been re-identified and renamed. However, there has not been a genome-wide survey of the rice BBX (OsBBX) gene family until now. Methodology/Principal Findings In this study, we identified 30 rice BBX genes through a comprehensive bioinformatics analysis. Each gene was assigned a uniform nomenclature. We described the chromosome localizations, gene structures, protein domains, phylogenetic relationship, whole life-cycle expression profile and diurnal expression patterns of the OsBBX family members. Based on the phylogeny and domain constitution, the OsBBX gene family was classified into five subfamilies. The gene duplication analysis revealed that only chromosomal segmental duplication contributed to the expansion of the OsBBX gene family. The expression profile of the OsBBX genes was analyzed by Affymetrix GeneChip microarrays throughout the entire life-cycle of rice cultivar Zhenshan 97 (ZS97). In addition, microarray analysis was performed to obtain the expression patterns of these genes under light/dark conditions and after three phytohormone treatments. This analysis revealed that the expression patterns of the OsBBX genes could be classified into eight groups. Eight genes were regulated under the light/dark treatments, and eleven genes showed differential expression under at least one phytohormone treatment. Moreover, we verified the diurnal expression of the OsBBX genes using the data obtained from the Diurnal Project and qPCR analysis, and the results indicated that many of these genes had a diurnal expression pattern. Conclusions/Significance The combination of the genome-wide identification and the expression and diurnal analysis of the OsBBX gene family should facilitate additional functional studies of the OsBBX genes. PMID:23118960
Elucidating Proteoform Families from Proteoform Intact-Mass and Lysine-Count Measurements
2016-01-01
Proteomics is presently dominated by the “bottom-up” strategy, in which proteins are enzymatically digested into peptides for mass spectrometric identification. Although this approach is highly effective at identifying large numbers of proteins present in complex samples, the digestion into peptides renders it impossible to identify the proteoforms from which they were derived. We present here a powerful new strategy for the identification of proteoforms and the elucidation of proteoform families (groups of related proteoforms) from the experimental determination of the accurate proteoform mass and number of lysine residues contained. Accurate proteoform masses are determined by standard LC–MS analysis of undigested protein mixtures in an Orbitrap mass spectrometer, and the lysine count is determined using the NeuCode isotopic tagging method. We demonstrate the approach in analysis of the yeast proteome, revealing 8637 unique proteoforms and 1178 proteoform families. The elucidation of proteoforms and proteoform families afforded here provides an unprecedented new perspective upon proteome complexity and dynamics. PMID:26941048
Understanding protein lids: kinetic analysis of active hinge mutants in triosephosphate isomerase.
Sun, J; Sampson, N S
1999-08-31
In previous work we tested what three amino acid sequences could serve as a protein hinge in triosephosphate isomerase [Sun, J., and Sampson, N. S. (1998) Protein Sci. 7, 1495-1505]. We generated a genetic library encoding all 8000 possible 3 amino acid combinations at the C-terminal hinge and selected for those combinations of amino acids that formed active mutants. These mutants were classified into six phylogenetic families. Two families resembled wild-type hinges, and four families represented new types of hinges. In this work, the kinetic characteristics and thermal stabilities of mutants representing each of these families were determined in order to understand what properties make an efficient protein hinge, and why all of the families are not observed in nature. From a steady-state kinetic analysis of our mutants, it is clear that the partitioning between protonation of intermediate to form product and intermediate release from the enzyme surface to form methylglyoxal (a decomposition product) is not affected. The two most impaired mutants undergo a change in rate-limiting step from enediol formation to dihydroxyacetone phosphate binding. Thus, it appears that k(cat)/K(m)'s are reduced relative to wild type as a result of slower Michaelis complex formation and dissociation, rather than increased loop opening speed.
Structure and Function of Na+-Symporters with Inverted Repeats
Abramson, Jeff; Wright, Ernest M.
2009-01-01
Summary Symporters are membrane proteins that couple energy stored in electrochemical potential gradients to drive the cotransport of molecules and ions into cells. Traditionally, proteins are classified into gene families based on sequence homology and functional properties, e.g. the sodium glucose (SLC5 or Sodium Solute Symporter Family, SSS or SSF) and GABA (SLC6 or Neurotransmitter Sodium Symporter Family, NSS or SNF) symporter families [1-4]. Recently, it has been established that four Na+-symporter proteins with unrelated sequences have a common structural core containing an inverted repeat of 5 transmembrane (TM) helices [5-8]. Analysis of these four structures reveals that they reside in different conformations along the transport cycle providing atomic insight into the mechanism of sodium solute cotransport. PMID:19631523
Complete nucleotide sequence of jasmine virus H, a new member of the family Tombusviridae.
Zhuo, Tao; Zhu, Li-Juan; Lu, Cheng-Cong; Jiang, Chao-Yang; Chen, Zi-Yin; Zhang, Guangzhi; Wang, Zong-Hua; Jovel, Juan; Han, Yan-Hong
2018-03-01
Jasmine virus H (JaVH) is a novel virus associated with symptoms of yellow mosaic on jasmine. The JaVH genome is 3,867 nt in length with five open reading frames (ORFs) encoding a 27-kDa protein (ORF 1), an 87-kDa replicase protein (ORF 2), two centrally located movement proteins (ORF 3 and 4), and a 37-kDa capsid protein (ORF 5). Based on genomic and phylogenetic analysis, JaVH is predicted to be a member of the genus Pelarspovirus in the family Tombusviridae.
Gong, Wei; He, Kun; Covington, Mike; Dinesh-Kumar, S. P.; Snyder, Michael; Harmer, Stacey L.; Zhu, Yu-Xian; Deng, Xing Wang
2009-01-01
We used our collection of Arabidopsis transcription factor (TF) ORFeome clones to construct protein microarrays containing as many as 802 TF proteins. These protein microarrays were used for both protein-DNA and protein-protein interaction analyses. For protein-DNA interaction studies, we examined AP2/ERF family TFs and their cognate cis-elements. By careful comparison of the DNA-binding specificity of 13 TFs on the protein microarray with previous non-microarray data, we showed that protein microarrays provide an efficient and high throughput tool for genome-wide analysis of TF-DNA interactions. This microarray protein-DNA interaction analysis allowed us to derive a comprehensive view of DNA-binding profiles of AP2/ERF family proteins in Arabidopsis. It also revealed four TFs that bound the EE (evening element) and had the expected phased gene expression under clock-regulation, thus providing a basis for further functional analysis of their roles in clock regulation of gene expression. We also developed procedures for detecting protein interactions using this TF protein microarray and discovered four novel partners that interact with HY5, which can be validated by yeast two-hybrid assays. Thus, plant TF protein microarrays offer an attractive high-throughput alternative to traditional techniques for TF functional characterization on a global scale. PMID:19802365
Exploration of Uncharted Regions of the Protein Universe
Jaroszewski, Lukasz; Li, Zhanwen; Krishna, S. Sri; Bakolitsa, Constantina; Wooley, John; Deacon, Ashley M.; Wilson, Ian A.; Godzik, Adam
2009-01-01
The genome projects have unearthed an enormous diversity of genes of unknown function that are still awaiting biological and biochemical characterization. These genes, as most others, can be grouped into families based on sequence similarity. The PFAM database currently contains over 2,200 such families, referred to as domains of unknown function (DUF). In a coordinated effort, the four large-scale centers of the NIH Protein Structure Initiative have determined the first three-dimensional structures for more than 250 of these DUF families. Analysis of the first 248 reveals that about two thirds of the DUF families likely represent very divergent branches of already known and well-characterized families, which allows hypotheses to be formulated about their biological function. The remainder can be formally categorized as new folds, although about one third of these show significant substructure similarity to previously characterized folds. These results infer that, despite the enormous increase in the number and the diversity of new genes being uncovered, the fold space of the proteins they encode is gradually becoming saturated. The previously unexplored sectors of the protein universe appear to be primarily shaped by extreme diversification of known protein families, which then enables organisms to evolve new functions and adapt to particular niches and habitats. Notwithstanding, these DUF families still constitute the richest source for discovery of the remaining protein folds and topologies. PMID:19787035
Dong, Chen; Hu, Huigang; Xie, Jianghui
2016-12-01
DNA-binding with one finger (Dof) domain proteins are a multigene family of plant-specific transcription factors involved in numerous aspects of plant growth and development. In this study, we report a genome-wide search for Musa acuminata Dof (MaDof) genes and their expression profiles at different developmental stages and in response to various abiotic stresses. In addition, a complete overview of the Dof gene family in bananas is presented, including the gene structures, chromosomal locations, cis-regulatory elements, conserved protein domains, and phylogenetic inferences. Based on the genome-wide analysis, we identified 74 full-length protein-coding MaDof genes unevenly distributed on 11 chromosomes. Phylogenetic analysis with Dof members from diverse plant species showed that MaDof genes can be classified into four subgroups (StDof I, II, III, and IV). The detailed genomic information of the MaDof gene homologs in the present study provides opportunities for functional analyses to unravel the exact role of the genes in plant growth and development.
Cloning and sequence analysis of Hemonchus contortus HC58cDNA.
Muleke, Charles I; Ruofeng, Yan; Lixin, Xu; Xinwen, Bo; Xiangrui, Li
2007-06-01
The complete coding sequence of Hemonchus contortus HC58cDNA was generated by rapid amplification of cDNA ends and polymerase chain reaction using primers based on the 5' and 3' ends of the parasite mRNA, accession no. AF305964. The HC58cDNA gene was 851 bp long, with open reading frame of 717 bp, precursors to 239 amino acids coding for approximately 27 kDa protein. Analysis of amino acid sequence revealed conserved residues of cysteine, histidine, asparagine, occluding loop pattern, hemoglobinase motif and glutamine of the oxyanion hole characteristic of cathepsin B like proteases (CBL). Comparison of the predicted amino acid sequences showed the protein shared 33.5-58.7% identity to cathepsin B homologues in the papain clan CA family (family C1). Phylogenetic analysis revealed close evolutionary proximity of the protein sequence to counterpart sequences in the CBL, suggesting that HC58cDNA was a member of the papain family.
Zhang, C H; Ma, R J; Shen, Z J; Sun, X; Korir, N K; Yu, M L
2014-04-08
In this study, 33 homeodomain-leucine zipper (HD-ZIP) genes were identified in peach using the HD-ZIP amino acid sequences of Arabidopsis thaliana as a probe. Based on the phylogenetic analysis and the individual gene or protein characteristics, the HD-ZIP gene family in peach can be classified into 4 subfamilies, HD-ZIP I, II, III, and IV, containing 14, 7, 4, and 8 members, respectively. The most closely related peach HD-ZIP members within the same subfamilies shared very similar gene structure in terms of either intron/exon numbers or lengths. Almost all members of the same subfamily shared common motif compositions, thereby implying that the HD-ZIP proteins within the same subfamily may have functional similarity. The 33 peach HD-ZIP genes were distributed across scaffolds 1 to 7. Although the primary structure varied among HD-ZIP family proteins, their tertiary structures were similar. The results from this study will be useful in selecting candidate genes from specific subfamilies for functional analysis.
Wei, Wei; Chai, Zhuangzhuang; Xie, Yinge; Gao, Kuan; Cui, Mengyuan; Jiang, Ying
2017-01-01
Mitogen-activated protein kinases (MAPKs) play essential roles in mediating biotic and abiotic stress responses in plants. However, the MAPK gene family in strawberry has not been systematically characterized. Here, we performed a genome-wide survey and identified 12 MAPK genes in the Fragaria vesca genome. Protein domain analysis indicated that all FvMAPKs have typical protein kinase domains. Sequence alignments and phylogenetic analysis classified the FvMAPK genes into four different groups. Conserved motif and exon-intron organization supported the evolutionary relationships inferred from the phylogenetic analysis. Analysis of the stress-related cis-regulatory element in the promoters and subcellular localization predictions of FvMAPKs were also performed. Gene transcript profile analysis showed that the majority of the FvMAPK genes were ubiquitously transcribed in strawberry leaves after Podosphaera aphanis inoculation and after treatment with cold, heat, drought, salt and the exogenous hormones abscisic acid, ethephon, methyl jasmonate, and salicylic acid. RT-qPCR showed that six selected FvMAPK genes comprehensively responded to various stimuli. Additionally, interaction networks revealed that the crucial signaling transduction controlled by FvMAPKs may be involved in the biotic and abiotic stress responses. Our results may provide useful information for future research on the function of the MAPK gene family and the genetic improvement of strawberry resistance to environmental stresses. PMID:28562633
Yang, Jinhua; Gao, Min; Huang, Li; Wang, Yaqiong; van Nocker, Steve; Wan, Ran; Guo, Chunlei; Wang, Xiping; Gao, Hua
2017-02-09
Basic helix-loop-helix (bHLH) proteins, which are characterized by a conserved bHLH domain, comprise one of the largest families of transcription factors in both plants and animals, and have been shown to have a wide range of biological functions. However, there have been very few studies of bHLH proteins from perennial tree species. We describe here the identification and characterization of 175 bHLH transcription factors from apple (Malus × domestica). Phylogenetic analysis of apple bHLH (MdbHLH) genes and their Arabidopsis thaliana (Arabidopsis) orthologs indicated that they can be classified into 23 subgroups. Moreover, integrated synteny analysis suggested that the large-scale expansion of the bHLH transcription factor family occurred before the divergence of apple and Arabidopsis. An analysis of the exon/intron structure and protein domains was conducted to suggest their functional roles. Finally, we observed that MdbHLH subgroup III and IV genes displayed diverse expression profiles in various organs, as well as in response to abiotic stresses and various hormone treatments. Taken together, these data provide new information regarding the composition and diversity of the apple bHLH transcription factor family that will provide a platform for future targeted functional characterization.
Kusch, Stefan; Pesch, Lina; Panstruga, Ralph
2016-01-01
Mildew resistance Locus O (MLO) proteins are polytopic integral membrane proteins that have long been considered as plant-specific and being primarily involved in plant–powdery mildew interactions. However, research in the past decade has revealed that MLO proteins diverged into a family with several clades whose members are associated with different physiological processes. We provide a largely increased dataset of MLO amino acid sequences, comprising nearly all major land plant lineages. Based on this comprehensive dataset, we defined seven phylogenetic clades and reconstructed the likely evolution of the MLO family in embryophytes. We further identified several MLO peptide motifs that are either conserved in all MLO proteins or confined to one or several clades, supporting the notion that clade-specific diversification of MLO functions is associated with particular sequence motifs. In baker’s yeast, some of these motifs are functionally linked to transmembrane (TM) transport of organic molecules and ions. In addition, we attempted to define the evolutionary origin of the MLO family and found that MLO-like proteins with highly diverse membrane topologies are present in green algae, but also in the distinctly related red algae (Rhodophyta), Amoebozoa, and Chromalveolata. Finally, we discovered several instances of putative fusion events between MLO proteins and different kinds of proteins. Such Rosetta stone-type hybrid proteins might be instructive for future analysis of potential MLO functions. Our findings suggest that MLO is an ancient protein that possibly evolved in unicellular photosynthetic eukaryotes, and consolidated in land plants with a conserved topology, comprising seven TM domains and an intrinsically unstructured C-terminus. PMID:26893454
NASA Astrophysics Data System (ADS)
Inupakutika, Madhuri A.; Sengupta, Soham; Nechushtai, Rachel; Jennings, Patricia A.; Onuchic, Jose' N.; Azad, Rajeev K.; Padilla, Pamela; Mittler, Ron
2017-02-01
NEET proteins belong to a unique family of iron-sulfur proteins in which the 2Fe-2S cluster is coordinated by a CDGSH domain that is followed by the “NEET” motif. They are involved in the regulation of iron and reactive oxygen metabolism, and have been associated with the progression of diabetes, cancer, aging and neurodegenerative diseases. Despite their important biological functions, the evolution and diversification of eukaryotic NEET proteins are largely unknown. Here we used the three members of the human NEET protein family (CISD1, mitoNEET; CISD2, NAF-1 or Miner 1; and CISD3, Miner2) as our guides to conduct a phylogenetic analysis of eukaryotic NEET proteins and their evolution. Our findings identified the slime mold Dictyostelium discoideum’s CISD proteins as the closest to the ancient archetype of eukaryotic NEET proteins. We further identified CISD3 homologs in fungi that were previously reported not to contain any NEET proteins, and revealed that plants lack homolog(s) of CISD3. Furthermore, our study suggests that the mammalian NEET proteins, mitoNEET (CISD1) and NAF-1 (CISD2), emerged via gene duplication around the origin of vertebrates. Our findings provide new insights into the classification and expansion of the NEET protein family, as well as offer clues to the diverged functions of the human mitoNEET and NAF-1 proteins.
Inupakutika, Madhuri A.; Sengupta, Soham; Nechushtai, Rachel; Jennings, Patricia A.; Onuchic, Jose’ N.; Azad, Rajeev K.; Padilla, Pamela; Mittler, Ron
2017-01-01
NEET proteins belong to a unique family of iron-sulfur proteins in which the 2Fe-2S cluster is coordinated by a CDGSH domain that is followed by the “NEET” motif. They are involved in the regulation of iron and reactive oxygen metabolism, and have been associated with the progression of diabetes, cancer, aging and neurodegenerative diseases. Despite their important biological functions, the evolution and diversification of eukaryotic NEET proteins are largely unknown. Here we used the three members of the human NEET protein family (CISD1, mitoNEET; CISD2, NAF-1 or Miner 1; and CISD3, Miner2) as our guides to conduct a phylogenetic analysis of eukaryotic NEET proteins and their evolution. Our findings identified the slime mold Dictyostelium discoideum’s CISD proteins as the closest to the ancient archetype of eukaryotic NEET proteins. We further identified CISD3 homologs in fungi that were previously reported not to contain any NEET proteins, and revealed that plants lack homolog(s) of CISD3. Furthermore, our study suggests that the mammalian NEET proteins, mitoNEET (CISD1) and NAF-1 (CISD2), emerged via gene duplication around the origin of vertebrates. Our findings provide new insights into the classification and expansion of the NEET protein family, as well as offer clues to the diverged functions of the human mitoNEET and NAF-1 proteins. PMID:28205535
Functional Evolution of PLP-dependent Enzymes based on Active-Site Structural Similarities
Catazaro, Jonathan; Caprez, Adam; Guru, Ashu; Swanson, David; Powers, Robert
2014-01-01
Families of distantly related proteins typically have very low sequence identity, which hinders evolutionary analysis and functional annotation. Slowly evolving features of proteins, such as an active site, are therefore valuable for annotating putative and distantly related proteins. To date, a complete evolutionary analysis of the functional relationship of an entire enzyme family based on active-site structural similarities has not yet been undertaken. Pyridoxal-5’-phosphate (PLP) dependent enzymes are primordial enzymes that diversified in the last universal ancestor. Using the Comparison of Protein Active Site Structures (CPASS) software and database, we show that the active site structures of PLP-dependent enzymes can be used to infer evolutionary relationships based on functional similarity. The enzymes successfully clustered together based on substrate specificity, function, and three-dimensional fold. This study demonstrates the value of using active site structures for functional evolutionary analysis and the effectiveness of CPASS. PMID:24920327
Functional evolution of PLP-dependent enzymes based on active-site structural similarities.
Catazaro, Jonathan; Caprez, Adam; Guru, Ashu; Swanson, David; Powers, Robert
2014-10-01
Families of distantly related proteins typically have very low sequence identity, which hinders evolutionary analysis and functional annotation. Slowly evolving features of proteins, such as an active site, are therefore valuable for annotating putative and distantly related proteins. To date, a complete evolutionary analysis of the functional relationship of an entire enzyme family based on active-site structural similarities has not yet been undertaken. Pyridoxal-5'-phosphate (PLP) dependent enzymes are primordial enzymes that diversified in the last universal ancestor. Using the comparison of protein active site structures (CPASS) software and database, we show that the active site structures of PLP-dependent enzymes can be used to infer evolutionary relationships based on functional similarity. The enzymes successfully clustered together based on substrate specificity, function, and three-dimensional-fold. This study demonstrates the value of using active site structures for functional evolutionary analysis and the effectiveness of CPASS. © 2014 Wiley Periodicals, Inc.
Analysis of functional redundancies within the Arabidopsis TCP transcription factor family
Danisman, Selahattin; de Folter, Stefan; Immink, Richard G. H.
2013-01-01
Analyses of the functions of TEOSINTE-LIKE1, CYCLOIDEA, and PROLIFERATING CELL FACTOR1 (TCP) transcription factors have been hampered by functional redundancy between its individual members. In general, putative functionally redundant genes are predicted based on sequence similarity and confirmed by genetic analysis. In the TCP family, however, identification is impeded by relatively low overall sequence similarity. In a search for functionally redundant TCP pairs that control Arabidopsis leaf development, this work performed an integrative bioinformatics analysis, combining protein sequence similarities, gene expression data, and results of pair-wise protein–protein interaction studies for the 24 members of the Arabidopsis TCP transcription factor family. For this, the work completed any lacking gene expression and protein–protein interaction data experimentally and then performed a comprehensive prediction of potential functional redundant TCP pairs. Subsequently, redundant functions could be confirmed for selected predicted TCP pairs by genetic and molecular analyses. It is demonstrated that the previously uncharacterized class I TCP19 gene plays a role in the control of leaf senescence in a redundant fashion with TCP20. Altogether, this work shows the power of combining classical genetic and molecular approaches with bioinformatics predictions to unravel functional redundancies in the TCP transcription factor family. PMID:24129704
Niskanen, Einari A; Hytönen, Vesa P; Grapputo, Alessandro; Nordlund, Henri R; Kulomaa, Markku S; Laitinen, Olli H
2005-01-01
Background A chicken egg contains several biotin-binding proteins (BBPs), whose complete DNA and amino acid sequences are not known. In order to identify and characterise these genes and proteins we studied chicken cDNAs and genes available in the NCBI database and chicken genome database using the reported N-terminal amino acid sequences of chicken egg-yolk BBPs as search strings. Results Two separate hits showing significant homology for these N-terminal sequences were discovered. For one of these hits, the chromosomal location in the immediate proximity of the avidin gene family was found. Both of these hits encode proteins having high sequence similarity with avidin suggesting that chicken BBPs are paralogous to avidin family. In particular, almost all residues corresponding to biotin binding in avidin are conserved in these putative BBP proteins. One of the found DNA sequences, however, seems to encode a carboxy-terminal extension not present in avidin. Conclusion We describe here the predicted properties of the putative BBP genes and proteins. Our present observations link BBP genes together with avidin gene family and shed more light on the genetic arrangement and variability of this family. In addition, comparative modelling revealed the potential structural elements important for the functional and structural properties of the putative BBP proteins. PMID:15777476
Thomas, Paul D; Kejariwal, Anish; Campbell, Michael J; Mi, Huaiyu; Diemer, Karen; Guo, Nan; Ladunga, Istvan; Ulitsky-Lazareva, Betty; Muruganujan, Anushya; Rabkin, Steven; Vandergriff, Jody A; Doremieux, Olivier
2003-01-01
The PANTHER database was designed for high-throughput analysis of protein sequences. One of the key features is a simplified ontology of protein function, which allows browsing of the database by biological functions. Biologist curators have associated the ontology terms with groups of protein sequences rather than individual sequences. Statistical models (Hidden Markov Models, or HMMs) are built from each of these groups. The advantage of this approach is that new sequences can be automatically classified as they become available. To ensure accurate functional classification, HMMs are constructed not only for families, but also for functionally distinct subfamilies. Multiple sequence alignments and phylogenetic trees, including curator-assigned information, are available for each family. The current version of the PANTHER database includes training sequences from all organisms in the GenBank non-redundant protein database, and the HMMs have been used to classify gene products across the entire genomes of human, and Drosophila melanogaster. The ontology terms and protein families and subfamilies, as well as Drosophila gene c;assifications, can be browsed and searched for free. Due to outstanding contractual obligations, access to human gene classifications and to protein family trees and multiple sequence alignments will temporarily require a nominal registration fee. PANTHER is publicly available on the web at http://panther.celera.com.
Brand, Thomas; Schindler, Roland
2017-12-01
The cyclic 3',5'-adenosine monophosphate (cAMP) signalling pathway constitutes an ancient signal transduction pathway present in prokaryotes and eukaryotes. Previously, it was thought that in eukaryotes three effector proteins mediate cAMP signalling, namely protein kinase A (PKA), exchange factor directly activated by cAMP (EPAC) and the cyclic-nucleotide gated channels. However, recently a novel family of cAMP effector proteins emerged and was termed the Popeye domain containing (POPDC) family, which consists of three members POPDC1, POPDC2 and POPDC3. POPDC proteins are transmembrane proteins, which are abundantly present in striated and smooth muscle cells. POPDC proteins bind cAMP with high affinity comparable to PKA. Presently, their biochemical activity is poorly understood. However, mutational analysis in animal models as well as the disease phenotype observed in patients carrying missense mutations suggests that POPDC proteins are acting by modulating membrane trafficking of interacting proteins. In this review, we will describe the current knowledge about this gene family and also outline the apparent gaps in our understanding of their role in cAMP signalling and beyond. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.
Genome-Wide Identification and Analysis of the TIFY Gene Family in Grape
Zhang, Yucheng; Gao, Min; Singer, Stacy D.; Fei, Zhangjun; Wang, Hua; Wang, Xiping
2012-01-01
Background The TIFY gene family constitutes a plant-specific group of genes with a broad range of functions. This family encodes four subfamilies of proteins, including ZML, TIFY, PPD and JASMONATE ZIM-Domain (JAZ) proteins. JAZ proteins are targets of the SCFCOI1 complex, and function as negative regulators in the JA signaling pathway. Recently, it has been reported in both Arabidopsis and rice that TIFY genes, and especially JAZ genes, may be involved in plant defense against insect feeding, wounding, pathogens and abiotic stresses. Nonetheless, knowledge concerning the specific expression patterns and evolutionary history of plant TIFY family members is limited, especially in a woody species such as grape. Methodology/Principal Findings A total of two TIFY, four ZML, two PPD and 11 JAZ genes were identified in the Vitis vinifera genome. Phylogenetic analysis of TIFY protein sequences from grape, Arabidopsis and rice indicated that the grape TIFY proteins are more closely related to those of Arabidopsis than those of rice. Both segmental and tandem duplication events have been major contributors to the expansion of the grape TIFY family. In addition, synteny analysis between grape and Arabidopsis demonstrated that homologues of several grape TIFY genes were found in the corresponding syntenic blocks of Arabidopsis, suggesting that these genes arose before the divergence of lineages that led to grape and Arabidopsis. Analyses of microarray and quantitative real-time RT-PCR expression data revealed that grape TIFY genes are not a major player in the defense against biotrophic pathogens or viruses. However, many of these genes were responsive to JA and ABA, but not SA or ET. Conclusion The genome-wide identification, evolutionary and expression analyses of grape TIFY genes should facilitate further research of this gene family and provide new insights regarding their evolutionary history and regulatory control. PMID:22984514
Singh, Anil Kumar; Sharma, Vishal; Pal, Awadhesh Kumar; Acharya, Vishal; Ahuja, Paramvir Singh
2013-08-01
NAC [no apical meristem (NAM), Arabidopsis thaliana transcription activation factor [ATAF1/2] and cup-shaped cotyledon (CUC2)] proteins belong to one of the largest plant-specific transcription factor (TF) families and play important roles in plant development processes, response to biotic and abiotic cues and hormone signalling. Our genome-wide analysis identified 110 StNAC genes in potato encoding for 136 proteins, including 14 membrane-bound TFs. The physical map positions of StNAC genes on 12 potato chromosomes were non-random, and 40 genes were found to be distributed in 16 clusters. The StNAC proteins were phylogenetically clustered into 12 subgroups. Phylogenetic analysis of StNACs along with their Arabidopsis and rice counterparts divided these proteins into 18 subgroups. Our comparative analysis has also identified 36 putative TNAC proteins, which appear to be restricted to Solanaceae family. In silico expression analysis, using Illumina RNA-seq transcriptome data, revealed tissue-specific, biotic, abiotic stress and hormone-responsive expression profile of StNAC genes. Several StNAC genes, including StNAC072 and StNAC101that are orthologs of known stress-responsive Arabidopsis RESPONSIVE TO DEHYDRATION 26 (RD26) were identified as highly abiotic stress responsive. Quantitative real-time polymerase chain reaction analysis largely corroborated the expression profile of StNAC genes as revealed by the RNA-seq data. Taken together, this analysis indicates towards putative functions of several StNAC TFs, which will provide blue-print for their functional characterization and utilization in potato improvement.
Finkina, Ekaterina I; Melnikova, Daria N; Bogdanov, Ivan V; Ovchinnikova, Tatiana V
2017-01-01
Pathogenesis-related (PR) proteins are components of innate immunity system in plants. They play an important role in plant defense against pathogens. Lipid transfer proteins (LTPs) and Bet v 1 homologs comprise of two separate families of PR-proteins. Both LTPs (PR-14) and Bet v 1 homologs (PR-10) are multifunctional small proteins involving in plant response to abiotic and biotic stress conditions. The representatives of these PR-protein families do not show any sequence similarity but have other common biochemical features such as low molecular masses, the presence of hydrophobic cavities, ligand binding properties, and antimicrobial activities. Besides, many members of PR-10 and PR-14 families are ubiquitous plant panallergens which are able to cause sensitization of human immune system and crossreactive allergic reactions to plant food and pollen. This review is aimed at comparative analysis of structure-functional and allergenic properties of the PR-10 and PR-14 families, as well as prospects for their medicinal application. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Wang, Wei; Xia, Minxuan; Chen, Jie; Deng, Fenni; Yuan, Rui; Zhang, Xiaopei; Shen, Fafu
2016-12-01
The data presented in this paper is supporting the research article "Genome-Wide Analysis of Superoxide Dismutase Gene Family in Gossypium raimondii and G. arboreum" [1]. In this data article, we present phylogenetic tree showing dichotomy with two different clusters of SODs inferred by the Bayesian method of MrBayes (version 3.2.4), "Bayesian phylogenetic inference under mixed models" [2], Ramachandran plots of G. raimondii and G. arboreum SODs, the protein sequence used to generate 3D sructure of proteins and the template accession via SWISS-MODEL server, "SWISS-MODEL: modelling protein tertiary and quaternary structure using evolutionary information." [3] and motif sequences of SODs identified by InterProScan (version 4.8) with the Pfam database, "Pfam: the protein families database" [4].
Candat, Adrien; Paszkiewicz, Gaël; Neveu, Martine; Gautier, Romain; Logan, David C.; Avelange-Macherel, Marie-Hélène; Macherel, David
2014-01-01
Late embryogenesis abundant (LEA) proteins are hydrophilic, mostly intrinsically disordered proteins, which play major roles in desiccation tolerance. In Arabidopsis thaliana, 51 genes encoding LEA proteins clustered into nine families have been inventoried. To increase our understanding of the yet enigmatic functions of these gene families, we report the subcellular location of each protein. Experimental data highlight the limits of in silico predictions for analysis of subcellular localization. Thirty-six LEA proteins localized to the cytosol, with most being able to diffuse into the nucleus. Three proteins were exclusively localized in plastids or mitochondria, while two others were found dually targeted to these organelles. Targeting cleavage sites could be determined for five of these proteins. Three proteins were found to be endoplasmic reticulum (ER) residents, two were vacuolar, and two were secreted. A single protein was identified in pexophagosomes. While most LEA protein families have a unique subcellular localization, members of the LEA_4 family are widely distributed (cytosol, mitochondria, plastid, ER, and pexophagosome) but share the presence of the class A α-helix motif. They are thus expected to establish interactions with various cellular membranes under stress conditions. The broad subcellular distribution of LEA proteins highlights the requirement for each cellular compartment to be provided with protective mechanisms to cope with desiccation or cold stress. PMID:25005920
Rappoport, Nadav; Linial, Michal
2015-08-07
Insects belong to a class that accounts for the majority of animals on earth. With over one million identified species, insects display a huge diversity and occupy extreme environments. At present, there are dozens of fully sequenced insect genomes that cover a range of habitats, social behavior and morphologies. In view of such diverse collection of genomes, revealing evolutionary trends and charting functional relationships of proteins remain challenging. We analyzed the relatedness of 17 complete proteomes representative of proteomes from insects including louse, bee, beetle, ants, flies and mosquitoes, as well as an out-group from the crustaceans. The analyzed proteomes mostly represented the orders of Hymenoptera and Diptera. The 287,405 protein sequences from the 18 proteomes were automatically clustered into 20,933 families, including 799 singletons. A comprehensive analysis based on statistical considerations identified the families that were significantly expanded or reduced in any of the studied organisms. Among all the tested species, ants are characterized by an exceptionally high rate of family gain and loss. By assigning annotations to hundreds of species-specific families, the functional diversity among species and between the major clades (Diptera and Hymenoptera) is revealed. We found that many species-specific families are associated with receptor signaling, stress-related functions and proteases. The highest variability among insects associates with the function of transposition and nucleic acids processes (collectively coined TNAP). Specifically, the wasp and ants have an order of magnitude more TNAP families and proteins relative to species that belong to Diptera (mosquitoes and flies). An unsupervised clustering methodology combined with a comparative functional analysis unveiled proteomic signatures in the major clades of winged insects. We propose that the expansion of TNAP families in Hymenoptera potentially contributes to the accelerated genome dynamics that characterize the wasp and ants.
Guo, Yong; Qiu, Li-Juan
2013-01-01
The Dof domain protein family is a classic plant-specific zinc-finger transcription factor family involved in a variety of biological processes. There is great diversity in the number of Dof genes in different plants. However, there are only very limited reports on the characterization of Dof transcription factors in soybean (Glycine max). In the present study, 78 putative Dof genes were identified from the whole-genome sequence of soybean. The predicted GmDof genes were non-randomly distributed within and across 19 out of 20 chromosomes and 97.4% (38 pairs) were preferentially retained duplicate paralogous genes located in duplicated regions of the genome. Soybean-specific segmental duplications contributed significantly to the expansion of the soybean Dof gene family. These Dof proteins were phylogenetically clustered into nine distinct subgroups among which the gene structure and motif compositions were considerably conserved. Comparative phylogenetic analysis of these Dof proteins revealed four major groups, similar to those reported for Arabidopsis and rice. Most of the GmDofs showed specific expression patterns based on RNA-seq data analyses. The expression patterns of some duplicate genes were partially redundant while others showed functional diversity, suggesting the occurrence of sub-functionalization during subsequent evolution. Comprehensive expression profile analysis also provided insights into the soybean-specific functional divergence among members of the Dof gene family. Cis-regulatory element analysis of these GmDof genes suggested diverse functions associated with different processes. Taken together, our results provide useful information for the functional characterization of soybean Dof genes by combining phylogenetic analysis with global gene-expression profiling.
Aguilar-Hernández, Victor; Aguilar-Henonin, Laura; Guzmán, Plinio
2011-01-01
Ubiquitin-ligases or E3s are components of the ubiquitin proteasome system (UPS) that coordinate the transfer of ubiquitin to the target protein. A major class of ubiquitin-ligases consists of RING-finger domain proteins that include the substrate recognition sequences in the same polypeptide; these are known as single-subunit RING finger E3s. We are studying a particular family of RING finger E3s, named ATL, that contain a transmembrane domain and the RING-H2 finger domain; none of the member of the family contains any other previously described domain. Although the study of a few members in A. thaliana and O. sativa has been reported, the role of this family in the life cycle of a plant is still vague. To provide tools to advance on the functional analysis of this family we have undertaken a phylogenetic analysis of ATLs in twenty-four plant genomes. ATLs were found in all the 24 plant species analyzed, in numbers ranging from 20-28 in two basal species to 162 in soybean. Analysis of ATLs arrayed in tandem indicates that sets of genes are expanding in a species-specific manner. To get insights into the domain architecture of ATLs we generated 75 pHMM LOGOs from 1815 ATLs, and unraveled potential protein-protein interaction regions by means of yeast two-hybrid assays. Several ATLs were found to interact with DSK2a/ubiquilin through a region at the amino-terminal end, suggesting that this is a widespread interaction that may assist in the mode of action of ATLs; the region was traced to a distinct sequence LOGO. Our analysis provides significant observations on the evolution and expansion of the ATL family in addition to information on the domain structure of this class of ubiquitin-ligases that may be involved in plant adaptation to environmental stress.
Ferreira Filho, Jaire Alves; Horta, Maria Augusta Crivelente; Beloti, Lilian Luzia; Dos Santos, Clelton Aparecido; de Souza, Anete Pereira
2017-10-12
Trichoderma harzianum is used in biotechnology applications due to its ability to produce powerful enzymes for the conversion of lignocellulosic substrates into soluble sugars. Active enzymes involved in carbohydrate metabolism are defined as carbohydrate-active enzymes (CAZymes), and the most abundant family in the CAZy database is the glycoside hydrolases. The enzymes of this family play a fundamental role in the decomposition of plant biomass. In this study, the CAZymes of T. harzianum were identified and classified using bioinformatic approaches after which the expression profiles of all annotated CAZymes were assessed via RNA-Seq, and a phylogenetic analysis was performed. A total of 430 CAZymes (3.7% of the total proteins for this organism) were annotated in T. harzianum, including 259 glycoside hydrolases (GHs), 101 glycosyl transferases (GTs), 6 polysaccharide lyases (PLs), 22 carbohydrate esterases (CEs), 42 auxiliary activities (AAs) and 46 carbohydrate-binding modules (CBMs). Among the identified T. harzianum CAZymes, 47% were predicted to harbor a signal peptide sequence and were therefore classified as secreted proteins. The GH families were the CAZyme class with the greatest number of expressed genes, including GH18 (23 genes), GH3 (17 genes), GH16 (16 genes), GH2 (13 genes) and GH5 (12 genes). A phylogenetic analysis of the proteins in the AA9/GH61, CE5 and GH55 families showed high functional variation among the proteins. Identifying the main proteins used by T. harzianum for biomass degradation can ensure new advances in the biofuel production field. Herein, we annotated and characterized the expression levels of all of the CAZymes from T. harzianum, which may contribute to future studies focusing on the functional and structural characterization of the identified proteins.
Benoit, Joshua B; Attardo, Geoffrey M; Michalkova, Veronika; Krause, Tyler B; Bohova, Jana; Zhang, Qirui; Baumann, Aaron A; Mireji, Paul O; Takáč, Peter; Denlinger, David L; Ribeiro, Jose M; Aksoy, Serap
2014-04-01
In tsetse flies, nutrients for intrauterine larval development are synthesized by the modified accessory gland (milk gland) and provided in mother's milk during lactation. Interference with at least two milk proteins has been shown to extend larval development and reduce fecundity. The goal of this study was to perform a comprehensive characterization of tsetse milk proteins using lactation-specific transcriptome/milk proteome analyses and to define functional role(s) for the milk proteins during lactation. Differential analysis of RNA-seq data from lactating and dry (non-lactating) females revealed enrichment of transcripts coding for protein synthesis machinery, lipid metabolism and secretory proteins during lactation. Among the genes induced during lactation were those encoding the previously identified milk proteins (milk gland proteins 1-3, transferrin and acid sphingomyelinase 1) and seven new genes (mgp4-10). The genes encoding mgp2-10 are organized on a 40 kb syntenic block in the tsetse genome, have similar exon-intron arrangements, and share regions of amino acid sequence similarity. Expression of mgp2-10 is female-specific and high during milk secretion. While knockdown of a single mgp failed to reduce fecundity, simultaneous knockdown of multiple variants reduced milk protein levels and lowered fecundity. The genomic localization, gene structure similarities, and functional redundancy of MGP2-10 suggest that they constitute a novel highly divergent protein family. Our data indicates that MGP2-10 function both as the primary amino acid resource for the developing larva and in the maintenance of milk homeostasis, similar to the function of the mammalian casein family of milk proteins. This study underscores the dynamic nature of the lactation cycle and identifies a novel family of lactation-specific proteins, unique to Glossina sp., that are essential to larval development. The specificity of MGP2-10 to tsetse and their critical role during lactation suggests that these proteins may be an excellent target for tsetse-specific population control approaches.
Schokraie, Elham; Warnken, Uwe; Hotz-Wagenblatt, Agnes; Grohme, Markus A; Hengherr, Steffen; Förster, Frank; Schill, Ralph O; Frohme, Marcus; Dandekar, Thomas; Schnölzer, Martina
2012-01-01
Tardigrades have fascinated researchers for more than 300 years because of their extraordinary capability to undergo cryptobiosis and survive extreme environmental conditions. However, the survival mechanisms of tardigrades are still poorly understood mainly due to the absence of detailed knowledge about the proteome and genome of these organisms. Our study was intended to provide a basis for the functional characterization of expressed proteins in different states of tardigrades. High-throughput, high-accuracy proteomics in combination with a newly developed tardigrade specific protein database resulted in the identification of more than 3000 proteins in three different states: early embryonic state and adult animals in active and anhydrobiotic state. This comprehensive proteome resource includes protein families such as chaperones, antioxidants, ribosomal proteins, cytoskeletal proteins, transporters, protein channels, nutrient reservoirs, and developmental proteins. A comparative analysis of protein families in the different states was performed by calculating the exponentially modified protein abundance index which classifies proteins in major and minor components. This is the first step to analyzing the proteins involved in early embryonic development, and furthermore proteins which might play an important role in the transition into the anhydrobiotic state.
Schokraie, Elham; Warnken, Uwe; Hotz-Wagenblatt, Agnes; Grohme, Markus A.; Hengherr, Steffen; Förster, Frank; Schill, Ralph O.; Frohme, Marcus; Dandekar, Thomas; Schnölzer, Martina
2012-01-01
Tardigrades have fascinated researchers for more than 300 years because of their extraordinary capability to undergo cryptobiosis and survive extreme environmental conditions. However, the survival mechanisms of tardigrades are still poorly understood mainly due to the absence of detailed knowledge about the proteome and genome of these organisms. Our study was intended to provide a basis for the functional characterization of expressed proteins in different states of tardigrades. High-throughput, high-accuracy proteomics in combination with a newly developed tardigrade specific protein database resulted in the identification of more than 3000 proteins in three different states: early embryonic state and adult animals in active and anhydrobiotic state. This comprehensive proteome resource includes protein families such as chaperones, antioxidants, ribosomal proteins, cytoskeletal proteins, transporters, protein channels, nutrient reservoirs, and developmental proteins. A comparative analysis of protein families in the different states was performed by calculating the exponentially modified protein abundance index which classifies proteins in major and minor components. This is the first step to analyzing the proteins involved in early embryonic development, and furthermore proteins which might play an important role in the transition into the anhydrobiotic state. PMID:23029181
Milanesi, Luciano; Petrillo, Mauro; Sepe, Leandra; Boccia, Angelo; D'Agostino, Nunzio; Passamano, Myriam; Di Nardo, Salvatore; Tasco, Gianluca; Casadio, Rita; Paolella, Giovanni
2005-01-01
Background Protein kinases are a well defined family of proteins, characterized by the presence of a common kinase catalytic domain and playing a significant role in many important cellular processes, such as proliferation, maintenance of cell shape, apoptosys. In many members of the family, additional non-kinase domains contribute further specialization, resulting in subcellular localization, protein binding and regulation of activity, among others. About 500 genes encode members of the kinase family in the human genome, and although many of them represent well known genes, a larger number of genes code for proteins of more recent identification, or for unknown proteins identified as kinase only after computational studies. Results A systematic in silico study performed on the human genome, led to the identification of 5 genes, on chromosome 1, 11, 13, 15 and 16 respectively, and 1 pseudogene on chromosome X; some of these genes are reported as kinases from NCBI but are absent in other databases, such as KinBase. Comparative analysis of 483 gene regions and subsequent computational analysis, aimed at identifying unannotated exons, indicates that a large number of kinase may code for alternately spliced forms or be incorrectly annotated. An InterProScan automated analysis was perfomed to study domain distribution and combination in the various families. At the same time, other structural features were also added to the annotation process, including the putative presence of transmembrane alpha helices, and the cystein propensity to participate into a disulfide bridge. Conclusion The predicted human kinome was extended by identifiying both additional genes and potential splice variants, resulting in a varied panorama where functionality may be searched at the gene and protein level. Structural analysis of kinase proteins domains as defined in multiple sources together with transmembrane alpha helices and signal peptide prediction provides hints to function assignment. The results of the human kinome analysis are collected in the KinWeb database, available for browsing and searching over the internet, where all results from the comparative analysis and the gene structure annotation are made available, alongside the domain information. Kinases may be searched by domain combinations and the relative genes may be viewed in a graphic browser at various level of magnification up to gene organization on the full chromosome set. PMID:16351747
Kim, Jieun; Lee, Haeryung; Kim, Yujin; Yoo, Sooyeon; Park, Eunjeong; Park, Soochul
2010-04-01
We recently reported that the phosphotyrosine-binding (PTB) domain of Anks family proteins binds to EphA8, thereby positively regulating EphA8-mediated signaling pathways. In the current study, we identified a potential role for the SAM domains of Anks family proteins in EphA signaling. We found that SAM domains of Anks family proteins directly bind to ubiquitin, suggesting that Anks proteins regulate the degradation of ubiquitinated EphA receptors. Consistent with the role of Cbl ubiquitin ligases in the degradation of Eph receptors, our results revealed that the ubiquitin ligase c-Cbl induced the ubiquitination and degradation of EphA8 upon ligand binding. Ubiquitinated EphA8 also bound to the SAM domains of Odin, a member of the Anks family proteins. More importantly, the overexpression of wild-type Odin protected EphA8 and EphA2 from undergoing degradation following ligand stimulation and promoted EphA-mediated inhibition of cell migration. In contrast, a SAM domain deletion mutant of Odin strongly impaired the function of endogenous Odin, suggesting that the mutant functions in a dominant-negative manner. An analysis of Odin-deficient primary embryonic fibroblasts indicated that Odin levels play a critical role in regulating the stability of EphA2 in response to ligand stimulation. Taken together, our studies suggest that the SAM domains of Anks family proteins play a pivotal role in enhancing the stability of EphA receptors by modulating the ubiquitination process.
Redundancy and divergence in the amyloid precursor protein family.
Shariati, S Ali M; De Strooper, Bart
2013-06-27
Gene duplication provides genetic material required for functional diversification. An interesting example is the amyloid precursor protein (APP) protein family. The APP gene family has experienced both expansion and contraction during evolution. The three mammalian members have been studied quite extensively in combined knock out models. The underlying assumption is that APP, amyloid precursor like protein 1 and 2 (APLP1, APLP2) are functionally redundant. This assumption is primarily supported by the similarities in biochemical processing of APP and APLPs and on the fact that the different APP genes appear to genetically interact at the level of the phenotype in combined knockout mice. However, unique features in each member of the APP family possibly contribute to specification of their function. In the current review, we discuss the evolution and the biology of the APP protein family with special attention to the distinct properties of each homologue. We propose that the functions of APP, APLP1 and APLP2 have diverged after duplication to contribute distinctly to different neuronal events. Our analysis reveals that APLP2 is significantly diverged from APP and APLP1. Copyright © 2013 Federation of European Biochemical Societies. Published by Elsevier B.V. All rights reserved.
Li, Si-Bei; OuYang, Wei-Zhi; Hou, Xiao-Jin; Xie, Liang-Liang; Hu, Chun-Gen; Zhang, Jin-Zhi
2015-01-01
Auxin response factors (ARFs) are an important family of proteins in auxin-mediated response, with key roles in various physiological and biochemical processes. To date, a genome-wide overview of the ARF gene family in citrus was not available. A systematic analysis of this gene family in citrus was begun by carrying out a genome-wide search for the homologs of ARFs. A total of 19 nonredundant ARF genes (CiARF) were found and validated from the sweet orange. A comprehensive overview of the CiARFs was undertaken, including the gene structures, phylogenetic analysis, chromosome locations, conserved motifs of proteins, and cis-elements in promoters of CiARF. Furthermore, expression profiling using real-time PCR revealed many CiARF genes, albeit with different patterns depending on types of tissues and/or developmental stages. Comprehensive expression analysis of these genes was also performed under two hormone treatments using real-time PCR. Indole-3-acetic acid (IAA) and N-1-napthylphthalamic acid (NPA) treatment experiments revealed differential up-regulation and down-regulation, respectively, of the 19 citrus ARF genes in the callus of sweet orange. Our comprehensive analysis of ARF genes further elucidates the roles of CiARF family members during citrus growth and development process. PMID:25870601
Identification of a novel Gig2 gene family specific to non-amniote vertebrates.
Zhang, Yi-Bing; Liu, Ting-Kai; Jiang, Jun; Shi, Jun; Liu, Ying; Li, Shun; Gui, Jian-Fang
2013-01-01
Gig2 (grass carp reovirus (GCRV)-induced gene 2) is first identified as a novel fish interferon (IFN)-stimulated gene (ISG). Overexpression of a zebrafish Gig2 gene can protect cultured fish cells from virus infection. In the present study, we identify a novel gene family that is comprised of genes homologous to the previously characterized Gig2. EST/GSS search and in silico cloning identify 190 Gig2 homologous genes in 51 vertebrate species ranged from lampreys to amphibians. Further large-scale search of vertebrate and invertebrate genome databases indicate that Gig2 gene family is specific to non-amniotes including lampreys, sharks/rays, ray-finned fishes and amphibians. Phylogenetic analysis and synteny analysis reveal lineage-specific expansion of Gig2 gene family and also provide valuable evidence for the fish-specific genome duplication (FSGD) hypothesis. Although Gig2 family proteins exhibit no significant sequence similarity to any known proteins, a typical Gig2 protein appears to consist of two conserved parts: an N-terminus that bears very low homology to the catalytic domains of poly(ADP-ribose) polymerases (PARPs), and a novel C-terminal domain that is unique to this gene family. Expression profiling of zebrafish Gig2 family genes shows that some duplicate pairs have diverged in function via acquisition of novel spatial and/or temporal expression under stresses. The specificity of this gene family to non-amniotes might contribute to a large extent to distinct physiology in non-amniote vertebrates.
Jin, Lily L.; Wybenga-Groot, Leanne E.; Tong, Jiefei; Taylor, Paul; Minden, Mark D.; Trudel, Suzanne; McGlade, C. Jane; Moran, Michael F.
2015-01-01
Src homology 2 (SH2) domains are modular protein structures that bind phosphotyrosine (pY)-containing polypeptides and regulate cellular functions through protein-protein interactions. Proteomics analysis showed that the SH2 domains of Src family kinases are themselves tyrosine phosphorylated in blood system cancers, including acute myeloid leukemia, chronic lymphocytic leukemia, and multiple myeloma. Using the Src family kinase Lyn SH2 domain as a model, we found that phosphorylation at the conserved SH2 domain residue Y194 impacts the affinity and specificity of SH2 domain binding to pY-containing peptides and proteins. Analysis of the Lyn SH2 domain crystal structure supports a model wherein phosphorylation of Y194 on the EF loop modulates the binding pocket that engages amino acid side chains at the pY+2/+3 position. These data indicate another level of regulation wherein SH2-mediated protein-protein interactions are modulated by SH2 kinases and phosphatases. PMID:25587033
Kumar, Arun; Babu, Mohan; Kimberling, William J; Venkatesh, Conjeevaram P
2004-11-24
Usher syndrome (USH) is a rare autosomal recessive disorder characterized by deafness and retinitis pigmentosa. The purpose of this study was to determine the genetic cause of USH in a four generation Indian family. Peripheral blood samples were collected from individuals for genomic DNA isolation. To determine the linkage of this family to known USH loci, microsatellite markers were selected from the candidate regions of known loci and used to genotype the family. Exon specific intronic primers for the MYO7A gene were used to amplify DNA samples from one affected individual from the family. PCR products were subsequently sequenced to detect mutation. PCR-SSCP analysis was used to determine if the mutation segregated with the disease in the family and was not present in 50 control individuals. All affected individuals had a classic USH type I (USH1) phenotype which included deafness, vestibular dysfunction and retinitis pigmentosa. Pedigree analysis suggested an autosomal recessive mode of inheritance of USH in the family. Haplotype analysis suggested linkage of this family to the USH1B locus on chromosome 11q. DNA sequence analysis of the entire coding region of the MYO7A gene showed a novel insertion mutation c.2663_2664insA in a homozygous state in all affected individuals, resulting in truncation of MYO7A protein. This is the first study from India which reports a novel MYO7A insertion mutation in a four generation USH family. The mutation is predicted to produce a truncated MYO7A protein. With the novel mutation reported here, the total number of USH causing mutations in the MYO7A gene described to date reaches to 75.
Intrinsic disorder in spondins and some of their interacting partners
Alowolodu, Oluwole; Johnson, Gbemisola; Addou, Iqbal; Zhdanova, Irina V.; Uversky, Vladimir N.
2016-01-01
ABSTRACT Spondins, which are proteins that inhibit and promote adherence of embryonic cells so as to aid axonal growth are part of the thrombospondin-1 family. Spondins function in several important biological processes, such as apoptosis, angiogenesis, etc. Spondins constitute a thrombospondin subfamily that includes F-spondin, a protein that interacts with Aβ precursor protein and inhibits its proteolytic processing; R-spondin, a 4-membered group of proteins that regulates Wnt pathway and have other functions, such as regulation of kidney proliferation, induction of epithelial proliferation, the tumor suppressant action; M-spondin that mediates mechanical linkage between the muscles and apodemes; and the SCO-spondin, a protein important for neuronal development. In this study, we investigated intrinsic disorder status of human spondins and their interacting partners, such as members of the LRP family, LGR family, Frizzled family, and several other binding partners in order to establish the existence and importance of disordered regions in spondins and their interacting partners by conducting a detailed analysis of their sequences, finding disordered regions, and establishing a correlation between their structure and biological functions. PMID:28232900
Jaimes-Becerra, Adrian; Chung, Ray; Morandini, André C; Weston, Andrew J; Padilla, Gabriel; Gacesa, Ranko; Ward, Malcolm; Long, Paul F; Marques, Antonio C
2017-10-01
Cnidarians are probably the oldest group of animals to be venomous, yet our current picture of cnidarian venom evolution is highly imbalanced due to limited taxon sampling. High-throughput tandem mass spectrometry was used to determine venom composition of the scyphozoan Chrysaora lactea and two cubozoans Tamoya haplonema and Chiropsalmus quadrumanus. Protein recruitment patterns were then compared against 5 other cnidarian venom proteomes taken from the literature. A total of 28 putative toxin protein families were identified, many for the first time in Cnidaria. Character mapping analysis revealed that 17 toxin protein families with predominantly cytolytic biological activities were likely recruited into the cnidarian venom proteome before the lineage split between Anthozoa and Medusozoa. Thereafter, venoms of Medusozoa and Anthozoa differed during subsequent divergence of cnidarian classes. Recruitment and loss of toxin protein families did not correlate with accepted phylogenetic patterns of Cnidaria. Selective pressures that drive toxin diversification independent of taxonomic positioning have yet to be identified in Cnidaria and now warrant experimental consideration. Copyright © 2017 Elsevier Ltd. All rights reserved.
Mackie, P.M.; Gharbi, K.; Ballantyne, J.S.; McCormick, S.D.; Wright, P.A.
2007-01-01
Smoltification involves morphological and physiological changes in the gills that prepare anadromous salmonids to osmoregulate efficiently in seawater. In a previous study, we found that different families of Atlantic salmon (Salmo salar) smolts vary in their ability to osmoregulate when abruptly transferred to cold seawater and that these differences are correlated with gill Na+/K+ ATPase activity. Here we extend these findings to test whether other key transport proteins, namely Na+/K+/2Cl- contransporter (NKCC) and the Cl- channel or cystic fibrosis transmembrane conductance regulator (CFTR), play a significant role in osmoregulatory differences between families. To facilitate molecular analysis of NKCC, we first isolated a gill cDNA containing the complete coding region (1147 aa) of an isoform previously reported as a partial sequence. Phylogenetic analysis showed that this isoform is most closely related to isoforms of the NKCC1a subfamily found in European eel and Mozambique tilapia. In a second step, we quantified NKCC protein abundance as well as mRNA expression levels for NKCC1a and two CFTR isoforms (CFTRI and CFTRII) in 0+ smolts from three families prior to and following seawater transfer. The family with the lowest salinity tolerance also showed significant increases in gill NKCC1a mRNA after seawater transfer. Taken together with our previous study, these data indicate that family differences in expression of transport proteins are in part related to salinity tolerance, although the best indicator of osmoregulatory performance between families may be gill Na+/K+ ATPase activity and CFTR I mRNA levels, rather than Na+/K+ ATPase and NKCC1a mRNA levels or NKCC protein abundance. ?? 2007 Elsevier B.V. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Perry, R.T.; Go, R.C.P.; Harrell, L.E.
1995-02-27
Alzheimer`s disease (AD) is a progressive, degenerative neurological disorder of the central nervous system. AD is the fourth leading cause of death in elderly persons 65 years or older in Western industrialized societies. The etiology of AD is unknown, but clinical, pathological, epidemiological, and molecular investigations suggest it is etiologically heterogeneous. Mutations in the amyloid protein are rare and segregate with the disease in a few early-onset familial AD (FAD) families. Similarities between AD and the unconventional viral (UCV) diseases, and between the amyloid and prion proteins, implicate the human prion protein gene (PRNP) as another candidate gene. Single strandmore » conformation polymorphism (SSCP) analysis was used to screen for mutations at this locus in 82 AD patients from 54 families (30 FAD), vs. 39 age-matched controls. A 24-bp deletion around codon 68 that codes for one of five Gly-Pro rich octarepeats was identified in two affected sibs and one offspring of one late-onset FAD family. Two other affected sibs, three unaffected sibs, and three offspring from this family, in addition to one sporadic AD patient and three age-matched controls, were heterozygous for another octarepeat deletion located around codon 82. Two of the four affected sibs had features of PD, including one who was autopsy-verified AD and PD. Although these deletions were found infrequently in other AD patients and controls, they appear to be a rare polymorphism that is segregating in this FAD family. It does not appear that mutations at the PRNP locus are frequently associated with AD in this population. 54 refs., 4 figs.« less
Iyer, Lakshminarayan M; Tahiliani, Mamta; Rao, Anjana; Aravind, L
2009-06-01
Modified bases in nucleic acids present a layer of information that directs biological function over and beyond the coding capacity of the conventional bases. While a large number of modified bases have been identified, many of the enzymes generating them still remain to be discovered. Recently, members of the 2-oxoglutarate- and iron(II)-dependent dioxygenase super-family, which modify diverse substrates from small molecules to biopolymers, were predicted and subsequently confirmed to catalyze oxidative modification of bases in nucleic acids. Of these, two distinct families, namely the AlkB and the kinetoplastid base J binding proteins (JBP) catalyze in situ hydroxylation of bases in nucleic acids. Using sensitive computational analysis of sequences, structures and contextual information from genomic structure and protein domain architectures, we report five distinct families of 2-oxoglutarate- and iron(II)-dependent dioxygenase that we predict to be involved in nucleic acid modifications. Among the DNA-modifying families, we show that the dioxygenase domains of the kinetoplastid base J-binding proteins belong to a larger family that includes the Tet proteins, prototyped by the human oncogene Tet1, and proteins from basidiomycete fungi, chlorophyte algae, heterolobosean amoeboflagellates and bacteriophages. We present evidence that some of these proteins are likely to be involved in oxidative modification of the 5-methyl group of cytosine leading to the formation of 5-hydroxymethylcytosine. The Tet/JBP homologs from basidiomycete fungi such as Laccaria and Coprinopsis show large lineage-specific expansions and a tight linkage with genes encoding a novel and distinct family of predicted transposases, and a member of the Maelstrom-like HMG family. We propose that these fungal members are part of a mobile transposon. To the best of our knowledge, this is the first report of a eukaryotic transposable element that encodes its own DNA-modification enzyme with a potential regulatory role. Through a wider analysis of other poorly characterized DNA-modifying enzymes we also show that the phage Mu Mom-like proteins, which catalyze the N6-carbamoylmethylation of adenines, are also linked to diverse families of bacterial transposases, suggesting that DNA modification by transposable elements might have a more general presence than previously appreciated. Among the other families of 2-oxoglutarate- and iron(II)-dependent dioxygenases identified in this study, one which is found in algae, is predicted to mainly comprise of RNA-modifying enzymes and shows a striking diversity in protein domain architectures suggesting the presence of RNA modifications with possibly unique adaptive roles. The results presented here are likely to provide the means for future investigation of unexpected epigenetic modifications, such as hydroxymethyl cytosine, that could profoundly impact our understanding of gene regulation and processes such as DNA demethylation.
High-throughput analysis of peptide binding modules
Liu, Bernard A.; Engelmann, Brett; Nash, Piers D.
2014-01-01
Modular protein interaction domains that recognize linear peptide motifs are found in hundreds of proteins within the human genome. Some protein interaction domains such as SH2, 14-3-3, Chromo and Bromo domains serve to recognize post-translational modification of amino acids (such as phosphorylation, acetylation, methylation etc.) and translate these into discrete cellular responses. Other modules such as SH3 and PDZ domains recognize linear peptide epitopes and serve to organize protein complexes based on localization and regions of elevated concentration. In both cases, the ability to nucleate specific signaling complexes is in large part dependent on the selectivity of a given protein module for its cognate peptide ligand. High throughput analysis of peptide-binding domains by peptide or protein arrays, phage display, mass spectrometry or other HTP techniques provides new insight into the potential protein-protein interactions prescribed by individual or even whole families of modules. Systems level analyses have also promoted a deeper understanding of the underlying principles that govern selective protein-protein interactions and how selectivity evolves. Lastly, there is a growing appreciation for the limitations and potential pitfalls of high-throughput analysis of protein-peptide interactomes. This review will examine some of the common approaches utilized for large-scale studies of protein interaction domains and suggest a set of standards for the analysis and validation of datasets from large-scale studies of peptide-binding modules. We will also highlight how data from large-scale studies of modular interaction domain families can provide insight into systems level properties such as the linguistics of selective interactions. PMID:22610655
An overview of the structures of protein-DNA complexes
Luscombe, Nicholas M; Austin, Susan E; Berman , Helen M; Thornton, Janet M
2000-01-01
On the basis of a structural analysis of 240 protein-DNA complexes contained in the Protein Data Bank (PDB), we have classified the DNA-binding proteins involved into eight different structural/functional groups, which are further classified into 54 structural families. Here we present this classification and review the functions, structures and binding interactions of these protein-DNA complexes. PMID:11104519
Building toy models of proteins using coevolutionary information
NASA Astrophysics Data System (ADS)
Cheng, Ryan; Raghunathan, Mohit; Onuchic, Jose
2015-03-01
Recent developments in global statistical methodologies have advanced the analysis of large collections of protein sequences for coevolutionary information. Coevolution between amino acids in a protein arises from compensatory mutations that are needed to maintain the stability or function of a protein over the course of evolution. This gives rise to quantifiable correlations between amino acid positions within the multiple sequence alignment of a protein family. Here, we use Direct Coupling Analysis (DCA) to infer a Potts model Hamiltonian governing the correlated mutations in a protein family to obtain the sequence-dependent interaction energies of a toy protein model. We demonstrate that this methodology predicts residue-residue interaction energies that are consistent with experimental mutational changes in protein stabilities as well as other computational methodologies. Furthermore, we demonstrate with several examples that DCA could be used to construct a structure-based model that quantitatively agrees with experimental data on folding mechanisms. This work serves as a potential framework for generating models of proteins that are enriched by evolutionary data that can potentially be used to engineer key functional motions and interactions in protein systems. This research has been supported by the NSF INSPIRE award MCB-1241332 and by the CTBP sponsored by the NSF (Grant PHY-1427654).
Vasilakis, Nikos; Widen, Steven; Mayer, Sandra V.; Seymour, Robert; Wood, Thomas G.; Popov, Vsevolov; Guzman, Hilda; da Rosa, Amelia P.A. Travassos; Ghedin, Elodie; Holmes, Edward C.; Walker, Peter J.; Tesh, Robert B.
2013-01-01
Members of the family Rhabdoviridae have been assigned to eight genera but many remain unassigned. Rhabdoviruses have a remarkably diverse host range that includes terrestrial and marine animals, invertebrates and plants. Transmission of some rhabdoviruses often requires an arthropod vector, such as mosquitoes, midges, sandflies, ticks, aphids and leafhoppers, in which they replicate. Herein we characterize Niakha virus (NIAV), a previously uncharacterized rhabdovirus isolated from phebotomine sandflies in Senegal. Analysis of the 11,124 nt genome sequence indicates that it encodes the five common rhabdovirus proteins with alternative ORFs in the M, G and L genes. Phylogenetic analysis of the L protein indicate that NIAV’s closest relative is Oak Vale rhabdovirus, although in this analysis NIAV is still so phylogenetically distinct that it might be classified as distinct from the eight currently recognized Rhabdoviridae genera. This observation highlights the vast, and yet not fully recognized diversity, of this family. PMID:23773405
The family B1 GPCR: structural aspects and interaction with accessory proteins.
Couvineau, Alain; Laburthe, Marc
2012-01-01
G protein coupled receptors (GPCRs) play a crucial role in physiology and pathophysiology in humans. Beside the large family A (rhodopsin-like receptors) and family C GPCR (metabotropic glutamate receptors), the small family B1 GPCR (secretin-like receptors) includes important receptors such as vasoactive intestinal peptide receptors (VPAC), pituitary adenylyl cyclase activating peptide receptor (PAC1R), secretin receptor (SECR), growth hormone releasing factor receptor (GRFR), glucagon receptor (GCGR), glucagon like-peptide 1 and 2 receptors (GLPR), gastric inhibitory peptide receptor (GIPR), parathyroid hormone receptors (PTHR), calcitonin receptors (CTR) and corticotropin-releasing factor receptors (CRFR). They represent very promising targets for the development of drugs having therapeutical impact on many diseases such as chronic inflammation, neurodegeneration, diabetes, stress and osteoporosis. Over the past decade, structure-function relationship studies have demonstrated that the N-terminal ectodomain (N-ted) of family B1 receptors plays a pivotal role in natural ligand recognition. Structural analysis of some family B1 GPCR N-teds revealed the existence of a Sushi domain fold consisting of two antiparallel β sheets stabilized by three disulfide bonds and a salt bridge. The family B1 GPCRs promote cellular responses through a signaling pathway including predominantly the Gsadenylyl cyclase-cAMP pathway activation. Family B1 GPCRs also interact with a few accessory proteins which play a role in cell signaling, receptor expression and/or pharmacological profiles of receptors. These accessory proteins may represent new targets for the design of new drugs. Here, we review the current knowledge regarding: i) the structure of family B1 GPCR binding domain for natural ligands and ii) the interaction of family B1 GPCRs with accessory proteins.
Identification and analysis of mutational hotspots in oncogenes and tumour suppressors.
Baeissa, Hanadi; Benstead-Hume, Graeme; Richardson, Christopher J; Pearl, Frances M G
2017-03-28
The key to interpreting the contribution of a disease-associated mutation in the development and progression of cancer is an understanding of the consequences of that mutation both on the function of the affected protein and on the pathways in which that protein is involved. Protein domains encapsulate function and position-specific domain based analysis of mutations have been shown to help elucidate their phenotypes. In this paper we examine the domain biases in oncogenes and tumour suppressors, and find that their domain compositions substantially differ. Using data from over 30 different cancers from whole-exome sequencing cancer genomic projects we mapped over one million mutations to their respective Pfam domains to identify which domains are enriched in any of three different classes of mutation; missense, indels or truncations. Next, we identified the mutational hotspots within domain families by mapping small mutations to equivalent positions in multiple sequence alignments of protein domainsWe find that gain of function mutations from oncogenes and loss of function mutations from tumour suppressors are normally found in different domain families and when observed in the same domain families, hotspot mutations are located at different positions within the multiple sequence alignment of the domain. By considering hotspots in tumour suppressors and oncogenes independently, we find that there are different specific positions within domain families that are particularly suited to accommodate either a loss or a gain of function mutation. The position is also dependent on the class of mutation.We find rare mutations co-located with well-known functional mutation hotspots, in members of homologous domain superfamilies, and we detect novel mutation hotspots in domain families previously unconnected with cancer. The results of this analysis can be accessed through the MOKCa database (http://strubiol.icr.ac.uk/extra/MOKCa).
Hrle, Ajla; Maier, Lisa-Katharina; Sharma, Kundan; Ebert, Judith; Basquin, Claire; Urlaub, Henning; Marchfelder, Anita; Conti, Elena
2014-01-01
Upon pathogen invasion, bacteria and archaea activate an RNA-interference-like mechanism termed CRISPR (clustered regularly interspaced short palindromic repeats). A large family of Cas (CRISPR-associated) proteins mediates the different stages of this sophisticated immune response. Bioinformatic studies have classified the Cas proteins into families, according to their sequences and respective functions. These range from the insertion of the foreign genetic elements into the host genome to the activation of the interference machinery as well as target degradation upon attack. Cas7 family proteins are central to the type I and type III interference machineries as they constitute the backbone of the large interference complexes. Here we report the crystal structure of Thermofilum pendens Csc2, a Cas7 family protein of type I-D. We found that Csc2 forms a core RRM-like domain, flanked by three peripheral insertion domains: a lid domain, a Zinc-binding domain and a helical domain. Comparison with other Cas7 family proteins reveals a set of similar structural features both in the core and in the peripheral domains, despite the absence of significant sequence similarity. T. pendens Csc2 binds single-stranded RNA in vitro in a sequence-independent manner. Using a crosslinking - mass-spectrometry approach, we mapped the RNA-binding surface to a positively charged surface patch on T. pendens Csc2. Thus our analysis of the key structural and functional features of T. pendens Csc2 highlights recurring themes and evolutionary relationships in type I and type III Cas proteins.
Rab11 family expression in the human placenta: Localization at the maternal-fetal interface
Artemiuk, Patrycja A.; Hanscom, Sara R.; Lindsay, Andrew J.; Wuebbolt, Danielle; Breathnach, Fionnuala M.; Tully, Elizabeth C.; Khan, Amir R.; McCaffrey, Mary W.
2017-01-01
Rab proteins are a family of small GTPases involved in a variety of cellular processes. The Rab11 subfamily in particular directs key steps of intracellular functions involving vesicle trafficking of the endosomal recycling pathway. This Rab subfamily works through a series of effector proteins including the Rab11-FIPs (Rab11 Family-Interacting Proteins). While the Rab11 subfamily has been well characterized at the cellular level, its function within human organ systems is still being explored. In an effort to further study these proteins, we conducted a preliminary investigation of a subgroup of endosomal Rab proteins in a range of human cell lines by Western blotting. The results from this analysis indicated that Rab11a, Rab11c(Rab25) and Rab14 were expressed in a wide range of cell lines, including the human placental trophoblastic BeWo cell line. These findings encouraged us to further analyse the localization of these Rabs and their common effector protein, the Rab Coupling Protein (RCP), by immunofluorescence microscopy and to extend this work to normal human placental tissue. The placenta is a highly active exchange interface, facilitating transfer between mother and fetus during pregnancy. As Rab11 proteins are closely involved in transcytosis we hypothesized that the placenta would be an interesting human tissue model system for Rab investigation. By immunofluorescence microscopy, Rab11a, Rab11c(Rab25), Rab14 as well as their common FIP effector RCP showed prominent expression in the placental cell lines. We also identified the expression of these proteins in human placental lysates by Western blot analysis. Further, via fluorescent immunohistochemistry, we noted abundant localization of these proteins within key functional areas of primary human placental tissues, namely the outer syncytial layer of placental villous tissue and the endothelia of fetal blood vessels. Overall these findings highlight the expression of the Rab11 family within the human placenta, with novel localization at the maternal-fetal interface. PMID:28922401
Dynamic Palmitoylation and the Role of DHHC Proteins in T Cell Activation and Anergy
Ladygina, Nadejda; Martin, Brent R.; Altman, Amnon
2017-01-01
Although protein S-palmitoylation was first characterized >30 years ago, and is implicated in the function, trafficking, and localization of many proteins, little is known about the regulation and physiological implications of this posttranslational modification. Palmitoylation of various signaling proteins required for TCR-induced T cell activation is also necessary for their proper function. LAT (linker for activation of T cells) is an essential scaffolding protein involved in T cell development and activation, and we found that its palmitoylation is selectively impaired in anergic T cells. The recent discovery of the DHHC family of palmitoyl acyl transferases (PATs) and the establishment of sensitive and quantitative proteomics-based methods for global analysis of the palmitoyl proteome led to significant progress in studying the biology and underlying mechanisms of cellular protein palmitoylation. We are using these approaches to explore the palmitoyl proteome in T lymphocytes and, specifically, the mechanistic basis for the impaired palmitoylation of LAT in anergic T cells. This chapter reviews the history of protein palmitoylation and its role in T cell activation, the DHHC family and new methodologies for global analysis of the palmitoyl proteome, and summarizes our recent work in this area. The new methodologies will accelerate the pace of research and provide a greatly improved mechanistic and molecular understanding of the complex process of protein palmitoylation and its regulation, and the substrate specificity of the novel DHHC family. Reversible protein palmitoylation will likely prove to be an important posttranslational mechanism that regulates cellular responses, similar to protein phosphorylation and ubiquitination. PMID:21569911
Knutson, Stacy T; Westwood, Brian M; Leuthaeuser, Janelle B; Turner, Brandon E; Nguyendac, Don; Shea, Gabrielle; Kumar, Kiran; Hayden, Julia D; Harper, Angela F; Brown, Shoshana D; Morris, John H; Ferrin, Thomas E; Babbitt, Patricia C; Fetrow, Jacquelyn S
2017-04-01
Protein function identification remains a significant problem. Solving this problem at the molecular functional level would allow mechanistic determinant identification-amino acids that distinguish details between functional families within a superfamily. Active site profiling was developed to identify mechanistic determinants. DASP and DASP2 were developed as tools to search sequence databases using active site profiling. Here, TuLIP (Two-Level Iterative clustering Process) is introduced as an iterative, divisive clustering process that utilizes active site profiling to separate structurally characterized superfamily members into functionally relevant clusters. Underlying TuLIP is the observation that functionally relevant families (curated by Structure-Function Linkage Database, SFLD) self-identify in DASP2 searches; clusters containing multiple functional families do not. Each TuLIP iteration produces candidate clusters, each evaluated to determine if it self-identifies using DASP2. If so, it is deemed a functionally relevant group. Divisive clustering continues until each structure is either a functionally relevant group member or a singlet. TuLIP is validated on enolase and glutathione transferase structures, superfamilies well-curated by SFLD. Correlation is strong; small numbers of structures prevent statistically significant analysis. TuLIP-identified enolase clusters are used in DASP2 GenBank searches to identify sequences sharing functional site features. Analysis shows a true positive rate of 96%, false negative rate of 4%, and maximum false positive rate of 4%. F-measure and performance analysis on the enolase search results and comparison to GEMMA and SCI-PHY demonstrate that TuLIP avoids the over-division problem of these methods. Mechanistic determinants for enolase families are evaluated and shown to correlate well with literature results. © 2017 The Authors Protein Science published by Wiley Periodicals, Inc. on behalf of The Protein Society.
Evolutionary distance from human homologs reflects allergenicity of animal food proteins.
Jenkins, John A; Breiteneder, Heimo; Mills, E N Clare
2007-12-01
In silico analysis of allergens can identify putative relationships among protein sequence, structure, and allergenic properties. Such systematic analysis reveals that most plant food allergens belong to a restricted number of protein superfamilies, with pollen allergens behaving similarly. We have investigated the structural relationships of animal food allergens and their evolutionary relatedness to human homologs to define how closely a protein must resemble a human counterpart to lose its allergenic potential. Profile-based sequence homology methods were used to classify animal food allergens into Pfam families, and in silico analyses of their evolutionary and structural relationships were performed. Animal food allergens could be classified into 3 main families--tropomyosins, EF-hand proteins, and caseins--along with 14 minor families each composed of 1 to 3 allergens. The evolutionary relationships of each of these allergen superfamilies showed that in general, proteins with a sequence identity to a human homolog above approximately 62% were rarely allergenic. Single substitutions in otherwise highly conserved regions containing IgE epitopes in EF-hand parvalbumins may modulate allergenicity. These data support the premise that certain protein structures are more allergenic than others. Contrasting with plant food allergens, animal allergens, such as the highly conserved tropomyosins, challenge the capability of the human immune system to discriminate between foreign and self-proteins. Such immune responses run close to becoming autoimmune responses. Exploiting the closeness between animal allergens and their human homologs in the development of recombinant allergens for immunotherapy will need to consider the potential for developing unanticipated autoimmune responses.
Zou, Zhi; Yang, Lifu; Gong, Jun; Mo, Yeyong; Wang, Jikun; Cao, Jianhua; An, Feng; Xie, Guishui
2016-01-01
Aquaporins (AQPs) are channel-forming integral membrane proteins that transport water and other small solutes across biological membranes. Despite the vital role of AQPs, to date, little is known in physic nut (Jatropha curcas L., Euphorbiaceae), an important non-edible oilseed crop with great potential for the production of biodiesel. In this study, 32 AQP genes were identified from the physic nut genome and the family number is relatively small in comparison to 51 in another Euphorbiaceae plant, rubber tree (Hevea brasiliensis Muell. Arg.). Based on the phylogenetic analysis, the JcAQPs were assigned to five subfamilies, i.e., nine plasma membrane intrinsic proteins (PIPs), nine tonoplast intrinsic proteins (TIPs), eight NOD26-like intrinsic proteins (NIPs), two X intrinsic proteins (XIPs), and four small basic intrinsic proteins (SIPs). Like rubber tree and other plant species, functional prediction based on the aromatic/arginine selectivity filter, Froger's positions, and specificity-determining positions showed a remarkable difference in substrate specificity among subfamilies of JcAQPs. Genome-wide comparative analysis revealed the specific expansion of PIP and TIP subfamilies in rubber tree and the specific gene loss of the XIP subfamily in physic nut. Furthermore, by analyzing deep transcriptome sequencing data, the expression evolution especially the expression divergence of duplicated HbAQP genes was also investigated and discussed. Results obtained from this study not only provide valuable information for future functional analysis and utilization of Jc/HbAQP genes, but also provide a useful reference to survey the gene family expansion and evolution in Euphorbiaceae plants and other plant species. PMID:27066041
Santucci, Laura; Candiano, Giovanni; Anglani, Franca; Bruschi, Maurizio; Tosetto, Enrica; Cremasco, Daniela; Murer, Luisa; D'Ambrosio, Chiara; Scaloni, Andrea; Petretto, Andrea; Caridi, Gianluca; Rossi, Roberta; Bonanni, Alice; Ghiggeri, Gian Marco
2016-01-01
Definition of the urinary protein composition would represent a potential tool for diagnosis in many clinical conditions. The use of new proteomic technologies allows detection of genetic and post-trasductional variants that increase sensitivity of the approach but complicates comparison within a heterogeneous patient population. Overall, this limits research of urinary biomarkers. Studying monogenic diseases are useful models to address this issue since genetic variability is reduced among first- and second-degree relatives of the same family. We applied this concept to Dent's disease, a monogenic condition characterised by low-molecular-weight proteinuria that is inherited following an X-linked trait. Results are presented here on a combined proteomic approach (LC-mass spectrometry, Western blot and zymograms for proteases and inhibitors) to characterise urine proteins in a large family (18 members, 6 hemizygous patients, 6 carrier females, and 6 normals) with Dent's diseases due to the 1070G>T mutation of the CLCN5. Gene ontology analysis on more than 1000 proteins showed that several clusters of proteins characterised urine of affected patients compared to carrier females and normal subjects: proteins involved in extracellular matrix remodelling were the major group. Specific analysis on metalloproteases and their inhibitors underscored unexpected mechanisms potentially involved in renal fibrosis. Studying with new-generation techniques for proteomic analysis of the members of a large family with Dent's disease sharing the same molecular defect allowed highly repetitive results that justify conclusions. Identification in urine of proteins actively involved in interstitial matrix remodelling poses the question of active anti-fibrotic drugs in Dent's patients. Copyright © 2015 Elsevier B.V. All rights reserved.
Nasir, Muhammad; Ahmad, Nafees; Sieber, Christian M K; Latif, Amir; Malik, Salman Akbar; Hameed, Abdul
2013-09-24
Xeroderma Pigmentosum (XP) is a rare skin disorder characterized by skin hypersensitivity to sunlight and abnormal pigmentation. The aim of this study was to investigate the genetic cause of a severe XP phenotype in a consanguineous Pakistani family and in silico characterization of any identified disease-associated mutation. The XP complementation group was assigned by genotyping of family for known XP loci. Genotyping data mapped the family to complementation group A locus, involving XPA gene. Mutation analysis of the candidate XP gene by DNA sequencing revealed a novel deletion mutation (c.654del A) in exon 5 of XPA gene. The c.654del A, causes frameshift, which pre-maturely terminates protein and result into a truncated product of 222 amino acid (aa) residues instead of 273 (p.Lys218AsnfsX5). In silico tools were applied to study the likelihood of changes in structural motifs and thus interaction of mutated protein with binding partners. In silico analysis of mutant protein sequence, predicted to affect the aa residue which attains coiled coil structure. The coiled coil structure has an important role in key cellular interactions, especially with DNA damage-binding protein 2 (DDB2), which has important role in DDB-mediated nucleotide excision repair (NER) system. Our findings support the fact of genetic and clinical heterogeneity in XP. The study also predicts the critical role of DDB2 binding region of XPA protein in NER pathway and opens an avenue for further research to study the functional role of the mutated protein domain.
Syed, Khajamohiddin; Mashele, Samson Sitheni
2014-01-01
Cytochrome P450 monooxygenases (P450s) are heme-thiolate proteins distributed across the biological kingdoms. P450s are catalytically versatile and play key roles in organisms primary and secondary metabolism. Identification of P450s across the biological kingdoms depends largely on the identification of two P450 signature motifs, EXXR and CXG, in the protein sequence. Once a putative protein has been identified as P450, it will be assigned to a family and subfamily based on the criteria that P450s within a family share more than 40% homology and members of subfamilies share more than 55% homology. However, to date, no evidence has been presented that can distinguish members of a P450 family. Here, for the first time we report the identification of EXXR- and CXG-motifs-based amino acid patterns that are characteristic of the P450 family. Analysis of P450 signature motifs in the under-explored fungal P450s from four different phyla, ascomycota, basidiomycota, zygomycota and chytridiomycota, indicated that the EXXR motif is highly variable and the CXG motif is somewhat variable. The amino acids threonine and leucine are preferred as second and third amino acids in the EXXR motif and proline and glycine are preferred as second and third amino acids in the CXG motif in fungal P450s. Analysis of 67 P450 families from biological kingdoms such as plants, animals, bacteria and fungi showed conservation of a set of amino acid patterns characteristic of a particular P450 family in EXXR and CXG motifs. This suggests that during the divergence of P450 families from a common ancestor these amino acids patterns evolve and are retained in each P450 family as a signature of that family. The role of amino acid patterns characteristic of a P450 family in the structural and/or functional aspects of members of the P450 family is a topic for future research. PMID:24743800
Anantharaman, Vivek; Aravind, L
2003-01-01
Peptidoglycan is hydrolyzed by a diverse set of enzymes during bacterial growth, development and cell division. The N1pC/P60 proteins define a family of cell-wall peptidases that are widely represented in various bacterial lineages. Currently characterized members are known to hydrolyze D-gamma-glutamyl-meso-diaminopimelate or N-acetylmuramate-L-alanine linkages. Detailed analysis of the N1pC/P60 peptidases showed that these proteins define a large superfamily encompassing several diverse groups of proteins. In addition to the well characterized P60-like proteins, this superfamily includes the AcmB/LytN and YaeF/YiiX families of bacterial proteins, the amidase domain of bacterial and kinetoplastid glutathionylspermidine synthases (GSPSs), and several proteins from eukaryotes, phages, poxviruses, positive-strand RNA viruses, and certain archaea. The eukaryotic members include lecithin retinol acyltransferase (LRAT), nematode developmental regulator Egl-26, and candidate tumor suppressor H-rev107. These eukaryotic proteins, along with the bacterial YaeF/poxviral G6R family, show a circular permutation of the catalytic domain. We identified three conserved residues, namely a cysteine, a histidine and a polar residue, that are involved in the catalytic activities of this superfamily. Evolutionary analysis of this superfamily shows that it comprises four major families, with diverse domain architectures in each of them. Several related, but distinct, catalytic activities, such as murein degradation, acyl transfer and amide hydrolysis, have emerged in the N1pC/P60 superfamily. The three conserved catalytic residues of this superfamily are shown to be equivalent to the catalytic triad of the papain-like thiol peptidases. The predicted structural features indicate that the N1pC/P60 enzymes contain a fold similar to the papain-like peptidases, transglutaminases and arylamine acetyltransferases.
2009-01-01
Background Subcellular trafficking is a hallmark of eukaryotic cells. Because of their pivotal role in the process, a great deal of attention has been paid to the SNARE proteins. Most R-SNAREs, or "longins", however, also possess a highly conserved, N-terminal fold. This "longin domain" is known to play multiple roles in regulating SNARE activity and targeting via interaction with other trafficking proteins. However, the diversity and complement of longins in eukaryotes is poorly understood. Results Our comparative genome survey identified a novel family of longin-related proteins, dubbed the "Phytolongins" because they are specific to land plants. Phytolongins share with longins the N-terminal longin domain and the C-terminal transmembrane domain; however, in the central region, the SNARE motif is replaced by a novel region. Phylogenetic analysis pinpoints the Phytolongins as a derivative of the plant specific VAMP72 longin sub-family and allows elucidation of Phytolongin evolution. Conclusion "Longins" have been defined as R-SNAREs composed of both a longin domain and a SNARE motif. However, expressed gene isoforms and splice variants of longins are examples of non-SNARE motif containing longins. The discovery of Phytolongins, a family of non-SNARE longin domain proteins, together with recent evidence on the conservation of the longin-like fold in proteins involved in both vesicle fusion (e.g. the Trs20 tether) and vesicle formation (e.g. σ and μ adaptin) highlight the importance of the longin-like domain in protein trafficking and suggest that it was one of the primordial building blocks of the eukaryotic membrane-trafficking machinery. PMID:19889231
Candida albicans Iff11, a secreted protein required for cell wall structure and virulence.
Bates, Steven; de la Rosa, José M; MacCallum, Donna M; Brown, Alistair J P; Gow, Neil A R; Odds, Frank C
2007-06-01
The Candida albicans cell wall is the immediate point of contact with the host and is implicated in the host-fungal interaction and virulence. To date, a number of cell wall proteins have been identified and associated with virulence. Analysis of the C. albicans genome has identified the IFF gene family as encoding the largest family of cell wall-related proteins. This family is also conserved in a range of other Candida species. Iff11 differs from other family members in lacking a GPI anchor, and we have demonstrated it to be O glycosylated and secreted in C. albicans. A null mutant lacking IFF11 was hypersensitive to cell wall-damaging agents, suggesting a role in cell wall organization. In a murine model of systemic infection the null mutant was highly attenuated in virulence, and survival-standardized infections suggest it is required to establish an infection. This work provides the first evidence of the importance of this gene family in the host-fungal interaction and virulence.
Generation of henipavirus nucleocapsid proteins in yeast Saccharomyces cerevisiae.
Juozapaitis, Mindaugas; Serva, Andrius; Zvirbliene, Aurelija; Slibinskas, Rimantas; Staniulis, Juozas; Sasnauskas, Kestutis; Shiell, Brian J; Wang, Lin-Fa; Michalski, Wojtek P
2007-03-01
Hendra and Nipah viruses are newly emerged, zoonotic viruses and their genomes have nucleotide and predicted amino acid homologies placing them in the family Paramyxoviridae. Currently these viruses are classified in the new genus Henipavirus, within the subfamily Paramyxovirinae, family Paramyxoviridae. The genes encoding HeV and NiV nucleocapsid proteins were cloned into the yeast Saccharomyces cerevisiae expression vector pFGG3 under control of GAL7 promoter. A high level of expression of these proteins (18-20 mg l(-1) of yeast culture) was obtained. Mass spectrometric analysis confirmed the primary structure of both proteins with 92% sequence coverage obtained using MS/MS analysis. Electron microscopy demonstrated the assembly of typical herring-bone structures of purified recombinant nucleocapsid proteins, characteristic for other paramyxoviruses. The nucleocapsid proteins revealed stability in yeast and can be easily purified by cesium chloride gradient ultracentrifugation. HeV nucleocapsid protein was detected by sera derived from fruit bats, humans, horses infected with HeV, and NiV nucleocapsid protein was immunodetected with sera from, fruit bats, humans and pigs. The development of an efficient and cost-effective system for generation of henipavirus nucleocapsid proteins might help to improve reagents for diagnosis of viruses.
Han, Ying-Li; Hou, Cong-Cong; Du, Chen; Zhu, Jun-Quan
2017-01-01
Heat shock proteins 70 (HSP70s) are molecular chaperones that aid in protection against environmental stress. In this study, we cloned and characterized five members of the HSP70 family (designated as HSPa1a, HSC70-1, HSC70-2, HSPa4 and HSPa14) from Lateolabrax maculatus using rapid amplification cDNA ends (RACE). Multiple sequence alignment and structural analysis revealed that all members of the HSP70 family had a conserved domain architecture, with some distinguishing features unique to each HSP70. Quantitative real-time (qPCR) analysis revealed that all members of the HSP70 family were ubiquitously and differentially expressed in all major types of tissues, including testicular tissue. This indicated that HSP70s have vital and conserved biological functions, and may also function in the development of germinal cells. The expression of mRNA of the five HSP70 family members mRNA expression was significantly increased in the head kidney, intestine and gill after Vibrio harveyi challenge, suggesting that HSP70s play an important role in the immune response. Copyright © 2016 Elsevier Ltd. All rights reserved.
Zebra: a web server for bioinformatic analysis of diverse protein families.
Suplatov, Dmitry; Kirilin, Evgeny; Takhaveev, Vakil; Svedas, Vytas
2014-01-01
During evolution of proteins from a common ancestor, one functional property can be preserved while others can vary leading to functional diversity. A systematic study of the corresponding adaptive mutations provides a key to one of the most challenging problems of modern structural biology - understanding the impact of amino acid substitutions on protein function. The subfamily-specific positions (SSPs) are conserved within functional subfamilies but are different between them and, therefore, seem to be responsible for functional diversity in protein superfamilies. Consequently, a corresponding method to perform the bioinformatic analysis of sequence and structural data has to be implemented in the common laboratory practice to study the structure-function relationship in proteins and develop novel protein engineering strategies. This paper describes Zebra web server - a powerful remote platform that implements a novel bioinformatic analysis algorithm to study diverse protein families. It is the first application that provides specificity determinants at different levels of functional classification, therefore addressing complex functional diversity of large superfamilies. Statistical analysis is implemented to automatically select a set of highly significant SSPs to be used as hotspots for directed evolution or rational design experiments and analyzed studying the structure-function relationship. Zebra results are provided in two ways - (1) as a single all-in-one parsable text file and (2) as PyMol sessions with structural representation of SSPs. Zebra web server is available at http://biokinet.belozersky.msu.ru/zebra .
Yang, Tao; Jia, Quanzhang; Guo, Hong; Xu, Jianzhong; Bai, Yun; Yang, Kai; Luo, Fei; Zhang, Zehua; Hou, Tianyong
2012-06-01
To investigate the effects of genetic factors on idiopathic scoliosis (IS) and genetic modes through genetic epidemiological survey on IS in Chongqing City, China, and to determine whether SH3GL1, GADD45B, and FGF22 in the chromosome 19p13.3 are the pathogenic genes of IS through genetic sequence analysis. 214 nuclear families were investigated to analyse the age incidence, familial aggregation, and heritability. SH3GL1, GADD45B, and FGF22 were chosen as candidate genes for mutation screening in 56 IS patients of 214 families. The sequence alignment analysis was performed to determine mutations and predict the protein structure. The average age of onset of 10.8 years suggests that IS is a early onset disease. Incidences of IS in first-, second-, third-degree relatives and the overall incidence in families (5.68%) were also significantly higher than that of the general population (1.04%). The U test indicated a significant difference, suggesting that IS has a familial aggregation. The heritability of first-degree relatives (77.68 ±10.39%), second-degree relatives (69.89 ±3.14%), and third-degree relatives (62.14 ±11.92%) illustrated that genetic factors play an important role in IS pathogenesis. The incidence of first-degree relatives (10.01%), second-degree relatives (2.55%) and third-degree relatives (1.76%) illustrated that IS is not in simple accord with monogenic Mendel's law but manifests as traits of multifactorial hereditary diseases. Sequence alignment of exons of SH3GL1, GADD45B, and FGF22 showed 17 base mutations, of which 16 mutations do not induce open reading frame (ORF) shift or amino acid changes whereas one mutation (C→T)occurred in SH3GL1 results in formation of the termination codon, which induces variation of protein reading frame. Prediction analysis of protein sequence showed that the SH3GL1 mutant encoded a truncated protein, thus affecting the protein structure. IS is a multifactorial genetic disease and SH3GL1 may be one of the pathogenic genes for IS.
Characterization of the Avian Trojan Gene Family Reveals Contrasting Evolutionary Constraints
Petrov, Petar; Syrjänen, Riikka; Smith, Jacqueline; Gutowska, Maria Weronika; Uchida, Tatsuya; Vainio, Olli; Burt, David W
2015-01-01
“Trojan” is a leukocyte-specific, cell surface protein originally identified in the chicken. Its molecular function has been hypothesized to be related to anti-apoptosis and the proliferation of immune cells. The Trojan gene has been localized onto the Z sex chromosome. The adjacent two genes also show significant homology to Trojan, suggesting the existence of a novel gene/protein family. Here, we characterize this Trojan family, identify homologues in other species and predict evolutionary constraints on these genes. The two Trojan-related proteins in chicken were predicted as a receptor-type tyrosine phosphatase and a transmembrane protein, bearing a cytoplasmic immuno-receptor tyrosine-based activation motif. We identified the Trojan gene family in ten other bird species and found related genes in three reptiles and a fish species. The phylogenetic analysis of the homologues revealed a gradual diversification among the family members. Evolutionary analyzes of the avian genes predicted that the extracellular regions of the proteins have been subjected to positive selection. Such selection was possibly a response to evolving interacting partners or to pathogen challenges. We also observed an almost complete lack of intracellular positively selected sites, suggesting a conserved signaling mechanism of the molecules. Therefore, the contrasting patterns of selection likely correlate with the interaction and signaling potential of the molecules. PMID:25803627
Characterization of the avian Trojan gene family reveals contrasting evolutionary constraints.
Petrov, Petar; Syrjänen, Riikka; Smith, Jacqueline; Gutowska, Maria Weronika; Uchida, Tatsuya; Vainio, Olli; Burt, David W
2015-01-01
"Trojan" is a leukocyte-specific, cell surface protein originally identified in the chicken. Its molecular function has been hypothesized to be related to anti-apoptosis and the proliferation of immune cells. The Trojan gene has been localized onto the Z sex chromosome. The adjacent two genes also show significant homology to Trojan, suggesting the existence of a novel gene/protein family. Here, we characterize this Trojan family, identify homologues in other species and predict evolutionary constraints on these genes. The two Trojan-related proteins in chicken were predicted as a receptor-type tyrosine phosphatase and a transmembrane protein, bearing a cytoplasmic immuno-receptor tyrosine-based activation motif. We identified the Trojan gene family in ten other bird species and found related genes in three reptiles and a fish species. The phylogenetic analysis of the homologues revealed a gradual diversification among the family members. Evolutionary analyzes of the avian genes predicted that the extracellular regions of the proteins have been subjected to positive selection. Such selection was possibly a response to evolving interacting partners or to pathogen challenges. We also observed an almost complete lack of intracellular positively selected sites, suggesting a conserved signaling mechanism of the molecules. Therefore, the contrasting patterns of selection likely correlate with the interaction and signaling potential of the molecules.
Mudgil, Yashwanti; Shiu, Shin-Han; Stone, Sophia L.; Salt, Jennifer N.; Goring, Daphne R.
2004-01-01
The Arabidopsis genome was searched to identify predicted proteins containing armadillo (ARM) repeats, a motif known to mediate protein-protein interactions in a number of different animal proteins. Using domain database predictions and models generated in this study, 108 Arabidopsis proteins were identified that contained a minimum of two ARM repeats with the majority of proteins containing four to eight ARM repeats. Clustering analysis showed that the 108 predicted Arabidopsis ARM repeat proteins could be divided into multiple groups with wide differences in their domain compositions and organizations. Interestingly, 41 of the 108 Arabidopsis ARM repeat proteins contained a U-box, a motif present in a family of E3 ligases, and these proteins represented the largest class of Arabidopsis ARM repeat proteins. In 14 of these U-box/ARM repeat proteins, there was also a novel conserved domain identified in the N-terminal region. Based on the phylogenetic tree, representative U-box/ARM repeat proteins were selected for further study. RNA-blot analyses revealed that these U-box/ARM proteins are expressed in a variety of tissues in Arabidopsis. In addition, the selected U-box/ARM proteins were found to be functional E3 ubiquitin ligases. Thus, these U-box/ARM proteins represent a new family of E3 ligases in Arabidopsis. PMID:14657406
Mudgil, Yashwanti; Shiu, Shin-Han; Stone, Sophia L; Salt, Jennifer N; Goring, Daphne R
2004-01-01
The Arabidopsis genome was searched to identify predicted proteins containing armadillo (ARM) repeats, a motif known to mediate protein-protein interactions in a number of different animal proteins. Using domain database predictions and models generated in this study, 108 Arabidopsis proteins were identified that contained a minimum of two ARM repeats with the majority of proteins containing four to eight ARM repeats. Clustering analysis showed that the 108 predicted Arabidopsis ARM repeat proteins could be divided into multiple groups with wide differences in their domain compositions and organizations. Interestingly, 41 of the 108 Arabidopsis ARM repeat proteins contained a U-box, a motif present in a family of E3 ligases, and these proteins represented the largest class of Arabidopsis ARM repeat proteins. In 14 of these U-box/ARM repeat proteins, there was also a novel conserved domain identified in the N-terminal region. Based on the phylogenetic tree, representative U-box/ARM repeat proteins were selected for further study. RNA-blot analyses revealed that these U-box/ARM proteins are expressed in a variety of tissues in Arabidopsis. In addition, the selected U-box/ARM proteins were found to be functional E3 ubiquitin ligases. Thus, these U-box/ARM proteins represent a new family of E3 ligases in Arabidopsis.
Benoit, Joshua B.; Attardo, Geoffrey M.; Michalkova, Veronika; Krause, Tyler B.; Bohova, Jana; Zhang, Qirui; Baumann, Aaron A.; Mireji, Paul O.; Takáč, Peter; Denlinger, David L.; Ribeiro, Jose M.; Aksoy, Serap
2014-01-01
In tsetse flies, nutrients for intrauterine larval development are synthesized by the modified accessory gland (milk gland) and provided in mother's milk during lactation. Interference with at least two milk proteins has been shown to extend larval development and reduce fecundity. The goal of this study was to perform a comprehensive characterization of tsetse milk proteins using lactation-specific transcriptome/milk proteome analyses and to define functional role(s) for the milk proteins during lactation. Differential analysis of RNA-seq data from lactating and dry (non-lactating) females revealed enrichment of transcripts coding for protein synthesis machinery, lipid metabolism and secretory proteins during lactation. Among the genes induced during lactation were those encoding the previously identified milk proteins (milk gland proteins 1–3, transferrin and acid sphingomyelinase 1) and seven new genes (mgp4–10). The genes encoding mgp2–10 are organized on a 40 kb syntenic block in the tsetse genome, have similar exon-intron arrangements, and share regions of amino acid sequence similarity. Expression of mgp2–10 is female-specific and high during milk secretion. While knockdown of a single mgp failed to reduce fecundity, simultaneous knockdown of multiple variants reduced milk protein levels and lowered fecundity. The genomic localization, gene structure similarities, and functional redundancy of MGP2–10 suggest that they constitute a novel highly divergent protein family. Our data indicates that MGP2–10 function both as the primary amino acid resource for the developing larva and in the maintenance of milk homeostasis, similar to the function of the mammalian casein family of milk proteins. This study underscores the dynamic nature of the lactation cycle and identifies a novel family of lactation-specific proteins, unique to Glossina sp., that are essential to larval development. The specificity of MGP2–10 to tsetse and their critical role during lactation suggests that these proteins may be an excellent target for tsetse-specific population control approaches. PMID:24763277
Antonenkov, Vasily D; Ohlmeier, Steffen; Sormunen, Raija T; Hiltunen, J Kalervo
2007-05-25
Mammalian UK114 belongs to a highly conserved family of proteins with unknown functions. Although it is believed that UK114 is a cytosolic or mitochondrial protein there is no detailed study of its intracellular localization. Using analytical subcellular fractionation, electron microscopic colloidal gold technique, and two-dimensional gel electrophoresis of peroxisomal matrix proteins combined with mass spectrometric analysis we show here that a large portion of UK114 is present in rat liver peroxisomes. The peroxisomal UK114 is a soluble matrix protein and it is not inducible by the peroxisomal proliferator clofibrate. The data predict involvement of UK114 in peroxisomal metabolism.
Proteomic characterization of a mouse model of familial Danish dementia.
Vitale, Monica; Renzone, Giovanni; Matsuda, Shuji; Scaloni, Andrea; D'Adamio, Luciano; Zambrano, Nicola
2012-01-01
A dominant mutation in the ITM2B/BRI2 gene causes familial Danish dementia (FDD) in humans. To model FDD in animal systems, a knock-in approach was recently implemented in mice expressing a wild-type and mutant allele, which bears the FDD-associated mutation. Since these FDD(KI) mice show behavioural alterations and impaired synaptic function, we characterized their synaptosomal proteome via two-dimensional differential in-gel electrophoresis. After identification by nanoliquid chromatography coupled to electrospray-linear ion trap tandem mass spectrometry, the differentially expressed proteins were classified according to their gene ontology descriptions and their predicted functional interactions. The Dlg4/Psd95 scaffold protein and additional signalling proteins, including protein phosphatases, were revealed by STRING analysis as potential players in the altered synaptic function of FDD(KI) mice. Immunoblotting analysis finally demonstrated the actual downregulation of the synaptosomal scaffold protein Dlg4/Psd95 and of the dual-specificity phosphatase Dusp3 in the synaptosomes of FDD(KI) mice.
Proteomic Characterization of a Mouse Model of Familial Danish Dementia
Vitale, Monica; Renzone, Giovanni; Matsuda, Shuji; Scaloni, Andrea; D'Adamio, Luciano; Zambrano, Nicola
2012-01-01
A dominant mutation in the ITM2B/BRI2 gene causes familial Danish dementia (FDD) in humans. To model FDD in animal systems, a knock-in approach was recently implemented in mice expressing a wild-type and mutant allele, which bears the FDD-associated mutation. Since these FDDKI mice show behavioural alterations and impaired synaptic function, we characterized their synaptosomal proteome via two-dimensional differential in-gel electrophoresis. After identification by nanoliquid chromatography coupled to electrospray-linear ion trap tandem mass spectrometry, the differentially expressed proteins were classified according to their gene ontology descriptions and their predicted functional interactions. The Dlg4/Psd95 scaffold protein and additional signalling proteins, including protein phosphatases, were revealed by STRING analysis as potential players in the altered synaptic function of FDDKI mice. Immunoblotting analysis finally demonstrated the actual downregulation of the synaptosomal scaffold protein Dlg4/Psd95 and of the dual-specificity phosphatase Dusp3 in the synaptosomes of FDDKI mice. PMID:22619496
Taddei, Lucilla; Stella, Giulio Rocco; Rogato, Alessandra; Bailleul, Benjamin; Fortunato, Antonio Emidio; Annunziata, Rossella; Sanges, Remo; Thaler, Michael; Lepetit, Bernard; Lavaud, Johann; Jaubert, Marianne; Finazzi, Giovanni; Bouly, Jean-Pierre; Falciatore, Angela
2016-01-01
Diatoms are phytoplanktonic organisms that grow successfully in the ocean where light conditions are highly variable. Studies of the molecular mechanisms of light acclimation in the marine diatom Phaeodactylum tricornutum show that carotenoid de-epoxidation enzymes and LHCX1, a member of the light-harvesting protein family, both contribute to dissipate excess light energy through non-photochemical quenching (NPQ). In this study, we investigate the role of the other members of the LHCX family in diatom stress responses. Our analysis of available genomic data shows that the presence of multiple LHCX genes is a conserved feature of diatom species living in different ecological niches. Moreover, an analysis of the levels of four P. tricornutum LHCX transcripts in relation to protein expression and photosynthetic activity indicates that LHCXs are differentially regulated under different light intensities and nutrient starvation, mostly modulating NPQ capacity. We conclude that multiple abiotic stress signals converge to regulate the LHCX content of cells, providing a way to fine-tune light harvesting and photoprotection. Moreover, our data indicate that the expansion of the LHCX gene family reflects functional diversification of its members which could benefit cells responding to highly variable ocean environments. PMID:27225826
Stattin, Eva-Lena; Wiklund, Fredrik; Lindblom, Karin; Önnerfjord, Patrik; Jonsson, Björn-Anders; Tegner, Yelverton; Sasaki, Takako; Struglics, André; Lohmander, Stefan; Dahl, Niklas; Heinegård, Dick; Aspberg, Anders
2010-01-01
Osteochondritis dissecans is a disorder in which fragments of articular cartilage and subchondral bone dislodge from the joint surface. We analyzed a five-generation family in which affected members had autosomal-dominant familial osteochondritis dissecans. A genome-wide linkage analysis identified aggrecan (ACAN) as a prime candidate gene for the disorder. Sequence analysis of ACAN revealed heterozygosity for a missense mutation (c.6907G > A) in affected individuals, resulting in a p.V2303M amino acid substitution in the aggrecan G3 domain C-type lectin, which mediates interactions with other proteins in the cartilage extracellular matrix. Binding studies with recombinant mutated and wild-type G3 proteins showed loss of fibulin-1, fibulin-2, and tenascin-R interactions for the V2303M protein. Mass spectrometric analyses of aggrecan purified from patient cartilage verified that V2303M aggrecan is produced and present in the tissue. Our results provide a molecular mechanism for the etiology of familial osteochondritis dissecans and show the importance of the aggrecan C-type lectin interactions for cartilage function in vivo. PMID:20137779
Identification, cloning and characterization of the tomato TCP transcription factor family.
Parapunova, Violeta; Busscher, Marco; Busscher-Lange, Jacqueline; Lammers, Michiel; Karlova, Rumyana; Bovy, Arnaud G; Angenent, Gerco C; de Maagd, Ruud A
2014-06-06
TCP proteins are plant-specific transcription factors, which are known to have a wide range of functions in different plant species such as in leaf development, flower symmetry, shoot branching, and senescence. Only a small number of TCP genes has been characterised from tomato (Solanum lycopersicum). Here we report several functional features of the members of the entire family present in the tomato genome. We have identified 30 Solanum lycopersicum SlTCP genes, most of which have not been described before. Phylogenetic analysis clearly distinguishes two homology classes of the SlTCP transcription factor family - class I and class II. Class II differentiates in two subclasses, the CIN-TCP subclass and the CYC/TB1 subclass, involved in leaf development and axillary shoots formation, respectively. The expression patterns of all members were determined by quantitative PCR. Several SlTCP genes, like SlTCP12, SlTCP15 and SlTCP18 are preferentially expressed in the tomato fruit, suggesting a role during fruit development or ripening. These genes are regulated by RIN (RIPENING INHIBITOR), CNR (COLORLESS NON-RIPENING) and SlAP2a (APETALA2a) proteins, which are transcription factors with key roles in ripening. With a yeast one-hybrid assay we demonstrated that RIN binds the promoter fragments of SlTCP12, SlTCP15 and SlTCP18, and that CNR binds the SlTCP18 promoter. This data strongly suggests that these class I SlTCP proteins are involved in ripening. Furthermore, we demonstrate that SlTCPs bind the promoter fragments of members of their own family, indicating that they regulate each other. Additional yeast one-hybrid studies performed with Arabidopsis transcription factors revealed binding of the promoter fragments by proteins involved in the ethylene signal transduction pathway, contributing to the idea that these SlTCP genes are involved in the ripening process. Yeast two-hybrid data shows that SlTCP proteins can form homo and heterodimers, suggesting that they act together in order to form functional protein complexes and together regulate developmental processes in tomato. The comprehensive analysis we performed, like phylogenetic analysis, expression studies, identification of the upstream regulators and the dimerization specificity of the tomato TCP transcription factor family provides the basis for functional studies to reveal the role of this family in tomato development.
Identification, cloning and characterization of the tomato TCP transcription factor family
2014-01-01
Background TCP proteins are plant-specific transcription factors, which are known to have a wide range of functions in different plant species such as in leaf development, flower symmetry, shoot branching, and senescence. Only a small number of TCP genes has been characterised from tomato (Solanum lycopersicum). Here we report several functional features of the members of the entire family present in the tomato genome. Results We have identified 30 Solanum lycopersicum SlTCP genes, most of which have not been described before. Phylogenetic analysis clearly distinguishes two homology classes of the SlTCP transcription factor family - class I and class II. Class II differentiates in two subclasses, the CIN-TCP subclass and the CYC/TB1 subclass, involved in leaf development and axillary shoots formation, respectively. The expression patterns of all members were determined by quantitative PCR. Several SlTCP genes, like SlTCP12, SlTCP15 and SlTCP18 are preferentially expressed in the tomato fruit, suggesting a role during fruit development or ripening. These genes are regulated by RIN (RIPENING INHIBITOR), CNR (COLORLESS NON-RIPENING) and SlAP2a (APETALA2a) proteins, which are transcription factors with key roles in ripening. With a yeast one-hybrid assay we demonstrated that RIN binds the promoter fragments of SlTCP12, SlTCP15 and SlTCP18, and that CNR binds the SlTCP18 promoter. This data strongly suggests that these class I SlTCP proteins are involved in ripening. Furthermore, we demonstrate that SlTCPs bind the promoter fragments of members of their own family, indicating that they regulate each other. Additional yeast one-hybrid studies performed with Arabidopsis transcription factors revealed binding of the promoter fragments by proteins involved in the ethylene signal transduction pathway, contributing to the idea that these SlTCP genes are involved in the ripening process. Yeast two-hybrid data shows that SlTCP proteins can form homo and heterodimers, suggesting that they act together in order to form functional protein complexes and together regulate developmental processes in tomato. Conclusions The comprehensive analysis we performed, like phylogenetic analysis, expression studies, identification of the upstream regulators and the dimerization specificity of the tomato TCP transcription factor family provides the basis for functional studies to reveal the role of this family in tomato development. PMID:24903607
Mao, Ke; Dong, Qinglong; Li, Chao; Liu, Changhai; Ma, Fengwang
2017-01-01
The bHLH (basic helix-loop-helix) transcription factor family is the second largest in plants. It occurs in all three eukaryotic kingdoms, and plays important roles in regulating growth and development. However, family members have not previously been studied in apple. Here, we identified 188 MdbHLH proteins in apple "Golden Delicious" ( Malus × domestica Borkh.), which could be classified into 18 groups. We also investigated the gene structures and 12 conserved motifs in these MdbHLH s. Coupled with expression analysis and protein interaction network prediction, we identified several genes that might be responsible for abiotic stress responses. This study provides insight and rich resources for subsequent investigations of such proteins in apple.
Mao, Ke; Dong, Qinglong; Li, Chao; Liu, Changhai; Ma, Fengwang
2017-01-01
The bHLH (basic helix-loop-helix) transcription factor family is the second largest in plants. It occurs in all three eukaryotic kingdoms, and plays important roles in regulating growth and development. However, family members have not previously been studied in apple. Here, we identified 188 MdbHLH proteins in apple “Golden Delicious” (Malus × domestica Borkh.), which could be classified into 18 groups. We also investigated the gene structures and 12 conserved motifs in these MdbHLHs. Coupled with expression analysis and protein interaction network prediction, we identified several genes that might be responsible for abiotic stress responses. This study provides insight and rich resources for subsequent investigations of such proteins in apple. PMID:28443104
Matthews, R J; Cahir, E D; Thomas, M L
1990-01-01
Protein-tyrosine-phosphatases (protein-tyrosine-phosphate phosphohydrolase, EC 3.13.48) have been implicated in the regulation of cell growth; however, to date few tyrosine phosphatases have been characterized. To identify additional family members, the cDNA for the human tyrosine phosphatase leukocyte common antigen (LCA; CD45) was used to screen, under low stringency, a mouse pre-B-cell cDNA library. Two cDNA clones were isolated and sequence analysis predicts a protein sequence of 793 amino acids. We have named the molecule LRP (LCA-related phosphatase). RNA transfer analysis indicates that the cDNAs were derived from a 3.2-kilobase mRNA. The LRP mRNA is transcribed in a wide variety of tissues. The predicted protein structure can be divided into the following structural features: a short 19-amino acid leader sequence, an exterior domain of 123 amino acids that is predicted to be highly glycosylated, a 24-amino acid membrane-spanning region, and a 627-amino acid cytoplasmic region. The cytoplasmic region contains two approximately 260-amino acid domains, each with homology to the tyrosine phosphatase family. One of the cDNA clones differed in that it had a 108-base-pair insertion that, while preserving the reading frame, would disrupt the first protein-tyrosine-phosphatase domain. Analysis of genomic DNA indicates that the insertion is due to an alternatively spliced exon. LRP appears to be evolutionarily conserved as a putative homologue has been identified in the invertebrate Styela plicata. Images PMID:2162042
A novel class of dual-family immunophilins.
Adams, Brian; Musiyenko, Alla; Kumar, Rajinder; Barik, Sailen
2005-07-01
Immunophilins are protein chaperones with peptidylprolyl isomerase activity that belong to one of two large families, the cyclosporin-binding cyclophilins (CyPs) and the FK506-binding proteins (FKBPs). Each family displays characteristic and conserved sequence features that differ between the two families. We report a novel group of dual-family immunophilins that contain both CyP and FKBP domains for which we propose the name FCBP (FK506- and cyclosporin-binding protein). The FCBP of Toxoplasma gondii, a protozoan parasite, contained N-terminal FKBP and C-terminal CyP domains joined by tetratricopeptide repeats. Structure-function analysis revealed that both domains were functional and exhibited family-specific drug sensitivity. The individual domains of FCBP inhibited calcineurin (protein phosphatase 2B) in the presence of the appropriate drugs. In binding studies, FCBP recruited calcineurin in the presence of FK506 and a putative target of rapamycin homolog in the presence of rapamycin. Two additional FCBP sequences in Flavobacterium and one in Treponema (spirochete) were also identified in which the CyP and FKBP domains were in the reverse order. T. gondii growth was inhibited by cyclosporin and FK506 in a moderately synergistic manner. The knockdown of FCBP by RNA interference revealed its essentiality for T. gondii growth. Clearly, the FCBPs are novel chaperones and potential targets of multiple immunosuppressant drugs.
A deep insight into the sialotranscriptome of the mosquito, Psorophora albipes
2013-01-01
Background Psorophora mosquitoes are exclusively found in the Americas and have been associated with transmission of encephalitis and West Nile fever viruses, among other arboviruses. Mosquito salivary glands represent the final route of differentiation and transmission of many parasites. They also secrete molecules with powerful pharmacologic actions that modulate host hemostasis, inflammation, and immune response. Here, we employed next generation sequencing and proteome approaches to investigate for the first time the salivary composition of a mosquito member of the Psorophora genus. We additionally discuss the evolutionary position of this mosquito genus into the Culicidae family by comparing the identity of its secreted salivary compounds to other mosquito salivary proteins identified so far. Results Illumina sequencing resulted in 13,535,229 sequence reads, which were assembled into 3,247 contigs. All families were classified according to their in silico-predicted function/ activity. Annotation of these sequences allowed classification of their products into 83 salivary protein families, twenty (24.39%) of which were confirmed by our subsequent proteome analysis. Two protein families were deorphanized from Aedes and one from Ochlerotatus, while four protein families were described as novel to Psorophora genus because they had no match with any other known mosquito salivary sequence. Several protein families described as exclusive to Culicines were present in Psorophora mosquitoes, while we did not identify any member of the protein families already known as unique to Anophelines. Also, the Psorophora salivary proteins had better identity to homologs in Aedes (69.23%), followed by Ochlerotatus (8.15%), Culex (6.52%), and Anopheles (4.66%), respectively. Conclusions This is the first sialome (from the Greek sialo = saliva) catalog of salivary proteins from a Psorophora mosquito, which may be useful for better understanding the lifecycle of this mosquito and the role of its salivary secretion in arboviral transmission. PMID:24330624
Torres, Matthew P; Dewhurst, Henry; Sundararaman, Niveda
2016-11-01
Post-translational modifications (PTMs) regulate protein behavior through modulation of protein-protein interactions, enzymatic activity, and protein stability essential in the translation of genotype to phenotype in eukaryotes. Currently, less than 4% of all eukaryotic PTMs are reported to have biological function - a statistic that continues to decrease with an increasing rate of PTM detection. Previously, we developed SAPH-ire (Structural Analysis of PTM Hotspots) - a method for the prioritization of PTM function potential that has been used effectively to reveal novel PTM regulatory elements in discrete protein families (Dewhurst et al., 2015). Here, we apply SAPH-ire to the set of eukaryotic protein families containing experimental PTM and 3D structure data - capturing 1,325 protein families with 50,839 unique PTM sites organized into 31,747 modified alignment positions (MAPs), of which 2010 (∼6%) possess known biological function. Here, we show that using an artificial neural network model (SAPH-ire NN) trained to identify MAP hotspots with biological function results in prediction outcomes that far surpass the use of single hotspot features, including nearest neighbor PTM clustering methods. We find the greatest enhancement in prediction for positions with PTM counts of five or less, which represent 98% of all MAPs in the eukaryotic proteome and 90% of all MAPs found to have biological function. Analysis of the top 1092 MAP hotspots revealed 267 of truly unknown function (containing 5443 distinct PTMs). Of these, 165 hotspots could be mapped to human KEGG pathways for normal and/or disease physiology. Many high-ranking hotspots were also found to be disease-associated pathogenic sites of amino acid substitution despite the lack of observable PTM in the human protein family member. Taken together, these experiments demonstrate that the functional relevance of a PTM can be predicted very effectively by neural network models, revealing a large but testable body of potential regulatory elements that impact hundreds of different biological processes important in eukaryotic biology and human health. © 2016 by The American Society for Biochemistry and Molecular Biology, Inc.
Dewhurst, Henry; Sundararaman, Niveda
2016-01-01
Post-translational modifications (PTMs) regulate protein behavior through modulation of protein-protein interactions, enzymatic activity, and protein stability essential in the translation of genotype to phenotype in eukaryotes. Currently, less than 4% of all eukaryotic PTMs are reported to have biological function - a statistic that continues to decrease with an increasing rate of PTM detection. Previously, we developed SAPH-ire (Structural Analysis of PTM Hotspots) - a method for the prioritization of PTM function potential that has been used effectively to reveal novel PTM regulatory elements in discrete protein families (Dewhurst et al., 2015). Here, we apply SAPH-ire to the set of eukaryotic protein families containing experimental PTM and 3D structure data - capturing 1,325 protein families with 50,839 unique PTM sites organized into 31,747 modified alignment positions (MAPs), of which 2010 (∼6%) possess known biological function. Here, we show that using an artificial neural network model (SAPH-ire NN) trained to identify MAP hotspots with biological function results in prediction outcomes that far surpass the use of single hotspot features, including nearest neighbor PTM clustering methods. We find the greatest enhancement in prediction for positions with PTM counts of five or less, which represent 98% of all MAPs in the eukaryotic proteome and 90% of all MAPs found to have biological function. Analysis of the top 1092 MAP hotspots revealed 267 of truly unknown function (containing 5443 distinct PTMs). Of these, 165 hotspots could be mapped to human KEGG pathways for normal and/or disease physiology. Many high-ranking hotspots were also found to be disease-associated pathogenic sites of amino acid substitution despite the lack of observable PTM in the human protein family member. Taken together, these experiments demonstrate that the functional relevance of a PTM can be predicted very effectively by neural network models, revealing a large but testable body of potential regulatory elements that impact hundreds of different biological processes important in eukaryotic biology and human health. PMID:27697855
Invasion of host cells by malaria parasites: a tale of two protein families.
Iyer, Jayasree; Grüner, Anne Charlotte; Rénia, Laurent; Snounou, Georges; Preiser, Peter R
2007-07-01
Malaria parasites are obligate intracellular parasites whose invasive stages select and invade the unique host cell in which they can develop with exquisite specificity and efficacy. Most studies aimed at elucidating the molecules and the mechanisms implicated in the selection and invasion processes have been conducted on the merozoite, the stage that invades erythrocytes to perpetuate the pathological cycles of parasite multiplication in the blood. Bioinformatic analysis has helped identify the members of two parasite protein families, the reticulocyte-binding protein homologues (RBL) and erythrocyte binding like (EBL), in recently sequenced genomes of different Plasmodium species. In this article we review data from classical studies and gene disruption experiments that are helping to illuminate the role of these proteins in the selection-invasion processes. The manner in which subsets of proteins from each of the families act in concert suggests a model to explain the ability of the parasites to use alternate pathways of invasion. Future perspectives and implications are discussed.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Adams, Melanie A.; Udell, Christian M.; Pal, Gour Pada
The crystallization and preliminary X-ray diffraction analysis of MraZ, formerly known as hypothetical protein YabB, from Escherichia coli K-12 is presented. The MraZ family of proteins, also referred to as the UPF0040 family, are highly conserved in bacteria and are thought to play a role in cell-wall biosynthesis and cell division. The murein region A (mra) gene cluster encodes MraZ proteins along with a number of other proteins involved in this complex process. To date, there has been no clear functional assignment provided for MraZ proteins and the structure of a homologue from Mycoplasma pneumoniae, MPN314, failed to suggest amore » molecular function. The b0081 gene from Escherichia coli that encodes the MraZ protein was cloned and the protein was overexpressed, purified and crystallized. This data is presented along with evidence that the E. coli homologue exists in a different oligomeric state to the MPN314 protein.« less
McLaughlin, Margaret; Lockhart, Ben; Jordan, Ramon; Denton, Geoff; Mollov, Dimitre
2017-05-01
Clematis chlorotic mottle virus (ClCMV) is a previously undescribed virus associated with symptoms of yellow mottling and veining, chlorotic ring spots, line pattern mosaics, and flower distortion and discoloration on ornamental Clematis. The ClCMV genome is 3,880 nt in length with five open reading frames (ORFs) encoding a 27-kDa protein (ORF 1), an 87-kDa replicase protein (ORF 2), two centrally located movement proteins (ORF 3 and 4), and a 37-kDa capsid protein (ORF 5). Based on morphological, genomic, and phylogenetic analysis, ClCMV is predicted to be a member of the genus Pelarspovirus in the family Tombusviridae.
A novel ATTR L32V mutation causes familial amyloid polyneuropathy in a Bolivian family.
Martínez-Ulloa, Pedro L; Vallejo, Manuela; Corral, Iñigo; García-Barragán, Nuria; Alcazar, Alberto; Martínez-Alonso, Emma; Martínez-Poles, Javier; Pian, Hector; Jiménez-Escrig, Adriano
2017-09-01
We report a new transthyretin (ATTR) gene c.272C>G mutation and variant protein, p.Leu32Val, in a kindred of Bolivian origin with a rapid progressive peripheral neuropathy and cardiomyopathy. Three individuals from a kindred with peripheral nerve and cardiac amyloidosis were examined. Analysis of the TTR gene was performed by Sanger direct sequencing. Neuropathologic examination was obtained on the index patient with mass spectrometry study of the ATTR deposition. Direct DNA sequence analysis of exons 2, 3, and 4 of the TTR gene demonstrated a c.272 C>G mutation in exon 2 (p.L32V). Sural nerve biopsy revealed massive amyloid deposition in the perineurium, endoneurium and vasa nervorum. Mass spectrometric analyses of ATTR immunoprecipitated from nerve biopsy showed the presence of both wild-type and variant proteins. The observed mass results for the wild-type and variant proteins were consistent with the predicted values calculated from the genetic analysis data. The ATTR L32V is associated with a severe course. This has implications for treatment of affected individuals and counseling of family members. © 2017 Peripheral Nerve Society.
Xu, Ning; Zhao, Hong-Yan; Yin, Yin; Shen, Shan-Shan; Shan, Lin-Lin; Chen, Chuan-Xi; Zhang, Yan-Xia; Gao, Jian-Fang; Ji, Xiang
2017-04-21
We conducted an omics-analysis of the venom of Naja kaouthia from China. Proteomics analysis revealed six protein families [three-finger toxins (3-FTx), phospholipase A 2 (PLA 2 ), nerve growth factor, snake venom metalloproteinase (SVMP), cysteine-rich secretory protein and ohanin], and venom-gland transcriptomics analysis revealed 28 protein families from 79 unigenes. 3-FTx (56.5% in proteome/82.0% in transcriptome) and PLA 2 (26.9%/13.6%) were identified as the most abundant families in venom proteome and venom-gland transcriptome. Furthermore, N. kaouthia venom expressed strong lethality (i.p. LD 50 : 0.79μg/g) and myotoxicity (CK: 5939U/l) in mice, and showed notable activity in PLA 2 but weak activity in SVMP, l-amino acid oxidase or 5' nucleotidase. Antivenomic assessment revealed that several venom components (nearly 17.5% of total venom) from N. kaouthia could not be thoroughly immunocaptured by commercial Naja atra antivenom. ELISA analysis revealed that there was no difference in the cross-reaction between N. kaouthia and N. atra venoms against the N. atra antivenom. The use of commercial N. atra antivenom in treatment of snakebites caused by N. kaouthia is reasonable, but design of novel antivenom with the attention on enhancing the immune response of non-immunocaptured components should be encouraged. The venomics, antivenomics and venom-gland transcriptome of the monocoled cobra (Naja kaouthia) from China have been elucidated. Quantitative and qualitative differences are evident when venom proteomic and venom-gland transcriptomic profiles are compared. Two protein families (3-FTx and PLA 2 ) are found to be the predominated components in N. kaouthia venom, and considered as the major players in functional role of venom. Other protein families with relatively low abundance appear to be minor in the functional significance. Antivenomics and ELISA evaluation reveal that the N. kaouthia venom can be effectively immunorecognized by commercial N. atra antivenom, but still a small number of venom components could not be thoroughly immunocaptured. The findings indicate that exploring the precise composition of snake venom should be executed by an integrated omics-approach, and elucidating the venom composition is helpful in understanding composition-function relationships and will facilitate the clinical application of antivenoms. Copyright © 2017 Elsevier B.V. All rights reserved.
Molecular characterization and expression analysis of importin "a" family genes in rainbow trout
USDA-ARS?s Scientific Manuscript database
The importin a/importin ß-mediated import pathway plays an essential role in the transport of proteins bearing nuclear localization signals (NLS) into the nucleus. Importin a serves to recognize the cargo proteins. In mammals, 7 importin a proteins (KPNA1 to 7) have been characterized and each impor...
Wang, Yu-Ling; Goh, King-Xiang; Wu, Wen-guey; Chen, Chun-Jung
2004-10-01
Cysteine-rich secretory proteins (CRISPs) play an important role in the innate immune system and are transcriptionally regulated by androgens in several tissues. The proteins are mostly found in the epididymis and granules of mammals, whilst a number of snake venoms also contain CRISP-family proteins. The natrin protein from the venom of Naja atra (Taiwan cobra), which belongs to a family of CRISPs and has a cysteine-rich C-terminal amino-acid sequence, has been purified using a three-stage chromatography procedure and crystals suitable for X-ray analysis have been obtained using the hanging-drop vapour-diffusion method. X-ray diffraction data were collected to 1.58 A resolution using synchrotron radiation; the crystals belong to space group C222(1), with unit-cell parameters a = 59.172, b = 65.038, c = 243.156 A. There are two protein molecules in the asymmetric unit and the Matthews coefficient is estimated to be 2.35 A3 Da(-1), corresponding to a solvent content of 47.60%.
Jin, Lily L; Wybenga-Groot, Leanne E; Tong, Jiefei; Taylor, Paul; Minden, Mark D; Trudel, Suzanne; McGlade, C Jane; Moran, Michael F
2015-03-01
Src homology 2 (SH2) domains are modular protein structures that bind phosphotyrosine (pY)-containing polypeptides and regulate cellular functions through protein-protein interactions. Proteomics analysis showed that the SH2 domains of Src family kinases are themselves tyrosine phosphorylated in blood system cancers, including acute myeloid leukemia, chronic lymphocytic leukemia, and multiple myeloma. Using the Src family kinase Lyn SH2 domain as a model, we found that phosphorylation at the conserved SH2 domain residue Y(194) impacts the affinity and specificity of SH2 domain binding to pY-containing peptides and proteins. Analysis of the Lyn SH2 domain crystal structure supports a model wherein phosphorylation of Y(194) on the EF loop modulates the binding pocket that engages amino acid side chains at the pY+2/+3 position. These data indicate another level of regulation wherein SH2-mediated protein-protein interactions are modulated by SH2 kinases and phosphatases. © 2015 by The American Society for Biochemistry and Molecular Biology, Inc.
Evolution-Based Functional Decomposition of Proteins
Rivoire, Olivier; Reynolds, Kimberly A.; Ranganathan, Rama
2016-01-01
The essential biological properties of proteins—folding, biochemical activities, and the capacity to adapt—arise from the global pattern of interactions between amino acid residues. The statistical coupling analysis (SCA) is an approach to defining this pattern that involves the study of amino acid coevolution in an ensemble of sequences comprising a protein family. This approach indicates a functional architecture within proteins in which the basic units are coupled networks of amino acids termed sectors. This evolution-based decomposition has potential for new understandings of the structural basis for protein function. To facilitate its usage, we present here the principles and practice of the SCA and introduce new methods for sector analysis in a python-based software package (pySCA). We show that the pattern of amino acid interactions within sectors is linked to the divergence of functional lineages in a multiple sequence alignment—a model for how sector properties might be differentially tuned in members of a protein family. This work provides new tools for studying proteins and for generally testing the concept of sectors as the principal units of function and adaptive variation. PMID:27254668
Functional Properties and Genomics of Glucose Transporters
Zhao, Feng-Qi; Keating, Aileen F
2007-01-01
Glucose is the major energy source for mammalian cells as well as an important substrate for protein and lipid synthesis. Mammalian cells take up glucose from extracellular fluid into the cell through two families of structurallyrelated glucose transporters. The facilitative glucose transporter family (solute carriers SLC2A, protein symbol GLUT) mediates a bidirectional and energy-independent process of glucose transport in most tissues and cells, while the NaM+/glucose cotransporter family (solute carriers SLC5A, protein symbol SGLT) mediates an active, Na+-linked transport process against an electrochemical gradient. The GLUT family consists of thirteen members (GLUT1-12 and HMIT). Phylogenetically, the members of the GLUT family are split into three classes based on protein similarities. Up to now, at least six members of the SGLT family have been cloned (SGLT1-6). In this review, we report both the genomic structure and function of each transporter as well as intra-species comparative genomic analysis of some of these transporters. The affinity for glucose and transport kinetics of each transporter differs and ranges from 0.2 to 17mM. The ability of each protein to transport alternative substrates also differs and includes substrates such as fructose and galactose. In addition, the tissue distribution pattern varies between species. There are different regulation mechanisms of these transporters. Characterization of transcriptional control of some of the gene promoters has been investigated and alternative promoter usage to generate different protein isoforms has been demonstrated. We also introduce some pathophysiological roles of these transporters in human. PMID:18660845
Hawley, Robert G; Chen, Yuzhong; Riz, Irene; Zeng, Chen
2012-05-04
In this study, we utilized an integrated bioinformatics and computational biology approach in search of new BH3-only proteins belonging to the BCL2 family of apoptotic regulators. The BH3 (BCL2 homology 3) domain mediates specific binding interactions among various BCL2 family members. It is composed of an amphipathic α-helical region of approximately 13 residues that has only a few amino acids that are highly conserved across all members. Using a generalized motif, we performed a genome-wide search for novel BH3-containing proteins in the NCBI Consensus Coding Sequence (CCDS) database. In addition to known pro-apoptotic BH3-only proteins, 197 proteins were recovered that satisfied the search criteria. These were categorized according to α-helical content and predictive binding to BCL-xL (encoded by BCL2L1) and MCL-1, two representative anti-apoptotic BCL2 family members, using position-specific scoring matrix models. Notably, the list is enriched for proteins associated with autophagy as well as a broad spectrum of cellular stress responses such as endoplasmic reticulum stress, oxidative stress, antiviral defense, and the DNA damage response. Several potential novel BH3-containing proteins are highlighted. In particular, the analysis strongly suggests that the apoptosis inhibitor and DNA damage response regulator, AVEN, which was originally isolated as a BCL-xL-interacting protein, is a functional BH3-only protein representing a distinct subclass of BCL2 family members.
Retracing Evolution of Red Fluorescence in GFP-Like Proteins from Faviina Corals
Field, Steven F.; Matz, Mikhail V.
2010-01-01
Proteins of the green fluorescent protein family represent a convenient experimental model to study evolution of novelty at the molecular level. Here, we focus on the origin of Kaede-like red fluorescent proteins characteristic of the corals of the Faviina suborder. We demonstrate, using an original approach involving resurrection and analysis of the library of possible evolutionary intermediates, that it takes on the order of 12 mutations, some of which strongly interact epistatically, to fully recapitulate the evolution of a red fluorescent phenotype from the ancestral green. Five of the identified mutations would not have been found without the help of ancestral reconstruction, because the corresponding site states are shared between extant red and green proteins due to their recent descent from a dual-function common ancestor. Seven of the 12 mutations affect residues that are not in close contact with the chromophore and thus must exert their effect indirectly through adjustments of the overall protein fold; the relevance of these mutations could not have been anticipated from the purely theoretical analysis of the protein's structure. Our results introduce a powerful experimental approach for comparative analysis of functional specificity in protein families even in the cases of pronounced epistasis, provide foundation for the detailed studies of evolutionary trajectories leading to novelty and complexity, and will help rational modification of existing fluorescent labels. PMID:19793832
Semova, Natalia; Kapanadze, Bagrat; Corcoran, Martin; Kutsenko, Alexei; Baranova, Ancha; Semov, Alexandre
2003-09-01
IRLB was originally identified as a partial cDNA clone, encoding a 191-aa protein binding the interferon-stimulated response element (ISRE) in the P2 promoter of human MYC. Here, we cloned the full-size IRLB using different bioinformatics tools and an RT-PCR approach. The full-size gene encompasses 131 kb within chromosome 15q22 and consists of 32 exons. IRLB is transcribed as a 6.6-kb mRNA encoding a protein of 1865 aa. IRLB is ubiquitously expressed and its expression is regulated in a growth- and cell cycle-dependent manner. In addition to the ISRE-binding domain IRLB contains a tripartite DENN domain, a nuclear localization signal, two PPRs, and a calmodulin-binding domain. The presence of DENN domains predicts possible interactions of IRLB with GTPases from the Rab family or regulation of growth-induced MAPKs. Strongly homologous proteins were identified in all available vertebrate genomes as well as in Caenorhabditis elegans and Drosophila melanogaster. In human and mouse a family of IRLB proteins exists, consisting of at least three members.
Ravagnani, Adriana; Finan, Christopher L; Young, Michael
2005-03-17
In Micrococcus luteus growth and resuscitation from starvation-induced dormancy is controlled by the production of a secreted growth factor. This autocrine resuscitation-promoting factor (Rpf) is the founder member of a family of proteins found throughout and confined to the actinobacteria (high G + C Gram-positive bacteria). The aim of this work was to search for and characterise a cognate gene family in the firmicutes (low G + C Gram-positive bacteria) and obtain information about how they may control bacterial growth and resuscitation. In silico analysis of the accessory domains of the Rpf proteins permitted their classification into several subfamilies. The RpfB subfamily is related to a group of firmicute proteins of unknown function, represented by YabE of Bacillus subtilis. The actinobacterial RpfB and firmicute YabE proteins have very similar domain structures and genomic contexts, except that in YabE, the actinobacterial Rpf domain is replaced by another domain, which we have called Sps. Although totally unrelated in both sequence and secondary structure, the Rpf and Sps domains fulfil the same function. We propose that these proteins have undergone "non-orthologous domain displacement", a phenomenon akin to "non-orthologous gene displacement" that has been described previously. Proteins containing the Sps domain are widely distributed throughout the firmicutes and they too fall into a number of distinct subfamilies. Comparative analysis of the accessory domains in the Rpf and Sps proteins, together with their weak similarity to lytic transglycosylases, provide clear evidence that they are muralytic enzymes. The results indicate that the firmicute Sps proteins and the actinobacterial Rpf proteins are cognate and that they control bacterial culturability via enzymatic modification of the bacterial cell envelope.
RecA family proteins in archaea: RadA and its cousins.
Haldenby, Sam; White, Malcolm F; Allers, Thorsten
2009-02-01
Recombinases of the RecA family are essential for homologous recombination and underpin genome stability, by promoting the repair of double-stranded DNA breaks and the rescue of collapsed DNA replication forks. Until now, our understanding of homologous recombination has relied on studies of bacterial and eukaryotic model organisms. Archaea provide new opportunities to study how recombination operates in a lineage distinct from bacteria and eukaryotes. In the present paper, we focus on RadA, the archaeal RecA family recombinase, and its homologues in archaea and other domains. On the basis of phylogenetic analysis, we propose that a family of archaeal proteins with a single RecA domain, which are currently annotated as KaiC, be renamed aRadC.
Protein arginine methylation: Cellular functions and methods of analysis.
Pahlich, Steffen; Zakaryan, Rouzanna P; Gehring, Heinz
2006-12-01
During the last few years, new members of the growing family of protein arginine methyltransferases (PRMTs) have been identified and the role of arginine methylation in manifold cellular processes like signaling, RNA processing, transcription, and subcellular transport has been extensively investigated. In this review, we describe recent methods and findings that have yielded new insights into the cellular functions of arginine-methylated proteins, and we evaluate the currently used procedures for the detection and analysis of arginine methylation.
2013-01-01
Background Xeroderma Pigmentosum (XP) is a rare skin disorder characterized by skin hypersensitivity to sunlight and abnormal pigmentation. The aim of this study was to investigate the genetic cause of a severe XP phenotype in a consanguineous Pakistani family and in silico characterization of any identified disease-associated mutation. Results The XP complementation group was assigned by genotyping of family for known XP loci. Genotyping data mapped the family to complementation group A locus, involving XPA gene. Mutation analysis of the candidate XP gene by DNA sequencing revealed a novel deletion mutation (c.654del A) in exon 5 of XPA gene. The c.654del A, causes frameshift, which pre-maturely terminates protein and result into a truncated product of 222 amino acid (aa) residues instead of 273 (p.Lys218AsnfsX5). In silico tools were applied to study the likelihood of changes in structural motifs and thus interaction of mutated protein with binding partners. In silico analysis of mutant protein sequence, predicted to affect the aa residue which attains coiled coil structure. The coiled coil structure has an important role in key cellular interactions, especially with DNA damage-binding protein 2 (DDB2), which has important role in DDB-mediated nucleotide excision repair (NER) system. Conclusions Our findings support the fact of genetic and clinical heterogeneity in XP. The study also predicts the critical role of DDB2 binding region of XPA protein in NER pathway and opens an avenue for further research to study the functional role of the mutated protein domain. PMID:24063568
Comprehensive analysis of orthologous protein domains using the HOPS database.
Storm, Christian E V; Sonnhammer, Erik L L
2003-10-01
One of the most reliable methods for protein function annotation is to transfer experimentally known functions from orthologous proteins in other organisms. Most methods for identifying orthologs operate on a subset of organisms with a completely sequenced genome, and treat proteins as single-domain units. However, it is well known that proteins are often made up of several independent domains, and there is a wealth of protein sequences from genomes that are not completely sequenced. A comprehensive set of protein domain families is found in the Pfam database. We wanted to apply orthology detection to Pfam families, but first some issues needed to be addressed. First, orthology detection becomes impractical and unreliable when too many species are included. Second, shorter domains contain less information. It is therefore important to assess the quality of the orthology assignment and avoid very short domains altogether. We present a database of orthologous protein domains in Pfam called HOPS: Hierarchical grouping of Orthologous and Paralogous Sequences. Orthology is inferred in a hierarchic system of phylogenetic subgroups using ortholog bootstrapping. To avoid the frequent errors stemming from horizontally transferred genes in bacteria, the analysis is presently limited to eukaryotic genes. The results are accessible in the graphical browser NIFAS, a Java tool originally developed for analyzing phylogenetic relations within Pfam families. The method was tested on a set of curated orthologs with experimentally verified function. In comparison to tree reconciliation with a complete species tree, our approach finds significantly more orthologs in the test set. Examples for investigating gene fusions and domain recombination using HOPS are given.
Heinz, Eva; Stubenrauch, Christopher J.; Grinter, Rhys; Croft, Nathan P.; Purcell, Anthony W.; Strugnell, Richard A.; Dougan, Gordon; Lithgow, Trevor
2016-01-01
The bacterial cell surface proteins intimin and invasin are virulence factors that share a common domain structure and bind selectively to host cell receptors in the course of bacterial pathogenesis. The β-barrel domains of intimin and invasin show significant sequence and structural similarities. Conversely, a variety of proteins with sometimes limited sequence similarity have also been annotated as “intimin-like” and “invasin” in genome datasets, while other recent work on apparently unrelated virulence-associated proteins ultimately revealed similarities to intimin and invasin. Here we characterize the sequence and structural relationships across this complex protein family. Surprisingly, intimins and invasins represent a very small minority of the sequence diversity in what has been previously the “intimin/invasin protein family”. Analysis of the assembly pathway for expression of the classic intimin, EaeA, and a characteristic example of the most prevalent members of the group, FdeC, revealed a dependence on the translocation and assembly module as a common feature for both these proteins. While the majority of the sequences in the grouping are most similar to FdeC, a further and widespread group is two-partner secretion systems that use the β-barrel domain as the delivery device for secretion of a variety of virulence factors. This comprehensive analysis supports the adoption of the “inverse autotransporter protein family” as the most accurate nomenclature for the family and, in turn, has important consequences for our overall understanding of the Type V secretion systems of bacterial pathogens. PMID:27190006
Johnson, Glynis; Moore, Samuel W
2013-09-01
Short linear motifs confer evolutionary flexibility on proteins as they can be added with relative ease allowing the acquisition of new functions. Such motifs may mediate a variety of signalling functions. The adhesion-mediating Leu-Arg-Glu (LRE) motif is enriched in laminin beta 2, and has been observed in other proteins, including members of the carboxylesterase/cholinesterase family. It acts as a stop signal for growing axons in the developing neuromuscular junction, binding to the voltage-gated calcium channel. In this bioinformatic analysis, we have investigated the presence of the motif in proteins of the neuromuscular junction, and have also examined its structural position and potential for ligand interaction, as well as phylogenetic conservation, in the carboxylesterase/cholinesterase family. The motif was observed to occur with a significantly higher frequency than expected in the UniProt/Swiss-Prot database, as well as in four individual species (human, mouse, Caenorhabditis elegans and Drosophila melanogaster). Examination of its presence in neuromuscular junction proteins showed it to be enriched in certain proteins of the synaptic basement membrane, including laminin, agrin, acetylcholinesterase and tenascin. A highly significant enrichment was observed in cytoskeletal proteins, particularly intermediate filament proteins and members of the spectrin family. In the carboxylesterase/cholinesterase family, the motif was observed in four conserved positions in the protein structure. It is present in the majority of mammalian acetylcholinesterases, as well as acetylcholinesterases from electric fish and a number of invertebrates. In insects, it is present in the ace-2, rather than in the synaptic ace-1, enzyme. It is also observed in the cholinesterase-like adhesion molecules (neuroligins, neurotactin and glutactin). It is never seen in butyrylcholinesterases, which do not mediate cell adhesion. In conclusion, the significant enrichment of the motif in certain classes of protein, as well as its conserved presence and structural positioning in one protein family, suggests that it has specific functions both in cell adhesion in the neuromuscular junction and in maintaining the structural integrity of the cytoskeleton. Copyright © 2013 Elsevier Inc. All rights reserved.
Evolutionary Descent of Prion Genes from the ZIP Family of Metal Ion Transporters
Schmitt-Ulms, Gerold; Ehsani, Sepehr; Watts, Joel C.; Westaway, David; Wille, Holger
2009-01-01
In the more than twenty years since its discovery, both the phylogenetic origin and cellular function of the prion protein (PrP) have remained enigmatic. Insights into a possible function of PrP may be obtained through the characterization of its molecular neighborhood in cells. Quantitative interactome data demonstrated the spatial proximity of two metal ion transporters of the ZIP family, ZIP6 and ZIP10, to mammalian prion proteins in vivo. A subsequent bioinformatic analysis revealed the unexpected presence of a PrP-like amino acid sequence within the N-terminal, extracellular domain of a distinct sub-branch of the ZIP protein family that includes ZIP5, ZIP6 and ZIP10. Additional structural threading and orthologous sequence alignment analyses argued that the prion gene family is phylogenetically derived from a ZIP-like ancestral molecule. The level of sequence homology and the presence of prion protein genes in most chordate species place the split from the ZIP-like ancestor gene at the base of the chordate lineage. This relationship explains structural and functional features found within mammalian prion proteins as elements of an ancient involvement in the transmembrane transport of divalent cations. The phylogenetic and spatial connection to ZIP proteins is expected to open new avenues of research to elucidate the biology of the prion protein in health and disease. PMID:19784368
Multi-Harmony: detecting functional specificity from sequence alignment
Brandt, Bernd W.; Feenstra, K. Anton; Heringa, Jaap
2010-01-01
Many protein families contain sub-families with functional specialization, such as binding different ligands or being involved in different protein–protein interactions. A small number of amino acids generally determine functional specificity. The identification of these residues can aid the understanding of protein function and help finding targets for experimental analysis. Here, we present multi-Harmony, an interactive web sever for detecting sub-type-specific sites in proteins starting from a multiple sequence alignment. Combining our Sequence Harmony (SH) and multi-Relief (mR) methods in one web server allows simultaneous analysis and comparison of specificity residues; furthermore, both methods have been significantly improved and extended. SH has been extended to cope with more than two sub-groups. mR has been changed from a sampling implementation to a deterministic one, making it more consistent and user friendly. For both methods Z-scores are reported. The multi-Harmony web server produces a dynamic output page, which includes interactive connections to the Jalview and Jmol applets, thereby allowing interactive analysis of the results. Multi-Harmony is available at http://www.ibi.vu.nl/ programs/shmrwww. PMID:20525785
Analysis of gene expression profile microarray data in complex regional pain syndrome.
Tan, Wulin; Song, Yiyan; Mo, Chengqiang; Jiang, Shuangjian; Wang, Zhongxing
2017-09-01
The aim of the present study was to predict key genes and proteins associated with complex regional pain syndrome (CRPS) using bioinformatics analysis. The gene expression profiling microarray data, GSE47603, which included peripheral blood samples from 4 patients with CRPS and 5 healthy controls, was obtained from the Gene Expression Omnibus (GEO) database. The differentially expressed genes (DEGs) in CRPS patients compared with healthy controls were identified using the GEO2R online tool. Functional enrichment analysis was then performed using The Database for Annotation Visualization and Integrated Discovery online tool. Protein‑protein interaction (PPI) network analysis was subsequently performed using Search Tool for the Retrieval of Interaction Genes database and analyzed with Cytoscape software. A total of 257 DEGs were identified, including 243 upregulated genes and 14 downregulated ones. Genes in the human leukocyte antigen (HLA) family were most significantly differentially expressed. Enrichment analysis demonstrated that signaling pathways, including immune response, cell motion, adhesion and angiogenesis were associated with CRPS. PPI network analysis revealed that key genes, including early region 1A binding protein p300 (EP300), CREB‑binding protein (CREBBP), signal transducer and activator of transcription (STAT)3, STAT5A and integrin α M were associated with CRPS. The results suggest that the immune response may therefore serve an important role in CRPS development. In addition, genes in the HLA family, such as HLA‑DQB1 and HLA‑DRB1, may present potential biomarkers for the diagnosis of CRPS. Furthermore, EP300, its paralog CREBBP, and the STAT family genes, STAT3 and STAT5 may be important in the development of CRPS.
Eichenberger, Ramon M; Ramakrishnan, Chandra; Russo, Giancarlo; Deplazes, Peter; Hehl, Adrian B
2017-06-13
Infections of dogs with virulent strains of Babesia canis are characterized by rapid onset and high mortality, comparable to complicated human malaria. As in other apicomplexan parasites, most Babesia virulence factors responsible for survival and pathogenicity are secreted to the host cell surface and beyond where they remodel and biochemically modify the infected cell interacting with host proteins in a very specific manner. Here, we investigated factors secreted by B. canis during acute infections in dogs and report on in silico predictions and experimental analysis of the parasite's exportome. As a backdrop, we generated a fully annotated B. canis genome sequence of a virulent Hungarian field isolate (strain BcH-CHIPZ) underpinned by extensive genome-wide RNA-seq analysis. We find evidence for conserved factors in apicomplexan hemoparasites involved in immune-evasion (e.g. VESA-protein family), proteins secreted across the iRBC membrane into the host bloodstream (e.g. SA- and Bc28 protein families), potential moonlighting proteins (e.g. profilin and histones), and uncharacterized antigens present during acute crisis in dogs. The combined data provides a first predicted and partially validated set of potential virulence factors exported during fatal infections, which can be exploited for urgently needed innovative intervention strategies aimed at facilitating diagnosis and management of canine babesiosis.
Valenzuela-Muñoz, Valentina; Sturm, Armin; Gallardo-Escárate, Cristian
2015-04-09
ATP-binding cassette (ABC) protein family encode for membrane proteins involved in the transport of various biomolecules through the cellular membrane. These proteins have been identified in all taxa and present important physiological functions, including the process of insecticide detoxification in arthropods. For that reason the ectoparasite Caligus rogercresseyi represents a model species for understanding the molecular underpinnings involved in insecticide drug resistance. llumina sequencing was performed using sea lice exposed to 2 and 3 ppb of deltamethrin and azamethiphos. Contigs obtained from de novo assembly were annotated by Blastx. RNA-Seq analysis was performed and validated by qPCR analysis. From the transcriptome database of C. rogercresseyi, 57 putative members of ABC protein sequences were identified and phylogenetically classified into the eight subfamilies described for ABC transporters in arthropods. Transcriptomic profiles for ABC proteins subfamilies were evaluated throughout C. rogercresseyi development. Moreover, RNA-Seq analysis was performed for adult male and female salmon lice exposed to the delousing drugs azamethiphos and deltamethrin. High transcript levels of the ABCB and ABCC subfamilies were evidenced. Furthermore, SNPs mining was carried out for the ABC proteins sequences, revealing pivotal genomic information. The present study gives a comprehensive transcriptome analysis of ABC proteins from C. rogercresseyi, providing relevant information about transporter roles during ontogeny and in relation to delousing drug responses in salmon lice. This genomic information represents a valuable tool for pest management in the Chilean salmon aquaculture industry.
Liu, Zhao; Ge, Xiaoyang; Yang, Zuoren; Zhang, Chaojun; Zhao, Ge; Chen, Eryong; Liu, Ji; Zhang, Xueyan; Li, Fuguang
2017-06-12
Sucrose non-fermenting-1-related protein kinase 2 (SnRK2) is a plant-specific serine/threonine kinase family involved in the abscisic acid (ABA) signaling pathway and responds to osmotic stress. A genome-wide analysis of this protein family has been conducted previously in some plant species, but little is known about SnRK2 genes in upland cotton (Gossypium hirsutum L.). The recent release of the G. hirsutum genome sequence provides an opportunity to identify and characterize the SnRK2 kinase family in upland cotton. We identified 20 putative SnRK2 sequences in the G. hirsutum genome, designated as GhSnRK2.1 to GhSnRK2.20. All of the sequences encoded hydrophilic proteins. Phylogenetic analysis showed that the GhSnRK2 genes were classifiable into three groups. The chromosomal location and phylogenetic analysis of the cotton SnRK2 genes indicated that segmental duplication likely contributed to the diversification and evolution of the genes. The gene structure and motif composition of the cotton SnRK2 genes were analyzed. Nine exons were conserved in length among all members of the GhSnRK2 family. Although the C-terminus was divergent, seven conserved motifs were present. All GhSnRK2s genes showed expression patterns under abiotic stress based on transcriptome data. The expression profiles of five selected genes were verified in various tissues by quantitative real-time RT-PCR (qRT-PCR). Transcript levels of some family members were up-regulated in response to drought, salinity or ABA treatments, consistent with potential roles in response to abiotic stress. This study is the first comprehensive analysis of SnRK2 genes in upland cotton. Our results provide the fundamental information for the functional dissection of GhSnRK2s and vital availability for the improvement of plant stress tolerance using GhSnRK2s.
CLK-1/Coq7p is a DMQ mono-oxygenase and a new member of the di-iron carboxylate protein family.
Rea, S
2001-12-14
Strains of Caenorhabditis elegans mutant for clk-1 exhibit a 20-40% increase in mean lifespan. clk-1 encodes a mitochondrial protein thought to be either an enzyme or regulatory molecule acting within the ubiquinone biosynthesis pathway. Here CLK-1 is shown to be related to the ubiquinol oxidase, alternative oxidase, and belong to the functionally diverse di-iron-carboxylate protein family which includes bacterioferritin and methane mono-oxygenase. Construction and analysis of a homology model indicates CLK-1 is a 2-polyprenyl-3-methyl-6-methoxy-1,4-benzoquinone mono-oxygenase as originally predicted. Analysis of known CLK-1/Coq7p mutations also supports this notion. These findings raise the possibility of developing CLK-1-specific inhibitors to test for lifespan extension in higher organisms.
Arboretum and Puerto Almendras viruses: two novel rhabdoviruses isolated from mosquitoes in Peru.
Vasilakis, Nikos; Castro-Llanos, Fanny; Widen, Steven G; Aguilar, Patricia V; Guzman, Hilda; Guevara, Carolina; Fernandez, Roberto; Auguste, Albert J; Wood, Thomas G; Popov, Vsevolod; Mundal, Kirk; Ghedin, Elodie; Kochel, Tadeusz J; Holmes, Edward C; Walker, Peter J; Tesh, Robert B
2014-04-01
Arboretum virus (ABTV) and Puerto Almendras virus (PTAMV) are two mosquito-associated rhabdoviruses isolated from pools of Psorophora albigenu and Ochlerotattus fulvus mosquitoes, respectively, collected in the Department of Loreto, Peru, in 2009. Initial tests suggested that both viruses were novel rhabdoviruses and this was confirmed by complete genome sequencing. Analysis of their 11 482 nt (ABTV) and 11 876 (PTAMV) genomes indicates that they encode the five canonical rhabdovirus structural proteins (N, P, M, G and L) with an additional gene (U1) encoding a small hydrophobic protein. Evolutionary analysis of the L protein indicates that ABTV and PTAMV are novel and phylogenetically distinct rhabdoviruses that cannot be classified as members of any of the eight currently recognized genera within the family Rhabdoviridae, highlighting the vast diversity of this virus family.
Arboretum and Puerto Almendras viruses: two novel rhabdoviruses isolated from mosquitoes in Peru
Castro-Llanos, Fanny; Widen, Steven G.; Aguilar, Patricia V.; Guzman, Hilda; Guevara, Carolina; Fernandez, Roberto; Auguste, Albert J.; Wood, Thomas G.; Popov, Vsevolod; Mundal, Kirk; Ghedin, Elodie; Kochel, Tadeusz J.; Holmes, Edward C.; Walker, Peter J.; Tesh, Robert B.
2014-01-01
Arboretum virus (ABTV) and Puerto Almendras virus (PTAMV) are two mosquito-associated rhabdoviruses isolated from pools of Psorophora albigenu and Ochlerotattus fulvus mosquitoes, respectively, collected in the Department of Loreto, Peru, in 2009. Initial tests suggested that both viruses were novel rhabdoviruses and this was confirmed by complete genome sequencing. Analysis of their 11 482 nt (ABTV) and 11 876 (PTAMV) genomes indicates that they encode the five canonical rhabdovirus structural proteins (N, P, M, G and L) with an additional gene (U1) encoding a small hydrophobic protein. Evolutionary analysis of the L protein indicates that ABTV and PTAMV are novel and phylogenetically distinct rhabdoviruses that cannot be classified as members of any of the eight currently recognized genera within the family Rhabdoviridae, highlighting the vast diversity of this virus family. PMID:24421116
Vasilakis, Nikos; Widen, Steven; Mayer, Sandra V; Seymour, Robert; Wood, Thomas G; Popov, Vsevolov; Guzman, Hilda; Travassos da Rosa, Amelia P A; Ghedin, Elodie; Holmes, Edward C; Walker, Peter J; Tesh, Robert B
2013-09-01
Members of the family Rhabdoviridae have been assigned to eight genera but many remain unassigned. Rhabdoviruses have a remarkably diverse host range that includes terrestrial and marine animals, invertebrates and plants. Transmission of some rhabdoviruses often requires an arthropod vector, such as mosquitoes, midges, sandflies, ticks, aphids and leafhoppers, in which they replicate. Herein we characterize Niakha virus (NIAV), a previously uncharacterized rhabdovirus isolated from phebotomine sandflies in Senegal. Analysis of the 11,124 nt genome sequence indicates that it encodes the five common rhabdovirus proteins with alternative ORFs in the M, G and L genes. Phylogenetic analysis of the L protein indicate that NIAV's closest relative is Oak Vale rhabdovirus, although in this analysis NIAV is still so phylogenetically distinct that it might be classified as distinct from the eight currently recognized Rhabdoviridae genera. This observation highlights the vast, and yet not fully recognized diversity, of this family. Copyright © 2013 Elsevier Inc. All rights reserved.
Evolution and functional divergence of the anoctamin family of membrane proteins
2010-01-01
Background The anoctamin family of transmembrane proteins are found in all eukaryotes and consists of 10 members in vertebrates. Ano1 and ano2 were observed to have Ca2+ activated Cl- channel activity. Recent findings however have revealed that ano6, and ano7 can also produce chloride currents, although with different properties. In contrast, ano9 and ano10 suppress baseline Cl- conductance when co-expressed with ano1 thus suggesting that different anoctamins can interfere with each other. In order to elucidate intrinsic functional diversity, and underlying evolutionary mechanism among anoctamins, we performed comprehensive bioinformatics analysis of anoctamin gene family. Results Our results show that anoctamin protein paralogs evolved from several gene duplication events followed by functional divergence of vertebrate anoctamins. Most of the amino acid replacements responsible for the functional divergence were fixed by adaptive evolution and this seem to be a common pattern in anoctamin gene family evolution. Strong purifying selection and the loss of many gene duplication products indicate rigid structure-function relationships among anoctamins. Conclusions Our study suggests that anoctamins have evolved by series of duplication events, and that they are constrained by purifying selection. In addition we identified a number of protein domains, and amino acid residues which contribute to predicted functional divergence. Hopefully, this work will facilitate future functional characterization of the anoctamin membrane protein family. PMID:20964844
Suárez-Ortegón, M F; Arbeláez, A; Mosquera, M; Méndez, F; Aguilar-de Plata, C
2012-08-01
Ferritin levels have been associated with metabolic syndrome and insulin resistance. The aim of the present study was to evaluate the prediction of ferritin levels by variables related to cardiometabolic disease risk in a multivariate analysis. For this aim, 123 healthy women (72 premenopausal and 51 posmenopausal) were recruited. Data were collected through procedures of anthropometric measurements, questionnaires for personal/familial antecedents, and dietary intake (24-h recall), and biochemical determinations (ferritin, C reactive protein (CRP), glucose, insulin, and lipid profile) in blood serum samples obtained. Multiple linear regression analysis was used and variables with no normal distribution were log-transformed for this analysis. In premenopausal women, a model to explain log-ferritin levels was found with log-CRP levels, heart attack familial history, and waist circumference as independent predictors. Ferritin behaves as other cardiovascular markers in terms of prediction of its levels by documented predictors of cardiometabolic disease and related disorders. This is the first report of a relationship between heart attack familial history and ferritin levels. Further research is required to evaluate the mechanism to explain the relationship of central body fat and heart attack familial history with body iron stores values.
2012-01-01
The endoplasmic reticulum chaperone gp96 is required for the cell surface expression of a narrow range of proteins, including toll-like receptors (TLRs) and integrins. To identify a more comprehensive repertoire of proteins whose cell surface expression is dependent on gp96, we developed plasma membrane profiling (PMP), a technique that combines SILAC labeling with selective cell surface aminooxy-biotinylation. This approach allowed us to compare the relative abundance of plasma membrane (PM) proteins on gp96-deficient versus gp96-reconstituted murine pre-B cells. Analysis of unfractionated tryptic peptides initially identified 113 PM proteins, which extended to 706 PM proteins using peptide prefractionation. We confirmed a requirement for gp96 in the cell surface expression of certain TLRs and integrins and found a marked decrease in cell surface expression of four members of the extended LDL receptor family (LDLR, LRP6, Sorl1 and LRP8) in the absence of gp96. Other novel gp96 client proteins included CD180/Ly86, important in the B-cell response to lipopolysaccharide. We highlight common structural motifs in these client proteins that may be recognized by gp96, including the beta-propeller and leucine-rich repeat. This study therefore identifies the extended LDL receptor family as an important new family of proteins whose cell surface expression is regulated by gp96. PMID:22292497
Weekes, Michael P; Antrobus, Robin; Talbot, Suzanne; Hör, Simon; Simecek, Nikol; Smith, Duncan L; Bloor, Stuart; Randow, Felix; Lehner, Paul J
2012-03-02
The endoplasmic reticulum chaperone gp96 is required for the cell surface expression of a narrow range of proteins, including toll-like receptors (TLRs) and integrins. To identify a more comprehensive repertoire of proteins whose cell surface expression is dependent on gp96, we developed plasma membrane profiling (PMP), a technique that combines SILAC labeling with selective cell surface aminooxy-biotinylation. This approach allowed us to compare the relative abundance of plasma membrane (PM) proteins on gp96-deficient versus gp96-reconstituted murine pre-B cells. Analysis of unfractionated tryptic peptides initially identified 113 PM proteins, which extended to 706 PM proteins using peptide prefractionation. We confirmed a requirement for gp96 in the cell surface expression of certain TLRs and integrins and found a marked decrease in cell surface expression of four members of the extended LDL receptor family (LDLR, LRP6, Sorl1 and LRP8) in the absence of gp96. Other novel gp96 client proteins included CD180/Ly86, important in the B-cell response to lipopolysaccharide. We highlight common structural motifs in these client proteins that may be recognized by gp96, including the beta-propeller and leucine-rich repeat. This study therefore identifies the extended LDL receptor family as an important new family of proteins whose cell surface expression is regulated by gp96.
ACLAME: a CLAssification of Mobile genetic Elements, update 2010.
Leplae, Raphaël; Lima-Mendez, Gipsi; Toussaint, Ariane
2010-01-01
The ACLAME database is dedicated to the collection, analysis and classification of sequenced mobile genetic elements (MGEs, in particular phages and plasmids). In addition to providing information on the MGEs content, classifications are available at various levels of organization. At the gene/protein level, families group similar sequences that are expected to share the same function. Families of four or more proteins are manually assigned with a functional annotation using the GeneOntology and the locally developed ontology MeGO dedicated to MGEs. At the genome level, evolutionary cohesive modules group sets of protein families shared among MGEs. At the population level, networks display the reticulate evolutionary relationships among MGEs. To increase the coverage of the phage sequence space, ACLAME version 0.4 incorporates 760 high-quality predicted prophages selected from the Prophinder database. Most of the data can be downloaded from the freely accessible ACLAME web site (http://aclame.ulb.ac.be). The BLAST interface for querying the database has been extended and numerous tools for in-depth analysis of the results have been added.
Cheng, Y; Yao, Z P; Ruan, M Y; Ye, Q J; Wang, R Q; Zhou, G Z; Luo, J
2016-09-23
The WRKY family is one of the most important transcription factor families in plants, involved in the regulation of a broad range of biological roles. The recent releases of whole-genome sequences of pepper (Capsicum annuum L.) allow us to perform a genome-wide identification and characterization of the WRKY family. In this study, 61 CaWRKY proteins were identified in the pepper genome. Based on protein structural and phylogenetic analyses, these proteins were classified into four main groups (I, II, III, and NG), and Group II was further divided into five subgroups (IIa to IIe). Chromosome mapping analysis indicated that CaWRKY genes are distributed across all 12 chromosomes, although the location of four CaWRKYs (CaWRKY58-CaWRKY61) could not be identified. Two pairs of CaWRKYs located on chromosome 01 appear to be tandem duplications. Furthermore, the phylogenetic tree showed a close evolutionary relationship of WRKYs in three species from Solanaceae. In conclusion, this comprehensive analysis of CaWRKYs will provide rich resources for further functional studies in pepper.
Laver, John D; Li, Xiao; Ray, Debashish; Cook, Kate B; Hahn, Noah A; Nabeel-Shah, Syed; Kekis, Mariana; Luo, Hua; Marsolais, Alexander J; Fung, Karen Yy; Hughes, Timothy R; Westwood, J Timothy; Sidhu, Sachdev S; Morris, Quaid; Lipshitz, Howard D; Smibert, Craig A
2015-05-12
Brain tumor (BRAT) is a Drosophila member of the TRIM-NHL protein family. This family is conserved among metazoans and its members function as post-transcriptional regulators. BRAT was thought to be recruited to mRNAs indirectly through interaction with the RNA-binding protein Pumilio (PUM). However, it has recently been demonstrated that BRAT directly binds to RNA. The precise sequence recognized by BRAT, the extent of BRAT-mediated regulation, and the exact roles of PUM and BRAT in post-transcriptional regulation are unknown. Genome-wide identification of transcripts associated with BRAT or with PUM in Drosophila embryos shows that they bind largely non-overlapping sets of mRNAs. BRAT binds mRNAs that encode proteins associated with a variety of functions, many of which are distinct from those implemented by PUM-associated transcripts. Computational analysis of in vitro and in vivo data identified a novel RNA motif recognized by BRAT that confers BRAT-mediated regulation in tissue culture cells. The regulatory status of BRAT-associated mRNAs suggests a prominent role for BRAT in post-transcriptional regulation, including a previously unidentified role in transcript degradation. Transcriptomic analysis of embryos lacking functional BRAT reveals an important role in mediating the decay of hundreds of maternal mRNAs during the maternal-to-zygotic transition. Our results represent the first genome-wide analysis of the mRNAs associated with a TRIM-NHL protein and the first identification of an RNA motif bound by this protein family. BRAT is a prominent post-transcriptional regulator in the early embryo through mechanisms that are largely independent of PUM.
Faure, Guilhem; Callebaut, Isabelle
2013-07-15
Describing domain architecture is a critical step in the functional characterization of proteins. However, some orphan domains do not match any profile stored in dedicated domain databases and are thereby difficult to analyze. We present here an original novel approach, called TREMOLO-HCA, for the analysis of orphan domain sequences and inspired from our experience in the use of Hydrophobic Cluster Analysis (HCA). Hidden relationships between protein sequences can be more easily identified from the PSI-BLAST results, using information on domain architecture, HCA plots and the conservation degree of amino acids that may participate in the protein core. This can lead to reveal remote relationships with known families of domains, as illustrated here with the identification of a hidden Tudor tandem in the human BAHCC1 protein and a hidden ET domain in the Saccharomyces cerevisiae Taf14p and human AF9 proteins. The results obtained in such a way are consistent with those provided by HHPRED, based on pairwise comparisons of HHMs. Our approach can, however, be applied even in absence of domain profiles or known 3D structures for the identification of novel families of domains. It can also be used in a reverse way for refining domain profiles, by starting from known protein domain families and identifying highly divergent members, hitherto considered as orphan. We provide a possible integration of this approach in an open TREMOLO-HCA package, which is fully implemented in python v2.7 and is available on request. Instructions are available at http://www.impmc.upmc.fr/∼callebau/tremolohca.html. isabelle.callebaut@impmc.upmc.fr Supplementary Data are available at Bioinformatics online.
Comparative and evolutionary analysis of the 14-3-3 family genes in eleven fishes.
Cao, Jun; Tan, Xiaona
2018-07-01
14-3-3 proteins are a type of highly conserved acidic proteins, which are distributed over a wide variety of organisms and are involved in multiple cellular processes. While the comparative and evolutionary analysis of this gene family is unavailable in various fish species. In this study, we identified 101 putative 14-3-3 genes in 11 fish species and divided them into 5 groups via phylogenetic analysis. Synteny analysis implied conserved and dynamic evolution characteristics near the 14-3-3 gene loci in some vertebrates. We also found that some recombination events have accelerated the evolution of this gene family. Moreover, a positive selection site was also identified, and mutation of this site could reduce the 14-3-3 stability. Divergent expression profiles of the zebrafish 14-3-3 genes were further investigated under organophosphorus stress, suggesting that they may be involved in the different osmoregulation and immune response. The results will serve as a foundation for the further functional investigation into the 14-3-3 genes in fishes. Copyright © 2018 Elsevier B.V. All rights reserved.
Gaona-López, Carlos; Julián-Sánchez, Adriana
2016-01-01
Background Alcohol dehydrogenase (ADH) activity is widely distributed in the three domains of life. Currently, there are three non-homologous NAD(P)+-dependent ADH families reported: Type I ADH comprises Zn-dependent ADHs; type II ADH comprises short-chain ADHs described first in Drosophila; and, type III ADH comprises iron-containing ADHs (FeADHs). These three families arose independently throughout evolution and possess different structures and mechanisms of reaction. While types I and II ADHs have been extensively studied, analyses about the evolution and diversity of (type III) FeADHs have not been published yet. Therefore in this work, a phylogenetic analysis of FeADHs was performed to get insights into the evolution of this protein family, as well as explore the diversity of FeADHs in eukaryotes. Principal Findings Results showed that FeADHs from eukaryotes are distributed in thirteen protein subfamilies, eight of them possessing protein sequences distributed in the three domains of life. Interestingly, none of these protein subfamilies possess protein sequences found simultaneously in animals, plants and fungi. Many FeADHs are activated by or contain Fe2+, but many others bind to a variety of metals, or even lack of metal cofactor. Animal FeADHs are found in just one protein subfamily, the hydroxyacid-oxoacid transhydrogenase (HOT) subfamily, which includes protein sequences widely distributed in fungi, but not in plants), and in several taxa from lower eukaryotes, bacteria and archaea. Fungi FeADHs are found mainly in two subfamilies: HOT and maleylacetate reductase (MAR), but some can be found also in other three different protein subfamilies. Plant FeADHs are found only in chlorophyta but not in higher plants, and are distributed in three different protein subfamilies. Conclusions/Significance FeADHs are a diverse and ancient protein family that shares a common 3D scaffold with a patchy distribution in eukaryotes. The majority of sequenced FeADHs from eukaryotes are distributed in just two subfamilies, HOT and MAR (found mainly in animals and fungi). These two subfamilies comprise almost 85% of all sequenced FeADHs in eukaryotes. PMID:27893862
Xu, Xiao-Man; Zhang, Man-Li; Zhang, Yi; Zhao, Li
2016-01-01
In the present study, we investigated the effects and mechanisms of Osthole on the apoptosis of non-small cell lung cancer (NSCLC) cells and its synergistic effect with Embelin. Our results revealed that treatment with both Osthole and Embelin inhibited cell proliferation. Notably, combination treatment of Osthole and Embelin inhibited cell proliferation more significantly compared with monotherapy. In addition, morphological analysis and Annexin V/propidium iodide analysis revealed that the combination of Osthole and Embelin enhanced their effect on cell apoptosis. We further examined the effect of Osthole on the expression of inhibitor of apoptosis protein (IAP) family proteins. That treatment of A549 lung cancer cells with various concentrations of Osthole was observed to decrease the protein expression of X-chromosome-encoded IAP, c-IAP1, c-IAP2 and Survivin, and increase Smac expression in a dose-dependent manner. Furthermore, it was noted that Osthole or Embelin alone increased the expression of BAX, caspase-3, caspase-9, cleaved caspase-3 and cleaved caspase-9, and decreased Bcl-2 levels following treatment. Osthole and Embelin combination treatment had a synergistic effect on the regulation of these proteins. In conclusion, our study demonstrated that Osthole inhibited proliferation and induced the apoptosis of lung cancer cells via IAP family proteins in a dose-dependent manner. Osthole enhances the antitumor effect of Embelin, indicating that combination of Osthole and Embelin has potential clinical significance in the treatment of NSCLC. PMID:27895730
Avsian-Kretchmer, Orna; Hsueh, Aaron J W
2004-01-01
TGF-beta family proteins with a cystine knot motif serve as ligands for diverse families of plasma membrane receptors. Bone morphogenetic protein (BMP) antagonists represent a subgroup of these proteins, some of which bind BMPs and antagonize their actions during development and morphogenesis. Availability of completed genome sequences from diverse organisms allows bioinformatic analysis of the evolution of BMP antagonists and facilitates their classification. Using a regular expression algorithm (http://BioRegEx.stanford.edu), an exhaustive search of the human genome identified all cystine knot-containing BMP antagonists. Based on the size of the cystine ring, these proteins were divided into three subfamilies: CAN (eight-membered ring), twisted gastrulation (nine-membered ring), as well as chordin and noggin (10-membered ring). The CAN family can be divided further into four subgroups based on a conserved arrangement of additional cysteine residues-gremlin and PRDC, cerberus and coco, and DAN, together with USAG-1 and sclerostin. We searched for orthologs of human BMP antagonists in the genomes of model organisms and analyzed their phylogenetic relationship. New human paralogs were identified together with the verification of orthologous relationships of known genes. We also discuss the physiological roles of the CAN subfamily of BMP antagonists and the associated genetic defects. Based on the known three-dimensional structure of key cystine knot proteins, we postulated disulfide bondings for eight-membered ring BMP antagonists to predict their potential folding and dimerization.
Chlamydomonas reinhardtii LFO1 Is an IsdG Family Heme Oxygenase
Lojek, Lisa J.; Farrand, Allison J.; Wisecaver, Jennifer H.; ...
2017-08-16
Heme is essential for respiration across all domains of life. However, heme accumulation can lead to toxicity if cells are unable to either degrade or export heme or its toxic by-products. Under aerobic conditions, heme degradation is performed by heme oxygenases, enzymes which utilize oxygen to cleave the tetrapyrrole ring of heme. The HO-1 family of heme oxygenases has been identified in both bacterial and eukaryotic cells, whereas the IsdG family has thus far been described only in bacteria. We identified a hypothetical protein in the eukaryotic green alga Chlamydomonas reinhardtii, which encodes a protein containing an antibiotic biosynthesis monooxygenasemore » (ABM) domain consistent with those associated with IsdG family members. This protein, which we have named LFO1, degrades heme, contains similarities in predicted secondary structures to IsdG family members, and retains the functionally conserved catalytic residues found in all IsdG family heme oxygenases. These data establish LFO1 as an IsdG family member and extend our knowledge of the distribution of IsdG family members beyond bacteria. To gain further insight into the distribution of the IsdG family, we used the LFO1 sequence to identify 866 IsdG family members, including representatives from all domains of life. These results indicate that the distribution of IsdG family heme oxygenases is more expansive than previously appreciated, underscoring the broad relevance of this enzyme family. This work establishes a protein in the freshwater alga Chlamydomonas reinhardtii as an IsdG family heme oxygenase. This protein, LFO1, exhibits predicted secondary structure and catalytic residues conserved in IsdG family members, in addition to a chloroplast localization sequence. Additionally, the catabolite that results from the degradation of heme by LFO1 is distinct from that of other heme degradation products. Using LFO1 as a seed, we performed phylogenetic analysis, revealing that the IsdG family is conserved in all domains of life. Also, C. reinhardtii contains two previously identified HO-1 family heme oxygenases, making C. reinhardtii the first organism shown to contain two families of heme oxygenases. These data indicate that C. reinhardtii may have unique mechanisms for regulating iron homeostasis within the chloroplast.« less
Chlamydomonas reinhardtii LFO1 Is an IsdG Family Heme Oxygenase
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lojek, Lisa J.; Farrand, Allison J.; Wisecaver, Jennifer H.
Heme is essential for respiration across all domains of life. However, heme accumulation can lead to toxicity if cells are unable to either degrade or export heme or its toxic by-products. Under aerobic conditions, heme degradation is performed by heme oxygenases, enzymes which utilize oxygen to cleave the tetrapyrrole ring of heme. The HO-1 family of heme oxygenases has been identified in both bacterial and eukaryotic cells, whereas the IsdG family has thus far been described only in bacteria. We identified a hypothetical protein in the eukaryotic green alga Chlamydomonas reinhardtii, which encodes a protein containing an antibiotic biosynthesis monooxygenasemore » (ABM) domain consistent with those associated with IsdG family members. This protein, which we have named LFO1, degrades heme, contains similarities in predicted secondary structures to IsdG family members, and retains the functionally conserved catalytic residues found in all IsdG family heme oxygenases. These data establish LFO1 as an IsdG family member and extend our knowledge of the distribution of IsdG family members beyond bacteria. To gain further insight into the distribution of the IsdG family, we used the LFO1 sequence to identify 866 IsdG family members, including representatives from all domains of life. These results indicate that the distribution of IsdG family heme oxygenases is more expansive than previously appreciated, underscoring the broad relevance of this enzyme family. This work establishes a protein in the freshwater alga Chlamydomonas reinhardtii as an IsdG family heme oxygenase. This protein, LFO1, exhibits predicted secondary structure and catalytic residues conserved in IsdG family members, in addition to a chloroplast localization sequence. Additionally, the catabolite that results from the degradation of heme by LFO1 is distinct from that of other heme degradation products. Using LFO1 as a seed, we performed phylogenetic analysis, revealing that the IsdG family is conserved in all domains of life. Also, C. reinhardtii contains two previously identified HO-1 family heme oxygenases, making C. reinhardtii the first organism shown to contain two families of heme oxygenases. These data indicate that C. reinhardtii may have unique mechanisms for regulating iron homeostasis within the chloroplast.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Karpinets, Tatiana V; Park, Byung; Syed, Mustafa H
2010-01-01
The Carbohydrate-Active Enzyme (CAZy) database provides a rich set of manually annotated enzymes that degrade, modify, or create glycosidic bonds. Despite rich and invaluable information stored in the database, software tools utilizing this information for annotation of newly sequenced genomes by CAZy families are limited. We have employed two annotation approaches to fill the gap between manually curated high-quality protein sequences collected in the CAZy database and the growing number of other protein sequences produced by genome or metagenome sequencing projects. The first approach is based on a similarity search against the entire non-redundant sequences of the CAZy database. Themore » second approach performs annotation using links or correspondences between the CAZy families and protein family domains. The links were discovered using the association rule learning algorithm applied to sequences from the CAZy database. The approaches complement each other and in combination achieved high specificity and sensitivity when cross-evaluated with the manually curated genomes of Clostridium thermocellum ATCC 27405 and Saccharophagus degradans 2-40. The capability of the proposed framework to predict the function of unknown protein domains (DUF) and of hypothetical proteins in the genome of Neurospora crassa is demonstrated. The framework is implemented as a Web service, the CAZymes Analysis Toolkit (CAT), and is available at http://cricket.ornl.gov/cgi-bin/cat.cgi.« less
Park, Byung H; Karpinets, Tatiana V; Syed, Mustafa H; Leuze, Michael R; Uberbacher, Edward C
2010-12-01
The Carbohydrate-Active Enzyme (CAZy) database provides a rich set of manually annotated enzymes that degrade, modify, or create glycosidic bonds. Despite rich and invaluable information stored in the database, software tools utilizing this information for annotation of newly sequenced genomes by CAZy families are limited. We have employed two annotation approaches to fill the gap between manually curated high-quality protein sequences collected in the CAZy database and the growing number of other protein sequences produced by genome or metagenome sequencing projects. The first approach is based on a similarity search against the entire nonredundant sequences of the CAZy database. The second approach performs annotation using links or correspondences between the CAZy families and protein family domains. The links were discovered using the association rule learning algorithm applied to sequences from the CAZy database. The approaches complement each other and in combination achieved high specificity and sensitivity when cross-evaluated with the manually curated genomes of Clostridium thermocellum ATCC 27405 and Saccharophagus degradans 2-40. The capability of the proposed framework to predict the function of unknown protein domains and of hypothetical proteins in the genome of Neurospora crassa is demonstrated. The framework is implemented as a Web service, the CAZymes Analysis Toolkit, and is available at http://cricket.ornl.gov/cgi-bin/cat.cgi.
Madio, Bruno; Undheim, Eivind A B; King, Glenn F
2017-08-23
More than a century of research on sea anemone venoms has shown that they contain a diversity of biologically active proteins and peptides. However, recent omics studies have revealed that much of the venom proteome remains unexplored. We used, for the first time, a combination of proteomic and transcriptomic techniques to obtain a holistic overview of the venom arsenal of the well-studied sea anemone Stichodactyla haddoni. A purely search-based approach to identify putative toxins in a transcriptome from tentacles regenerating after venom extraction identified 508 unique toxin-like transcripts grouped into 63 families. However, proteomic analysis of venom revealed that 52 of these toxin families are likely false positives. In contrast, the combination of transcriptomic and proteomic data enabled positive identification of 23 families of putative toxins, 12 of which have no homology known proteins or peptides. Our data highlight the importance of using proteomics of milked venom to correctly identify venom proteins/peptides, both known and novel, while minimizing false positive identifications from non-toxin homologues identified in transcriptomes of venom-producing tissues. This work lays the foundation for uncovering the role of individual toxins in sea anemone venom and how they contribute to the envenomation of prey, predators, and competitors. Proteomic analysis of milked venom combined with analysis of a tentacle transcriptome revealed the full extent of the venom arsenal of the sea anemone Stichodactyla haddoni. This combined approach led to the discovery of 12 entirely new families of disulfide-rich peptides and proteins in a genus of anemones that have been studied for over a century. Copyright © 2017 Elsevier B.V. All rights reserved.
Jasrapuria, Sinu; Specht, Charles A.; Kramer, Karl J.; Beeman, Richard W.; Muthukrishnan, Subbaratnam
2012-01-01
The functional characterization of an entire class of 17 genes from the red flour beetle, Tribolium castaneum, which encode two families of Cuticular Proteins Analogous to Peritrophins (CPAPs) has been carried out. CPAP genes in T. castaneum are expressed exclusively in cuticle-forming tissues and have been classified into two families, CPAP1 and CPAP3, based on whether the proteins contain either one (CPAP1), or three copies (CPAP3) of the chitin-binding domain, ChtBD2, with its six characteristically spaced cysteine residues. Individual members of the TcCPAP1 and TcCPAP3 gene families have distinct developmental patterns of expression. Many of these proteins serve essential and non-redundant functions in maintaining the structural integrity of the cuticle in different parts of the insect anatomy. Three genes of the TcCPAP1 family and five genes of the TcCPAP3 family are essential for insect development, molting, cuticle integrity, proper locomotion or fecundity. RNA interference (RNAi) targeting TcCPAP1-C, TcCPAP1-H, TcCPAP1-J or TcCPAP3-C transcripts resulted in death at the pharate adult stage of development. RNAi for TcCPAP3-A1, TcCPAP3-B, TcCPAP3-D1 or TcCPAP3-D2 genes resulted in different developmental defects, including adult/embryonic mortality, abnormal elytra or hindwings, or an abnormal ‘stiff-jointed’ gait. These results provide experimental support for specialization in the functions of CPAP proteins in T. castaneum and a biological rationale for the conservation of CPAP orthologs in other orders of insects. This is the first comprehensive functional analysis of an entire class of cuticular proteins with one or more ChtBD2 domains in any insect species. PMID:23185457
Jasrapuria, Sinu; Specht, Charles A; Kramer, Karl J; Beeman, Richard W; Muthukrishnan, Subbaratnam
2012-01-01
The functional characterization of an entire class of 17 genes from the red flour beetle, Tribolium castaneum, which encode two families of Cuticular Proteins Analogous to Peritrophins (CPAPs) has been carried out. CPAP genes in T. castaneum are expressed exclusively in cuticle-forming tissues and have been classified into two families, CPAP1 and CPAP3, based on whether the proteins contain either one (CPAP1), or three copies (CPAP3) of the chitin-binding domain, ChtBD2, with its six characteristically spaced cysteine residues. Individual members of the TcCPAP1 and TcCPAP3 gene families have distinct developmental patterns of expression. Many of these proteins serve essential and non-redundant functions in maintaining the structural integrity of the cuticle in different parts of the insect anatomy. Three genes of the TcCPAP1 family and five genes of the TcCPAP3 family are essential for insect development, molting, cuticle integrity, proper locomotion or fecundity. RNA interference (RNAi) targeting TcCPAP1-C, TcCPAP1-H, TcCPAP1-J or TcCPAP3-C transcripts resulted in death at the pharate adult stage of development. RNAi for TcCPAP3-A1, TcCPAP3-B, TcCPAP3-D1 or TcCPAP3-D2 genes resulted in different developmental defects, including adult/embryonic mortality, abnormal elytra or hindwings, or an abnormal 'stiff-jointed' gait. These results provide experimental support for specialization in the functions of CPAP proteins in T. castaneum and a biological rationale for the conservation of CPAP orthologs in other orders of insects. This is the first comprehensive functional analysis of an entire class of cuticular proteins with one or more ChtBD2 domains in any insect species.
Diversity and evolution of cytochrome P450 monooxygenases in Oomycetes.
Sello, Mopeli Marshal; Jafta, Norventia; Nelson, David R; Chen, Wanping; Yu, Jae-Hyuk; Parvez, Mohammad; Kgosiemang, Ipeleng Kopano Rosinah; Monyaki, Richie; Raselemane, Seiso Caiphus; Qhanya, Lehlohonolo Benedict; Mthakathi, Ntsane Trevor; Sitheni Mashele, Samson; Syed, Khajamohiddin
2015-07-01
Cytochrome P450 monooxygenases (P450s) are heme-thiolate proteins whose role as drug targets against pathogens, as well as in valuable chemical production and bioremediation, has been explored. In this study we performed comprehensive comparative analysis of P450s in 13 newly explored oomycete pathogens. Three hundred and fifty-six P450s were found in oomycetes. These P450s were grouped into 15 P450 families and 84 P450 subfamilies. Among these, nine P450 families and 31 P450 subfamilies were newly found in oomycetes. Research revealed that oomycetes belonging to different orders contain distinct P450 families and subfamilies in their genomes. Evolutionary analysis and sequence homology data revealed P450 family blooms in oomycetes. Tandem arrangement of a large number of P450s belonging to the same family indicated that P450 family blooming is possibly due to its members' duplications. A unique combination of amino acid patterns was observed at EXXR and CXG motifs for the P450 families CYP5014, CYP5015 and CYP5017. A novel P450 fusion protein (CYP5619 family) with an N-terminal P450 domain fused to a heme peroxidase/dioxygenase domain was discovered in Saprolegnia declina. Oomycete P450 patterns suggested host influence in shaping their P450 content. This manuscript serves as reference for future P450 annotations in newly explored oomycetes.
Proteins with an Euonymus lectin-like domain are ubiquitous in Embryophyta
2009-01-01
Background Cloning of the Euonymus lectin led to the discovery of a novel domain that also occurs in some stress-induced plant proteins. The distribution and the diversity of proteins with an Euonymus lectin (EUL) domain were investigated using detailed analysis of sequences in publicly accessible genome and transcriptome databases. Results Comprehensive in silico analyses indicate that the recently identified Euonymus europaeus lectin domain represents a conserved structural unit of a novel family of putative carbohydrate-binding proteins, which will further be referred to as the Euonymus lectin (EUL) family. The EUL domain is widespread among plants. Analysis of retrieved sequences revealed that some sequences consist of a single EUL domain linked to an unrelated N-terminal domain whereas others comprise two in tandem arrayed EUL domains. A new classification system for these lectins is proposed based on the overall domain architecture. Evolutionary relationships among the sequences with EUL domains are discussed. Conclusion The identification of the EUL family provides the first evidence for the occurrence in terrestrial plants of a highly conserved plant specific domain. The widespread distribution of the EUL domain strikingly contrasts the more limited or even narrow distribution of most other lectin domains found in plants. The apparent omnipresence of the EUL domain is indicative for a universal role of this lectin domain in plants. Although there is unambiguous evidence that several EUL domains possess carbohydrate-binding activity further research is required to corroborate the carbohydrate-binding properties of different members of the EUL family. PMID:19930663
Ponsuwanna, Patrath; Kümpornsin, Krittikorn; Chookajorn, Thanat
2014-01-01
Even though antigenic variation is employed among parasitic protozoa for host immune evasion, Tetrahymena thermophila, a free-living ciliate, can also change its surface protein antigens. These cysteine-rich glycosylphosphatidylinositol (GPI)-linked surface proteins are encoded by a family of polymorphic Ser genes. Despite the availability of T. thermophila genome, a comprehensive analysis of the Ser family is limited by its high degree of polymorphism. In order to overcome this problem, a new approach was adopted by searching for Ser candidates with common motif sequences, namely length-specific repetitive cysteine pattern and GPI anchor site. The candidate genes were phylogenetically compared with the previously identified Ser genes and classified into subtypes. Ser candidates were often found to be located as tandem arrays of the same subtypes on several chromosomal scaffolds. Certain Ser candidates located in the same chromosomal arrays were transcriptionally expressed at specific T. thermophila developmental stages. These Ser candidates selected by the motif analysis approach can form the foundation for a systematic identification of the entire Ser gene family, which will contribute to the understanding of their function and the basis of T. thermophila antigenic variation. PMID:25133747
Torres, Marina W; Corrêa, Régis L; Schrago, Carlos G
2005-12-30
The coat protein (CP) of the family Luteoviridae is directly associated with the success of infection. It participates in various steps of the virus life cycle, such as virion assembly, stability, systemic infection, and transmission. Despite its importance, extensive studies on the molecular evolution of this protein are lacking. In the present study, we investigate the action of differential selective forces on the CP coding region using maximum likelihood methods. We found that the protein is subjected to heterogeneous selective pressures and some sites may be evolving near neutrality. Based on the proposed 3-D model of the CP S-domain, we showed that nearly neutral sites are predominantly located in the region of the protein that faces the interior of the capsid, in close contact with the viral RNA, while highly conserved sites are mainly part of beta-strands, in the protein's major framework.
Boldt, Lynda; Yellowlees, David; Leggat, William
2012-01-01
The superfamily of light-harvesting complex (LHC) proteins is comprised of proteins with diverse functions in light-harvesting and photoprotection. LHC proteins bind chlorophyll (Chl) and carotenoids and include a family of LHCs that bind Chl a and c. Dinophytes (dinoflagellates) are predominantly Chl c binding algal taxa, bind peridinin or fucoxanthin as the primary carotenoid, and can possess a number of LHC subfamilies. Here we report 11 LHC sequences for the chlorophyll a-chlorophyll c 2-peridinin protein complex (acpPC) subfamily isolated from Symbiodinium sp. C3, an ecologically important peridinin binding dinoflagellate taxa. Phylogenetic analysis of these proteins suggests the acpPC subfamily forms at least three clades within the Chl a/c binding LHC family; Clade 1 clusters with rhodophyte, cryptophyte and peridinin binding dinoflagellate sequences, Clade 2 with peridinin binding dinoflagellate sequences only and Clades 3 with heterokontophytes, fucoxanthin and peridinin binding dinoflagellate sequences. PMID:23112815
BCL-2 family proteins: changing partners in the dance towards death.
Kale, Justin; Osterlund, Elizabeth J; Andrews, David W
2018-01-01
The BCL-2 family of proteins controls cell death primarily by direct binding interactions that regulate mitochondrial outer membrane permeabilization (MOMP) leading to the irreversible release of intermembrane space proteins, subsequent caspase activation and apoptosis. The affinities and relative abundance of the BCL-2 family proteins dictate the predominate interactions between anti-apoptotic and pro-apoptotic BCL-2 family proteins that regulate MOMP. We highlight the core mechanisms of BCL-2 family regulation of MOMP with an emphasis on how the interactions between the BCL-2 family proteins govern cell fate. We address the critical importance of both the concentration and affinities of BCL-2 family proteins and show how differences in either can greatly change the outcome. Further, we explain the importance of using full-length BCL-2 family proteins (versus truncated versions or peptides) to parse out the core mechanisms of MOMP regulation by the BCL-2 family. Finally, we discuss how post-translational modifications and differing intracellular localizations alter the mechanisms of apoptosis regulation by BCL-2 family proteins. Successful therapeutic intervention of MOMP regulation in human disease requires an understanding of the factors that mediate the major binding interactions between BCL-2 family proteins in cells.
Evolutionary plasticity of plasma membrane interaction in DREPP family proteins.
Vosolsobě, Stanislav; Petrášek, Jan; Schwarzerová, Kateřina
2017-05-01
The plant-specific DREPP protein family comprises proteins that were shown to regulate the actin and microtubular cytoskeleton in a calcium-dependent manner. Our phylogenetic analysis showed that DREPPs first appeared in ferns and that DREPPs have a rapid and plastic evolutionary history in plants. Arabidopsis DREPP paralogues called AtMDP25/PCaP1 and AtMAP18/PCaP2 are N-myristoylated, which has been reported as a key factor in plasma membrane localization. Here we show that N-myristoylation is neither conserved nor ancestral for the DREPP family. Instead, by using confocal microscopy and a new method for quantitative evaluation of protein membrane localization, we show that DREPPs rely on two mechanisms ensuring their plasma membrane localization. These include N-myristoylation and electrostatic interaction of a polybasic amino acid cluster. We propose that various plasma membrane association mechanisms resulting from the evolutionary plasticity of DREPPs are important for refining plasma membrane interaction of these signalling proteins under various conditions and in various cells. Copyright © 2017 Elsevier B.V. All rights reserved.
Mutations Affecting G-Protein Subunit α11 in Hypercalcemia and Hypocalcemia
Babinsky, Valerie N.; Head, Rosie A.; Cranston, Treena; Rust, Nigel; Hobbs, Maurine R.; Heath, Hunter; Thakker, Rajesh V.
2013-01-01
BACKGROUND Familial hypocalciuric hypercalcemia is a genetically heterogeneous disorder with three variants: types 1, 2, and 3. Type 1 is due to loss-of-function mutations of the calcium-sensing receptor, a guanine nucleotide–binding protein (G-protein)–coupled receptor that signals through the G-protein subunit α11 (Gα11). Type 3 is associated with adaptor-related protein complex 2, sigma 1 subunit (AP2S1) mutations, which result in altered calcium-sensing receptor endocytosis. We hypothesized that type 2 is due to mutations effecting Gα11 loss of function, since Gα11 is involved in calcium-sensing receptor signaling, and its gene (GNA11) and the type 2 locus are colocalized on chromosome 19p13.3. We also postulated that mutations effecting Gα11 gain of function, like the mutations effecting calcium-sensing receptor gain of function that cause autosomal dominant hypocalcemia type 1, may lead to hypocalcemia. METHODS We performed GNA11 mutational analysis in a kindred with familial hypocalciuric hypercalcemia type 2 and in nine unrelated patients with familial hypocalciuric hypercalcemia who did not have mutations in the gene encoding the calcium-sensing receptor (CASR) or AP2S1. We also performed this analysis in eight unrelated patients with hypocalcemia who did not have CASR mutations. In addition, we studied the effects of GNA11 mutations on Gα11 protein structure and calcium-sensing receptor signaling in human embryonic kidney 293 (HEK293) cells. RESULTS The kindred with familial hypocalciuric hypercalcemia type 2 had an in-frame deletion of a conserved Gα11 isoleucine (Ile200del), and one of the nine unrelated patients with familial hypocalciuric hypercalcemia had a missense GNA11 mutation (Leu135Gln). Missense GNA11 mutations (Arg181Gln and Phe341Leu) were detected in two unrelated patients with hypocalcemia; they were therefore identified as having autosomal dominant hypocalcemia type 2. All four GNA11 mutations predicted disrupted protein structures, and assessment on the basis of in vitro expression showed that familial hypocalciuric hypercalcemia type 2–associated mutations decreased the sensitivity of cells expressing calcium-sensing receptors to changes in extracellular calcium concentrations, whereas autosomal dominant hypocalcemia type 2–associated mutations increased cell sensitivity. CONCLUSIONS Gα11 mutants with loss of function cause familial hypocalciuric hypercalcemia type 2, and Gα11 mutants with gain of function cause a clinical disorder designated as autosomal dominant hypocalcemia type 2. (Funded by the United Kingdom Medical Research Council and others.) PMID:23802516
Members of the YjgF/YER057c/UK114 family of proteins inhibit phosphoribosylamine synthesis in vitro.
Lambrecht, Jennifer A; Browne, Beth Ann; Downs, Diana M
2010-11-05
The YjgF/YER057c/UK114 family of proteins is highly conserved across all three domains of life and currently lacks a consensus biochemical function. Analysis of Salmonella enterica strains lacking yjgF has led to a working model in which YjgF functions to remove potentially toxic secondary products of cellular enzymes. Strains lacking yjgF synthesize the thiamine precursor phosphoribosylamine (PRA) by a TrpD-dependent mechanism that is not present in wild-type strains. Here, PRA synthesis was reconstituted in vitro with anthranilate phosphoribosyltransferase (TrpD), threonine dehydratase (IlvA), threonine, and phosphoribosyl pyrophosphate. TrpD-dependent PRA formation in vitro was inhibited by S. enterica YjgF and the human homolog UK114. Thus, the work herein describes the first biochemical assay for diverse members of the highly conserved YjgF/YER057c/UK114 family of proteins and provides a means to dissect the cellular functions of these proteins.
Graph pyramids for protein function prediction
2015-01-01
Background Uncovering the hidden organizational characteristics and regularities among biological sequences is the key issue for detailed understanding of an underlying biological phenomenon. Thus pattern recognition from nucleic acid sequences is an important affair for protein function prediction. As proteins from the same family exhibit similar characteristics, homology based approaches predict protein functions via protein classification. But conventional classification approaches mostly rely on the global features by considering only strong protein similarity matches. This leads to significant loss of prediction accuracy. Methods Here we construct the Protein-Protein Similarity (PPS) network, which captures the subtle properties of protein families. The proposed method considers the local as well as the global features, by examining the interactions among 'weakly interacting proteins' in the PPS network and by using hierarchical graph analysis via the graph pyramid. Different underlying properties of the protein families are uncovered by operating the proposed graph based features at various pyramid levels. Results Experimental results on benchmark data sets show that the proposed hierarchical voting algorithm using graph pyramid helps to improve computational efficiency as well the protein classification accuracy. Quantitatively, among 14,086 test sequences, on an average the proposed method misclassified only 21.1 sequences whereas baseline BLAST score based global feature matching method misclassified 362.9 sequences. With each correctly classified test sequence, the fast incremental learning ability of the proposed method further enhances the training model. Thus it has achieved more than 96% protein classification accuracy using only 20% per class training data. PMID:26044522
Graph pyramids for protein function prediction.
Sandhan, Tushar; Yoo, Youngjun; Choi, Jin; Kim, Sun
2015-01-01
Uncovering the hidden organizational characteristics and regularities among biological sequences is the key issue for detailed understanding of an underlying biological phenomenon. Thus pattern recognition from nucleic acid sequences is an important affair for protein function prediction. As proteins from the same family exhibit similar characteristics, homology based approaches predict protein functions via protein classification. But conventional classification approaches mostly rely on the global features by considering only strong protein similarity matches. This leads to significant loss of prediction accuracy. Here we construct the Protein-Protein Similarity (PPS) network, which captures the subtle properties of protein families. The proposed method considers the local as well as the global features, by examining the interactions among 'weakly interacting proteins' in the PPS network and by using hierarchical graph analysis via the graph pyramid. Different underlying properties of the protein families are uncovered by operating the proposed graph based features at various pyramid levels. Experimental results on benchmark data sets show that the proposed hierarchical voting algorithm using graph pyramid helps to improve computational efficiency as well the protein classification accuracy. Quantitatively, among 14,086 test sequences, on an average the proposed method misclassified only 21.1 sequences whereas baseline BLAST score based global feature matching method misclassified 362.9 sequences. With each correctly classified test sequence, the fast incremental learning ability of the proposed method further enhances the training model. Thus it has achieved more than 96% protein classification accuracy using only 20% per class training data.
Fushiki, Daisuke; Hamada, Yasuo; Yoshimura, Ryoichi; Endo, Yasuhisa
2010-04-01
All multi-cellular animals, including hydra, insects and vertebrates, develop gap junctions, which communicate directly with neighboring cells. Gap junctions consist of protein families called connexins in vertebrates and innexins in invertebrates. Connexins and innexins have no homology in their amino acid sequence, but both are thought to have some similar characteristics, such as a tetra-membrane-spanning structure, formation of a channel by hexamer, and transmission of small molecules (e.g. ions) to neighboring cells. Pannexins were recently identified as a homolog of innexins in vertebrate genomes. Although pannexins are thought to share the function of intercellular communication with connexins and innexins, there is little information about the relationship among these three protein families of gap junctions. We phylgenetically and bioinformatically examined these protein families and other tetra-membrane-spanning proteins using a database and three analytical softwares. The clades formed by pannexin families do not belong to the species classification but do to paralogs of each member of pannexins. Amino acid sequences of pannexins are closely related to those of innexins but less to those of connexins. These data suggest that innexins and pannexins have a common origin, but the relationship between innexins/pannexins and connexins is as slight as that of other tetra-membrane-spanning members.
Kugler, Jamie E.; Horsch, Marion; Huang, Di; Furusawa, Takashi; Rochman, Mark; Garrett, Lillian; Becker, Lore; Bohla, Alexander; Hölter, Sabine M.; Prehn, Cornelia; Rathkolb, Birgit; Racz, Ildikó; Aguilar-Pimentel, Juan Antonio; Adler, Thure; Adamski, Jerzy; Beckers, Johannes; Busch, Dirk H.; Eickelberg, Oliver; Klopstock, Thomas; Ollert, Markus; Stöger, Tobias; Wolf, Eckhard; Wurst, Wolfgang; Yildirim, Ali Önder; Zimmer, Andreas; Gailus-Durner, Valérie; Fuchs, Helmut; Hrabě de Angelis, Martin; Garfinkel, Benny; Orly, Joseph; Ovcharenko, Ivan; Bustin, Michael
2013-01-01
The nuclei of most vertebrate cells contain members of the high mobility group N (HMGN) protein family, which bind specifically to nucleosome core particles and affect chromatin structure and function, including transcription. Here, we study the biological role of this protein family by systematic analysis of phenotypes and tissue transcription profiles in mice lacking functional HMGN variants. Phenotypic analysis of Hmgn1tm1/tm1, Hmgn3tm1/tm1, and Hmgn5tm1/tm1 mice and their wild type littermates with a battery of standardized tests uncovered variant-specific abnormalities. Gene expression analysis of four different tissues in each of the Hmgntm1/tm1 lines reveals very little overlap between genes affected by specific variants in different tissues. Pathway analysis reveals that loss of an HMGN variant subtly affects expression of numerous genes in specific biological processes. We conclude that within the biological framework of an entire organism, HMGNs modulate the fidelity of the cellular transcriptional profile in a tissue- and HMGN variant-specific manner. PMID:23620591
Stamm, I; Leclerque, A; Plaga, W
1999-09-01
Prominent low-molecular-weight proteins were isolated from vegetative cells of the myxobacterium Stigmatella aurantiaca and were found to be members of the cold-shock protein family. A first gene of this family (cspA) was cloned and sequenced. It encodes a protein of 68 amino acid residues that displays up to 71% sequence identity with other bacterial cold-shock(-like) proteins. A cysteine residue within the RNP-2 motif is a peculiarity of Stigmatella CspA. A cspA::(Deltatrp-lacZ) fusion gene construct was introduced into Stigmatella by electroporation, a method that has not been used previously for this strain. Analysis of the resultant transformants revealed that cspA transcription occurs at high levels during vegetative growth at 20 and 32 degrees C, and during fruiting body formation.
2011-01-01
Background The drug/metabolite transporter superfamily comprises a diversity of protein domain families with multiple functions including transport of nucleotide sugars. Drug/metabolite transporter domains are contained in both solute carrier families 30, 35 and 39 proteins as well as in acyl-malonyl condensing enzyme proteins. In this paper, we present an evolutionary analysis of nucleotide sugar transporters in relation to the entire superfamily of drug/metabolite transporters that considers crucial intra-protein duplication events that have shaped the transporters. We use a method that combines the strengths of hidden Markov models and maximum likelihood to find relationships between drug/metabolite transporter families, and branches within families. Results We present evidence that the triose-phosphate transporters, domain unknown function 914, uracil-diphosphate glucose-N-acetylglucosamine, and nucleotide sugar transporter families have evolved from a domain duplication event before the radiation of Viridiplantae in the EamA family (previously called domain unknown function 6). We identify previously unknown branches in the solute carrier 30, 35 and 39 protein families that emerged simultaneously as key physiological developments after the radiation of Viridiplantae, including the "35C/E" branch of EamA, which formed in the lineage of T. adhaerens (Animalia). We identify a second cluster of DMTs, called the domain unknown function 1632 cluster, which has non-cytosolic N- and C-termini, and thus appears to have been formed from a different domain duplication event. We identify a previously uncharacterized motif, G-X(6)-G, which is overrepresented in the fifth transmembrane helix of C-terminal domains. We present evidence that the family called fatty acid elongases are homologous to transporters, not enzymes as had previously been thought. Conclusions The nucleotide sugar transporters families were formed through differentiation of the gene cluster EamA (domain unknown function 6) before Viridiplantae, showing for the first time the significance of EamA. PMID:21569384
DOE Office of Scientific and Technical Information (OSTI.GOV)
Osipiuk, J.; Gornicki, P.; Maj, L.
The structure of the YlxR protein of unknown function from Streptococcus pneumonia was determined to 1.35 Angstroms. YlxR is expressed from the nusA/infB operon in bacteria and belongs to a small protein family (COG2740) that shares a conserved sequence motif GRGA(Y/W). The family shows no significant amino-acid sequence similarity with other proteins. Three-wavelength diffraction MAD data were collected to 1.7 Angstroms from orthorhombic crystals using synchrotron radiation and the structure was determined using a semi-automated approach. The YlxR structure resembles a two-layer {alpha}/{beta} sandwich with the overall shape of a cylinder and shows no structural homology to proteins of knownmore » structure. Structural analysis revealed that the YlxR structure represents a new protein fold that belongs to the {alpha}-{beta} plait superfamily. The distribution of the electrostatic surface potential shows a large positively charged patch on one side of the protein, a feature often found in nucleic acid-binding proteins. Three sulfate ions bind to this positively charged surface. Analysis of potential binding sites uncovered several substantial clefts, with the largest spanning 3/4 of the protein. A similar distribution of binding sites and a large sharply bent cleft are observed in RNA-binding proteins that are unrelated in sequence and structure. It is proposed that YlxR is an RNA-binding protein.« less
Streptococcus pneumonia YlxR at 1.35 A shows a putative new fold.
Osipiuk, J; Górnicki, P; Maj, L; Dementieva, I; Laskowski, R; Joachimiak, A
2001-11-01
The structure of the YlxR protein of unknown function from Streptococcus pneumonia was determined to 1.35 A. YlxR is expressed from the nusA/infB operon in bacteria and belongs to a small protein family (COG2740) that shares a conserved sequence motif GRGA(Y/W). The family shows no significant amino-acid sequence similarity with other proteins. Three-wavelength diffraction MAD data were collected to 1.7 A from orthorhombic crystals using synchrotron radiation and the structure was determined using a semi-automated approach. The YlxR structure resembles a two-layer alpha/beta sandwich with the overall shape of a cylinder and shows no structural homology to proteins of known structure. Structural analysis revealed that the YlxR structure represents a new protein fold that belongs to the alpha-beta plait superfamily. The distribution of the electrostatic surface potential shows a large positively charged patch on one side of the protein, a feature often found in nucleic acid-binding proteins. Three sulfate ions bind to this positively charged surface. Analysis of potential binding sites uncovered several substantial clefts, with the largest spanning 3/4 of the protein. A similar distribution of binding sites and a large sharply bent cleft are observed in RNA-binding proteins that are unrelated in sequence and structure. It is proposed that YlxR is an RNA-binding protein.
The Friedreich ataxia critical region spans a 150-kb interval on chromosome 9q13
DOE Office of Scientific and Technical Information (OSTI.GOV)
Montermini, L.; Zara, F.; Patel, P.I.
1995-11-01
By analysis of crossovers in key recombinant families and by homozygosity analysis of inbred families, the Friedreich ataxia (FRDA) locus was localized in a 300-kb interval between the X104 gene and the microsatellite marker FR8 (D9S888). By homology searches of the sequence databases, we identified X104 as the human tight junction protein ZO-2 gene. We generated a large-scale physical map of the FRDA region by pulsed-field gel electrophoresis analysis of genomic DNA and of three YAC clones derived from different libraries, and we constructed an uninterrupted cosmid contig spanning the FRDA locus. The cAMP-dependent protein kinase {gamma}-catalytic subunit gene wasmore » identified within the critical FRDA interval, but it was excluded as candidate because of its biological properties and because of lack of mutations in FRDA patients. Six new polymorphic markers were isolated between FR2 (D9S886) and FR8 (D9S888), which were used for homozygosity analysis in a family in which parents of an affected child are distantly related. An ancient recombination involving the centromeric FRDA flanking markers had been previously demonstrated in this family. Homozygosity analysis indicated that the FRDA gene is localized in the telomeric 150 kb of the FR2-FR8 interval. 17 refs., 3 figs., 1 tab.« less
Expression analysis of genes encoding double B-box zinc finger proteins in maize.
Li, Wenlan; Wang, Jingchao; Sun, Qi; Li, Wencai; Yu, Yanli; Zhao, Meng; Meng, Zhaodong
2017-11-01
The B-box proteins play key roles in plant development. The double B-box (DBB) family is one of the subfamily of the B-box family, with two B-box domains and without a CCT domain. In this study, 12 maize double B-box genes (ZmDBBs) were identified through a genome-wide survey. Phylogenetic analysis of DBB proteins from maize, rice, Sorghum bicolor, Arabidopsis, and poplar classified them into five major clades. Gene duplication analysis indicated that segmental duplications made a large contribution to the expansion of ZmDBBs. Furthermore, a large number of cis-acting regulatory elements related to plant development, response to light and phytohormone were identified in the promoter regions of the ZmDBB genes. The expression patterns of the ZmDBB genes in various tissues and different developmental stages demonstrated that ZmDBBs might play essential roles in plant development, and some ZmDBB genes might have unique function in specific developmental stages. In addition, several ZmDBB genes showed diurnal expression pattern. The expression levels of some ZmDBB genes changed significantly under light/dark treatment conditions and phytohormone treatments, implying that they might participate in light signaling pathway and hormone signaling. Our results will provide new information to better understand the complexity of the DBB gene family in maize.
Yockey, C E; Shimizu, N
1998-02-01
Members of the TEA/ATTS family of transcription factors have been found in most representative eukaryotic organisms. In vertebrates, the TEA family contains at least four members, which share overlapping DNA-binding specificity and have similar transcriptional activation properties. In this article, we describe the cDNA cloning and characterization of the murine TEA proteins DTEF-1 (mDTEF-1) and ETF. Using in situ hybridization analysis of mouse embryos, we found that mDTEF-1 and ETF transcript distributions substantially overlap. ETF is expressed throughout the embryo except in the myocardium early in development, whereas late in development, it is enriched in lung and neuroectoderm. Mouse DTEF-1 is expressed at a much lower level throughout development and is substantially enriched in ectoderm and skin, as well as in the developing pituitary at midgestation. Northern blot analysis of adult mouse tissue total RNA showed that both ETF and mDTEF-1 are abundant in uterus and lung relative to other tissues. Using gel mobility shift assays and GAL4-fusion protein analysis, we demonstrated that the full coding sequences of ETF and mDTEF-1 encode M-CAT/GT-IIC-binding proteins containing activation domains.
Bernkopf, Marie; Webersinke, Gerald; Tongsook, Chanakan; Koyani, Chintan N.; Rafiq, Muhammad A.; Ayaz, Muhammad; Müller, Doris; Enzinger, Christian; Aslam, Muhammad; Naeem, Farooq; Schmidt, Kurt; Gruber, Karl; Speicher, Michael R.; Malle, Ernst; Macheroux, Peter; Ayub, Muhammad; Vincent, John B.; Windpassinger, Christian; Duba, Hans-Christoph
2014-01-01
We describe the characterization of a gene for mild nonsyndromic autosomal recessive intellectual disability (ID) in two unrelated families, one from Austria, the other from Pakistan. Genome-wide single nucleotide polymorphism microarray analysis enabled us to define a region of homozygosity by descent on chromosome 17q25. Whole-exome sequencing and analysis of this region in an affected individual from the Austrian family identified a 5 bp frameshifting deletion in the METTL23 gene. By means of Sanger sequencing of METTL23, a nonsense mutation was detected in a consanguineous ID family from Pakistan for which homozygosity-by-descent mapping had identified a region on 17q25. Both changes lead to truncation of the putative METTL23 protein, which disrupts the predicted catalytic domain and alters the cellular localization. 3D-modelling of the protein indicates that METTL23 is strongly predicted to function as an S-adenosyl-methionine (SAM)-dependent methyltransferase. Expression analysis of METTL23 indicated a strong association with heat shock proteins, which suggests that these may act as a putative substrate for methylation by METTL23. A number of methyltransferases have been described recently in association with ID. Disruption of METTL23 presented here supports the importance of methylation processes for intact neuronal function and brain development. PMID:24626631
Selective Loss of Cysteine Residues and Disulphide Bonds in a Potato Proteinase Inhibitor II Family
Li, Xiu-Qing; Zhang, Tieling; Donnelly, Danielle
2011-01-01
Disulphide bonds between cysteine residues in proteins play a key role in protein folding, stability, and function. Loss of a disulphide bond is often associated with functional differentiation of the protein. The evolution of disulphide bonds is still actively debated; analysis of naturally occurring variants can promote understanding of the protein evolutionary process. One of the disulphide bond-containing protein families is the potato proteinase inhibitor II (PI-II, or Pin2, for short) superfamily, which is found in most solanaceous plants and participates in plant development, stress response, and defence. Each PI-II domain contains eight cysteine residues (8C), and two similar PI-II domains form a functional protein that has eight disulphide bonds and two non-identical reaction centres. It is still unclear which patterns and processes affect cysteine residue loss in PI-II. Through cDNA sequencing and data mining, we found six natural variants missing cysteine residues involved in one or two disulphide bonds at the first reaction centre. We named these variants Pi7C and Pi6C for the proteins missing one or two pairs of cysteine residues, respectively. This PI-II-7C/6C family was found exclusively in potato. The missing cysteine residues were in bonding pairs but distant from one another at the nucleotide/protein sequence level. The non-synonymous/synonymous substitution (Ka/Ks) ratio analysis suggested a positive evolutionary gene selection for Pi6C and various Pi7C. The selective deletion of the first reaction centre cysteine residues that are structure-level-paired but sequence-level-distant in PI-II illustrates the flexibility of PI-II domains and suggests the functionality of their transient gene versions during evolution. PMID:21494600
Pal Choudhury, Pabitra
2017-01-01
Periplasmic c7 type cytochrome A (PpcA) protein is determined in Geobacter sulfurreducens along with its other four homologs (PpcB-E). From the crystal structure viewpoint the observation emerges that PpcA protein can bind with Deoxycholate (DXCA), while its other homologs do not. But it is yet to be established with certainty the reason behind this from primary protein sequence information. This study is primarily based on primary protein sequence analysis through the chemical basis of embedded amino acids. Firstly, we look for the chemical group specific score of amino acids. Along with this, we have developed a new methodology for the phylogenetic analysis based on chemical group dissimilarities of amino acids. This new methodology is applied to the cytochrome c7 family members and pinpoint how a particular sequence is differing with others. Secondly, we build a graph theoretic model on using amino acid sequences which is also applied to the cytochrome c7 family members and some unique characteristics and their domains are highlighted. Thirdly, we search for unique patterns as subsequences which are common among the group or specific individual member. In all the cases, we are able to show some distinct features of PpcA that emerges PpcA as an outstanding protein compared to its other homologs, resulting towards its binding with deoxycholate. Similarly, some notable features for the structurally dissimilar protein PpcD compared to the other homologs are also brought out. Further, the five members of cytochrome family being homolog proteins, they must have some common significant features which are also enumerated in this study. PMID:28362850
EMSA Analysis of DNA Binding By Rgg Proteins
LaSarre, Breah; Federle, Michael J.
2016-01-01
In bacteria, interaction of various proteins with DNA is essential for the regulation of specific target gene expression. Electrophoretic mobility shift assay (EMSA) is an in vitro approach allowing for the visualization of these protein-DNA interactions. Rgg proteins comprise a family of transcriptional regulators widespread among the Firmicutes. Some of these proteins function independently to regulate target gene expression, while others have now been demonstrated to function as effectors of cell-to-cell communication, having regulatory activities that are modulated via direct interaction with small signaling peptides. EMSA analysis can be used to assess DNA binding of either type of Rgg protein. EMSA analysis of Rgg protein activity has facilitated in vitro confirmation of regulatory targets, identification of precise DNA binding sites via DNA probe mutagenesis, and characterization of the mechanism by which some cognate signaling peptides modulate Rgg protein function (e.g. interruption of DNA-binding in some cases). PMID:27430004
EMSA Analysis of DNA Binding By Rgg Proteins.
LaSarre, Breah; Federle, Michael J
2013-08-20
In bacteria, interaction of various proteins with DNA is essential for the regulation of specific target gene expression. Electrophoretic mobility shift assay (EMSA) is an in vitro approach allowing for the visualization of these protein-DNA interactions. Rgg proteins comprise a family of transcriptional regulators widespread among the Firmicutes. Some of these proteins function independently to regulate target gene expression, while others have now been demonstrated to function as effectors of cell-to-cell communication, having regulatory activities that are modulated via direct interaction with small signaling peptides. EMSA analysis can be used to assess DNA binding of either type of Rgg protein. EMSA analysis of Rgg protein activity has facilitated in vitro confirmation of regulatory targets, identification of precise DNA binding sites via DNA probe mutagenesis, and characterization of the mechanism by which some cognate signaling peptides modulate Rgg protein function ( e.g. interruption of DNA-binding in some cases).
Hosaka, Takashi; Ishii, Kazuhiro; Miura, Takeshi; Mezaki, Naomi; Kasuga, Kensaku; Ikeuchi, Takeshi; Tamaoka, Akira
2017-09-15
Progranulin gene (GRN) mutations are major causes of frontotemporal lobar degeneration. To date, 68 pathogenic GRN mutations have been identified. However, very few of these mutations have been reported in Asians. Moreover, some GRN mutations manifest with familial phenotypic heterogeneity. Here, we present a novel GRN mutation resulting in frontotemporal lobar degeneration with a distinct clinical phenotype, and we review reports of GRN mutations associated with familial phenotypic heterogeneity. We describe the case of a 74-year-old woman with left frontotemporal lobe atrophy who presented with progressive anarthria and non-fluent aphasia. Her brother had been diagnosed with corticobasal syndrome (CBS) with right-hand limb-kinetic apraxia, aphasia, and a similar pattern of brain atrophy. Laboratory blood examinations did not reveal abnormalities that could have caused cognitive dysfunction. In the cerebrospinal fluid, cell counts and protein concentrations were within normal ranges, and concentrations of tau protein and phosphorylated tau protein were also normal. Since similar familial cases due to mutation of GRN and microtubule-associated protein tau gene (MAPT) were reported, we performed genetic analysis. No pathological mutations of MAPT were identified, but we identified a novel GRN frameshift mutation (c.1118_1119delCCinsG: p.Pro373ArgX37) that resulted in progranulin haploinsufficiency. This is the first report of a GRN mutation associated with familial phenotypic heterogeneity in Japan. Literature review of GRN mutations associated with familial phenotypic heterogeneity revealed no tendency of mutation sites. The role of progranulin has been reported in this and other neurodegenerative diseases, and the analysis of GRN mutations may lead to the discovery of a new therapeutic target.
Functional analysis of the Arabidopsis PHT4 family of intracellular phosphate transporters.
Guo, B; Jin, Y; Wussler, C; Blancaflor, E B; Motes, C M; Versaw, W K
2008-01-01
The transport of phosphate (Pi) between subcellular compartments is central to metabolic regulation. Although some of the transporters involved in controlling the intracellular distribution of Pi have been identified in plants, others are predicted from genetic, biochemical and bioinformatics studies. Heterologous expression in yeast, and gene expression and localization in plants were used to characterize all six members of an Arabidopsis thaliana membrane transporter family designated here as PHT4. PHT4 proteins share similarity with SLC17/type I Pi transporters, a diverse group of animal proteins involved in the transport of Pi, organic anions and chloride. All of the PHT4 proteins mediate Pi transport in yeast with high specificity. Bioinformatic analysis and localization of PHT4-GFP fusion proteins indicate that five of the proteins are targeted to the plastid envelope, and the sixth resides in the Golgi apparatus. PHT4 genes are expressed in both roots and leaves, although two of the genes are expressed predominantly in leaves and one mostly in roots. These expression patterns, together with Pi transport activities and subcellular locations, suggest roles for PHT4 proteins in the transport of Pi between the cytosol and chloroplasts, heterotrophic plastids and the Golgi apparatus.
Jones, John T; Kumar, Amar; Pylypenko, Liliya A; Thirugnanasambandam, Amarnath; Castelli, Lydia; Chapman, Sean; Cock, Peter J A; Grenier, Eric; Lilley, Catherine J; Phillips, Mark S; Blok, Vivian C
2009-11-01
In this article, we describe the analysis of over 9000 expressed sequence tags (ESTs) from cDNA libraries obtained from various life cycle stages of Globodera pallida. We have identified over 50 G. pallida effectors from this dataset using bioinformatics analysis, by screening clones in order to identify secreted proteins up-regulated after the onset of parasitism and using in situ hybridization to confirm the expression in pharyngeal gland cells. A substantial gene family encoding G. pallida SPRYSEC proteins has been identified. The expression of these genes is restricted to the dorsal pharyngeal gland cell. Different members of the SPRYSEC family of proteins from G. pallida show different subcellular localization patterns in plants, with some localized to the cytoplasm and others to the nucleus and nucleolus. Differences in subcellular localization may reflect diverse functional roles for each individual protein or, more likely, variety in the compartmentalization of plant proteins targeted by the nematode. Our data are therefore consistent with the suggestion that the SPRYSEC proteins suppress host defences, as suggested previously, and that they achieve this through interaction with a range of host targets.
The sugar transporter inventory of tomato: genome-wide identification and expression analysis.
Reuscher, Stefan; Akiyama, Masahito; Yasuda, Tomohide; Makino, Haruko; Aoki, Koh; Shibata, Daisuke; Shiratake, Katsuhiro
2014-06-01
The mobility of sugars between source and sink tissues in plants depends on sugar transport proteins. Studying the corresponding genes allows the manipulation of the sink strength of developing fruits, thereby improving fruit quality for human consumption. Tomato (Solanum lycopersicum) is both a major horticultural crop and a model for the development of fleshy fruits. In this article we provide a comprehensive inventory of tomato sugar transporters, including the SUCROSE TRANSPORTER family, the SUGAR TRANSPORTER PROTEIN family, the SUGAR FACILITATOR PROTEIN family, the POLYOL/MONOSACCHARIDE TRANSPORTER family, the INOSITOL TRANSPORTER family, the PLASTIDIC GLUCOSE TRANSLOCATOR family, the TONOPLAST MONOSACCHARIDE TRANSPORTER family and the VACUOLAR GLUCOSE TRANSPORTER family. Expressed sequence tag (EST) sequencing and phylogenetic analyses established a nomenclature for all analyzed tomato sugar transporters. In total we identified 52 genes in tomato putatively encoding sugar transporters. The expression of 29 sugar transporter genes in vegetative tissues and during fruit development was analyzed. Several sugar transporter genes were expressed in a tissue- or developmental stage-specific manner. This information will be helpful to better understand source to sink movement of photoassimilates in tomato. Identification of fruit-specific sugar transporters might be a first step to find novel genes contributing to tomato fruit sugar accumulation. © The Author 2014. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists. All rights reserved. For permissions, please email: journals.permissions@oup.com.
Gene and domain duplication in the chordate Otx gene family: insights from amphioxus Otx.
Williams, N A; Holland, P W
1998-05-01
We report the genomic organization and deduced protein sequence of a cephalochordate member of the Otx homeobox gene family (AmphiOtx) and show its probable single-copy state in the genome. We also present molecular phylogenetic analysis indicating that there was single ancestral Otx gene in the first chordates which was duplicated in the vertebrate lineage after it had split from the lineage leading to the cephalochordates. Duplication of a C-terminal protein domain has occurred specifically in the vertebrate lineage, strengthening the case for a single Otx gene in an ancestral chordate whose gene structure has been retained in an extant cephalochordate. Comparative analysis of protein sequences and published gene expression patterns suggest that the ancestral chordate Otx gene had roles in patterning the anterior mesendoderm and central nervous system. These roles were elaborated following Otx gene duplication in vertebrates, accompanied by regulatory and structural divergence, particularly of Otx1 descendant genes.
The sieve element occlusion gene family in dicotyledonous plants
Jekat, Stephan B; Nordzieke, Steffen; Reineke, Anna R; Müller, Boje; Bornberg-Bauer, Erich; Noll, Gundula A
2011-01-01
Sieve element occlusion (SEO) genes encoding forisome subunits have been identified in Medicago truncatula and other legumes. Forisomes are structural phloem proteins uniquely found in Fabaceae sieve elements. They undergo a reversible conformational change after wounding, from a condensed to a dispersed state, thereby blocking sieve tube translocation and preventing the loss of photoassimilates. Recently, we identified SEO genes in several non-Fabaceae plants (lacking forisomes) and concluded that they most probably encode conventional non-forisome P-proteins. Molecular and phylogenetic analysis of the SEO gene family has identified domains that are characteristic for SEO proteins. Here, we extended our phylogenetic analysis by including additional SEO genes from several diverse species based on recently published genomic data. Our results strengthen the original assumption that SEO genes seem to be widespread in dicotyledonous angiosperms, and further underline the divergent evolution of SEO genes within the Fabaceae. PMID:21422825
The sieve element occlusion gene family in dicotyledonous plants.
Ernst, Antonia M; Rüping, Boris; Jekat, Stephan B; Nordzieke, Steffen; Reineke, Anna R; Müller, Boje; Bornberg-Bauer, Erich; Prüfer, Dirk; Noll, Gundula A
2011-01-01
Sieve element occlusion (SEO) genes encoding forisome subunits have been identified in Medicago truncatula and other legumes. Forisomes are structural phloem proteins uniquely found in Fabaceae sieve elements. They undergo a reversible conformational change after wounding, from a condensed to a dispersed state, thereby blocking sieve tube translocation and preventing the loss of photoassimilates. Recently, we identified SEO genes in several non-Fabaceae plants (lacking forisomes) and concluded that they most probably encode conventional non-forisome P-proteins. Molecular and phylogenetic analysis of the SEO gene family has identified domains that are characteristic for SEO proteins. Here, we extended our phylogenetic analysis by including additional SEO genes from several diverse species based on recently published genomic data. Our results strengthen the original assumption that SEO genes seem to be widespread in dicotyledonous angiosperms, and further underline the divergent evolution of SEO genes within the Fabaceae.
Ivanov, Stefan M; Huber, Roland G; Warwicker, Jim; Bond, Peter J
2016-11-01
Critical regulatory pathways are replete with instances of intra- and interfamily protein-protein interactions due to the pervasiveness of gene duplication throughout evolution. Discerning the specificity determinants within these systems has proven a challenging task. Here, we present an energetic analysis of the specificity determinants within the Bcl-2 family of proteins (key regulators of the intrinsic apoptotic pathway) via a total of ∼20 μs of simulation of 60 distinct protein-protein complexes. We demonstrate where affinity and specificity of protein-protein interactions arise across the family, and corroborate our conclusions with extensive experimental evidence. We identify energy and specificity hotspots that may offer valuable guidance in the design of targeted therapeutics for manipulating the protein-protein interactions within the apoptosis-regulating pathway. Moreover, we propose a conceptual framework that allows us to quantify the relationship between sequence, structure, and binding energetics. This approach may represent a general methodology for investigating other paralogous protein-protein interaction sites. Copyright © 2016 Elsevier Ltd. All rights reserved.
Robson, James F; Barker, Daniel
2015-10-13
To demonstrate the bioinformatics capabilities of a low-cost computer, the Raspberry Pi, we present a comparison of the protein-coding gene content of two species in phylum Chlamydiae: Chlamydia trachomatis, a common sexually transmitted infection of humans, and Candidatus Protochlamydia amoebophila, a recently discovered amoebal endosymbiont. Identifying species-specific proteins and differences in protein families could provide insights into the unique phenotypes of the two species. Using a Raspberry Pi computer, sequence similarity-based protein families were predicted across the two species, C. trachomatis and P. amoebophila, and their members counted. Examples include nine multi-protein families unique to C. trachomatis, 132 multi-protein families unique to P. amoebophila and one family with multiple copies in both. Most families unique to C. trachomatis were polymorphic outer-membrane proteins. Additionally, multiple protein families lacking functional annotation were found. Predicted functional interactions suggest one of these families is involved with the exodeoxyribonuclease V complex. The Raspberry Pi computer is adequate for a comparative genomics project of this scope. The protein families unique to P. amoebophila may provide a basis for investigating the host-endosymbiont interaction. However, additional species should be included; and further laboratory research is required to identify the functions of unknown or putative proteins. Multiple outer membrane proteins were found in C. trachomatis, suggesting importance for host evasion. The tyrosine transport protein family is shared between both species, with four proteins in C. trachomatis and two in P. amoebophila. Shared protein families could provide a starting point for discovery of wide-spectrum drugs against Chlamydiae.
Keck, Michael; van Dijk, Roelof Maarten; Deeg, Cornelia A; Kistler, Katharina; Walker, Andreas; von Rüden, Eva-Lotta; Russmann, Vera; Hauck, Stefanie M; Potschka, Heidrun
2018-04-01
Information about epileptogenesis-associated changes in protein expression patterns is of particular interest for future selection of target and biomarker candidates. Bioinformatic analysis of proteomic data sets can increase our knowledge about molecular alterations characterizing the different phases of epilepsy development following an initial epileptogenic insult. Here, we report findings from a focused analysis of proteomic data obtained for the hippocampus and parahippocampal cortex samples collected during the early post-insult phase, latency phase, and chronic phase of a rat model of epileptogenesis. The study focused on proteins functionally associated with cell stress, cell death, extracellular matrix (ECM) remodeling, cell-ECM interaction, cell-cell interaction, angiogenesis, and blood-brain barrier function. The analysis revealed prominent pathway enrichment providing information about the complex expression alterations of the respective protein groups. In the hippocampus, the number of differentially expressed proteins declined over time during the course of epileptogenesis. In contrast, a peak in the regulation of proteins linked with cell stress and death as well as ECM and cell-cell interaction became evident at later phases during epileptogenesis in the parahippocampal cortex. The data sets provide valuable information about the time course of protein expression patterns during epileptogenesis for a series of proteins. Moreover, the findings provide comprehensive novel information about expression alterations of proteins that have not been discussed yet in the context of epileptogenesis. These for instance include different members of the lamin protein family as well as the fermitin family member 2 (FERMT2). Induction of FERMT2 and other selected proteins, CD18 (ITGB2), CD44 and Nucleolin were confirmed by immunohistochemistry. Taken together, focused bioinformatic analysis of the proteomic data sets completes our knowledge about molecular alterations linked with cell death and cellular plasticity during epileptogenesis. The analysis provided can guide future selection of target and biomarker candidates. Copyright © 2018 Elsevier Inc. All rights reserved.
Shi, Xiao-Feng; Li, Yi-Nü; Yi, Yong-Zhu; Xiao, Xing-Guo; Zhang, Zhi-Fang
2015-01-01
The 30 K proteins, the major group of hemolymph proteins in the silkworm, Bombyx mori (Lepidoptera: Bombycidae), are structurally related with molecular masses of ∼30 kDa and are involved in various physiological processes, e.g., energy storage, embryonic development, and immune responses. For this report, known 30 K protein gene sequences were used as Blastn queries against sequences in the B. mori transcriptome (SilkTransDB). Twenty-nine cDNAs (Bm30K-1–29) were retrieved, including four being previously unidentified in the Lipoprotein_11 family. The genomic structures of the 29 genes were analyzed and they were mapped to their corresponding chromosomes. Furthermore, phylogenetic analysis revealed that the 29 genes encode three types of 30 K proteins. The members increased in each type is mainly a result of gene duplication with the appearance of each type preceding the differentiation of each species included in the tree. Real-Time Quantitative Polymerase Chain Reaction (Q-PCR) confirmed that the genes could be expressed, and that the three types have different temporal expression patterns. Proteins from the hemolymph was separated by SDS-PAGE, and those with molecular mass of ∼30 kDa were isolated and identified by mass spectrometry sequencing in combination with searches of various databases containing B. mori 30K protein sequences. Of the 34 proteins identified, 13 are members of the 30 K protein family, with one that had not been found in the SilkTransDB, although it had been found in the B. mori genome. Taken together, our results indicate that the 30 K protein family contains many members with various functions. Other methods will be required to find more members of the family. PMID:26078299
Wu, Zhi-Jun; Li, Xing-Hui; Liu, Zhi-Wei; Li, Hui; Wang, Yong-Xin; Zhuang, Jing
2016-02-01
Tea plant [Camellia sinensis (L.) O. Kuntze] is a leaf-type healthy non-alcoholic beverage crop, which has been widely introduced worldwide. Tea is rich in various secondary metabolites, which are important for human health. However, varied climate and complex geography have posed challenges for tea plant survival. The WRKY gene family in plants is a large transcription factor family that is involved in biological processes related to stress defenses, development, and metabolite synthesis. Therefore, identification and analysis of WRKY family transcription factors in tea plant have a profound significance. In the present study, 50 putative C. sinensis WRKY proteins (CsWRKYs) with complete WRKY domain were identified and divided into three Groups (Group I-III) on the basis of phylogenetic analysis results. The distribution of WRKY family transcription factors among plantae, fungi, and protozoa showed that the number of WRKY genes increased in higher plant, whereas the number of these genes did not correspond to the evolutionary relationships of different species. Structural feature and annotation analysis results showed that CsWRKY proteins contained WRKYGQK/WRKYGKK domains and C2H2/C2HC-type zinc-finger structure: D-X18-R-X1-Y-X2-C-X4-7-C-X23-H motif; CsWRKY proteins may be associated with the biological processes of abiotic and biotic stresses, tissue development, and hormone and secondary metabolite biosynthesis. Temperature stresses suggested that the candidate CsWRKY genes were involved in responses to extreme temperatures. The current study established an extensive overview of the WRKY family transcription factors in tea plant. This study also provided a global survey of CsWRKY transcription factors and a foundation of future functional identification and molecular breeding.
Using random forests for assistance in the curation of G-protein coupled receptor databases.
Shkurin, Aleksei; Vellido, Alfredo
2017-08-18
Biology is experiencing a gradual but fast transformation from a laboratory-centred science towards a data-centred one. As such, it requires robust data engineering and the use of quantitative data analysis methods as part of database curation. This paper focuses on G protein-coupled receptors, a large and heterogeneous super-family of cell membrane proteins of interest to biology in general. One of its families, Class C, is of particular interest to pharmacology and drug design. This family is quite heterogeneous on its own, and the discrimination of its several sub-families is a challenging problem. In the absence of known crystal structure, such discrimination must rely on their primary amino acid sequences. We are interested not as much in achieving maximum sub-family discrimination accuracy using quantitative methods, but in exploring sequence misclassification behavior. Specifically, we are interested in isolating those sequences showing consistent misclassification, that is, sequences that are very often misclassified and almost always to the same wrong sub-family. Random forests are used for this analysis due to their ensemble nature, which makes them naturally suited to gauge the consistency of misclassification. This consistency is here defined through the voting scheme of their base tree classifiers. Detailed consistency results for the random forest ensemble classification were obtained for all receptors and for all data transformations of their unaligned primary sequences. Shortlists of the most consistently misclassified receptors for each subfamily and transformation, as well as an overall shortlist including those cases that were consistently misclassified across transformations, were obtained. The latter should be referred to experts for further investigation as a data curation task. The automatic discrimination of the Class C sub-families of G protein-coupled receptors from their unaligned primary sequences shows clear limits. This study has investigated in some detail the consistency of their misclassification using random forest ensemble classifiers. Different sub-families have been shown to display very different discrimination consistency behaviors. The individual identification of consistently misclassified sequences should provide a tool for quality control to GPCR database curators.
Lai, Jason; Jin, Jing; Kubelka, Jan; Liberles, David A
2012-09-21
Since the dynamic nature of protein structures is essential for enzymatic function, it is expected that functional evolution can be inferred from the changes in protein dynamics. However, dynamics can also diverge neutrally with sequence substitution between enzymes without changes of function. In this study, a phylogenetic approach is implemented to explore the relationship between enzyme dynamics and function through evolutionary history. Protein dynamics are described by normal mode analysis based on a simplified harmonic potential force field applied to the reduced C(α) representation of the protein structure while enzymatic function is described by Enzyme Commission numbers. Similarity of the binding pocket dynamics at each branch of the protein family's phylogeny was analyzed in two ways: (1) explicitly by quantifying the normal mode overlap calculated for the reconstructed ancestral proteins at each end and (2) implicitly using a diffusion model to obtain the reconstructed lineage-specific changes in the normal modes. Both explicit and implicit ancestral reconstruction identified generally faster rates of change in dynamics compared with the expected change from neutral evolution at the branches of potential functional divergences for the α-amylase, D-isomer-specific 2-hydroxyacid dehydrogenase, and copper-containing amine oxidase protein families. Normal mode analysis added additional information over just comparing the RMSD of static structures. However, the branch-specific changes were not statistically significant compared to background function-independent neutral rates of change of dynamic properties and blind application of the analysis would not enable prediction of changes in enzyme specificity. Copyright © 2012 Elsevier Ltd. All rights reserved.
Darbro, Benjamin W.; Mahajan, Vinit B.; Gakhar, Lokesh; Skeie, Jessica M.; Campbell, Elizabeth; Wu, Shu; Bing, Xinyu; Millen, Kathleen J.; Dobyns, William B.; Kessler, John A.; Jalali, Ali; Cremer, James; Segre, Alberto; Manak, J. Robert; Aldinger, Kimerbly A.; Suzuki, Satoshi; Natsume, Nagato; Ono, Maya; Hai, Huynh Dai; Viet, Le Thi; Loddo, Sara; Valente, Enza M.; Bernardini, Laura; Ghonge, Nitin; Ferguson, Polly J.; Bassuk, Alexander G.
2013-01-01
We performed whole-exome sequencing of a family with autosomal dominant Dandy-Walker malformation and occipital cephaloceles (ADDWOC) and detected a mutation in the extracellular matrix protein encoding gene NID1. In a second family, protein interaction network analysis identified a mutation in LAMC1, which encodes a NID1 binding partner. Structural modeling the NID1-LAMC1 complex demonstrated that each mutation disrupts the interaction. These findings implicate the extracellular matrix in the pathogenesis of Dandy-Walker spectrum disorders. PMID:23674478
COGNAT: a web server for comparative analysis of genomic neighborhoods.
Klimchuk, Olesya I; Konovalov, Kirill A; Perekhvatov, Vadim V; Skulachev, Konstantin V; Dibrova, Daria V; Mulkidjanian, Armen Y
2017-11-22
In prokaryotic genomes, functionally coupled genes can be organized in conserved gene clusters enabling their coordinated regulation. Such clusters could contain one or several operons, which are groups of co-transcribed genes. Those genes that evolved from a common ancestral gene by speciation (i.e. orthologs) are expected to have similar genomic neighborhoods in different organisms, whereas those copies of the gene that are responsible for dissimilar functions (i.e. paralogs) could be found in dissimilar genomic contexts. Comparative analysis of genomic neighborhoods facilitates the prediction of co-regulated genes and helps to discern different functions in large protein families. We intended, building on the attribution of gene sequences to the clusters of orthologous groups of proteins (COGs), to provide a method for visualization and comparative analysis of genomic neighborhoods of evolutionary related genes, as well as a respective web server. Here we introduce the COmparative Gene Neighborhoods Analysis Tool (COGNAT), a web server for comparative analysis of genomic neighborhoods. The tool is based on the COG database, as well as the Pfam protein families database. As an example, we show the utility of COGNAT in identifying a new type of membrane protein complex that is formed by paralog(s) of one of the membrane subunits of the NADH:quinone oxidoreductase of type 1 (COG1009) and a cytoplasmic protein of unknown function (COG3002). This article was reviewed by Drs. Igor Zhulin, Uri Gophna and Igor Rogozin.
Molecular Physiology of SPAK and OSR1: Two Ste20-Related Protein Kinases Regulating Ion Transport
Gagnon, Kenneth B.; Delpire, Eric
2015-01-01
SPAK (Ste20-related proline alanine rich kinase) and OSR1 (oxidative stress responsive kinase) are members of the germinal center kinase VI sub-family of the mammalian Ste20 (Sterile20)-related protein kinase family. Although there are 30 enzymes in this protein kinase family, their conservation across the fungi, plant and animal kingdom confirms their evolutionary importance. Already, a large volume of work has accumulated on the tissue distribution, binding partners, signaling cascades, and physiological roles of mammalian SPAK and OSR1 in multiple organ systems. After reviewing this basic information, we will examine newer studies that demonstrate the pathophysiological consequences to SPAK and/or OSR1 disruption, discuss the development and analysis of genetically-engineered mouse models, and address the possible role these serine/threonine kinases might have in cancer proliferation and migration. PMID:23073627
Two suppressors of RNA silencing encoded by cereal-infecting members of the family Luteoviridae.
Liu, Yan; Zhai, Hao; Zhao, Kun; Wu, Beilei; Wang, Xifeng
2012-08-01
Several members of the family Luteoviridae are important pathogens of cultivated plant species of the family Gramineae. In this study, we explored RNA-silencing suppressors (RSSs) encoded by two cereal-infecting luteoviruses: barley yellow dwarf virus and wheat yellow dwarf virus (BYDV and WYDV, respectively). The P0 protein of WYDV-GPV (P0(GPV)) and the P6 protein of BYDV-GAV (P6(GAV)) displayed RSS activities when expressed in agro-infiltrated leaves of Nicotiana benthamiana, by their local ability to inhibit post-transcriptional gene silencing of GFP. Analysis of GFP, mRNA and GFP-specific small interfering RNA indicated that both P0(GPV) and P6(GAV) are suppressors of silencing that can restrain not only local but also systemic gene silencing. This is the first report of RSS activity of the P6 protein in a member of the genus Luteovirus.
Plants, symbiosis and parasites: a calcium signalling connection.
Harper, Jeffrey F; Harmon, Alice
2005-07-01
A unique family of protein kinases has evolved with regulatory domains containing sequences that are related to Ca(2+)-binding EF-hands. In this family, the archetypal Ca(2+)-dependent protein kinases (CDPKs) have been found in plants and some protists, including the malarial parasite, Plasmodium falciparum. Recent genetic evidence has revealed isoform-specific functions for a CDPK that is essential for Plasmodium berghei gametogenesis, and for a related chimeric Ca(2+) and calmodulin-dependent protein kinase (CCaMK) that is essential to the formation of symbiotic nitrogen-fixing nodules in plants. In Arabidopsis thaliana, the analysis of 42 isoforms of CDPK and related kinases is expected to delineate Ca(2+) signalling pathways in all aspects of plant biology.
Pereira-Santana, Alejandro; Alcaraz, Luis David; Castaño, Enrique; Sanchez-Calderon, Lenin; Sanchez-Teyer, Felipe; Rodriguez-Zapata, Luis
2015-01-01
NAC proteins constitute one of the largest groups of plant-specific transcription factors and are known to play essential roles in various developmental processes. They are also important in plant responses to stresses such as drought, soil salinity, cold, and heat, which adversely affect growth. The current knowledge regarding the distribution of NAC proteins in plant lineages comes from relatively small samplings from the available data. In the present study, we broadened the number of plant species containing the NAC family origin and evolution to shed new light on the evolutionary history of this family in angiosperms. A comparative genome analysis was performed on 24 land plant species, and NAC ortholog groups were identified by means of bidirectional BLAST hits. Large NAC gene families are found in those species that have experienced more whole-genome duplication events, pointing to an expansion of the NAC family with divergent functions in flowering plants. A total of 3,187 NAC transcription factors that clustered into six major groups were used in the phylogenetic analysis. Many orthologous groups were found in the monocot and eudicot lineages, but only five orthologous groups were found between P. patens and each representative taxa of flowering plants. These groups were called basal orthologous groups and likely expanded into more recent taxa to cope with their environmental needs. This analysis on the angiosperm NAC family represents an effort to grasp the evolutionary and functional diversity within this gene family while providing a basis for further functional research on vascular plant gene families. PMID:26569117
Pereira-Santana, Alejandro; Alcaraz, Luis David; Castaño, Enrique; Sanchez-Calderon, Lenin; Sanchez-Teyer, Felipe; Rodriguez-Zapata, Luis
2015-01-01
NAC proteins constitute one of the largest groups of plant-specific transcription factors and are known to play essential roles in various developmental processes. They are also important in plant responses to stresses such as drought, soil salinity, cold, and heat, which adversely affect growth. The current knowledge regarding the distribution of NAC proteins in plant lineages comes from relatively small samplings from the available data. In the present study, we broadened the number of plant species containing the NAC family origin and evolution to shed new light on the evolutionary history of this family in angiosperms. A comparative genome analysis was performed on 24 land plant species, and NAC ortholog groups were identified by means of bidirectional BLAST hits. Large NAC gene families are found in those species that have experienced more whole-genome duplication events, pointing to an expansion of the NAC family with divergent functions in flowering plants. A total of 3,187 NAC transcription factors that clustered into six major groups were used in the phylogenetic analysis. Many orthologous groups were found in the monocot and eudicot lineages, but only five orthologous groups were found between P. patens and each representative taxa of flowering plants. These groups were called basal orthologous groups and likely expanded into more recent taxa to cope with their environmental needs. This analysis on the angiosperm NAC family represents an effort to grasp the evolutionary and functional diversity within this gene family while providing a basis for further functional research on vascular plant gene families.
Malviya, N; Gupta, S; Singh, V K; Yadav, M K; Bisht, N C; Sarangi, B K; Yadav, D
2015-02-01
The DNA binding with One Finger (Dof) protein is a plant specific transcription factor involved in the regulation of wide range of processes. The analysis of whole genome sequence of pigeonpea has identified 38 putative Dof genes (CcDof) distributed on 8 chromosomes. A total of 17 out of 38 CcDof genes were found to be intronless. A comprehensive in silico characterization of CcDof gene family including the gene structure, chromosome location, protein motif, phylogeny, gene duplication and functional divergence has been attempted. The phylogenetic analysis resulted in 3 major clusters with closely related members in phylogenetic tree revealed common motif distribution. The in silico cis-regulatory element analysis revealed functional diversity with predominance of light responsive and stress responsive elements indicating the possibility of these CcDof genes to be associated with photoperiodic control and biotic and abiotic stress. The duplication pattern showed that tandem duplication is predominant over segmental duplication events. The comparative phylogenetic analysis of these Dof proteins along with 78 soybean, 36 Arabidopsis and 30 rice Dof proteins revealed 7 major clusters. Several groups of orthologs and paralogs were identified based on phylogenetic tree constructed. Our study provides useful information for functional characterization of CcDof genes.
Integrated genome sequence and linkage map of physic nut (Jatropha curcas L.), a biodiesel plant.
Wu, Pingzhi; Zhou, Changpin; Cheng, Shifeng; Wu, Zhenying; Lu, Wenjia; Han, Jinli; Chen, Yanbo; Chen, Yan; Ni, Peixiang; Wang, Ying; Xu, Xun; Huang, Ying; Song, Chi; Wang, Zhiwen; Shi, Nan; Zhang, Xudong; Fang, Xiaohua; Yang, Qing; Jiang, Huawu; Chen, Yaping; Li, Meiru; Wang, Ying; Chen, Fan; Wang, Jun; Wu, Guojiang
2015-03-01
The family Euphorbiaceae includes some of the most efficient biomass accumulators. Whole genome sequencing and the development of genetic maps of these species are important components in molecular breeding and genetic improvement. Here we report the draft genome of physic nut (Jatropha curcas L.), a biodiesel plant. The assembled genome has a total length of 320.5 Mbp and contains 27,172 putative protein-coding genes. We established a linkage map containing 1208 markers and anchored the genome assembly (81.7%) to this map to produce 11 pseudochromosomes. After gene family clustering, 15,268 families were identified, of which 13,887 existed in the castor bean genome. Analysis of the genome highlighted specific expansion and contraction of a number of gene families during the evolution of this species, including the ribosome-inactivating proteins and oil biosynthesis pathway enzymes. The genomic sequence and linkage map provide a valuable resource not only for fundamental and applied research on physic nut but also for evolutionary and comparative genomics analysis, particularly in the Euphorbiaceae. © 2015 The Authors The Plant Journal © 2015 John Wiley & Sons Ltd.
Bul Proteins, a Nonredundant, Antagonistic Family of Ubiquitin Ligase Regulatory Proteins
Novoselova, Tatiana V.; Zahira, Kiran; Rose, Ruth-Sarah
2012-01-01
Like other Nedd4 ligases, Saccharomyces cerevisiae E3 Rsp5p utilizes adaptor proteins to interact with some substrates. Previous studies have indentified Bul1p and Bul2p as adaptor proteins that facilitate the ligase-substrate interaction. Here, we show the identification of a third member of the Bul family, Bul3p, the product of two adjacent open reading frames separated by a stop codon that undergoes readthrough translation. Combinatorial analysis of BUL gene deletions reveals that they regulate some, but not all, of the cellular pathways known to involve Rsp5p. Surprisingly, we find that Bul proteins can act antagonistically to regulate the same ubiquitin-dependent process, and the nature of this antagonistic activity varies between different substrates. We further show, using in vitro ubiquitination assays, that the Bul proteins have different specificities for WW domains and that the two forms of Bul3p interact differently with Rsp5p, potentially leading to alternate functional outcomes. These data introduce a new level of complexity into the regulatory interactions that take place between Rsp5p and its adaptors and substrates and suggest a more critical role for the Bul family of proteins in controlling adaptor-mediated ubiquitination. PMID:22307975
Evolution of the Translocation and Assembly Module (TAM)
Heinz, Eva; Selkrig, Joel; Belousoff, Matthew J.; Lithgow, Trevor
2015-01-01
Bacterial outer membrane proteins require the beta-barrel assembly machinery (BAM) for their correct folding and function. The central component of this machinery is BamA, an Omp85 protein that is essential and found in all Gram-negative bacteria. An additional feature of the BAM is the translocation and assembly module (TAM), comprised TamA (an Omp85 family protein) and TamB. We report that TamA and a closely related protein TamL are confined almost exclusively to Proteobacteria and Bacteroidetes/Chlorobi respectively, whereas TamB is widely distributed across the majority of Gram-negative bacterial lineages. A comprehensive phylogenetic and secondary structure analysis of the TamB protein family revealed that TamB was present very early in the evolution of bacteria. Several sequence characteristics were discovered to define the TamB protein family: A signal-anchor linkage to the inner membrane, beta-helical structure, conserved domain architecture and a C-terminal region that mimics outer membrane protein beta-strands. Taken together, the structural and phylogenetic analyses suggest that the TAM likely evolved from an original combination of BamA and TamB, with a later gene duplication event of BamA, giving rise to an additional Omp85 sequence that evolved to be TamA in Proteobacteria and TamL in Bacteroidetes/Chlorobi. PMID:25994932
Statistical discovery of site inter-dependencies in sub-molecular hierarchical protein structuring
2012-01-01
Background Much progress has been made in understanding the 3D structure of proteins using methods such as NMR and X-ray crystallography. The resulting 3D structures are extremely informative, but do not always reveal which sites and residues within the structure are of special importance. Recently, there are indications that multiple-residue, sub-domain structural relationships within the larger 3D consensus structure of a protein can be inferred from the analysis of the multiple sequence alignment data of a protein family. These intra-dependent clusters of associated sites are used to indicate hierarchical inter-residue relationships within the 3D structure. To reveal the patterns of associations among individual amino acids or sub-domain components within the structure, we apply a k-modes attribute (aligned site) clustering algorithm to the ubiquitin and transthyretin families in order to discover associations among groups of sites within the multiple sequence alignment. We then observe what these associations imply within the 3D structure of these two protein families. Results The k-modes site clustering algorithm we developed maximizes the intra-group interdependencies based on a normalized mutual information measure. The clusters formed correspond to sub-structural components or binding and interface locations. Applying this data-directed method to the ubiquitin and transthyretin protein family multiple sequence alignments as a test bed, we located numerous interesting associations of interdependent sites. These clusters were then arranged into cluster tree diagrams which revealed four structural sub-domains within the single domain structure of ubiquitin and a single large sub-domain within transthyretin associated with the interface among transthyretin monomers. In addition, several clusters of mutually interdependent sites were discovered for each protein family, each of which appear to play an important role in the molecular structure and/or function. Conclusions Our results demonstrate that the method we present here using a k-modes site clustering algorithm based on interdependency evaluation among sites obtained from a sequence alignment of homologous proteins can provide significant insights into the complex, hierarchical inter-residue structural relationships within the 3D structure of a protein family. PMID:22793672
Statistical discovery of site inter-dependencies in sub-molecular hierarchical protein structuring.
Durston, Kirk K; Chiu, David Ky; Wong, Andrew Kc; Li, Gary Cl
2012-07-13
Much progress has been made in understanding the 3D structure of proteins using methods such as NMR and X-ray crystallography. The resulting 3D structures are extremely informative, but do not always reveal which sites and residues within the structure are of special importance. Recently, there are indications that multiple-residue, sub-domain structural relationships within the larger 3D consensus structure of a protein can be inferred from the analysis of the multiple sequence alignment data of a protein family. These intra-dependent clusters of associated sites are used to indicate hierarchical inter-residue relationships within the 3D structure. To reveal the patterns of associations among individual amino acids or sub-domain components within the structure, we apply a k-modes attribute (aligned site) clustering algorithm to the ubiquitin and transthyretin families in order to discover associations among groups of sites within the multiple sequence alignment. We then observe what these associations imply within the 3D structure of these two protein families. The k-modes site clustering algorithm we developed maximizes the intra-group interdependencies based on a normalized mutual information measure. The clusters formed correspond to sub-structural components or binding and interface locations. Applying this data-directed method to the ubiquitin and transthyretin protein family multiple sequence alignments as a test bed, we located numerous interesting associations of interdependent sites. These clusters were then arranged into cluster tree diagrams which revealed four structural sub-domains within the single domain structure of ubiquitin and a single large sub-domain within transthyretin associated with the interface among transthyretin monomers. In addition, several clusters of mutually interdependent sites were discovered for each protein family, each of which appear to play an important role in the molecular structure and/or function. Our results demonstrate that the method we present here using a k-modes site clustering algorithm based on interdependency evaluation among sites obtained from a sequence alignment of homologous proteins can provide significant insights into the complex, hierarchical inter-residue structural relationships within the 3D structure of a protein family.
Calcium-binding protein from mouse Ehrlich ascites-tumour cells is homologous to human calcyclin.
Kuźnicki, J; Filipek, A; Hunziker, P E; Huber, S; Heizmann, C W
1989-01-01
A Ca2+-binding protein was purified from mouse Ehrlich ascites-tumour cells. The protein forms monomers and disulphide-linked dimers, which can be separated by reverse-phase h.p.l.c. A partial amino acid sequence analysis demonstrated that the protein has an EF-hand structure. A striking homology was found to rat and human calcyclin (a member of the S-100 protein family), which is possibly involved in cell-cycle regulation. Images Fig. 1. Fig. 2. PMID:2597136
Yang, Hai-Ling; Liu, Yan-Jing; Wang, Cai-Ling; Zeng, Qing-Yin
2012-01-01
Trehalose-6-phosphate synthase (TPS) plays important roles in trehalose metabolism and signaling. Plant TPS proteins contain both a TPS and a trehalose-6-phosphate phosphatase (TPP) domain, which are coded by a multi-gene family. The plant TPS gene family has been divided into class I and class II. A previous study showed that the Populus, Arabidopsis, and rice genomes have seven class I and 27 class II TPS genes. In this study, we found that all class I TPS genes had 16 introns within the protein-coding region, whereas class II TPS genes had two introns. A significant sequence difference between the two classes of TPS proteins was observed by pairwise sequence comparisons of the 34 TPS proteins. A phylogenetic analysis revealed that at least seven TPS genes were present in the monocot–dicot common ancestor. Segmental duplications contributed significantly to the expansion of this gene family. At least five and three TPS genes were created by segmental duplication events in the Populus and rice genomes, respectively. Both the TPS and TPP domains of 34 TPS genes have evolved under purifying selection, but the selective constraint on the TPP domain was more relaxed than that on the TPS domain. Among 34 TPS genes from Populus, Arabidopsis, and rice, four class I TPS genes (AtTPS1, OsTPS1, PtTPS1, and PtTPS2) were under stronger purifying selection, whereas three Arabidopsis class I TPS genes (AtTPS2, 3, and 4) apparently evolved under relaxed selective constraint. Additionally, a reverse transcription polymerase chain reaction analysis showed the expression divergence of the TPS gene family in Populus, Arabidopsis, and rice under normal growth conditions and in response to stressors. Our findings provide new insights into the mechanisms of gene family expansion and functional evolution. PMID:22905132
Yang, Hai-Ling; Liu, Yan-Jing; Wang, Cai-Ling; Zeng, Qing-Yin
2012-01-01
Trehalose-6-phosphate synthase (TPS) plays important roles in trehalose metabolism and signaling. Plant TPS proteins contain both a TPS and a trehalose-6-phosphate phosphatase (TPP) domain, which are coded by a multi-gene family. The plant TPS gene family has been divided into class I and class II. A previous study showed that the Populus, Arabidopsis, and rice genomes have seven class I and 27 class II TPS genes. In this study, we found that all class I TPS genes had 16 introns within the protein-coding region, whereas class II TPS genes had two introns. A significant sequence difference between the two classes of TPS proteins was observed by pairwise sequence comparisons of the 34 TPS proteins. A phylogenetic analysis revealed that at least seven TPS genes were present in the monocot-dicot common ancestor. Segmental duplications contributed significantly to the expansion of this gene family. At least five and three TPS genes were created by segmental duplication events in the Populus and rice genomes, respectively. Both the TPS and TPP domains of 34 TPS genes have evolved under purifying selection, but the selective constraint on the TPP domain was more relaxed than that on the TPS domain. Among 34 TPS genes from Populus, Arabidopsis, and rice, four class I TPS genes (AtTPS1, OsTPS1, PtTPS1, and PtTPS2) were under stronger purifying selection, whereas three Arabidopsis class I TPS genes (AtTPS2, 3, and 4) apparently evolved under relaxed selective constraint. Additionally, a reverse transcription polymerase chain reaction analysis showed the expression divergence of the TPS gene family in Populus, Arabidopsis, and rice under normal growth conditions and in response to stressors. Our findings provide new insights into the mechanisms of gene family expansion and functional evolution.
Genome-wide identification and expression profiling of the SnRK2 gene family in Malus prunifolia.
Shao, Yun; Qin, Yuan; Zou, Yangjun; Ma, Fengwang
2014-11-15
Sucrose non-fermenting-1-related protein kinase 2 (SnRK2) constitutes a small plant-specific serine/threonine kinase family with essential roles in the abscisic acid (ABA) signal pathway and in responses to osmotic stress. Although a genome-wide analysis of this family has been conducted in some species, little is known about SnRK2 genes in apple (Malus domestica). We identified 14 putative sequences encoding 12 deduced SnRK2 proteins within the apple genome. Gene chromosomal location and synteny analysis of the apple SnRK2 genes indicated that tandem and segmental duplications have likely contributed to the expansion and evolution of these genes. All 12 full-length coding sequences were confirmed by cloning from Malus prunifolia. The gene structure and motif compositions of the apple SnRK2 genes were analyzed. Phylogenetic analysis showed that MpSnRK2s could be classified into four groups. Profiling of these genes presented differential patterns of expression in various tissues. Under stress conditions, transcript levels for some family members were up-regulated in the leaves in response to drought, salinity, or ABA treatments. This suggested their possible roles in plant response to abiotic stress. Our findings provide essential information about SnRK2 genes in apple and will contribute to further functional dissection of this gene family. Copyright © 2014 Elsevier B.V. All rights reserved.
KinFin: Software for Taxon-Aware Analysis of Clustered Protein Sequences.
Laetsch, Dominik R; Blaxter, Mark L
2017-10-05
The field of comparative genomics is concerned with the study of similarities and differences between the information encoded in the genomes of organisms. A common approach is to define gene families by clustering protein sequences based on sequence similarity, and analyze protein cluster presence and absence in different species groups as a guide to biology. Due to the high dimensionality of these data, downstream analysis of protein clusters inferred from large numbers of species, or species with many genes, is nontrivial, and few solutions exist for transparent, reproducible, and customizable analyses. We present KinFin, a streamlined software solution capable of integrating data from common file formats and delivering aggregative annotation of protein clusters. KinFin delivers analyses based on systematic taxonomy of the species analyzed, or on user-defined, groupings of taxa, for example, sets based on attributes such as life history traits, organismal phenotypes, or competing phylogenetic hypotheses. Results are reported through graphical and detailed text output files. We illustrate the utility of the KinFin pipeline by addressing questions regarding the biology of filarial nematodes, which include parasites of veterinary and medical importance. We resolve the phylogenetic relationships between the species and explore functional annotation of proteins in clusters in key lineages and between custom taxon sets, identifying gene families of interest. KinFin can easily be integrated into existing comparative genomic workflows, and promotes transparent and reproducible analysis of clustered protein data. Copyright © 2017 Laetsch and Blaxter.
NASA Astrophysics Data System (ADS)
Mittal, Shikha; Banduni, Pooja; Mallikarjuna, Mallana G.; Rao, Atmakuri R.; Jain, Prashant A.; Dash, Prasanta K.; Thirunavukkarasu, Nepolean
2018-05-01
Drought is one of the major threats to maize production. In order to improve the production and to breed tolerant hybrids, understanding the genes and regulatory mechanisms during drought stress is important. Transcription factors (TFs) play a major role in gene regulation and many TFs have been identified in response to drought stress. In our experiment, a set of 15 major TF families comprising 1436 genes was structurally and functionally characterized using in-silico tools and a gene expression assay. All 1436 genes were mapped on 10 chromosome of maize. The functional annotation indicated the involvement of these genes in ABA signaling, ROS scavenging, photosynthesis, stomatal regulation, and sucrose metabolism. Duplication was identified as the primary force in divergence and expansion of TF families. Phylogenetic relationship was developed individually for each TF family as well as combined TF families. Phylogenetic analysis grouped the TF family of genes into TF-specific and mixed groups. Phylogenetic analysis of genes belonging to various TF families suggested that the origin of TFs occurred in the lineage of maize evolution. Gene structure analysis revealed that more number of genes were intron-rich as compared to intronless genes. Drought-responsive CRE’s such as ABREA, ABREB, DRE1 and DRECRTCOREAT have been identified. Expression and interaction analyses identified leaf-specific bZIP TF, GRMZM2G140355, as a potential contributor toward drought tolerance in maize. We also analyzed protein-protein interaction network of 269 drought-responsive genes belonging to different drought-related TFs. The information generated on structural and functional characteristics, expression and interaction of the drought-related TF families will be useful to decipher the drought tolerance mechanisms and to derive drought-tolerant genotypes in maize.
Saleha, Shamim; Ajmal, Muhammad; Jamil, Muhammad; Nasir, Muhammad; Hameed, Abdul
2016-01-01
To map Usher phenotype in a consanguineous Pakistani family and identify disease-associated mutation in a causative gene to establish phenotype-genotype correlation. A consanguineous Pakistani family in which Usher phenotype was segregating as an autosomal recessive trait was ascertained. On the basis of results of clinical investigations of affected members of this family disease was diagnosed as Usher syndrome (USH). To identify the locus responsible for the Usher phenotype in this family, genomic DNA from blood sample of each individual was genotyped using microsatellite Short Tandem Repeat (STR) markers for the known Usher syndrome loci. Then direct sequencing was performed to find out disease associated mutations in the candidate gene. By genetic linkage analysis, the USH phenotype of this family was mapped to PCDH15 locus on chromosome 10q21.1. Three different point mutations in exon 11 of PCDH15 were identified and one of them, c.1304A>C was found to be segregating with the disease phenotype in Pakistani family with Usher phenotype. This, c.1304A>C transversion mutation predicts an amino-acid substitution of aspartic acid with an alanine at residue number 435 (p.D435A) of its protein product. Moreover, in silico analysis revealed conservation of aspartic acid at position 435 and predicated this change as pathogenic. The identification of c.1304A>C pathogenic mutation in PCDH15 gene and its association with Usher syndrome in a consanguineous Pakistani family is the first example of a missense mutation of PCDH15 causing USH1 phenotype. In previous reports, it was hypothesized that severe mutations such as truncated protein of PCDH15 led to the Usher I phenotype and that missense variants are mainly responsible for non-syndromic hearing impairment.
[Analysis of the NDP gene in a Chinese family with X-linked recessive Norrie disease].
Mei, Libin; Huang, Yanru; Pan, Qian; Liang, Desheng; Wu, Lingqian
2015-05-01
The purpose of the current research was to investigate the NDP (Norrie disease protein) gene in one Chinese family with Norrie disease (ND) and to characterize the related clinical features. Clinical data of the proband and his family members were collected. Complete ophthalmic examinations were carried out on the proband. Genomic DNA was extracted from peripheral blood leukocytes of 35 family members. Molecular analysis of the NDP gene was performed by polymerase chain reaction and direct sequencing of all exons and flanking regions. A hemizygous NDP missense mutation c.362G > A (p.Arg121Gln) in exon 3 was identified in the affected members, but not in any of the unaffected family individuals. The missense mutation c.362G > A in NDP is responsible for the Norrie disease in this family. This discovery will help provide the family members with accurate and reliable genetic counseling and prenatal diagnosis.
Molano, Eddy Patricia Lopez; Cabrera, Odalys García; Jose, Juliana; do Nascimento, Leandro Costa; Carazzolle, Marcelo Falsarella; Teixeira, Paulo José Pereira Lima; Alvarez, Javier Correa; Tiburcio, Ricardo Augusto; Tokimatu Filho, Paulo Massanari; de Lima, Gustavo Machado Alvares; Guido, Rafael Victório Carvalho; Corrêa, Thamy Lívia Ribeiro; Leme, Adriana Franco Paes; Mieczkowski, Piotr; Pereira, Gonçalo Amarante Guimarães
2018-01-17
The Ceratocystis genus harbors a large number of phytopathogenic fungi that cause xylem parenchyma degradation and vascular destruction on a broad range of economically important plants. Ceratocystis cacaofunesta is a necrotrophic fungus responsible for lethal wilt disease in cacao. The aim of this work is to analyze the genome of C. cacaofunesta through a comparative approach with genomes of other Sordariomycetes in order to better understand the molecular basis of pathogenicity in the Ceratocystis genus. We present an analysis of the C. cacaofunesta genome focusing on secreted proteins that might constitute pathogenicity factors. Comparative genome analyses among five Ceratocystidaceae species and 23 other Sordariomycetes fungi showed a strong reduction in gene content of the Ceratocystis genus. However, some gene families displayed a remarkable expansion, in particular, the Phosphatidylinositol specific phospholipases-C (PI-PLC) family. Also, evolutionary rate calculations suggest that the evolution process of this family was guided by positive selection. Interestingly, among the 82 PI-PLCs genes identified in the C. cacaofunesta genome, 70 genes encoding extracellular PI-PLCs are grouped in eight small scaffolds surrounded by transposon fragments and scars that could be involved in the rapid evolution of the PI-PLC family. Experimental secretome using LC-MS/MS validated 24% (86 proteins) of the total predicted secretome (342 proteins), including four PI-PLCs and other important pathogenicity factors. Analysis of the Ceratocystis cacaofunesta genome provides evidence that PI-PLCs may play a role in pathogenicity. Subsequent functional studies will be aimed at evaluating this hypothesis. The observed genetic arsenals, together with the analysis of the PI-PLC family shown in this work, reveal significant differences in the Ceratocystis genome compared to the classical vascular fungi, Verticillium and Fusarium. Altogether, our analyses provide new insights into the evolution and the molecular basis of plant pathogenicity.
Labbunruang, Nipawan; Phadungsil, Wansika; Tesana, Smarn; Smooker, Peter M; Grams, Rudi
2016-05-01
Opisthorchis viverrini is the causative agent of human opisthorchiasis in Thailand and long lasting infection with the parasite has been correlated with the development of cholangiocarcinoma. In this work we have molecularly characterized the first member of a protein family carrying two DM9 repeats in this parasite (OvDM9-1). InterPro and other protein family databases describe the DM9 repeat as a protein domain of unknown function that has been first noted in Drosophila melanogaster. Two paralogous proteins have been partially characterized in the genus Fasciola, Fasciola hepatica TP16.5, a novel tegumental antigen in human fascioliasis and, recently F. gigantica DM9-1, a parenchymal protein with structural similarity to nematode cytoplasmic motility protein (MFP2). In this study, we show further evidence that this family of trematode proteins is related to MFP2 in sequence and structure. Soluble recombinant OvDM9-1 was used for structural analyses and for production of specific antisera. The native protein was detected in soluble and insoluble crude worm extracts and in seemingly various oligomeric forms in the latter. The potential for oligomerization was supported by cross-linking experiments of recombinant OvDM9-1. Structure prediction suggested a β-rich secondary structure of the protein and this was supported by a circular dichroism analysis. Molecular modeling in Phyre2 identified both MFP2 domains as distant homologs of OvDM9-1. The protein was located in tegumental type tissue and the cecal epithelium in the mature parasite. Recombinant OvDM9-1 was used as target in indirect ELISA but sera from infected hamsters showed only marginal reactivity towards it. It is proposed that OvDM9-1 and other members of this protein family have a role in cellular transport through functions on the cytoskeleton. Copyright © 2016 Elsevier B.V. All rights reserved.
Albrecht, Steffen; Bogdanovic, Nenad; Ghetti, Bernardino; Winblad, Bengt; LeBlanc, Andréa C.
2010-01-01
We previously demonstrated the activation of Caspase-6 in the hippocampus and cortex in cases of mild, moderate, severe and very severe Alzheimer disease (AD). To determine whether Caspase-6 is also activated in familial AD, we performed an immunohistochemical analysis of active Caspase-6 and Tau cleaved by Caspase-6 in temporal cortex and hippocampal tissue sections from cases of familial AD. The cases included 5 carrying the amyloid precursor protein K670N, M671L Swedish mutation, 1 carrying the amyloid precursor protein E693G Arctic mutation, 2 each carrying the Presenilin I M146V, F105L, A431E, V261F, Y115C mutations, and 1 with the Presenilin II N141I mutation. Active Caspase-6 immunoreactivity was found in all cases. Caspase-6 immunoreactivity was observed in neuritic plaques or cotton wool plaques in some cases, neuropil threads and neurofibrillary tangles. These results indicate that Caspase-6 is activated in familial forms of AD, as previously observed in sporadic forms. Since sporadic and familial AD cases have similar pathological features, these results support a fundamental role of Caspase-6 in the pathophysiology of both familial and sporadic AD. PMID:19915487
Large-scale structure prediction by improved contact predictions and model quality assessment.
Michel, Mirco; Menéndez Hurtado, David; Uziela, Karolis; Elofsson, Arne
2017-07-15
Accurate contact predictions can be used for predicting the structure of proteins. Until recently these methods were limited to very big protein families, decreasing their utility. However, recent progress by combining direct coupling analysis with machine learning methods has made it possible to predict accurate contact maps for smaller families. To what extent these predictions can be used to produce accurate models of the families is not known. We present the PconsFold2 pipeline that uses contact predictions from PconsC3, the CONFOLD folding algorithm and model quality estimations to predict the structure of a protein. We show that the model quality estimation significantly increases the number of models that reliably can be identified. Finally, we apply PconsFold2 to 6379 Pfam families of unknown structure and find that PconsFold2 can, with an estimated 90% specificity, predict the structure of up to 558 Pfam families of unknown structure. Out of these, 415 have not been reported before. Datasets as well as models of all the 558 Pfam families are available at http://c3.pcons.net/ . All programs used here are freely available. arne@bioinfo.se. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
Iida, Aya; Ohnishi, Yasuo; Horinouchi, Sueharu
2008-07-01
Via N-acylhomoserine lactones, the GinI/GinR quorum-sensing system in Gluconacetobacter intermedius NCI1051, a gram-negative acetic acid bacterium, represses acetic acid and gluconic acid fermentation. Two-dimensional polyacrylamide gel electrophoretic analysis of protein profiles of strain NCI1051 and ginI and ginR mutants identified a protein that was produced in response to the GinI/GinR regulatory system. Cloning and nucleotide sequencing of the gene encoding this protein revealed that it encoded an OmpA family protein, named GmpA. gmpA was a member of the gene cluster containing three adjacent homologous genes, gmpA to gmpC, the organization of which appeared to be unique to vinegar producers, including "Gluconacetobacter polyoxogenes." In addition, GmpA was unique among the OmpA family proteins in that its N-terminal membrane domain forming eight antiparallel transmembrane beta-strands contained an extra sequence in one of the surface-exposed loops. Transcriptional analysis showed that only gmpA of the three adjacent gmp genes was activated by the GinI/GinR quorum-sensing system. However, gmpA was not controlled directly by GinR but was controlled by an 89-amino-acid protein, GinA, a target of this quorum-sensing system. A gmpA mutant grew more rapidly in the presence of 2% (vol/vol) ethanol and accumulated acetic acid and gluconic acid in greater final yields than strain NCI1051. Thus, GmpA plays a role in repressing oxidative fermentation, including acetic acid fermentation, which is unique to acetic acid bacteria and allows ATP synthesis via ethanol oxidation. Consistent with the involvement of gmpA in oxidative fermentation, its transcription was also enhanced by ethanol and acetic acid.
Zhang, Min; Wei, Zhiyi; Chang, Shaojie; Teng, Maikun; Gong, Weimin
2006-04-21
A 31kDa cysteine protease, SPE31, was isolated from the seeds of a legume plant, Pachyrizhus erosus. The protein was purified, crystallized and the 3D structure solved using molecular replacement. The cDNA was obtained by RT PCR followed by amplification using mRNA isolated from the seeds of the legume plant as a template. Analysis of the cDNA sequence and the 3D structure indicated the protein to belong to the papain family. Detailed analysis of the structure revealed an unusual replacement of the conserved catalytic Cys with Gly. Replacement of another conserved residue Ala/Gly by a Phe sterically blocks the access of the substrate to the active site. A polyethyleneglycol molecule and a natural peptide fragment were bound to the surface of the active site. Asn159 was found to be glycosylated. The SPE31 cDNA sequence shares several features with P34, a protein found in soybeans, that is implicated in plant defense mechanisms as an elicitor receptor binding to syringolide. P34 has also been shown to interact with vegetative storage proteins and NADH-dependent hydroxypyruvate reductase. These roles suggest that SPE31 and P34 form a unique subfamily within the papain family. The crystal structure of SPE31 complexed with a natural peptide ligand reveals a unique active site architecture. In addition, the clear evidence of glycosylated Asn159 provides useful information towards understanding the functional mechanism of SPE31/P34.
Russo, Roberta; Chiaramonte, Marco; Matranga, Valeria; Arizza, Vincenzo
2015-08-01
The innate immune response involves proteins such as the membrane receptors of the Toll-like family (TLRs), which trigger different intracellular signalling pathways that are dependent on specific stimulating molecules. In sea urchins, TLR proteins are encoded by members of a large multigenic family composed of 60-250 genes in different species. Here, we report a newly identified mRNA sequence encoding a TLR protein (referred to as Pl-Tlr) isolated from Paracentrotus lividus immune cells. The partial protein sequence contained the conserved Toll/IL-1 receptor (TIR) domain, the transmembrane domain and part of the leucine repeats. Phylogenetic analysis of the Pl-Tlr protein was accomplished by comparing its sequence with those of TLRs from different classes of vertebrates and invertebrates. This analysis was suggestive of an evolutionary path that most likely represented the course of millions of years, starting from simple organisms and extending to humans. Challenge of the sea urchin immune system with poly-I:C, a chemical compound that mimics dsRNA, caused time-dependent Pl-Tlr mRNA up-regulation that was detected by QPCR. In contrast, bacterial LPS injury did not affect Pl-Tlr transcription. The study of the Tlr genes in the sea urchin model system may provide new perspectives on the role of Tlrs in the invertebrate immune response and clues concerning their evolution in a changing world. Copyright © 2015 Elsevier Ltd. All rights reserved.
Srinivasan, Jagan; Dillman, Adler R.; Macchietto, Marissa G.; Heikkinen, Liisa; Lakso, Merja; Fracchia, Kelley M.; Antoshechkin, Igor; Mortazavi, Ali; Wong, Garry; Sternberg, Paul W.
2013-01-01
Nematodes compose an abundant and diverse invertebrate phylum with members inhabiting nearly every ecological niche. Panagrellus redivivus (the “microworm”) is a free-living nematode frequently used to understand the evolution of developmental and behavioral processes given its phylogenetic distance to Caenorhabditis elegans. Here we report the de novo sequencing of the genome, transcriptome, and small RNAs of P. redivivus. Using a combination of automated gene finders and RNA-seq data, we predict 24,249 genes and 32,676 transcripts. Small RNA analysis revealed 248 microRNA (miRNA) hairpins, of which 63 had orthologs in other species. Fourteen miRNA clusters containing 42 miRNA precursors were found. The RNA interference, dauer development, and programmed cell death pathways are largely conserved. Analysis of protein family domain abundance revealed that P. redivivus has experienced a striking expansion of BTB domain-containing proteins and an unprecedented expansion of the cullin scaffold family of proteins involved in multi-subunit ubiquitin ligases, suggesting proteolytic plasticity and/or tighter regulation of protein turnover. The eukaryotic release factor protein family has also been dramatically expanded and suggests an ongoing evolutionary arms race with viruses and transposons. The P. redivivus genome provides a resource to advance our understanding of nematode evolution and biology and to further elucidate the genomic architecture leading to free-living lineages, taking advantage of the many fascinating features of this worm revealed by comparative studies. PMID:23410827
Effect of Wnt3a on Keratinocytes Utilizing in Vitro and Bioinformatics Analysis
Nam, Ju-Suk; Chakraborty, Chiranjib; Sharma, Ashish Ranjan; Her, Young; Bae, Kee-Jeong; Sharma, Garima; Doss, George Priya; Lee, Sang-Soo; Hong, Myung-Sun; Song, Dong-Keun
2014-01-01
Wingless-type (Wnt) signaling proteins participate in various cell developmental processes. A suppressive role of Wnt5a on keratinocyte growth has already been observed. However, the role of other Wnt proteins in proliferation and differentiation of keratinocytes remains unknown. Here, we investigated the effects of the Wnt ligand, Wnt3a, on proliferation and differentiation of keratinocytes. Keratinocytes from normal human skin were cultured and treated with recombinant Wnt3a alone or in combination with the inflammatory cytokine, tumor necrosis factor α (TNFα). Furthermore, using bioinformatics, we analyzed the biochemical parameters, molecular evolution, and protein–protein interaction network for the Wnt family. Application of recombinant Wnt3a showed an anti-proliferative effect on keratinocytes in a dose-dependent manner. After treatment with TNFα, Wnt3a still demonstrated an anti-proliferative effect on human keratinocytes. Exogenous treatment of Wnt3a was unable to alter mRNA expression of differentiation markers of keratinocytes, whereas an altered expression was observed in TNFα-stimulated keratinocytes. In silico phylogenetic, biochemical, and protein–protein interaction analysis showed several close relationships among the family members of the Wnt family. Moreover, a close phylogenetic and biochemical similarity was observed between Wnt3a and Wnt5a. Finally, we proposed a hypothetical mechanism to illustrate how the Wnt3a protein may inhibit the process of proliferation in keratinocytes, which would be useful for future researchers. PMID:24686518
Carrascal, Montserrat; Gay, Marina; Ovelleiro, David; Casas, Vanessa; Gelpí, Emilio; Abian, Joaquin
2010-02-05
Major plasma protein families play different roles in blood physiology and hemostasis and in immunodefense. Other proteins in plasma can be involved in signaling as chemical messengers or constitute biological markers of the status of distant tissues. In this respect, the plasma phosphoproteome holds potentially relevant information on the mechanisms modulating these processes through the regulation of protein activity. In this work we describe for the first time a collection of phosphopeptides identified in human plasma using immunoaffinity separation of the seven major serum protein families from other plasma proteins, SCX fractionation, and TiO(2) purification prior to LC-MS/MS analysis. One-hundred and twenty-seven phosphosites in 138 phosphopeptides mapping 70 phosphoproteins were identified with FDR < 1%. A high-confidence collection of phosphosites was obtained using a combined search with the OMSSA, SEQUEST, and Phenyx search engines.
Database of amino acid-nucleotide contacts in contacts in DNA-homeodomain protein
NASA Astrophysics Data System (ADS)
Grokhlina, T. I.; Zrelov, P. V.; Ivanov, V. V.; Polozov, R. V.; Chirgadze, Yu. N.; Sivozhelezov, V. S.
2013-09-01
The analysis of amino acid-nucleotide contacts in interfaces of the protein-DNA complexes, intended to find consistencies in the protein-DNA recognition, is a complex problem that requires an analysis of the physicochemical characteristics of these contacts and the positions of the participating amino acids and nucleotides in the chains of the protein and the DNA, respectively, as well as conservatism of these contacts. Thus, those heterogeneous data should be systematized. For this purpose we have developed a database of amino acid-nucleotide contacts ANTPC (Amino acid Nucleotide Type Position Conservation) following the archetypal example of the proteins in the homeodomain family. We show that it can be used to compare and classify the interfaces of the protein-DNA complexes.
Analysis of sDMA modifications of PIWI proteins
Honda, Shozo; Kirino, Yoriko; Kirino, Yohei
2015-01-01
Summary Arginine methylation is an important post-translational protein modification that modulates protein function for a wide range of biological processes. PIWI proteins, a subclade of the Argonaute family proteins, contain evolutionarily conserved symmetrical dimethylarginines (sDMAs). It has become increasingly apparent that the sDMAs of PIWI proteins serve as binding elements for TUDOR-domain containing proteins and that sDMA-dependent protein interactions play crucial roles in the biogenesis and function of PIWI-interacting RNAs (piRNAs). We describe a method for detecting PIWI sDMAs and purifying PIWI/piRNA complexes using anti-sDMA antibodies. PMID:24178562
NASA Astrophysics Data System (ADS)
Arce, DP; Krsticevic, FJ; Ezpeleta, J.; Ponce, SD; Pratta, GR; Tapia, E.
2016-04-01
The small heat shock proteins (sHSPs) have been found to play a critical role in physiological stress conditions in protecting proteins from irreversible aggregation. To characterize the gene expression profile of four sHsps with a tandem gene structure arrangement in the domesticated Solanum lycopersicum (Heinz 1706) genome and its wild close relative Solanum pimpinellifolium (LA1589), differential gene expression analysis using RNA-Seq was conducted in three ripening stages in both cultivars fruits. Gene promoter analysis was performed to explain the heterogeneous pattern of gene expression found for these tandem duplicated sHsps. In silico analysis results contribute to refocus wet experiment analysis in tomato sHsp family proteins.
Gao, Feng; Song, Weibo; Katz, Laura A.
2014-01-01
In most lineages, diversity among gene family members results from gene duplication followed by sequence divergence. Because of the genome rearrangements during the development of somatic nuclei, gene family evolution in ciliates involves more complex processes. Previous work on the ciliate Chilodonella uncinata revealed that macronuclear β-tubulin gene family members are generated by alternative processing, in which germline regions are alternatively used in multiple macronuclear chromosomes. To further study genome evolution in this ciliate, we analyzed its transcriptome and found that: 1) alternative processing is extensive among gene families; and 2) such gene families are likely to be C. uncinata-specific. We characterized additional macronuclear and micronuclear copies of one candidate alternatively processed gene family -- a protein kinase domain containing protein (PKc) -- from two C. uncinata strains. Analysis of the PKc sequences reveals: 1) multiple PKc gene family members in the macronucleus share some identical regions flanked by divergent regions; and 2) the shared identical regions are processed from a single micronuclear chromosome. We discuss analogous processes in lineages across the eukaryotic tree of life to provide further insights on the impact of genome structure on gene family evolution in eukaryotes. PMID:24749903
LC3/GABARAP family proteins: autophagy-(un)related functions.
Schaaf, Marco B E; Keulers, Tom G; Vooijs, Marc A; Rouschop, Kasper M A
2016-12-01
From yeast to mammals, autophagy is an important mechanism for sustaining cellular homeostasis through facilitating the degradation and recycling of aged and cytotoxic components. During autophagy, cargo is captured in double-membraned vesicles, the autophagosomes, and degraded through lysosomal fusion. In yeast, autophagy initiation, cargo recognition, cargo engulfment, and vesicle closure is Atg8 dependent. In higher eukaryotes, Atg8 has evolved into the LC3/GABARAP protein family, consisting of 7 family proteins [LC3A (2 splice variants), LC3B, LC3C, GABARAP, GABARAPL1, and GABARAPL2]. LC3B, the most studied family protein, is associated with autophagosome development and maturation and is used to monitor autophagic activity. Given the high homology, the other LC3/GABARAP family proteins are often presumed to fulfill similar functions. Nevertheless, substantial evidence shows that the LC3/GABARAP family proteins are unique in function and important in autophagy-independent mechanisms. In this review, we discuss the current knowledge and functions of the LC3/GABARAP family proteins. We focus on processing of the individual family proteins and their role in autophagy initiation, cargo recognition, vesicle closure, and trafficking, a complex and tightly regulated process that requires selective presentation and recruitment of these family proteins. In addition, functions unrelated to autophagy of the LC3/GABARAP protein family members are discussed.-Schaaf, M. B. E., Keulers, T. G, Vooijs, M. A., Rouschop, K. M. A. LC3/GABARAP family proteins: autophagy-(un)related functions. © FASEB.
The Popeye Domain Containing Genes and Their Function in Striated Muscle
Schindler, Roland F. R.; Scotton, Chiara; French, Vanessa; Ferlini, Alessandra; Brand, Thomas
2016-01-01
The Popeye domain containing (POPDC) genes encode a novel class of cAMP effector proteins, which are abundantly expressed in heart and skeletal muscle. Here, we will review their role in striated muscle as deduced from work in cell and animal models and the recent analysis of patients carrying a missense mutation in POPDC1. Evidence suggests that POPDC proteins control membrane trafficking of interacting proteins. Furthermore, we will discuss the current catalogue of established protein-protein interactions. In recent years, the number of POPDC-interacting proteins has been rising and currently includes ion channels (TREK-1), sarcolemma-associated proteins serving functions in mechanical stability (dystrophin), compartmentalization (caveolin 3), scaffolding (ZO-1), trafficking (NDRG4, VAMP2/3) and repair (dysferlin) or acting as a guanine nucleotide exchange factor for Rho-family GTPases (GEFT). Recent evidence suggests that POPDC proteins might also control the cellular level of the nuclear proto-oncoprotein c-Myc. These data suggest that this family of cAMP-binding proteins probably serves multiple roles in striated muscle. PMID:27347491
MacKellar, Drew C; Vaughan, Ashley M; Aly, Ahmed S I; DeLeon, Sasha; Kappe, Stefan H I
2011-11-01
The early transcribed membrane proteins (ETRAMPs) are a family of small, highly charged transmembrane proteins unique to malaria parasites. Some members of the ETRAMP family have been localized to the parasitophorous vacuole membrane that separates the intracellular parasite from the host cell and thus presumably have a role in host-parasite interactions. Although it was previously shown that two ETRAMPs are critical for rodent malaria parasite liver-stage development, the importance of most ETRAMPs during the parasite life cycle remains unknown. Here, we comprehensively identify nine new etramps in the genome of the rodent malaria parasite Plasmodium yoelii, and elucidate their conservation in other malaria parasites. etramp expression profiles are diverse throughout the parasite life cycle as measured by RT-PCR. Epitope tagging of two ETRAMPs demonstrates protein expression in blood and liver stages, and reveals differences in both their timing of expression and their subcellular localization. Gene targeting studies of each of the nine uncharacterized etramps show that two are refractory to deletion and thus likely essential for blood-stage replication. Seven etramps are not essential for any life cycle stage. Systematic characterization of the members of the ETRAMP family reveals the diversity in importance of each family member at the interface between host and parasite throughout the developmental cycle of the malaria parasite. © 2011 Blackwell Publishing Ltd.
Gomi, Hiroshi; Kubota-Murata, Chisato; Yasui, Tadashi; Tsukise, Azuma; Torii, Seiji
2013-02-01
Islet-associated protein-2 (IA-2) and IA-2β (also known as phogrin) are unique neuroendocrine-specific protein tyrosine phosphatases (PTPs). The IA-2 family of PTPs was originally identified from insulinoma cells and discovered to be major autoantigens in type 1 diabetes. Despite its expression in the neural and canonical endocrine tissues, data on expression of the IA-2 family of PTPs in gastrointestinal endocrine cells (GECs) are limited. Therefore, we immunohistochemically investigated the expression of the IA-2 family of PTPs in the rat gastrointestinal tract. In the stomach, IA-2 and IA-2β were expressed in GECs that secrete serotonin, somatostatin, and cholecystokinin/gastrin-1. In addition to these hormones, secretin, gastric inhibitory polypeptide (also known as the glucose-dependent insulinotropic peptide), glucagon-like peptide-1, and glucagon, but not ghrelin were coexpressed with IA-2 or IA-2β in duodenal GECs. Pancreatic islet cells that secrete gut hormones expressed the IA-2 family of PTPs. The expression patterns of IA-2 and IA-2β were comparable. These results reveal that the IA-2 family of PTPs is expressed in a cell type-specific manner in rat GECs. The extensive expression of the IA-2 family of PTPs in pancreo-gastrointestinal endocrine cells and in the enteric plexus suggests their systemic contribution to nutritional control through a neuroendocrine signaling network.
Evolutionary and Expression Analyses of the Apple Basic Leucine Zipper Transcription Factor Family
Zhao, Jiao; Guo, Rongrong; Guo, Chunlei; Hou, Hongmin; Wang, Xiping; Gao, Hua
2016-01-01
Transcription factors (TFs) play essential roles in the regulatory networks controlling many developmental processes in plants. Members of the basic leucine (Leu) zipper (bZIP) TF family, which is unique to eukaryotes, are involved in regulating diverse processes, including flower and vascular development, seed maturation, stress signaling, and defense responses to pathogens. The bZIP proteins have a characteristic bZIP domain composed of a DNA-binding basic region and a Leu zipper dimerization region. In this study, we identified 112 apple (Malus domestica Borkh) bZIP TF-encoding genes, termed MdbZIP genes. Synteny analysis indicated that segmental and tandem duplication events, as well as whole genome duplication, have contributed to the expansion of the apple bZIP family. The family could be divided into 11 groups based on structural features of the encoded proteins, as well as on the phylogenetic relationship of the apple bZIP proteins to those of the model plant Arabidopsis thaliana (AtbZIP genes). Synteny analysis revealed that several paired MdbZIP genes and AtbZIP gene homologs were located in syntenic genomic regions. Furthermore, expression analyses of group A MdbZIP genes showed distinct expression levels in 10 different organs. Moreover, changes in these expression profiles in response to abiotic stress conditions and various hormone treatments identified MdbZIP genes that were responsive to high salinity and drought, as well as to different phytohormones. PMID:27066030
Evolutionary and Expression Analyses of the Apple Basic Leucine Zipper Transcription Factor Family.
Zhao, Jiao; Guo, Rongrong; Guo, Chunlei; Hou, Hongmin; Wang, Xiping; Gao, Hua
2016-01-01
Transcription factors (TFs) play essential roles in the regulatory networks controlling many developmental processes in plants. Members of the basic leucine (Leu) zipper (bZIP) TF family, which is unique to eukaryotes, are involved in regulating diverse processes, including flower and vascular development, seed maturation, stress signaling, and defense responses to pathogens. The bZIP proteins have a characteristic bZIP domain composed of a DNA-binding basic region and a Leu zipper dimerization region. In this study, we identified 112 apple (Malus domestica Borkh) bZIP TF-encoding genes, termed MdbZIP genes. Synteny analysis indicated that segmental and tandem duplication events, as well as whole genome duplication, have contributed to the expansion of the apple bZIP family. The family could be divided into 11 groups based on structural features of the encoded proteins, as well as on the phylogenetic relationship of the apple bZIP proteins to those of the model plant Arabidopsis thaliana (AtbZIP genes). Synteny analysis revealed that several paired MdbZIP genes and AtbZIP gene homologs were located in syntenic genomic regions. Furthermore, expression analyses of group A MdbZIP genes showed distinct expression levels in 10 different organs. Moreover, changes in these expression profiles in response to abiotic stress conditions and various hormone treatments identified MdbZIP genes that were responsive to high salinity and drought, as well as to different phytohormones.
Genome-wide analysis of the WRKY gene family in physic nut (Jatropha curcas L.).
Xiong, Wangdan; Xu, Xueqin; Zhang, Lin; Wu, Pingzhi; Chen, Yaping; Li, Meiru; Jiang, Huawu; Wu, Guojiang
2013-07-25
The WRKY proteins, which contain highly conserved WRKYGQK amino acid sequences and zinc-finger-like motifs, constitute a large family of transcription factors in plants. They participate in diverse physiological and developmental processes. WRKY genes have been identified and characterized in a number of plant species. We identified a total of 58 WRKY genes (JcWRKY) in the genome of the physic nut (Jatropha curcas L.). On the basis of their conserved WRKY domain sequences, all of the JcWRKY proteins could be assigned to one of the previously defined groups, I-III. Phylogenetic analysis of JcWRKY genes with Arabidopsis and rice WRKY genes, and separately with castor bean WRKY genes, revealed no evidence of recent gene duplication in JcWRKY gene family. Analysis of transcript abundance of JcWRKY gene products were tested in different tissues under normal growth condition. In addition, 47 WRKY genes responded to at least one abiotic stress (drought, salinity, phosphate starvation and nitrogen starvation) in individual tissues (leaf, root and/or shoot cortex). Our study provides a useful reference data set as the basis for cloning and functional analysis of physic nut WRKY genes. Copyright © 2013 Elsevier B.V. All rights reserved.
Alvares, K; Carrillo, A; Yuan, P M; Kawano, H; Morimoto, R I; Reddy, J K
1990-01-01
Clofibrate and many of its structural analogues induce proliferation of peroxisomes in the hepatic parenchymal cells of rodents and certain nonrodent species including primates. This induction is tissue specific, occurring mainly in the liver parenchymal cells and to a lesser extent in the kidney cortical epithelium. The induction of peroxisomes is associated with a predictable pleiotropic response, characterized by hepatomegaly, and increased activities and mRNA levels of certain peroxisomal enzymes. Using affinity chromatography, we had previously isolated a protein that binds to clofibric acid. We now show that this protein is homologous with the heat shock protein HSP70 family by analysis of amino acid sequences of isolated peptides from trypsin-treated clofibric acid binding protein and by cross-reactivity with a monoclonal antibody raised against the conserved region of the 70-kDa heat shock proteins. The clofibric acid-Sepharose column could bind HSP70 proteins isolated from various species, which could then be eluted with either clofibric acid or ATP. Conversely, when a rat liver cytosol containing multiple members of the HSP70 family was passed through an ATP-agarose column, and eluted with clofibric acid, only P72 (HSC70) was eluted. These results suggest that clofibric acid, a peroxisome proliferator, preferentially interacts with P72 at or near the ATP binding site. Images PMID:2371272
NASA Astrophysics Data System (ADS)
Wang, Yu; Guo, Yanzhi; Kuang, Qifan; Pu, Xuemei; Ji, Yue; Zhang, Zhihang; Li, Menglong
2015-04-01
The assessment of binding affinity between ligands and the target proteins plays an essential role in drug discovery and design process. As an alternative to widely used scoring approaches, machine learning methods have also been proposed for fast prediction of the binding affinity with promising results, but most of them were developed as all-purpose models despite of the specific functions of different protein families, since proteins from different function families always have different structures and physicochemical features. In this study, we proposed a random forest method to predict the protein-ligand binding affinity based on a comprehensive feature set covering protein sequence, binding pocket, ligand structure and intermolecular interaction. Feature processing and compression was respectively implemented for different protein family datasets, which indicates that different features contribute to different models, so individual representation for each protein family is necessary. Three family-specific models were constructed for three important protein target families of HIV-1 protease, trypsin and carbonic anhydrase respectively. As a comparison, two generic models including diverse protein families were also built. The evaluation results show that models on family-specific datasets have the superior performance to those on the generic datasets and the Pearson and Spearman correlation coefficients ( R p and Rs) on the test sets are 0.740, 0.874, 0.735 and 0.697, 0.853, 0.723 for HIV-1 protease, trypsin and carbonic anhydrase respectively. Comparisons with the other methods further demonstrate that individual representation and model construction for each protein family is a more reasonable way in predicting the affinity of one particular protein family.
A new family of β-helix proteins with similarities to the polysaccharide lyases
Close, Devin W.; D'Angelo, Sara; Bradbury, Andrew R. M.
2014-09-27
Microorganisms that degrade biomass produce diverse assortments of carbohydrate-active enzymes and binding modules. Despite tremendous advances in the genomic sequencing of these organisms, many genes do not have an ascribed function owing to low sequence identity to genes that have been annotated. Consequently, biochemical and structural characterization of genes with unknown function is required to complement the rapidly growing pool of genomic sequencing data. A protein with previously unknown function (Cthe_2159) was recently isolated in a genome-wide screen using phage display to identify cellulose-binding protein domains from the biomass-degrading bacterium Clostridium thermocellum. Here, the crystal structure of Cthe_2159 is presentedmore » and it is shown that it is a unique right-handed parallel β-helix protein. Despite very low sequence identity to known β-helix or carbohydrate-active proteins, Cthe_2159 displays structural features that are very similar to those of polysaccharide lyase (PL) families 1, 3, 6 and 9. Cthe_2159 is conserved across bacteria and some archaea and is a member of the domain of unknown function family DUF4353. This suggests that Cthe_2159 is the first representative of a previously unknown family of cellulose and/or acid-sugar binding β-helix proteins that share structural similarities with PLs. More importantly, these results demonstrate how functional annotation by biochemical and structural analysis remains a critical tool in the characterization of new gene products.« less
A new family of β-helix proteins with similarities to the polysaccharide lyases
DOE Office of Scientific and Technical Information (OSTI.GOV)
Close, Devin W.; D'Angelo, Sara; Bradbury, Andrew R. M.
Microorganisms that degrade biomass produce diverse assortments of carbohydrate-active enzymes and binding modules. Despite tremendous advances in the genomic sequencing of these organisms, many genes do not have an ascribed function owing to low sequence identity to genes that have been annotated. Consequently, biochemical and structural characterization of genes with unknown function is required to complement the rapidly growing pool of genomic sequencing data. A protein with previously unknown function (Cthe_2159) was recently isolated in a genome-wide screen using phage display to identify cellulose-binding protein domains from the biomass-degrading bacterium Clostridium thermocellum. Here, the crystal structure of Cthe_2159 is presentedmore » and it is shown that it is a unique right-handed parallel β-helix protein. Despite very low sequence identity to known β-helix or carbohydrate-active proteins, Cthe_2159 displays structural features that are very similar to those of polysaccharide lyase (PL) families 1, 3, 6 and 9. Cthe_2159 is conserved across bacteria and some archaea and is a member of the domain of unknown function family DUF4353. This suggests that Cthe_2159 is the first representative of a previously unknown family of cellulose and/or acid-sugar binding β-helix proteins that share structural similarities with PLs. More importantly, these results demonstrate how functional annotation by biochemical and structural analysis remains a critical tool in the characterization of new gene products.« less
Simmons, Christopher W.; Reddy, Amitha P.; D’haeseleer, Patrik; ...
2014-12-31
New lignocellulolytic enzymes are needed that maintain optimal activity under the harsh conditions present during industrial enzymatic deconstruction of biomass, including high temperatures, the absence of free water, and the presence of inhibitors from the biomass. Enriching lignocellulolytic microbial communities under these conditions provides a source of microorganisms that may yield robust lignocellulolytic enzymes tolerant to the extreme conditions needed to improve the throughput and efficiency of biomass enzymatic deconstruction. Identification of promising enzymes from these systems is challenging due to complex substrate-enzyme interactions and requirements to assay for activity. In this study, metatranscriptomes from compost-derived microbial communities enriched onmore » rice straw under thermophilic and mesophilic conditions were sequenced and analyzed to identify lignocellulolytic enzymes overexpressed under thermophilic conditions. To determine differential gene expression across mesophilic and thermophilic treatments, a method was developed which pooled gene expression by functional category, as indicated by Pfam annotations, since microbial communities performing similar tasks are likely to have overlapping functions even if they share no specific genes. Differential expression analysis identified enzymes from glycoside hydrolase family 48, carbohydrate binding module family 2, and carbohydrate binding module family 33 domains as significantly overexpressed in the thermophilic community. Overexpression of these protein families in the thermophilic community resulted from expression of a small number of genes not currently represented in any protein database. Genes in overexpressed protein families were predominantly expressed by a single Actinobacteria genus, Micromonospora. In conclusion, coupling measurements of deconstructive activity with comparative analyses to identify overexpressed enzymes in lignocellulolytic communities provides a targeted approach for discovery of candidate enzymes for more efficient biomass deconstruction. Furthermore, glycoside hydrolase family 48 cellulases and carbohydrate binding module family 33 polysaccharide monooxygenases with carbohydrate binding module family 2 domains may improve saccharification of lignocellulosic biomass under high-temperature and low moisture conditions relevant to industrial biofuel production.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Simmons, Christopher W.; Reddy, Amitha P.; D’haeseleer, Patrik
New lignocellulolytic enzymes are needed that maintain optimal activity under the harsh conditions present during industrial enzymatic deconstruction of biomass, including high temperatures, the absence of free water, and the presence of inhibitors from the biomass. Enriching lignocellulolytic microbial communities under these conditions provides a source of microorganisms that may yield robust lignocellulolytic enzymes tolerant to the extreme conditions needed to improve the throughput and efficiency of biomass enzymatic deconstruction. Identification of promising enzymes from these systems is challenging due to complex substrate-enzyme interactions and requirements to assay for activity. In this study, metatranscriptomes from compost-derived microbial communities enriched onmore » rice straw under thermophilic and mesophilic conditions were sequenced and analyzed to identify lignocellulolytic enzymes overexpressed under thermophilic conditions. To determine differential gene expression across mesophilic and thermophilic treatments, a method was developed which pooled gene expression by functional category, as indicated by Pfam annotations, since microbial communities performing similar tasks are likely to have overlapping functions even if they share no specific genes. Differential expression analysis identified enzymes from glycoside hydrolase family 48, carbohydrate binding module family 2, and carbohydrate binding module family 33 domains as significantly overexpressed in the thermophilic community. Overexpression of these protein families in the thermophilic community resulted from expression of a small number of genes not currently represented in any protein database. Genes in overexpressed protein families were predominantly expressed by a single Actinobacteria genus, Micromonospora. In conclusion, coupling measurements of deconstructive activity with comparative analyses to identify overexpressed enzymes in lignocellulolytic communities provides a targeted approach for discovery of candidate enzymes for more efficient biomass deconstruction. Furthermore, glycoside hydrolase family 48 cellulases and carbohydrate binding module family 33 polysaccharide monooxygenases with carbohydrate binding module family 2 domains may improve saccharification of lignocellulosic biomass under high-temperature and low moisture conditions relevant to industrial biofuel production.« less
Yang, S D; Yu, J S; Lee, T T; Ni, M H; Yang, C C; Ho, Y S; Tsen, T Z
1995-10-01
Computer analysis of protein phosphorylation-sites sequence revealed that most transcriptional factors and viral oncoproteins are prime targets for regulation of proline-directed protein phosphorylation, suggesting an association of proline-directed protein kinase (PDPK) family with neoplastic transformation and tumorigenesis. In this report, an immunoprecipitate activity assay of protein kinase FA/glycogen synthase kinase-3alpha (kinase FA/GSK-3alpha) (a particular member of PDPK family) has been optimized for human cervical tissue and used to demonstrate for the first time significantly increased (P < 0.001) activity in poorly differentiated cervical carcinoma (82.8 +/- 6.6 U/mg of protein), moderately differentiated carcinoma (36.2 +/- 3.4 U/mg of protein), and well-differentiated carcinoma (18.3 +/- 2.4 U/mg of protein) from 36 human cervical carcinoma samples when compared to 12 normal controls (4.9 +/- 0.6 U/mg of protein). Immunoblotting analysis further revealed that increased activity of kinase FA/GSK-3alpha in cervical carcinoma is due to overexpression of protein synthesis of the kinase. Taken together, the results provide initial evidence that overexpression of protein synthesis and cellular activity of kinase FA/GSK-3alpha may be involved in human cervical carcinoma dedifferentiation/progression, supporting an association of proline-directed protein kinase with neoplastic transformation and tumorigenesis. Since protein kinase FA/GSK-3alpha may function as a possible regulator of transcription factors/proto-oncogenes, the results further suggest that kinase FA/GSK-3alpha may play a potential role in human cervical carcinogenesis, especially in its dedifferentiation and progression.
Kubota, Daiki; Gocho, Kiyoko; Kikuchi, Sachiko; Akeo, Keiichiro; Miura, Masahiro; Yamaki, Kunihiko; Takahashi, Hiroshi; Kameya, Shuhei
2018-05-02
CEP250 encodes the C-Nap1 protein which belongs to the CEP family of proteins. C-Nap1 has been reported to be expressed in the photoreceptor cilia and is known to interact with other ciliary proteins. Mutations of CEP250 cause atypical Usher syndrome which is characterized by early-onset sensorineural hearing loss (SNHL) and a relatively mild retinitis pigmentosa. This study tested the hypothesis that the mild cone-rod dystrophy (CRD) and SNHL in a non-consanguineous Japanese family was caused by CEP250 mutations. Detailed ophthalmic and auditory examinations were performed on the proband and her family members. Whole exome sequencing (WES) was used on the DNA obtained from the proband. Electrophysiological analysis revealed a mild CRD in two family members. Adaptive optics (AO) imaging showed reduced cone density around the fovea. Auditory examinations showed a slight SNHL in both patients. WES of the proband identified compound heterozygous variants c.361C>T, p.R121*, and c.562C>T, p.R188* in CEP250. The variants were found to co-segregate with the disease in five members of the family. The variants of CEP250 are both null variants and according to American College of Medical Genetics and Genomics (ACMG) standards and guideline, these variants are classified into the very strong category (PVS1). The criteria for both alleles will be pathogenic. Our data indicate that mutations of CEP250 can cause mild CRD and SNHL in Japanese patients. Because the ophthalmological phenotypes were very mild, high-resolution retinal imaging analysis, such as AO, will be helpful in diagnosing CEP250-associated disease.
Singh, Anupama; Kushwaha, Hemant R.; Soni, Praveen; Gupta, Himanshu; Singla-Pareek, Sneh L.; Pareek, Ashwani
2015-01-01
Two-component system (TCS) is one of the key signal sensing machinery which enables species to sense environmental stimuli. It essentially comprises of three major components, sensory histidine kinase proteins (HKs), histidine phosphotransfer proteins (Hpts), and response regulator proteins (RRs). The members of the TCS family have already been identified in Arabidopsis and rice but the knowledge about their functional indulgence during various abiotic stress conditions remains meager. Current study is an attempt to carry out comprehensive analysis of the expression of TCS members in response to various abiotic stress conditions and in various plant tissues in Arabidopsis and rice using MPSS and publicly available microarray data. The analysis suggests that despite having almost similar number of genes, rice expresses higher number of TCS members during various abiotic stress conditions than Arabidopsis. We found that the TCS machinery is regulated by not only various abiotic stresses, but also by the tissue specificity. Analysis of expression of some representative members of TCS gene family showed their regulation by the diurnal cycle in rice seedlings, thus bringing-in another level of their transcriptional control. Thus, we report a highly complex and tight regulatory network of TCS members, as influenced by the tissue, abiotic stress signal, and diurnal rhythm. The insights on the comparative expression analysis presented in this study may provide crucial leads toward dissection of diverse role(s) of the various TCS family members in Arabidopsis and rice. PMID:26442025
Ichinose, Hitomi; Fujimoto, Zui; Honda, Mariko; Harazono, Koichi; Nishimoto, Yukifumi; Uzura, Atsuko; Kaneko, Satoshi
2009-09-11
Arabinogalactan proteins (AGPs) are a family of plant cell surface proteoglycans and are considered to be involved in plant growth and development. Because AGPs are very complex molecules, glycoside hydrolases capable of degrading AGPs are powerful tools for analyses of the AGPs. We previously reported such enzymes from Streptomyces avermitilis. Recently, a beta-l-arabinopyranosidase was purified from the culture supernatant of the bacterium, and its corresponding gene was identified. The primary structure of the protein revealed that the catalytic module was highly similar to that of glycoside hydrolase family 27 (GH27) alpha-d-galactosidases. The recombinant protein was successfully expressed as a secreted 64-kDa protein using a Streptomyces expression system. The specific activity toward p-nitrophenyl-beta-l-arabinopyranoside was 18 micromol of arabinose/min/mg, which was 67 times higher than that toward p- nitrophenyl-alpha-d-galactopyranoside. The enzyme could remove 0.1 and 45% l-arabinose from gum arabic or larch arabinogalactan, respectively. X-ray crystallographic analysis reveals that the protein had a GH27 catalytic domain, an antiparallel beta-domain containing Greek key motifs, another antiparallel beta-domain forming a jellyroll structure, and a carbohydrate-binding module family 13 domain. Comparison of the structure of this protein with that of alpha-d-galactosidase showed a single amino acid substitution (aspartic acid to glutamic acid) in the catalytic pocket of beta-l-arabinopyranosidase, and a space for the hydroxymethyl group on the C-5 carbon of d-galactose bound to alpha-galactosidase was changed in beta-l-arabinopyranosidase. Mutagenesis study revealed that the residue is critical for modulating the enzyme activity. This is the first report in which beta-l-arabinopyranosidase is classified as a new member of the GH27 family.
Evolution of the Max and Mlx networks in animals.
McFerrin, Lisa G; Atchley, William R
2011-01-01
Transcription factors (TFs) are essential for the regulation of gene expression and often form emergent complexes to perform vital roles in cellular processes. In this paper, we focus on the parallel Max and Mlx networks of TFs because of their critical involvement in cell cycle regulation, proliferation, growth, metabolism, and apoptosis. A basic-helix-loop-helix-zipper (bHLHZ) domain mediates the competitive protein dimerization and DNA binding among Max and Mlx network members to form a complex system of cell regulation. To understand the importance of these network interactions, we identified the bHLHZ domain of Max and Mlx network proteins across the animal kingdom and carried out several multivariate statistical analyses. The presence and conservation of Max and Mlx network proteins in animal lineages stemming from the divergence of Metazoa indicate that these networks have ancient and essential functions. Phylogenetic analysis of the bHLHZ domain identified clear relationships among protein families with distinct points of radiation and divergence. Multivariate discriminant analysis further isolated specific amino acid changes within the bHLHZ domain that classify proteins, families, and network configurations. These analyses on Max and Mlx network members provide a model for characterizing the evolution of TFs involved in essential networks.
Gcebe, Nomakorinte; Michel, Anita; Gey van Pittius, Nicolaas C; Rutten, Victor
2016-01-01
The Esx and PE/PPE families of proteins are among the most immunodominant mycobacterial antigens and have thus been the focus of research to develop vaccines and immunological tests for diagnosis of bovine and human tuberculosis, mainly caused by Mycobacterium bovis and Mycobacterium tuberculosis, respectively. In non-tuberculous mycobacteria (NTM), multiple copies of genes encoding homologous proteins have mainly been identified in pathogenic Mycobacterium species phylogenically related to Mycobacterium tuberculosis and Mycobacterium bovis. Only ancestral copies of these genes have been identified in nonpathogenic NTM species like Mycobacterium smegmatis, Mycobacterium sp. KMS, Mycobacterium sp. MCS, and Mycobacterium sp. JLS. In this study we elucidated the genomes of four nonpathogenic NTM species, viz Mycobacterium komanii sp. nov., Mycobacterium malmesburii sp. nov., Mycobacterium nonchromogenicum, and Mycobacterium fortuitum ATCC 6841. These genomes were investigated for genes encoding for the Esx and PE/PPE (situated in the esx cluster) family of proteins as well as adjacent genes situated in the ESX-1 to ESX-5 regions. To identify proteins actually expressed, comparative proteomic analyses of purified protein derivatives from three of the NTM as well as Mycobacterium kansasii ATCC 12478 and the commercially available purified protein derivatives from Mycobacterium bovis and Mycobacterium avium was performed. The genomic analysis revealed the occurrence in each of the four NTM, orthologs of the genes encoding for the Esx family, the PE and PPE family proteins in M. bovis and M. tuberculosis. The identification of genes of the ESX-1, ESX-3, and ESX-4 region including esxA, esxB, ppe68, pe5, and pe35 adds to earlier reports of these genes in nonpathogenic NTM like M. smegmatis, Mycobacterium sp. JLS and Mycobacterium KMS. This report is also the first to identify esxN gene situated within the ESX-5 locus in M. nonchromogenicum. Our proteomics analysis identified a total of 609 proteins in the six PPDs and 22 of these were identified as shared between PPD of M.bovis and one or more of the NTM PPDs. Previously characterized M tuberculosis/M. bovis homologous immunogenic proteins detected in one or more of the nonpathogenic NTM in this study included CFP-10 (detected in M. malmesburii sp. nov. PPD), GroES (detected in all NTM PPDs but M. malmesburii sp. nov.), DnaK (detected in all NTM PPDs), and GroEL (detected in all NTM PPDs). This study confirms reports that the ESX-1, ESX-3, and ESX-4 regions are ancestral regions and thus found in the genomes of most mycobacteria. Identification of NTM homologs of immunogenic proteins warrants further investigation of their ability to cause cross-reactive immune responses with MTBC antigens.
Gcebe, Nomakorinte; Michel, Anita; Gey van Pittius, Nicolaas C.; Rutten, Victor
2016-01-01
The Esx and PE/PPE families of proteins are among the most immunodominant mycobacterial antigens and have thus been the focus of research to develop vaccines and immunological tests for diagnosis of bovine and human tuberculosis, mainly caused by Mycobacterium bovis and Mycobacterium tuberculosis, respectively. In non-tuberculous mycobacteria (NTM), multiple copies of genes encoding homologous proteins have mainly been identified in pathogenic Mycobacterium species phylogenically related to Mycobacterium tuberculosis and Mycobacterium bovis. Only ancestral copies of these genes have been identified in nonpathogenic NTM species like Mycobacterium smegmatis, Mycobacterium sp. KMS, Mycobacterium sp. MCS, and Mycobacterium sp. JLS. In this study we elucidated the genomes of four nonpathogenic NTM species, viz Mycobacterium komanii sp. nov., Mycobacterium malmesburii sp. nov., Mycobacterium nonchromogenicum, and Mycobacterium fortuitum ATCC 6841. These genomes were investigated for genes encoding for the Esx and PE/PPE (situated in the esx cluster) family of proteins as well as adjacent genes situated in the ESX-1 to ESX-5 regions. To identify proteins actually expressed, comparative proteomic analyses of purified protein derivatives from three of the NTM as well as Mycobacterium kansasii ATCC 12478 and the commercially available purified protein derivatives from Mycobacterium bovis and Mycobacterium avium was performed. The genomic analysis revealed the occurrence in each of the four NTM, orthologs of the genes encoding for the Esx family, the PE and PPE family proteins in M. bovis and M. tuberculosis. The identification of genes of the ESX-1, ESX-3, and ESX-4 region including esxA, esxB, ppe68, pe5, and pe35 adds to earlier reports of these genes in nonpathogenic NTM like M. smegmatis, Mycobacterium sp. JLS and Mycobacterium KMS. This report is also the first to identify esxN gene situated within the ESX-5 locus in M. nonchromogenicum. Our proteomics analysis identified a total of 609 proteins in the six PPDs and 22 of these were identified as shared between PPD of M.bovis and one or more of the NTM PPDs. Previously characterized M tuberculosis/M. bovis homologous immunogenic proteins detected in one or more of the nonpathogenic NTM in this study included CFP-10 (detected in M. malmesburii sp. nov. PPD), GroES (detected in all NTM PPDs but M. malmesburii sp. nov.), DnaK (detected in all NTM PPDs), and GroEL (detected in all NTM PPDs). This study confirms reports that the ESX-1, ESX-3, and ESX-4 regions are ancestral regions and thus found in the genomes of most mycobacteria. Identification of NTM homologs of immunogenic proteins warrants further investigation of their ability to cause cross-reactive immune responses with MTBC antigens. PMID:27375559
Structure-Based Phylogenetic Analysis of the Lipocalin Superfamily.
Lakshmi, Balasubramanian; Mishra, Madhulika; Srinivasan, Narayanaswamy; Archunan, Govindaraju
2015-01-01
Lipocalins constitute a superfamily of extracellular proteins that are found in all three kingdoms of life. Although very divergent in their sequences and functions, they show remarkable similarity in 3-D structures. Lipocalins bind and transport small hydrophobic molecules. Earlier sequence-based phylogenetic studies of lipocalins highlighted that they have a long evolutionary history. However the molecular and structural basis of their functional diversity is not completely understood. The main objective of the present study is to understand functional diversity of the lipocalins using a structure-based phylogenetic approach. The present study with 39 protein domains from the lipocalin superfamily suggests that the clusters of lipocalins obtained by structure-based phylogeny correspond well with the functional diversity. The detailed analysis on each of the clusters and sub-clusters reveals that the 39 lipocalin domains cluster based on their mode of ligand binding though the clustering was performed on the basis of gross domain structure. The outliers in the phylogenetic tree are often from single member families. Also structure-based phylogenetic approach has provided pointers to assign putative function for the domains of unknown function in lipocalin family. The approach employed in the present study can be used in the future for the functional identification of new lipocalin proteins and may be extended to other protein families where members show poor sequence similarity but high structural similarity.
Nishimura, Agnes L.; Mitne-Neto, Miguel; Silva, Helga C. A.; Richieri-Costa, Antônio; Middleton, Susan; Cascio, Duilio; Kok, Fernando; Oliveira, João R. M.; Gillingwater, Tom; Webb, Jeanette; Skehel, Paul; Zatz, Mayana
2004-01-01
Motor neuron diseases (MNDs) are a group of neurodegenerative disorders with involvement of upper and/or lower motor neurons, such as amyotrophic lateral sclerosis (ALS), spinal muscular atrophy (SMA), progressive bulbar palsy, and primary lateral sclerosis. Recently, we have mapped a new locus for an atypical form of ALS/MND (atypical amyotrophic lateral sclerosis [ALS8]) at 20q13.3 in a large white Brazilian family. Here, we report the finding of a novel missense mutation in the vesicle-associated membrane protein/synaptobrevin-associated membrane protein B (VAPB) gene in patients from this family. Subsequently, the same mutation was identified in patients from six additional kindreds but with different clinical courses, such as ALS8, late-onset SMA, and typical severe ALS with rapid progression. Although it was not possible to link all these families, haplotype analysis suggests a founder effect. Members of the vesicle-associated proteins are intracellular membrane proteins that can associate with microtubules and that have been shown to have a function in membrane transport. These data suggest that clinically variable MNDs may be caused by a dysfunction in intracellular membrane trafficking. PMID:15372378
Ho, Vincent K.; Angelotti, Timothy
2013-01-01
Receptor expression enhancing proteins (REEPs) were identified by their ability to enhance cell surface expression of a subset of G protein-coupled receptors (GPCRs), specifically GPCRs that have proven difficult to express in heterologous cell systems. Further analysis revealed that they belong to the Yip (Ypt-interacting protein) family and that some REEP subtypes affect ER structure. Yip family comparisons have established other potential roles for REEPs, including regulation of ER-Golgi transport and processing/neuronal localization of cargo proteins. However, these other potential REEP functions and the mechanism by which they selectively enhance GPCR cell surface expression have not been clarified. By utilizing several REEP family members (REEP1, REEP2, and REEP6) and model GPCRs (α2A and α2C adrenergic receptors), we examined REEP regulation of GPCR plasma membrane expression, intracellular processing, and trafficking. Using a combination of immunolocalization and biochemical methods, we demonstrated that this REEP subset is localized primarily to ER, but not plasma membranes. Single cell analysis demonstrated that these REEPs do not specifically enhance surface expression of all GPCRs, but affect ER cargo capacity of specific GPCRs and thus their surface expression. REEP co-expression with α2 adrenergic receptors (ARs) revealed that this REEP subset interacts with and alter glycosidic processing of α2C, but not α2A ARs, demonstrating selective interaction with cargo proteins. Specifically, these REEPs enhanced expression of and interacted with minimally/non-glycosylated forms of α2C ARs. Most importantly, expression of a mutant REEP1 allele (hereditary spastic paraplegia SPG31) lacking the carboxyl terminus led to loss of this interaction. Thus specific REEP isoforms have additional intracellular functions besides altering ER structure, such as enhancing ER cargo capacity, regulating ER-Golgi processing, and interacting with select cargo proteins. Therefore, some REEPs can be further described as ER membrane shaping adapter proteins. PMID:24098485
Genes encoding calmodulin-binding proteins in the Arabidopsis genome
NASA Technical Reports Server (NTRS)
Reddy, Vaka S.; Ali, Gul S.; Reddy, Anireddy S N.
2002-01-01
Analysis of the recently completed Arabidopsis genome sequence indicates that approximately 31% of the predicted genes could not be assigned to functional categories, as they do not show any sequence similarity with proteins of known function from other organisms. Calmodulin (CaM), a ubiquitous and multifunctional Ca(2+) sensor, interacts with a wide variety of cellular proteins and modulates their activity/function in regulating diverse cellular processes. However, the primary amino acid sequence of the CaM-binding domain in different CaM-binding proteins (CBPs) is not conserved. One way to identify most of the CBPs in the Arabidopsis genome is by protein-protein interaction-based screening of expression libraries with CaM. Here, using a mixture of radiolabeled CaM isoforms from Arabidopsis, we screened several expression libraries prepared from flower meristem, seedlings, or tissues treated with hormones, an elicitor, or a pathogen. Sequence analysis of 77 positive clones that interact with CaM in a Ca(2+)-dependent manner revealed 20 CBPs, including 14 previously unknown CBPs. In addition, by searching the Arabidopsis genome sequence with the newly identified and known plant or animal CBPs, we identified a total of 27 CBPs. Among these, 16 CBPs are represented by families with 2-20 members in each family. Gene expression analysis revealed that CBPs and CBP paralogs are expressed differentially. Our data suggest that Arabidopsis has a large number of CBPs including several plant-specific ones. Although CaM is highly conserved between plants and animals, only a few CBPs are common to both plants and animals. Analysis of Arabidopsis CBPs revealed the presence of a variety of interesting domains. Our analyses identified several hypothetical proteins in the Arabidopsis genome as CaM targets, suggesting their involvement in Ca(2+)-mediated signaling networks.
Bernkopf, Marie; Webersinke, Gerald; Tongsook, Chanakan; Koyani, Chintan N; Rafiq, Muhammad A; Ayaz, Muhammad; Müller, Doris; Enzinger, Christian; Aslam, Muhammad; Naeem, Farooq; Schmidt, Kurt; Gruber, Karl; Speicher, Michael R; Malle, Ernst; Macheroux, Peter; Ayub, Muhammad; Vincent, John B; Windpassinger, Christian; Duba, Hans-Christoph
2014-08-01
We describe the characterization of a gene for mild nonsyndromic autosomal recessive intellectual disability (ID) in two unrelated families, one from Austria, the other from Pakistan. Genome-wide single nucleotide polymorphism microarray analysis enabled us to define a region of homozygosity by descent on chromosome 17q25. Whole-exome sequencing and analysis of this region in an affected individual from the Austrian family identified a 5 bp frameshifting deletion in the METTL23 gene. By means of Sanger sequencing of METTL23, a nonsense mutation was detected in a consanguineous ID family from Pakistan for which homozygosity-by-descent mapping had identified a region on 17q25. Both changes lead to truncation of the putative METTL23 protein, which disrupts the predicted catalytic domain and alters the cellular localization. 3D-modelling of the protein indicates that METTL23 is strongly predicted to function as an S-adenosyl-methionine (SAM)-dependent methyltransferase. Expression analysis of METTL23 indicated a strong association with heat shock proteins, which suggests that these may act as a putative substrate for methylation by METTL23. A number of methyltransferases have been described recently in association with ID. Disruption of METTL23 presented here supports the importance of methylation processes for intact neuronal function and brain development. © The Author 2014. Published by Oxford University Press.
Gao, Chao; Sun, Jianlei; Wang, Chongqi; Dong, Yumei; Xiao, Shouhua; Wang, Xingjun; Jiao, Zigao
2017-01-01
The basic/helix-loop-helix (bHLH) proteins constitute a superfamily of transcription factors that are known to play a range of regulatory roles in eukaryotes. Over the past few decades, many bHLH family genes have been well-characterized in model plants, such as Arabidopsis, rice and tomato. However, the bHLH protein family in peanuts has not yet been systematically identified and characterized. Here, 132 and 129 bHLH proteins were identified from two wild ancestral diploid subgenomes of cultivated tetraploid peanuts, Arachis duranensis (AA) and Arachis ipaensis (BB), respectively. Phylogenetic analysis indicated that these bHLHs could be classified into 19 subfamilies. Distribution mapping results showed that peanut bHLH genes were randomly and unevenly distributed within the 10 AA chromosomes and 10 BB chromosomes. In addition, 120 bHLH gene pairs between the AA-subgenome and BB-subgenome were found to be orthologous and 101 of these pairs were highly syntenic in AA and BB chromosomes. Furthermore, we confirmed that 184 bHLH genes expressed in different tissues, 22 of which exhibited tissue-specific expression. Meanwhile, we identified 61 bHLH genes that may be potentially involved in peanut-specific subterranean. Our comprehensive genomic analysis provides a foundation for future functional dissection and understanding of the regulatory mechanisms of bHLH transcription factors in peanuts.
Upregulation of human heme oxygenase gene expression by Ets-family proteins.
Deramaudt, B M; Remy, P; Abraham, N G
1999-03-01
Overexpression of human heme oxygenase-1 has been shown to have the potential to promote EC proliferation and angiogenesis. Since Ets-family proteins have been shown to play an important role in angiogenesis, we investigated the presence of ETS binding sites (EBS), GGAA/T, and ETS protein contributing to human HO-1 gene expression. Several chloramphenicol acetyltransferase constructs were examined in order to analyze the effect of ETS family proteins on the transduction of HO-1 in Xenopus oocytes and in microvessel endothelial cells. Heme oxygenase promoter activity was up-regulated by FLI-1ERGETS-1 protein(s). Chloramphenicol acetyltransferase (CAT) assays demonstrated that the promoter region (-1500 to +19) contains positive and negative control elements and that all three members of the ETS protein family were responsible for the up-regulation of HHO-1. Electrophoretic mobility shift assays (EMSA), performed with nuclear extracts from endothelial cells overexpressing HHO-1 gene, and specific HHO-1 oligonucleotides probes containing putative EBS resulted in a specific and marked bandshift. Synergistic binding was observed in EMSA between AP-1 on the one hand, FLI-1, ERG, and ETS-1 protein on the other. Moreover, 5'-deletion analysis demonstrated the existence of a negative control element of HHO-1 expression located between positions -1500 and -120 on the HHO-1 promoter. The presence of regulatory sequences for transcription factors such as ETS-1, FLI-1, or ERG, whose activity is associated with cell proliferation, endothelial cell differentiation, and matrix metalloproteinase transduction, may be an indication of the important role that HO-1 may play in coronary collateral circulation, tumor growth, angiogenesis, and hemoglobin-induced endothelial cell injuries.
Liu, Zhao-liang; Luo, Cong; Dong, Long; Van Toan, Can; Wei, Peng-xiao; He, Xin-hua
2014-04-25
The Rab family, the largest branch of Ras small GTPases, plays a crucial role in the vesicular transport in plants. The members of Rab family act as molecular switches that regulate the fusion of vesicles with target membranes through conformational changes. However, little is known about the Rab5 gene involved in fruit ripening and stress response. In this study, the MiRab5 gene was isolated from stress-induced Mangifera indica. The full-length cDNA sequence was 984bp and contained an open reading frame of 600bp, which encoded a 200 amino acid protein with a molecular weight of 21.83kDa and a theoretical isoelectric point of 6.99. The deduced amino acid sequence exhibited high homology with tomato (91% similarity) and contains all five characteristic Rab motifs. Real-time quantitative RT-PCR analysis demonstrated that MiRab5 was ubiquitously expressed in various mango tree tissues at different levels. The expression of MiRab5 was up-regulated during later stages of fruit ripening. Moreover, MiRab5 was generally up-regulated in response to various abiotic stresses (cold, salinity, and PEG treatments). Recombinant MiRab5 protein was successfully expressed and purified. SDS-PAGE and western blot analysis indicated that the expressed protein was recognized by the anti-6-His antibody. These results provide insights into the role of the MiRab5 gene family in fruit ripening and stress responses in the mango plant. Copyright © 2014 Elsevier B.V. All rights reserved.
Mohanta, Tapan Kumar; Kumar, Pradeep; Bae, Hanhong
2017-02-03
Ca 2+ ion is a versatile second messenger that operate in a wide ranges of cellular processes that impact nearly every aspect of life. Ca 2+ regulates gene expression and biotic and abiotic stress responses in organisms ranging from unicellular algae to multi-cellular higher plants through the cascades of calcium signaling processes. In this study, we deciphered the genomics and evolutionary aspects of calcium signaling event of calmodulin (CaM) and calmodulin like- (CML) proteins. We studied the CaM and CML gene family of 41 different species across the plant lineages. Genomic analysis showed that plant encodes more calmodulin like-protein than calmodulins. Further analyses showed, the majority of CMLs were intronless, while CaMs were intron rich. Multiple sequence alignment showed, the EF-hand domain of CaM contains four conserved D-x-D motifs, one in each EF-hand while CMLs contain only one D-x-D-x-D motif in the fourth EF-hand. Phylogenetic analysis revealed that, the CMLs were evolved earlier than CaM and later diversified. Gene expression analysis demonstrated that different CaM and CMLs genes were express differentially in different tissues in a spatio-temporal manner. In this study we provided in detailed genome-wide identifications and characterization of CaM and CML protein family, phylogenetic relationships, and domain structure. Expression study of CaM and CML genes were conducted in Glycine max and Phaseolus vulgaris. Our study provides a strong foundation for future functional research in CaM and CML gene family in plant kingdom.
Oncodomains: A protein domain-centric framework for analyzing rare variants in tumor samples
Peterson, Thomas A.; Park, Junyong
2017-01-01
The fight against cancer is hindered by its highly heterogeneous nature. Genome-wide sequencing studies have shown that individual malignancies contain many mutations that range from those commonly found in tumor genomes to rare somatic variants present only in a small fraction of lesions. Such rare somatic variants dominate the landscape of genomic mutations in cancer, yet efforts to correlate somatic mutations found in one or few individuals with functional roles have been largely unsuccessful. Traditional methods for identifying somatic variants that drive cancer are ‘gene-centric’ in that they consider only somatic variants within a particular gene and make no comparison to other similar genes in the same family that may play a similar role in cancer. In this work, we present oncodomain hotspots, a new ‘domain-centric’ method for identifying clusters of somatic mutations across entire gene families using protein domain models. Our analysis confirms that our approach creates a framework for leveraging structural and functional information encapsulated by protein domains into the analysis of somatic variants in cancer, enabling the assessment of even rare somatic variants by comparison to similar genes. Our results reveal a vast landscape of somatic variants that act at the level of domain families altering pathways known to be involved with cancer such as protein phosphorylation, signaling, gene regulation, and cell metabolism. Due to oncodomain hotspots’ unique ability to assess rare variants, we expect our method to become an important tool for the analysis of sequenced tumor genomes, complementing existing methods. PMID:28426665
Furukawa, Atsushi; Nakada-Tsukui, Kumiko
2013-01-01
Phagocytosis plays a pivotal role in nutrient acquisition and evasion from the host defense systems in Entamoeba histolytica, the intestinal protozoan parasite that causes amoebiasis. We previously reported that E. histolytica possesses a unique class of a hydrolase receptor family, designated the cysteine protease-binding protein family (CPBF), that is involved in trafficking of hydrolases to lysosomes and phagosomes, and we have also reported that CPBF1 and CPBF8 bind to cysteine proteases or β-hexosaminidase α-subunit and lysozymes, respectively. In this study, we showed by immunoprecipitation that CPBF6, one of the most highly expressed CPBF proteins, specifically binds to α-amylase and γ-amylase. We also found that CPBF6 is localized in lysosomes, based on immunofluorescence imaging. Immunoblot and proteome analyses of the isolated phagosomes showed that CPBF6 mediates transport of amylases to phagosomes. We also demonstrated that the carboxyl-terminal cytosolic region of CPBF6 is engaged in the regulation of the trafficking of CPBF6 to phagosomes. Our proteome analysis of phagosomes also revealed new potential phagosomal proteins. PMID:23509141
Tetreau, Guillaume; Dittmer, Neal T; Cao, Xiaolong; Agrawal, Sinu; Chen, Yun-Ru; Muthukrishnan, Subbaratnam; Haobo, Jiang; Blissard, Gary W; Kanost, Michael R; Wang, Ping
2015-07-01
In insects, chitin is a major structural component of the cuticle and the peritrophic membrane (PM). In nature, chitin is always associated with proteins among which chitin-binding proteins (CBPs) are the most important for forming, maintaining and regulating the functions of these extracellular structures. In this study, a genome-wide search for genes encoding proteins with ChtBD2-type (peritrophin A-type) chitin-binding domains (CBDs) was conducted. A total of 53 genes encoding 56 CBPs were identified, including 15 CPAP1s (cuticular proteins analogous to peritrophins with 1 CBD), 11 CPAP3s (CPAPs with 3 CBDs) and 17 PMPs (PM proteins) with a variable number of CBDs, which are structural components of cuticle or of the PM. CBDs were also identified in enzymes of chitin metabolism including 6 chitinases and 7 chitin deacetylases encoded by 6 and 5 genes, respectively. RNA-seq analysis confirmed that PMP and CPAP genes have differential spatial expression patterns. The expression of PMP genes is midgut-specific, while CPAP genes are widely expressed in different cuticle forming tissues. Phylogenetic analysis of CBDs of proteins in insects belonging to different orders revealed that CPAP1s from different species constitute a separate family with 16 different groups, including 6 new groups identified in this study. The CPAP3s are clustered into a separate family of 7 groups present in all insect orders. Altogether, they reveal that duplication events of CBDs in CPAP1s and CPAP3s occurred prior to the evolutionary radiation of insect species. In contrast to the CPAPs, all CBDs from individual PMPs are generally clustered and distinct from other PMPs in the same species in phylogenetic analyses, indicating that the duplication of CBDs in each of these PMPs occurred after divergence of insect species. Phylogenetic analysis of these three CBP families showed that the CBDs in CPAP1s form a clearly separate family, while those found in PMPs and CPAP3s were clustered together in the phylogenetic tree. For chitinases and chitin deacetylases, most of phylogenetic analysis performed with the CBD sequences resulted in similar clustering to the one obtained by using catalytic domain sequences alone, suggesting that CBDs were incorporated into these enzymes and evolved in tandem with the catalytic domains before the diversification of different insect orders. Based on these results, the evolution of CBDs in insect CBPs is discussed to provide a new insight into the CBD sequence structure and diversity, and their evolution and expression in insects. Copyright © 2014 Elsevier Ltd. All rights reserved.
Zainal Abidin, Syafiq Asnawi; Rajadurai, Pathmanathan; Chowdhury, Md Ezharul Hoque; Ahmad Rusmili, Muhamad Rusdi; Othman, Iekhsan; Naidu, Rakesh
2016-10-18
Tropidolaemus wagleri and Cryptelytrops purpureomaculatus are venomous pit viper species commonly found in Malaysia. Tandem mass spectrometry analysis of the crude venoms has detected different proteins in T. wagleri and C. purpureomaculatus . They were classified into 13 venom protein families consisting of enzymatic and nonenzymatic proteins. Enzymatic families detected in T. wagleri and C. purpureomaculatus venom were snake venom metalloproteinase, phospholipase A₂, ʟ-amino acid oxidase, serine proteases, 5'-nucleotidase, phosphodiesterase, and phospholipase B. In addition, glutaminyl cyclotransferase was detected in C. purpureomaculatus . C-type lectin-like proteins were common nonenzymatic components in both species. Waglerin was present and unique to T. wagleri -it was not in C. purpureomaculatus venom. In contrast, cysteine-rich secretory protein, bradykinin-potentiating peptide, and C-type natriuretic peptide were present in C. purpureomaculatus venom. Composition of the venom proteome of T. wagleri and C. purpureomaculatus provides useful information to guide production of effective antivenom and identification of proteins with potential therapeutic applications.
TrkA and TrkC neurotrophin receptor-like proteins in the lizard gut.
Lucini, C; de Girolamo, P; Lamanna, C; Botte, V; Vega, J A; Castaldo, L
2001-03-01
The tyrosine kinase proteins (Trk), encoded by the trk family of proto-oncogenes, mediate, in mammals, the action of neurotrophins, a family of growth factors acting on the development and maintenance of the nervous system. Neurotrophins and their specific receptors, TrkA, TrkB and TrkC, seem to be phylogenetically well preserved but, in reptiles, data regarding the occurrence of Trk-like proteins are very scarce, especially in non-nervous organs. Western blot analysis demonstrated that the lizard gut contains TrkA- and TrkC-like, but not TrkB-like, proteins. Consistently, TrkA- and TrkC-like immunoreactivity were both observed in neurons of the anterior intestine, whereas endocrine cells of the stomach and anterior intestine only displayed TrkA-like immunoreactivity. These results demonstrate for the first time the occurrence of Trk-like proteins in non-neuronal tissues of reptilians and provide further evidence for the evolutionary preservation of the molecular mass and cell distribution of Trk neurotrophin receptor-like proteins in the gut of vertebrates.
Regalado, A P; Ricardo, C P
1996-01-01
Proteins in the intercellular fluid (IF) of healthy Lupinus albus leaves were characterized. Silver staining of the proteins separated by sodium dodecyl sulfate-polyacrylamide gel electrophoresis revealed more than 30 polypeptides, with the major ones having a molecular mass lower than 36 kD. After amino-terminal amino acid sequence analysis, one of the major polypeptides, IF4, was shown to have no identity with any of the proteins present in the data bases. Two others, IF1 and IF3, showed identity with previously reported pathogenesis-related proteins, IF1 with an antifungal protein from Hordeum vulgare that belongs to the thaumatin family (PR-5 family), and IF3 with class III chitinase-lysozymes. IF3 was also present in the IF of stem and root and it represents the major polypeptide in the medium of L. albus cell-suspension cultures. The ubiquitous presence of this enzyme in healthy, nonstressed tissues of L. albus cannot be explained. PMID:8587984
Zainal Abidin, Syafiq Asnawi; Rajadurai, Pathmanathan; Chowdhury, Md Ezharul Hoque; Ahmad Rusmili, Muhamad Rusdi; Othman, Iekhsan; Naidu, Rakesh
2016-01-01
Tropidolaemus wagleri and Cryptelytrops purpureomaculatus are venomous pit viper species commonly found in Malaysia. Tandem mass spectrometry analysis of the crude venoms has detected different proteins in T. wagleri and C. purpureomaculatus. They were classified into 13 venom protein families consisting of enzymatic and nonenzymatic proteins. Enzymatic families detected in T. wagleri and C. purpureomaculatus venom were snake venom metalloproteinase, phospholipase A2, l-amino acid oxidase, serine proteases, 5′-nucleotidase, phosphodiesterase, and phospholipase B. In addition, glutaminyl cyclotransferase was detected in C. purpureomaculatus. C-type lectin-like proteins were common nonenzymatic components in both species. Waglerin was present and unique to T. wagleri—it was not in C. purpureomaculatus venom. In contrast, cysteine-rich secretory protein, bradykinin-potentiating peptide, and C-type natriuretic peptide were present in C. purpureomaculatus venom. Composition of the venom proteome of T. wagleri and C. purpureomaculatus provides useful information to guide production of effective antivenom and identification of proteins with potential therapeutic applications. PMID:27763534
Huang, Xian-De; Zhang, Hua; He, Mao-Xian
2017-12-30
The platelet-derived growth factor/vascular endothelial growth factor (PDGF/VEGF, PVF) family of proteins have been implicated in a wide range of biological functions in vertebrates, including cell proliferation, cell differentiation, cell migration, neural development and especially angiogenesis/vasculogenesis. In this study, a PVF gene, belonging to the PDGF/VEGF family, was cloned and characterized from Pinctada fucata. It contained an ORF of 1110bp encoding a putative protein of 369 amino acids. The deduced amino acid sequence presented the typical structural features of PDGF family members and the N-terminal signal peptide for secretion. Comparative phylogenetic analysis revealed that PfPVF shows relatively high identity with other invertebrate PVF homologues. Furthermore, gene expression analysis revealed that PfPVF is involved in not only the nucleus grafting operation and but also the response to immune stimulation. The study may help to increase understanding of the functions of molluscan PVF. Copyright © 2017 Elsevier B.V. All rights reserved.
A duplicated PLP gene causing Pelizaeus-Merzbacher disease detected by comparative multiplex PCR
DOE Office of Scientific and Technical Information (OSTI.GOV)
Inoue, K.; Sugiyama, N.; Kawanishi, C.
1996-07-01
Pelizaeus-Merzbacher disease (PMD) is an X-linked dysmyelinating disorder caused by abnormalities in the proteolipid protein (PLP) gene, which is essential for oligodendrocyte differentiation and CNS myelin formation. Although linkage analysis has shown the homogeneity at the PLP locus in patients with PMD, exonic mutations in the PLP gene have been identified in only 10% - 25% of all cases, which suggests the presence of other genetic aberrations, including gene duplication. In this study, we examined five families with PMD not carrying exonic mutations in PLP gene, using comparative multiplex PCR (CM-PCR) as a semiquantitative assay of gene dosage. PLP genemore » duplications were identified in four families by CM-PCR and confirmed in three families by densitometric RFLP analysis. Because a homologous myelin protein gene, PMP22, is duplicated in the majority of patients with Charcot-Marie-Tooth 1A, PLP gene overdosage may be an important genetic abnormality in PMD and affect myelin formation. 38 ref., 5 figs., 2 tabs.« less
Cross-species Virus-host Protein-Protein Interactions Inhibiting Innate Immunity
2016-07-01
Distribution A: Approved for public release; distribution is unlimited 13. SUPPLEMENTARY NOTES 14. ABSTRACT The single- stranded negative sense RNA...focused upon members of three negative-sense single- stranded RNA (ssRNA(-)) virus families with know or suspected histories of changes in host-species...however, the N and C-termini are disordered extended strands . In contrast, our covariance analysis mapped hotspots for protein interaction to the
Two dimensional Blue Native-/SDS-PAGE analysis of SLP family adaptor protein complexes.
Swamy, Mahima; Kulathu, Yogesh; Ernst, Sandra; Reth, Michael; Schamel, Wolfgang W A
2006-04-15
SH2 domain containing leukocyte protein (SLP) adaptor proteins serve a central role in the antigen-mediated activation of lymphocytes by organizing multiprotein signaling complexes. Here, we use two dimensional native-/SDS-gel electrophoresis to study the number, size and relative abundance of protein complexes containing SLP family proteins. In non-stimulated T cells all SLP-76 proteins are in a approximately 400 kDa complex with the small adaptor protein Grb2-like adaptor protein downstream of Shc (Gads), whereas half of Gads is monomeric. This constitutive SLP-76/Gads complex could be reconstituted in Drosophila S2 cells expressing both components, suggesting that it might not contain additional subunits. In contrast, in B cells SLP-65 exists in a 180 kDa complex as well as in monomeric form. Since the complex was not found in S2 cells expressing only SLP-65, it was not di/trimeric SLP-65. Upon antigen-stimulation only the complexed SLP-65 was phosphorylated. Surprisingly, stimulation-induced alteration of SLP complexes could not be detected, suggesting that active signaling complexes form only transiently, and are of low abundance.
Litholdo, Celso G.; Parker, Benjamin L.; Eamens, Andrew L.; Larsen, Martin R.; Cordwell, Stuart J.; Waterhouse, Peter M.
2016-01-01
Expression of the F-Box protein Leaf Curling Responsiveness (LCR) is regulated by microRNA, miR394, and alterations to this interplay in Arabidopsis thaliana produce defects in leaf polarity and shoot apical meristem organization. Although the miR394-LCR node has been documented in Arabidopsis, the identification of proteins targeted by LCR F-box itself has proven problematic. Here, a proteomic analysis of shoot apices from plants with altered LCR levels identified a member of the Latex Protein (MLP) family gene as a potential LCR F-box target. Bioinformatic and molecular analyses also suggested that other MLP family members are likely to be targets for this post-translational regulation. Direct interaction between LCR F-Box and MLP423 was validated. Additional MLP members had reduction in protein accumulation, in varying degrees, mediated by LCR F-Box. Transgenic Arabidopsis lines, in which MLP28 expression was reduced through an artificial miRNA technology, displayed severe developmental defects, including changes in leaf patterning and morphology, shoot apex defects, and eventual premature death. These phenotypic characteristics resemble those of Arabidopsis plants modified to over-express LCR. Taken together, the results demonstrate that MLPs are driven to degradation by LCR, and indicate that MLP gene family is target of miR394-LCR regulatory node, representing potential targets for directly post-translational regulation mediated by LCR F-Box. In addition, MLP28 family member is associated with the LCR regulation that is critical for normal Arabidopsis development. PMID:27067051
PCR analysis of the cryI insecticidal crystal family genes from Bacillus thuringiensis.
Ceron, J; Covarrubias, L; Quintero, R; Ortiz, A; Ortiz, M; Aranda, E; Lina, L; Bravo, A
1994-01-01
A method allowing rapid and accurate identification of different subgroups within the insecticidal crystal CryI protein-producing family of Bacillus thuringiensis strains was established by using PCR technology. Thirteen highly homologous primers specific to regions within genes encoding seven different subgroups of B. thuringiensis CryI proteins were described. Differentiation among these strains was determined on the basis of the electrophoretic patterns of PCR products. B. thuringiensis strains, isolated from soil samples, were analyzed by PCR technology. Small amounts of bacterial lysates were assayed in two reaction mixtures containing six to eight primers. This method can be applied to rapidly detect the subgroups of CryI proteins that correspond with toxicity to various lepidopteran insects. Images PMID:8117089
Exploring metazoan evolution through dynamic and holistic changes in protein families and domains
2012-01-01
Background Proteins convey the majority of biochemical and cellular activities in organisms. Over the course of evolution, proteins undergo normal sequence mutations as well as large scale mutations involving domain duplication and/or domain shuffling. These events result in the generation of new proteins and protein families. Processes that affect proteome evolution drive species diversity and adaptation. Herein, change over the course of metazoan evolution, as defined by birth/death and duplication/deletion events within protein families and domains, was examined using the proteomes of 9 metazoan and two outgroup species. Results In studying members of the three major metazoan groups, the vertebrates, arthropods, and nematodes, we found that the number of protein families increased at the majority of lineages over the course of metazoan evolution where the magnitude of these increases was greatest at the lineages leading to mammals. In contrast, the number of protein domains decreased at most lineages and at all terminal lineages. This resulted in a weak correlation between protein family birth and domain birth; however, the correlation between domain birth and domain member duplication was quite strong. These data suggest that domain birth and protein family birth occur via different mechanisms, and that domain shuffling plays a role in the formation of protein families. The ratio of protein family birth to protein domain birth (domain shuffling index) suggests that shuffling had a more demonstrable effect on protein families in nematodes and arthropods than in vertebrates. Through the contrast of high and low domain shuffling indices at the lineages of Trichinella spiralis and Gallus gallus, we propose a link between protein redundancy and evolutionary changes controlled by domain shuffling; however, the speed of adaptation among the different lineages was relatively invariant. Evaluating the functions of protein families that appeared or disappeared at the last common ancestors (LCAs) of the three metazoan clades supports a correlation with organism adaptation. Furthermore, bursts of new protein families and domains in the LCAs of metazoans and vertebrates are consistent with whole genome duplications. Conclusion Metazoan speciation and adaptation were explored by birth/death and duplication/deletion events among protein families and domains. Our results provide insights into protein evolution and its bearing on metazoan evolution. PMID:22862991
Darbro, Benjamin W; Mahajan, Vinit B; Gakhar, Lokesh; Skeie, Jessica M; Campbell, Elizabeth; Wu, Shu; Bing, Xinyu; Millen, Kathleen J; Dobyns, William B; Kessler, John A; Jalali, Ali; Cremer, James; Segre, Alberto; Manak, J Robert; Aldinger, Kimerbly A; Suzuki, Satoshi; Natsume, Nagato; Ono, Maya; Hai, Huynh Dai; Viet, Le Thi; Loddo, Sara; Valente, Enza M; Bernardini, Laura; Ghonge, Nitin; Ferguson, Polly J; Bassuk, Alexander G
2013-08-01
We performed whole-exome sequencing of a family with autosomal dominant Dandy-Walker malformation and occipital cephaloceles and detected a mutation in the extracellular matrix (ECM) protein-encoding gene NID1. In a second family, protein interaction network analysis identified a mutation in LAMC1, which encodes a NID1-binding partner. Structural modeling of the NID1-LAMC1 complex demonstrated that each mutation disrupts the interaction. These findings implicate the ECM in the pathogenesis of Dandy-Walker spectrum disorders. © 2013 WILEY PERIODICALS, INC.
Genomic analysis of organismal complexity in the multicellular green alga Volvox carteri
DOE Office of Scientific and Technical Information (OSTI.GOV)
Prochnik, Simon E.; Umen, James; Nedelcu, Aurora
2010-07-01
Analysis of the Volvox carteri genome reveals that this green alga's increased organismal complexity and multicellularity are associated with modifications in protein families shared with its unicellular ancestor, and not with large-scale innovations in protein coding capacity. The multicellular green alga Volvox carteri and its morphologically diverse close relatives (the volvocine algae) are uniquely suited for investigating the evolution of multicellularity and development. We sequenced the 138 Mb genome of V. carteri and compared its {approx}14,500 predicted proteins to those of its unicellular relative, Chlamydomonas reinhardtii. Despite fundamental differences in organismal complexity and life history, the two species have similarmore » protein-coding potentials, and few species-specific protein-coding gene predictions. Interestingly, volvocine algal-specific proteins are enriched in Volvox, including those associated with an expanded and highly compartmentalized extracellular matrix. Our analysis shows that increases in organismal complexity can be associated with modifications of lineage-specific proteins rather than large-scale invention of protein-coding capacity.« less
Sela, Noa; Lachman, Oded; Reingold, Victoria; Dombrovsky, Aviv
2013-10-01
A novel virus was detected in watermelon plants (Citrullus lanatus Thunb.) infected with Melon necrotic spot virus (MNSV) using SOLiD next-generation sequence analysis. In addition to the expected MSNV genome, two double-stranded RNA (dsRNA) segments of 1,312 and 1,118 bp were also identified and sequenced from the purified virus preparations. These two dsRNA segments encode two putative partitivirus-related proteins, an RNA-dependent RNA polymerase (RdRP) and a capsid protein, which were sequenced. Genomic-sequence analysis and analysis of phylogenetic relationships indicate that these two dsRNAs together make up the genome of a novel Partitivirus. This virus was found to be closely related to the Pepper cryptic virus 1 and Raphanus sativus cryptic virus. It is suggested that this novel virus putatively named Citrullus lanatus cryptic virus be considered as a new member of the family Partitiviridae.
Armstrong, Stuart D; Xia, Dong; Bah, Germanus S; Krishna, Ritesh; Ngangyung, Henrietta F; LaCourse, E James; McSorley, Henry J; Kengne-Ouafo, Jonas A; Chounna-Ndongmo, Patrick W; Wanji, Samuel; Enyong, Peter A; Taylor, David W; Blaxter, Mark L; Wastling, Jonathan M; Tanya, Vincent N; Makepeace, Benjamin L
2016-08-01
Despite 40 years of control efforts, onchocerciasis (river blindness) remains one of the most important neglected tropical diseases, with 17 million people affected. The etiological agent, Onchocerca volvulus, is a filarial nematode with a complex lifecycle involving several distinct stages in the definitive host and blackfly vector. The challenges of obtaining sufficient material have prevented high-throughput studies and the development of novel strategies for disease control and diagnosis. Here, we utilize the closest relative of O. volvulus, the bovine parasite Onchocerca ochengi, to compare stage-specific proteomes and host-parasite interactions within the secretome. We identified a total of 4260 unique O. ochengi proteins from adult males and females, infective larvae, intrauterine microfilariae, and fluid from intradermal nodules. In addition, 135 proteins were detected from the obligate Wolbachia symbiont. Observed protein families that were enriched in all whole body extracts relative to the complete search database included immunoglobulin-domain proteins, whereas redox and detoxification enzymes and proteins involved in intracellular transport displayed stage-specific overrepresentation. Unexpectedly, the larval stages exhibited enrichment for several mitochondrial-related protein families, including members of peptidase family M16 and proteins which mediate mitochondrial fission and fusion. Quantification of proteins across the lifecycle using the Hi-3 approach supported these qualitative analyses. In nodule fluid, we identified 94 O. ochengi secreted proteins, including homologs of transforming growth factor-β and a second member of a novel 6-ShK toxin domain family, which was originally described from a model filarial nematode (Litomosoides sigmodontis). Strikingly, the 498 bovine proteins identified in nodule fluid were strongly dominated by antimicrobial proteins, especially cathelicidins. This first high-throughput analysis of an Onchocerca spp. proteome across the lifecycle highlights its profound complexity and emphasizes the extremely close relationship between O. ochengi and O. volvulus The insights presented here provide new candidates for vaccine development, drug targeting and diagnostic biomarkers. © 2016 by The American Society for Biochemistry and Molecular Biology, Inc.
Armstrong, Stuart D.; Xia, Dong; Bah, Germanus S.; Krishna, Ritesh; Ngangyung, Henrietta F.; LaCourse, E. James; McSorley, Henry J.; Kengne-Ouafo, Jonas A.; Chounna-Ndongmo, Patrick W.; Wanji, Samuel; Enyong, Peter A.; Taylor, David W.; Blaxter, Mark L.; Wastling, Jonathan M.; Tanya, Vincent N.; Makepeace, Benjamin L.
2016-01-01
Despite 40 years of control efforts, onchocerciasis (river blindness) remains one of the most important neglected tropical diseases, with 17 million people affected. The etiological agent, Onchocerca volvulus, is a filarial nematode with a complex lifecycle involving several distinct stages in the definitive host and blackfly vector. The challenges of obtaining sufficient material have prevented high-throughput studies and the development of novel strategies for disease control and diagnosis. Here, we utilize the closest relative of O. volvulus, the bovine parasite Onchocerca ochengi, to compare stage-specific proteomes and host-parasite interactions within the secretome. We identified a total of 4260 unique O. ochengi proteins from adult males and females, infective larvae, intrauterine microfilariae, and fluid from intradermal nodules. In addition, 135 proteins were detected from the obligate Wolbachia symbiont. Observed protein families that were enriched in all whole body extracts relative to the complete search database included immunoglobulin-domain proteins, whereas redox and detoxification enzymes and proteins involved in intracellular transport displayed stage-specific overrepresentation. Unexpectedly, the larval stages exhibited enrichment for several mitochondrial-related protein families, including members of peptidase family M16 and proteins which mediate mitochondrial fission and fusion. Quantification of proteins across the lifecycle using the Hi-3 approach supported these qualitative analyses. In nodule fluid, we identified 94 O. ochengi secreted proteins, including homologs of transforming growth factor-β and a second member of a novel 6-ShK toxin domain family, which was originally described from a model filarial nematode (Litomosoides sigmodontis). Strikingly, the 498 bovine proteins identified in nodule fluid were strongly dominated by antimicrobial proteins, especially cathelicidins. This first high-throughput analysis of an Onchocerca spp. proteome across the lifecycle highlights its profound complexity and emphasizes the extremely close relationship between O. ochengi and O. volvulus. The insights presented here provide new candidates for vaccine development, drug targeting and diagnostic biomarkers. PMID:27226403
UBIAD1 Mutation Alters a Mitochondrial Prenyltransferase to Cause Schnyder Corneal Dystrophy
Nickerson, Michael L.; Kostiha, Brittany N.; Brandt, Wolfgang; Fredericks, William; Xu, Ke-Ping; Yu, Fu-Shin; Gold, Bert; Chodosh, James; Goldberg, Marc; Lu, Da Wen; Yamada, Masakazu; Tervo, Timo M.; Grutzmacher, Richard; Croasdale, Chris; Hoeltzenbein, Maria; Sutphin, John; Malkowicz, S. Bruce; Wessjohann, Ludger; Kruth, Howard S.; Dean, Michael; Weiss, Jayne S.
2010-01-01
Background Mutations in a novel gene, UBIAD1, were recently found to cause the autosomal dominant eye disease Schnyder corneal dystrophy (SCD). SCD is characterized by an abnormal deposition of cholesterol and phospholipids in the cornea resulting in progressive corneal opacification and visual loss. We characterized lesions in the UBIAD1 gene in new SCD families and examined protein homology, localization, and structure. Methodology/Principal Findings We characterized five novel mutations in the UBIAD1 gene in ten SCD families, including a first SCD family of Native American ethnicity. Examination of protein homology revealed that SCD altered amino acids which were highly conserved across species. Cell lines were established from patients including keratocytes obtained after corneal transplant surgery and lymphoblastoid cell lines from Epstein-Barr virus immortalized peripheral blood mononuclear cells. These were used to determine the subcellular localization of mutant and wild type protein, and to examine cholesterol metabolite ratios. Immunohistochemistry using antibodies specific for UBIAD1 protein in keratocytes revealed that both wild type and N102S protein were localized sub-cellularly to mitochondria. Analysis of cholesterol metabolites in patient cell line extracts showed no significant alteration in the presence of mutant protein indicating a potentially novel function of the UBIAD1 protein in cholesterol biochemistry. Molecular modeling was used to develop a model of human UBIAD1 protein in a membrane and revealed potentially critical roles for amino acids mutated in SCD. Potential primary and secondary substrate binding sites were identified and docking simulations indicated likely substrates including prenyl and phenolic molecules. Conclusions/Significance Accumulating evidence from the SCD familial mutation spectrum, protein homology across species, and molecular modeling suggest that protein function is likely down-regulated by SCD mutations. Mitochondrial UBIAD1 protein appears to have a highly conserved function that, at least in humans, is involved in cholesterol metabolism in a novel manner. PMID:20505825
Grapevine MLO candidates required for powdery mildew pathogenicity?
Feechan, Angela; Jermakow, Angelica M
2009-01-01
MLOs belong to the largest family of seven-transmembrane (7TM) domain proteins found in plants. The Arabidopsis and rice genomes contain 15 and 12 MLO family members, respectively. Although the biological function of most MLO family members remains elusive, a select group of MLO proteins have been demonstrated to negatively regulate defence responses to the obligate biotrophic pathogen, powdery mildew, thereby acting as “susceptibility” genes. Recently we identified a family of 17 putative VvMLO genes in the genome of the cultivated winegrape species, Vitis vinifera. Expression analysis indicated that the VvMLO family members respond differently to biotic and abiotic stimuli. Infection of V. vinifera by grape powdery mildew (Erysiphe necator) specifically upregulates four VvMLO genes that are orthologous to the Arabidopsis and tomato MLOs previously demonstrated to be required for powdery mildew susceptibility. We postulate that one or more of these E. necator responsive VvMLOs may have a role in the powdery mildew susceptibility of grapevine. PMID:19816131
Yang, Ming-ming; Ho, Mary; Lau, Henry H W; Tam, Pancy O S; Young, Alvin L; Pang, Chi Pui; Yip, Wilson W K; Chen, LiJia
2013-01-01
To determine the underlying genetic cause of Duane retraction syndrome (DRS) in a non-consanguineous Chinese Han family. Detailed ophthalmic and physical examinations were performed on all members from a pedigree with DRS. All exons and their adjacent splicing junctions of the sal-like 4 (SALL4) gene were amplified with polymerase chain reaction and analyzed with direct sequencing in all the recruited family members and 200 unrelated control subjects. Clinical examination revealed a broad spectrum of phenotypes in the DRS family. Mutation analysis of SALL4 identified a novel heterozygous duplication mutation, c.1919dupT, which was completely cosegregated with the disease in the family and absent in controls. This mutation was predicted to cause a frameshift, introducing a premature stop codon, when translated, resulting in a truncated SALL4 protein, i.e., p.Met640IlefsX25. Bioinformatics analysis showed that the affected region of SALL4 shared a highly conserved sequence across different species. Diversified clinical manifestations were observed in the c.1919dupT carriers of the family. We identified a novel truncating mutation in the SALL4 gene that leads to diversified clinical features of DRS in a Chinese family. This mutation is predicted to result in a truncated SALL4 protein affecting two functional domains and cause disease development due to haploinsufficiency through nonsense-mediated mRNA decay.
Ma, Yibao; Zhao, Yong; Zhao, Ruiming; Zhang, Weiping; He, Yawen; Wu, Yingliang; Cao, Zhijian; Guo, Lin; Li, Wenxin
2010-07-01
Scorpion venoms contain a vast untapped reservoir of natural products, which have the potential for medicinal value in drug discovery. In this study, toxin components from the scorpion Heterometrus petersii venom were evaluated by transcriptome and proteome analysis.Ten known families of venom peptides and proteins were identified, which include: two families of potassium channel toxins, four families of antimicrobial and cytolytic peptides,and one family from each of the calcium channel toxins, La1-like peptides, phospholipase A2,and the serine proteases. In addition, we also identified 12 atypical families, which include the acid phosphatases, diuretic peptides, and ten orphan families. From the data presented here, the extreme diversity and convergence of toxic components in scorpion venom was uncovered. Our work demonstrates the power of combining transcriptomic and proteomic approaches in the study of animal venoms.
Expansion of the receptor-like kinase/Pelle gene family and receptor-like proteins in Arabidopsis.
Shiu, Shin Han; Bleecker, Anthony B
2003-06-01
Receptor-like kinases (RLKs) are a family of transmembrane proteins with versatile N-terminal extracellular domains and C-terminal intracellular kinases. They control a wide range of physiological responses in plants and belong to one of the largest gene families in the Arabidopsis genome with more than 600 members. Interestingly, this gene family constitutes 60% of all kinases in Arabidopsis and accounts for nearly all transmembrane kinases in Arabidopsis. Analysis of four fungal, six metazoan, and two Plasmodium sp. genomes indicates that the family was represented in all but fungal genomes, indicating an ancient origin for the family with a more recent expansion only in the plant lineages. The RLK/Pelle family can be divided into several subfamilies based on three independent criteria: the phylogeny based on kinase domain sequences, the extracellular domain identities, and intron locations and phases. A large number of receptor-like proteins (RLPs) resembling the extracellular domains of RLKs are also found in the Arabidopsis genome. However, not all RLK subfamilies have corresponding RLPs. Several RLK/Pelle subfamilies have undergone differential expansions. More than 33% of the RLK/Pelle members are found in tandem clusters, substantially higher than the genome average. In addition, 470 of the RLK/Pelle family members are located within the segmentally duplicated regions in the Arabidopsis genome and 268 of them have a close relative in the corresponding regions. Therefore, tandem duplications and segmental/whole-genome duplications represent two of the major mechanisms for the expansion of the RLK/Pelle family in Arabidopsis.
Amino acid sequence analysis of the annexin super-gene family of proteins.
Barton, G J; Newman, R H; Freemont, P S; Crumpton, M J
1991-06-15
The annexins are a widespread family of calcium-dependent membrane-binding proteins. No common function has been identified for the family and, until recently, no crystallographic data existed for an annexin. In this paper we draw together 22 available annexin sequences consisting of 88 similar repeat units, and apply the techniques of multiple sequence alignment, pattern matching, secondary structure prediction and conservation analysis to the characterisation of the molecules. The analysis clearly shows that the repeats cluster into four distinct families and that greatest variation occurs within the repeat 3 units. Multiple alignment of the 88 repeats shows amino acids with conserved physicochemical properties at 22 positions, with only Gly at position 23 being absolutely conserved in all repeats. Secondary structure prediction techniques identify five conserved helices in each repeat unit and patterns of conserved hydrophobic amino acids are consistent with one face of a helix packing against the protein core in predicted helices a, c, d, e. Helix b is generally hydrophobic in all repeats, but contains a striking pattern of repeat-specific residue conservation at position 31, with Arg in repeats 4 and Glu in repeats 2, but unconserved amino acids in repeats 1 and 3. This suggests repeats 2 and 4 may interact via a buried saltbridge. The loop between predicted helices a and b of repeat 3 shows features distinct from the equivalent loop in repeats 1, 2 and 4, suggesting an important structural and/or functional role for this region. No compelling evidence emerges from this study for uteroglobin and the annexins sharing similar tertiary structures, or for uteroglobin representing a derivative of a primordial one-repeat structure that underwent duplication to give the present day annexins. The analyses performed in this paper are re-evaluated in the Appendix, in the light of the recently published X-ray structure for human annexin V. The structure confirms most of the predictions and shows the power of techniques for the determination of tertiary structural information from the amino acid sequences of an aligned protein family.
Oliveira, Alberto F; Folador, Edson L; Gomide, Anne C P; Goes-Neto, Aristóteles; Azevedo, Vasco A C; Wattam, Alice R
2018-02-15
The genus Corynebacterium includes species of great importance in medical, veterinary and biotechnological fields. The genus-specific families (PLfams) from PATRIC have been used to observe conserved proteins associated to all species. Our results showed a large number of conserved proteins that are associated with the cellular division process. Was not observe in our results other proteins like FtsA and ZapA that interact with FtsZ. Our findings point that SepF overlaps the function of this proteins explored by molecular docking, protein-protein interaction and sequence analysis. Transcriptomic analysis showed that these two (Sepf and FtsZ) proteins can be expressed in different conditions together. The work presents novelties on molecules participating in the cell division event, from the interaction of FtsZ and SepF, as new therapeutic targets.
1999-07-01
patients with Ph’-positive leukemias also revealed loss of Abi proteins. We determined by RNase protection assay and reverse transcriptase polymerase...myelogenous leukemia . Abi protein levels also appeared unaltered by Western blot analysis of human lung, liver, colon, and breast carcinoma tissues as...generated in the presence of Bcr-Abl • Abi protein degradation was observed in Ph’+ leukemia -derived cells, but not in Ph1- leukemias or in human breast
Oates, A C; Wollberg, P; Achen, M G; Wilks, A F
1998-08-28
The polymerase chain reaction (PCR), with cDNA as template, has been widely used to identify members of protein families from many species. A major limitation of using cDNA in PCR is that detection of a family member is dependent on temporal and spatial patterns of gene expression. To circumvent this restriction, and in order to develop a technique that is broadly applicable we have tested the use of genomic DNA as PCR template to identify members of protein families in an expression-independent manner. This test involved amplification of DNA encoding protein tyrosine kinase (PTK) genes from the genomes of three animal species that are well known development models; namely, the mouse Mus musculus, the fruit fly Drosophila melanogaster, and the nematode worm Caenorhabditis elegans. Ten PTK genes were identified from the mouse, 13 from the fruit fly, and 13 from the nematode worm. Among these kinases were 13 members of the PTK family that had not been reported previously. Selected PTKs from this screen were shown to be expressed during development, demonstrating that the amplified fragments did not arise from pseudogenes. This approach will be useful for the identification of many novel members of gene families in organisms of agricultural, medical, developmental and evolutionary significance and for analysis of gene families from any species, or biological sample whose habitat precludes the isolation of mRNA. Furthermore, as a tool to hasten the discovery of members of gene families that are of particular interest, this method offers an opportunity to sample the genome for new members irrespective of their expression pattern.
Maršálová, Lucie; Vítámvás, Pavel; Hynek, Radovan; Prášil, Ilja T.; Kosová, Klára
2016-01-01
Response to a high salinity treatment of 300 mM NaCl was studied in a cultivated barley Hordeum vulgare Syrian cultivar Tadmor and in a halophytic wild barley H. marinum. Differential salinity tolerance of H. marinum and H. vulgare is underlied by qualitative and quantitative differences in proteins involved in a variety of biological processes. The major aim was to identify proteins underlying differential salinity tolerance between the two barley species. Analyses of plant water content, osmotic potential and accumulation of proline and dehydrin proteins under high salinity revealed a relatively higher water saturation deficit in H. marinum than in H. vulgare while H. vulgare had lower osmotic potential corresponding with high levels of proline and dehydrins. Analysis of proteins soluble upon boiling isolated from control and salt-treated crown tissues revealed similarities as well as differences between H. marinum and H. vulgare. The similar salinity responses of both barley species lie in enhanced levels of stress-protective proteins such as defense-related proteins from late-embryogenesis abundant family, several chaperones from heat shock protein family, and others such as GrpE. However, there have also been found significant differences between H. marinum and H. vulgare salinity response indicating an active stress acclimation in H. marinum while stress damage in H. vulgare. An active acclimation to high salinity in H. marinum is underlined by enhanced levels of several stress-responsive transcription factors from basic leucine zipper and nascent polypeptide-associated complex families. In salt-treated H. marinum, enhanced levels of proteins involved in energy metabolism such as glycolysis, ATP metabolism, and photosynthesis-related proteins indicate an active acclimation to enhanced energy requirements during an establishment of novel plant homeostasis. In contrast, changes at proteome level in salt-treated H. vulgare indicate plant tissue damage as revealed by enhanced levels of proteins involved in proteasome-dependent protein degradation and proteins related to apoptosis. The results of proteomic analysis clearly indicate differential responses to high salinity and provide more profound insight into biological mechanisms underlying salinity response between two barley species with contrasting salinity tolerance. PMID:27536311
Identification and characterisation of seed storage protein transcripts from Lupinus angustifolius
2011-01-01
Background In legumes, seed storage proteins are important for the developing seedling and are an important source of protein for humans and animals. Lupinus angustifolius (L.), also known as narrow-leaf lupin (NLL) is a grain legume crop that is gaining recognition as a potential human health food as the grain is high in protein and dietary fibre, gluten-free and low in fat and starch. Results Genes encoding the seed storage proteins of NLL were characterised by sequencing cDNA clones derived from developing seeds. Four families of seed storage proteins were identified and comprised three unique α, seven β, two γ and four δ conglutins. This study added eleven new expressed storage protein genes for the species. A comparison of the deduced amino acid sequences of NLL conglutins with those available for the storage proteins of Lupinus albus (L.), Pisum sativum (L.), Medicago truncatula (L.), Arachis hypogaea (L.) and Glycine max (L.) permitted the analysis of a phylogenetic relationships between proteins and demonstrated, in general, that the strongest conservation occurred within species. In the case of 7S globulin (β conglutins) and 2S sulphur-rich albumin (δ conglutins), the analysis suggests that gene duplication occurred after legume speciation. This contrasted with 11S globulin (α conglutin) and basic 7S (γ conglutin) sequences where some of these sequences appear to have diverged prior to speciation. The most abundant NLL conglutin family was β (56%), followed by α (24%), δ (15%) and γ (6%) and the transcript levels of these genes increased 103 to 106 fold during seed development. We used the 16 NLL conglutin sequences identified here to determine that for individuals specifically allergic to lupin, all seven members of the β conglutin family were potential allergens. Conclusion This study has characterised 16 seed storage protein genes in NLL including 11 newly-identified members. It has helped lay the foundation for efforts to use molecular breeding approaches to improve lupins, for example by reducing allergens or increasing the expression of specific seed storage protein(s) with desirable nutritional properties. PMID:21457583
Identification and characterisation of seed storage protein transcripts from Lupinus angustifolius.
Foley, Rhonda C; Gao, Ling-Ling; Spriggs, Andrew; Soo, Lena Y C; Goggin, Danica E; Smith, Penelope M C; Atkins, Craig A; Singh, Karam B
2011-04-04
In legumes, seed storage proteins are important for the developing seedling and are an important source of protein for humans and animals. Lupinus angustifolius (L.), also known as narrow-leaf lupin (NLL) is a grain legume crop that is gaining recognition as a potential human health food as the grain is high in protein and dietary fibre, gluten-free and low in fat and starch. Genes encoding the seed storage proteins of NLL were characterised by sequencing cDNA clones derived from developing seeds. Four families of seed storage proteins were identified and comprised three unique α, seven β, two γ and four δ conglutins. This study added eleven new expressed storage protein genes for the species. A comparison of the deduced amino acid sequences of NLL conglutins with those available for the storage proteins of Lupinus albus (L.), Pisum sativum (L.), Medicago truncatula (L.), Arachis hypogaea (L.) and Glycine max (L.) permitted the analysis of a phylogenetic relationships between proteins and demonstrated, in general, that the strongest conservation occurred within species. In the case of 7S globulin (β conglutins) and 2S sulphur-rich albumin (δ conglutins), the analysis suggests that gene duplication occurred after legume speciation. This contrasted with 11S globulin (α conglutin) and basic 7S (γ conglutin) sequences where some of these sequences appear to have diverged prior to speciation. The most abundant NLL conglutin family was β (56%), followed by α (24%), δ (15%) and γ (6%) and the transcript levels of these genes increased 103 to 106 fold during seed development. We used the 16 NLL conglutin sequences identified here to determine that for individuals specifically allergic to lupin, all seven members of the β conglutin family were potential allergens. This study has characterised 16 seed storage protein genes in NLL including 11 newly-identified members. It has helped lay the foundation for efforts to use molecular breeding approaches to improve lupins, for example by reducing allergens or increasing the expression of specific seed storage protein(s) with desirable nutritional properties.
Protein family clustering for structural genomics.
Yan, Yongpan; Moult, John
2005-10-28
A major goal of structural genomics is the provision of a structural template for a large fraction of protein domains. The magnitude of this task depends on the number and nature of protein sequence families. With a large number of bacterial genomes now fully sequenced, it is possible to obtain improved estimates of the number and diversity of families in that kingdom. We have used an automated clustering procedure to group all sequences in a set of genomes into protein families. Bench-marking shows the clustering method is sensitive at detecting remote family members, and has a low level of false positives. This comprehensive protein family set has been used to address the following questions. (1) What is the structure coverage for currently known families? (2) How will the number of known apparent families grow as more genomes are sequenced? (3) What is a practical strategy for maximizing structure coverage in future? Our study indicates that approximately 20% of known families with three or more members currently have a representative structure. The study indicates also that the number of apparent protein families will be considerably larger than previously thought: We estimate that, by the criteria of this work, there will be about 250,000 protein families when 1000 microbial genomes have been sequenced. However, the vast majority of these families will be small, and it will be possible to obtain structural templates for 70-80% of protein domains with an achievable number of representative structures, by systematically sampling the larger families.
Islam, Shiful; Rahman, Iffat Ara; Islam, Tahmina
2017-01-01
Glutathione S-transferase (GST) refers to one of the major detoxifying enzymes that plays an important role in different abiotic and biotic stress modulation pathways of plant. The present study aimed to a comprehensive genome-wide functional characterization of GST genes and proteins in tomato (Solanum lycopersicum L.). The whole genome sequence analysis revealed the presence of 90 GST genes in tomato, the largest GST gene family reported till date. Eight segmental duplicated gene pairs might contribute significantly to the expansion of SlGST gene family. Based on phylogenetic analysis of tomato, rice, and Arabidopsis GST proteins, GST family members could be further divided into ten classes. Members of each orthologous class showed high conservancy among themselves. Tau and lambda are the major classes of tomato; while tau and phi are the major classes for rice and Arabidopsis. Chromosomal localization revealed highly uneven distribution of SlGST genes in 13 different chromosomes, where chromosome 9 possessed the highest number of genes. Based on publicly available microarray data, expression analysis of 30 available SlGST genes exhibited a differential pattern in all the analyzed tissues and developmental stages. Moreover, most of the members showed highly induced expression in response to multiple biotic and abiotic stress inducers that could be harmonized with the increase in total GST enzyme activity under several stress conditions. Activity of tomato GST could be enhanced further by using some positive modulators (safeners) that have been predicted through molecular docking of SlGSTU5 and ligands. Moreover, tomato GST proteins are predicted to interact with a lot of other glutathione synthesizing and utilizing enzymes such as glutathione peroxidase, glutathione reductase, glutathione synthetase and γ-glutamyltransferase. This comprehensive genome-wide analysis and expression profiling would provide a rational platform and possibility to explore the versatile role of GST genes in crop engineering. PMID:29095889
DOE Office of Scientific and Technical Information (OSTI.GOV)
Corbin, Cyrielle; Drouet, Samantha; Markulin, Lucija
Identification of DIR encoding genes in flax genome. Analysis of phylogeny, gene/protein structures and evolution. Identification of new conserved motifs linked to biochemical functions. Investigation of spatio-temporal gene expression and response to stress. Dirigent proteins (DIRs) were discovered during 8-8' lignan biosynthesis studies, through identification of stereoselective coupling to afford either (+)- or (-)-pinoresinols from E-coniferyl alcohol. DIRs are also involved or potentially involved in terpenoid, allyl/propenyl phenol lignan, pterocarpan and lignin biosynthesis. DIRs have very large multigene families in different vascular plants including flax, with most still of unknown function. DIR studies typically focus on a small subset ofmore » genes and identification of biochemical/physiological functions. Herein, a genome-wide analysis and characterization of the predicted flax DIR 44-membered multigene family was performed, this species being a rich natural grain source of 8-8' linked secoisolariciresinol-derived lignan oligomers. All predicted DIR sequences, including their promoters, were analyzed together with their public gene expression datasets. Expression patterns of selected DIRs were examined using qPCR, as well as through clustering analysis of DIR gene expression. These analyses further implicated roles for specific DIRs in (-)-pinoresinol formation in seed-coats, as well as (+)-pinoresinol in vegetative organs and/or specific responses to stress. Phylogeny and gene expression analysis segregated flax DIRs into six distinct clusters with new cluster-specific motifs identified. We propose that these findings can serve as a foundation to further systematically determine functions of DIRs, i.e. other than those already known in lignan biosynthesis in flax and other species. Given the differential expression profiles and inducibility of the flax DIR family, we provisionally propose that some DIR genes of unknown function could be involved in different aspects of secondary cell wall biosynthesis and plant defense.« less
Corbin, Cyrielle; Drouet, Samantha; Markulin, Lucija; Auguin, Daniel; Lainé, Éric; Davin, Laurence B; Cort, John R; Lewis, Norman G; Hano, Christophe
2018-05-01
Identification of DIR encoding genes in flax genome. Analysis of phylogeny, gene/protein structures and evolution. Identification of new conserved motifs linked to biochemical functions. Investigation of spatio-temporal gene expression and response to stress. Dirigent proteins (DIRs) were discovered during 8-8' lignan biosynthesis studies, through identification of stereoselective coupling to afford either (+)- or (-)-pinoresinols from E-coniferyl alcohol. DIRs are also involved or potentially involved in terpenoid, allyl/propenyl phenol lignan, pterocarpan and lignin biosynthesis. DIRs have very large multigene families in different vascular plants including flax, with most still of unknown function. DIR studies typically focus on a small subset of genes and identification of biochemical/physiological functions. Herein, a genome-wide analysis and characterization of the predicted flax DIR 44-membered multigene family was performed, this species being a rich natural grain source of 8-8' linked secoisolariciresinol-derived lignan oligomers. All predicted DIR sequences, including their promoters, were analyzed together with their public gene expression datasets. Expression patterns of selected DIRs were examined using qPCR, as well as through clustering analysis of DIR gene expression. These analyses further implicated roles for specific DIRs in (-)-pinoresinol formation in seed-coats, as well as (+)-pinoresinol in vegetative organs and/or specific responses to stress. Phylogeny and gene expression analysis segregated flax DIRs into six distinct clusters with new cluster-specific motifs identified. We propose that these findings can serve as a foundation to further systematically determine functions of DIRs, i.e. other than those already known in lignan biosynthesis in flax and other species. Given the differential expression profiles and inducibility of the flax DIR family, we provisionally propose that some DIR genes of unknown function could be involved in different aspects of secondary cell wall biosynthesis and plant defense.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Liu, Yueyong; Xu, Yanhui; Zhu, Jieqing
2005-09-01
Single crystals of the central structure domains from mumps virus F protein have been obtained by the hanging-drop vapour-diffusion method. A diffraction data set has been collected to 2.2 Å resolution. Fusion of members of the Paramyxoviridae family involves two glycoproteins: the attachment protein and the fusion protein. Changes in the fusion-protein conformation were caused by binding of the attachment protein to the cellular receptor. In the membrane-fusion process, two highly conserved heptad-repeat (HR) regions, HR1 and HR2, are believed to form a stable six-helix coiled-coil bundle. However, no crystal structure has yet been determined for this state in themore » mumps virus (MuV, a member of the Paramyxoviridae family). In this study, a single-chain protein consisting of two HR regions connected by a flexible amino-acid linker (named 2-Helix) was expressed, purified and crystallized by the hanging-drop vapour-diffusion method. A complete X-ray data set was obtained in-house to 2.2 Å resolution from a single crystal. The crystal belongs to space group C2, with unit-cell parameters a = 161.2, b = 60.8, c = 40.1 Å, β = 98.4°. The crystal structure will help in understanding the molecular mechanism of Paramyxoviridae family membrane fusion.« less
microRNAs affect BCL-2 family proteins in the setting of cerebral ischemia
Ouyang, Yi-Bing; Giffard, Rona G.
2014-01-01
The BCL-2 family is centrally involved in the mechanism of cell death after cerebral ischemia. It is well known that the proteins of the BCL-2 family are key regulators of apoptosis through controlling mitochondrial outer membrane permeabilization. Recent findings suggest that many BCL-2 family members are also directly involved in controlling transmission of Ca2+ from the endoplasmic reticulum (ER) to mitochondria through a specialization called the mitochondria-associated ER membrane (MAM). Increasing evidence supports the involvement of microRNAs (miRNA), some of them targeting BCL-2 family proteins, in the regulation of cerebral ischemia. In this mini-review, after highlighting current knowledge about the multiple functions of BCL-2 family proteins and summarizing their relationship to outcome from cerebral ischemia, we focus on the regulation of BCL-2 family proteins by miRNAs, especially miR-29 which targets multiple BCL-2 family proteins. PMID:24373752
Whole-Genome Survey of the Putative ATP-Binding Cassette Transporter Family Genes in Vitis vinifera
Çakır, Birsen; Kılıçkaya, Ozan
2013-01-01
The ATP-binding cassette (ABC) protein superfamily constitutes one of the largest protein families known in plants. In this report, we performed a complete inventory of ABC protein genes in Vitis vinifera, the whole genome of which has been sequenced. By comparison with ABC protein members of Arabidopsis thaliana, we identified 135 putative ABC proteins with 1 or 2 NBDs in V. vinifera. Of these, 120 encode intrinsic membrane proteins, and 15 encode proteins missing TMDs. V. vinifera ABC proteins can be divided into 13 subfamilies with 79 “full-size,” 41 “half-size,” and 15 “soluble” putative ABC proteins. The main feature of the Vitis ABC superfamily is the presence of 2 large subfamilies, ABCG (pleiotropic drug resistance and white-brown complex homolog) and ABCC (multidrug resistance-associated protein). We identified orthologs of V. vinifera putative ABC transporters in different species. This work represents the first complete inventory of ABC transporters in V. vinifera. The identification of Vitis ABC transporters and their comparative analysis with the Arabidopsis counterparts revealed a strong conservation between the 2 species. This inventory could help elucidate the biological and physiological functions of these transporters in V. vinifera. PMID:24244377
Zhang, Ningbo; Li, Ruimin; Shen, Wei; Jiao, Shuzhen; Zhang, Junxiang; Xu, Weirong
2018-04-27
The major latex protein/ripening-related protein (MLP/RRP) subfamily is known to be involved in a wide range of biological processes of plant development and various stress responses. However, the biological function of MLP/RRP proteins is still far from being clear and identification of them may provide important clues for understanding their roles. Here, we report a genome-wide evolutionary characterization and gene expression analysis of the MLP family in European Vitis species. A total of 14 members, was found in the grape genome, all of which are located on chromosome 1, where are predominantly arranged in tandem clusters. We have noticed, most surprisingly, promoter-sharing by several non-identical but highly similar gene members to a greater extent than expected by chance. Synteny analysis between the grape and Arabidopsis thaliana genomes suggested that 3 grape MLP genes arose before the divergence of the two species. Phylogenetic analysis provided further insights into the evolutionary relationship between the genes, as well as their putative functions, and tissue-specific expression analysis suggested distinct biological roles for different members. Our expression data suggested a couple of candidate genes involved in abiotic stresses and phytohormone responses. The present work provides new insight into the evolution and regulation of Vitis MLP genes, which represent targets for future studies and inclusion in tolerance-related molecular breeding programs.
Tsuji, Akihiko; Kikuchi, Yayoi; Sato, Yukimi; Koide, Shizuyo; Yuasa, Keizo; Nagahama, Masami; Matsuda, Yoshiko
2006-01-01
SPCs (subtilisin-like proprotein convertases) are a family of seven structurally related serine endoproteases that are involved in the proteolytic activation of proproteins. In an effort to examine the substrate protein for PACE4 (paired basic amino-acid-cleaving enzyme-4), an SPC, a potent protein inhibitor of PACE4, an α1-antitrypsin RVRR (Arg-Val-Arg-Arg) variant, was expressed in GH4C1 cells. Ectopic expression of the RVRR variant caused accumulation of the 48 kDa protein in cells. Sequence analysis indicates that the 48 kDa protein is a putative Ca2+-binding protein, RCN-3 (reticulocalbin-3), which had previously been predicted by bioinformatic analysis of cDNA from the human hypothalamus. RCN-3 is a member of the CREC (Cab45/reticulocalbin/ERC45/calumenin) family of multiple EF-hand Ca2+-binding proteins localized to the secretory pathway. The most interesting feature of the RCN-3 sequence is the presence of five Arg-Xaa-Xaa-Arg motifs, which represents the target sequence of SPCs. Biosynthetic studies showed that RCN-3 is transiently associated with proPACE4, but not with mature PACE4. Inhibition of PACE4 maturation by a Ca2+ ionophore resulted in accumulation of the proPACE4–RCN-3 complex in cells. Furthermore, autoactivation and secretion of PACE4 was increased upon co-expression with RCN-3. Our findings suggest that selective and transient association of RCN-3 with the precursor of PACE4 plays an important role in the biosynthesis of PACE4. PMID:16433634
Yan, Yan; Wang, Lianzhe; Ding, Zehong; Tie, Weiwei; Ding, Xupo; Zeng, Changying; Wei, Yunxie; Zhao, Hongliang; Peng, Ming; Hu, Wei
2016-01-01
Mitogen-activated protein kinases (MAPKs) play central roles in plant developmental processes, hormone signaling transduction, and responses to abiotic stress. However, no data are currently available about the MAPK family in cassava, an important tropical crop. Herein, 21 MeMAPK genes were identified from cassava. Phylogenetic analysis indicated that MeMAPKs could be classified into four subfamilies. Gene structure analysis demonstrated that the number of introns in MeMAPK genes ranged from 1 to 10, suggesting large variation among cassava MAPK genes. Conserved motif analysis indicated that all MeMAPKs had typical protein kinase domains. Transcriptomic analysis suggested that MeMAPK genes showed differential expression patterns in distinct tissues and in response to drought stress between wild subspecies and cultivated varieties. Interaction networks and co-expression analyses revealed that crucial pathways controlled by MeMAPK networks may be involved in the differential response to drought stress in different accessions of cassava. Expression of nine selected MAPK genes showed that these genes could comprehensively respond to osmotic, salt, cold, oxidative stressors, and abscisic acid (ABA) signaling. These findings yield new insights into the transcriptional control of MAPK gene expression, provide an improved understanding of abiotic stress responses and signaling transduction in cassava, and lead to potential applications in the genetic improvement of cassava cultivars. PMID:27625666
Immunogenic proteins of Brucella abortus to minimize cross reactions in brucellosis diagnosis.
Ko, Kyung Yuk; Kim, Jong-Wan; Her, Moon; Kang, Sung-Il; Jung, Suk Chan; Cho, Dong Hee; Kim, Ji-Yeon
2012-05-04
To overcome the limitations of serological diagnosis, including false positive reactions caused by other pathogens, specific antigens for diagnosis of brucellosis other than LPS have been required. The present study was conducted to separate and identify immuno-dominant insoluble proteins of Brucella abortus against the antisera of cattle infected with B. abortus, or/and Yersinia enterocolitica, or the sera of non-infected cattle. After separating insoluble proteins of B. abortus by two dimensional electrophoresis (2-DE), their immuno-reactivity was determined by western blotting. A portion of the immunogenic spots against the positive antisera of B. abortus that have the potential for use as specific antigens were identified by MS/MS analysis. Overall, 18 immunogenic insoluble proteins of B. abortus 1119-3 showed immuno-reactivity against only the positive antisera of B. abortus, but failed to have immunogenicity toward both the positive sera of Y. enterocolitica and the negative sera of B. abortus. Identification of these proteins revealed the following: F0F1 ATP synthase subunit β, solute-binding family 5 protein, 28 kDa OMP, Leu/Ile/Val-binding family protein, Histidinol dehyddrogenase, Hypothetical protein, Twin-arginine translocation pathway signal sequence domain-containing protein, Dihydroorotase, Serine protease family protein, β-hydroxyacyl-(acyl-carrier-protein) dehydratase FabA, Short-chain dehydrogenase-/reductase carbonic anhydrase, Orinithine carbamoyltransferase, Leucyl aminopeptidase, Cold shock DNA-binding domain-containing protein, Cu/Zn superoxide dismutase, and Methionine aminopeptidase. The 18 immunogenic proteins separated in the present study can be considered candidate antigens to minimize cross reaction in the diagnosis of brucellosis and useful sources for Brucella vaccine development. Copyright © 2011 Elsevier B.V. All rights reserved.
Defining the conserved internal architecture of a protein kinase.
Kornev, Alexandr P; Taylor, Susan S
2010-03-01
Protein kinases constitute a large protein family of important regulators in all eukaryotic cells. All of the protein kinases have a similar bilobal fold, and their key structural features have been well studied. However, the recent discovery of non-contiguous hydrophobic ensembles inside the protein kinase core shed new light on the internal organization of these molecules. Two hydrophobic "spines" traverse both lobes of the protein kinase molecule, providing a firm but flexible connection between its key elements. The spine model introduces a useful framework for analysis of intramolecular communications, molecular dynamics, and drug design. Published by Elsevier B.V.
Konno, Kotaro; Shimura, Sachiko; Ueno, Chihiro; Arakawa, Toru; Nakamura, Masatoshi
2018-03-01
MLX56 family defense proteins, MLX56 and its close homolog LA-b, are chitin-binding defense proteins found in mulberry latex that show strong growth-inhibitions against caterpillars when fed at concentrations as low as 0.01%. MLX56 family proteins contain a unique structure with an extensin domain surrounded by two hevein-like chitin-binding domains, but their defensive modes of action remain unclear. Here, we analyzed the effects of MLX56 family proteins on the peritrophic membrane (PM), a thin and soft membrane consisting of chitin that lines the midgut lumen of insects. We observed an abnormally thick (>1/5 the diameter of midgut) hard gel-like membrane consisted of chitin and MLX56 family proteins, MLX56 and LA-b, in the midgut of the Eri silkworms, Samia ricini, fed a diet containing MLX56 family proteins, MLX56 and LA-b. When polyoxin AL, a chitin-synthesis-inhibitor, was added to the diet containing MLX56 family proteins, the toxicity of MLX56 family proteins disappeared and PM became thinner and fragmented. These results suggest that MLX56 family proteins, through their chitin-binding domains, bind to the chitin framework of PM, then through their extensin-domain (gum arabic-like structure), which functions as swelling agent, expands PM into an abnormally thick membrane that inhibits the growth of insects. This study shows that MLX56 family proteins are plant defense lectins with a totally unique mode of action, and reveals the functions of extensin domains and arabinogalactan proteins as swelling (gel-forming) agents of plants. Copyright © 2018 Elsevier Ltd. All rights reserved.
Motohashi, Hozumi; O'Connor, Tania; Katsuoka, Fumiki; Engel, James Douglas; Yamamoto, Masayuki
2002-07-10
Recent progress in the analysis of transcriptional regulation has revealed the presence of an exquisite functional network comprising the Maf and Cap 'n' collar (CNC) families of regulatory proteins, many of which have been isolated. Among Maf factors, large Maf proteins are important in the regulation of embryonic development and cell differentiation, whereas small Maf proteins serve as obligatory heterodimeric partner molecules for members of the CNC family. Both Maf homodimers and CNC-small Maf heterodimers bind to the Maf recognition element (MARE). Since the MARE contains a consensus TRE sequence recognized by AP-1, Jun and Fos family members may act to compete or interfere with the function of CNC-small Maf heterodimers. Overall then, the quantitative balance of transcription factors interacting with the MARE determines its transcriptional activity. Many putative MARE-dependent target genes such as those induced by antioxidants and oxidative stress are under concerted regulation by the CNC family member Nrf2, as clearly proven by mouse germline mutagenesis. Since these genes represent a vital aspect of the cellular defense mechanism against oxidative stress, Nrf2-null mutant mice are highly sensitive to xenobiotic and oxidative insults. Deciphering the molecular basis of the regulatory network composed of Maf and CNC families of transcription factors will undoubtedly lead to a new paradigm for the cooperative function of transcription factors.
A functional genomic analysis of Arabidopsis thaliana PP2C clade D
USDA-ARS?s Scientific Manuscript database
In the reference dicot plant Arabidopsis thaliana, the PP2C family of P-protein phosphatases includes the products of 80 genes that have been separated into 10 multi-protein clades plus six singletons. Clade D includes the products of nine genes distributed among 3 chromosomes (PPD1, At3g12620; PPD2...
Tong, Xiangjun; Xia, Zhidan; Zu, Yao; Telfer, Helena; Hu, Jing; Yu, Jingyi; Liu, Huan; Zhang, Quan; Sodmergen; Lin, Shuo; Zhang, Bo
2013-01-25
The notochord is an important organ involved in embryonic patterning and locomotion. In zebrafish, the mature notochord consists of a single stack of fully differentiated, large vacuolated cells called chordocytes, surrounded by a single layer of less differentiated notochordal epithelial cells called chordoblasts. Through genetic analysis of zebrafish lines carrying pseudo-typed retroviral insertions, a mutant exhibiting a defective notochord with a granular appearance was isolated, and the corresponding gene was identified as ngs (notochord granular surface), which was specifically expressed in the notochord. In the mutants, the notochord started to degenerate from 32 hours post-fertilization, and the chordocytes were then gradually replaced by smaller cells derived from chordoblasts. The granular notochord phenotype was alleviated by anesthetizing the mutant embryos with tricaine to prevent muscle contraction and locomotion. Phylogenetic analysis showed that ngs encodes a new type of intermediate filament (IF) family protein, which we named chordostatin based on its function. Under the transmission electron microcopy, bundles of 10-nm-thick IF-like filaments were enriched in the chordocytes of wild-type zebrafish embryos, whereas the chordocytes in ngs mutants lacked IF-like structures. Furthermore, chordostatin-enhanced GFP (EGFP) fusion protein assembled into a filamentous network specifically in chordocytes. Taken together, our work demonstrates that ngs encodes a novel type of IF protein and functions to maintain notochord integrity for larval development and locomotion. Our work sheds light on the mechanisms of notochord structural maintenance, as well as the evolution and biological function of IF family proteins.
Tong, Xiangjun; Xia, Zhidan; Zu, Yao; Telfer, Helena; Hu, Jing; Yu, Jingyi; Liu, Huan; Zhang, Quan; Sodmergen; Lin, Shuo; Zhang, Bo
2013-01-01
The notochord is an important organ involved in embryonic patterning and locomotion. In zebrafish, the mature notochord consists of a single stack of fully differentiated, large vacuolated cells called chordocytes, surrounded by a single layer of less differentiated notochordal epithelial cells called chordoblasts. Through genetic analysis of zebrafish lines carrying pseudo-typed retroviral insertions, a mutant exhibiting a defective notochord with a granular appearance was isolated, and the corresponding gene was identified as ngs (notochord granular surface), which was specifically expressed in the notochord. In the mutants, the notochord started to degenerate from 32 hours post-fertilization, and the chordocytes were then gradually replaced by smaller cells derived from chordoblasts. The granular notochord phenotype was alleviated by anesthetizing the mutant embryos with tricaine to prevent muscle contraction and locomotion. Phylogenetic analysis showed that ngs encodes a new type of intermediate filament (IF) family protein, which we named chordostatin based on its function. Under the transmission electron microcopy, bundles of 10-nm-thick IF-like filaments were enriched in the chordocytes of wild-type zebrafish embryos, whereas the chordocytes in ngs mutants lacked IF-like structures. Furthermore, chordostatin-enhanced GFP (EGFP) fusion protein assembled into a filamentous network specifically in chordocytes. Taken together, our work demonstrates that ngs encodes a novel type of IF protein and functions to maintain notochord integrity for larval development and locomotion. Our work sheds light on the mechanisms of notochord structural maintenance, as well as the evolution and biological function of IF family proteins. PMID:23132861
Cholinesterase Confabs and Cousins: Approaching Forty Years
Taylor, Palmer; De Jaco, Antonella; Comoletti, Davide; Miller, Meghan; Camp, Shelley
2013-01-01
In the past four decades of cholinesterase (ChE) research, we have seen substantive evolution of the field from one centered around substrate and inhibitor kinetic profiles and compound characterizations to the analysis of ChE structure, first through the gene families and then by x-ray crystallographic determinations of the free enzymes and their complexes and conjugates. Indeed, these endeavors have been facilitated by recombinant DNA technologies, structure determinations and parallel studies in related proteins in the α/β-hydrolase fold family. This approach has not only contributed to a fundamental understanding of structure and function of a large family of hydrolase-like proteins possessing functions other than catalysis, but also has been used to develop new practical strategies for scavenging and antidotal activity in cases of organophosphate insecticide or nerve agent exposure. PMID:23085121
Online interactive analysis of protein structure ensembles with Bio3D-web.
Skjærven, Lars; Jariwala, Shashank; Yao, Xin-Qiu; Grant, Barry J
2016-11-15
Bio3D-web is an online application for analyzing the sequence, structure and conformational heterogeneity of protein families. Major functionality is provided for identifying protein structure sets for analysis, their alignment and refined structure superposition, sequence and structure conservation analysis, mapping and clustering of conformations and the quantitative comparison of their predicted structural dynamics. Bio3D-web is based on the Bio3D and Shiny R packages. All major browsers are supported and full source code is available under a GPL2 license from http://thegrantlab.org/bio3d-web CONTACT: bjgrant@umich.edu or lars.skjarven@uib.no. © The Author 2016. Published by Oxford University Press.
Nietzsche, Madlen; Schießl, Ingrid; Börnke, Frederik
2014-01-01
In plants, SNF1-related kinase (SnRK1) responds to the availability of carbohydrates as well as to environmental stresses by down-regulating ATP consuming biosynthetic processes, while stimulating energy-generating catabolic reactions through gene expression and post-transcriptional regulation. The functional SnRK1 complex is a heterotrimer where the catalytic α subunit associates with a regulatory β subunit and an activating γ subunit. Several different metabolites as well as the hormone abscisic acid (ABA) have been shown to modulate SnRK1 activity in a cell- and stimulus-type specific manner. It has been proposed that tissue- or stimulus-specific expression of adapter proteins mediating SnRK1 regulation can at least partly explain the differences observed in SnRK1 signaling. By using yeast two-hybrid and in planta bi-molecular fluorescence complementation assays we were able to demonstrate that proteins containing the domain of unknown function (DUF) 581 could interact with both isoforms of the SnRK1α subunit (AKIN10/11) of Arabidopsis. A structure/function analysis suggests that the DUF581 is a generic SnRK1 interaction module and co-expression with DUF581 proteins in plant cells leads to reallocation of the kinase to specific regions within the nucleus. Yeast two-hybrid analyses suggest that SnRK1 and DUF581 proteins share common interaction partners inside the nucleus. The analysis of available microarray data implies that expression of the 19 members of the DUF581 encoding gene family in Arabidopsis is differentially regulated by hormones and environmental cues, indicating specialized functions of individual family members. We hypothesize that DUF581 proteins could act as mediators conferring tissue- and stimulus-type specific differences in SnRK1 regulation.
Nietzsche, Madlen; Schießl, Ingrid; Börnke, Frederik
2014-01-01
In plants, SNF1-related kinase (SnRK1) responds to the availability of carbohydrates as well as to environmental stresses by down-regulating ATP consuming biosynthetic processes, while stimulating energy-generating catabolic reactions through gene expression and post-transcriptional regulation. The functional SnRK1 complex is a heterotrimer where the catalytic α subunit associates with a regulatory β subunit and an activating γ subunit. Several different metabolites as well as the hormone abscisic acid (ABA) have been shown to modulate SnRK1 activity in a cell- and stimulus-type specific manner. It has been proposed that tissue- or stimulus-specific expression of adapter proteins mediating SnRK1 regulation can at least partly explain the differences observed in SnRK1 signaling. By using yeast two-hybrid and in planta bi-molecular fluorescence complementation assays we were able to demonstrate that proteins containing the domain of unknown function (DUF) 581 could interact with both isoforms of the SnRK1α subunit (AKIN10/11) of Arabidopsis. A structure/function analysis suggests that the DUF581 is a generic SnRK1 interaction module and co-expression with DUF581 proteins in plant cells leads to reallocation of the kinase to specific regions within the nucleus. Yeast two-hybrid analyses suggest that SnRK1 and DUF581 proteins share common interaction partners inside the nucleus. The analysis of available microarray data implies that expression of the 19 members of the DUF581 encoding gene family in Arabidopsis is differentially regulated by hormones and environmental cues, indicating specialized functions of individual family members. We hypothesize that DUF581 proteins could act as mediators conferring tissue- and stimulus-type specific differences in SnRK1 regulation. PMID:24600465
A novel mutation in PAX3 associated with Waardenburg syndrome type I in a Chinese family.
Xiao, Yun; Luo, Jianfen; Zhang, Fengguo; Li, Jianfeng; Han, Yuechen; Zhang, Daogong; Wang, Mingming; Ma, Yalin; Xu, Lei; Bai, Xiaohui; Wang, Haibo
2016-01-01
The novel compound heterozygous mutation in PAX3 was the key genetic reason for WS1 in this family, which was useful to the molecular diagnosis of WS1. Screening the pathogenic mutations in a four generation Chinese family with Waardenburg syndrome type I (WS1). WS1 was diagnosed in a 4-year-old boy according to the Waardenburg syndrome Consortium criteria. The detailed family history revealed four affected members in the family. Routine clinical, audiological examination, and ophthalmologic evaluation were performed on four affected and 10 healthy members in this family. The genetic analysis was conducted, including the targeted next-generation sequencing of 127 known deafness genes combined with Sanger sequencing, TA clone and bioinformatic analysis. A novel compound heterozygous mutation c.[169_170insC;172_174delAAG] (p.His57ProfsX55) was identified in PAX3, which was co-segregated with WS1 in the Chinese family. This mutation was absent in the unaffected family members and 200 ethnicity-matched controls. The phylogenetic analysis and three-dimensional (3D) modeling of Pax3 protein further confirmed that the novel compound heterozygous mutation was pathogenic.
Freire, José E. C.; Vasconcelos, Ilka M.; Moreno, Frederico B. M. B.; Batista, Adelina B.; Lobo, Marina D. P.; Pereira, Mirella L.; Lima, João P. M. S.; Almeida, Ricardo V. M.; Sousa, Antônio J. S.; Monteiro-Moreira, Ana C. O.; Oliveira, José T. A.; Grangeiro, Thalles B.
2015-01-01
Mo-CBP3 is a chitin-binding protein from M. oleifera seeds that inhibits the germination and mycelial growth of phytopathogenic fungi. This protein is highly thermostable and resistant to pH changes, and therefore may be useful in the development of new antifungal drugs. However, the relationship of MoCBP3 with the known families of carbohydrate-binding domains has not been established. In the present study, full-length cDNAs encoding 4 isoforms of Mo-CBP3 (Mo-CBP3-1, Mo-CBP3-2, Mo-CBP3-3 and Mo-CBP3-4) were cloned from developing seeds. The polypeptides encoded by the Mo-CBP3 cDNAs were predicted to contain 160 (Mo-CBP3-3) and 163 amino acid residues (Mo-CBP3-1, Mo-CBP3-2 and Mo-CBP3-4) with a signal peptide of 20-residues at the N-terminal region. A comparative analysis of the deduced amino acid sequences revealed that Mo-CBP3 is a typical member of the 2S albumin family, as shown by the presence of an eight-cysteine motif, which is a characteristic feature of the prolamin superfamily. Furthermore, mass spectrometry analysis demonstrated that Mo-CBP3 is a mixture of isoforms that correspond to different mRNA products. The identification of Mo-CBP3 as a genuine member of the 2S albumin family reinforces the hypothesis that these seed storage proteins are involved in plant defense. Moreover, the chitin-binding ability of Mo-CBP3 reveals a novel functionality for a typical 2S albumin. PMID:25789746
EB-Family Proteins: Functions and Microtubule Interaction Mechanisms.
Mustyatsa, V V; Boyakhchyan, A V; Ataullakhanov, F I; Gudimchuk, N B
2017-07-01
Microtubules are polymers of tubulin protein, one of the key components of cytoskeleton. They are polar filaments whose plus-ends usually oriented toward the cell periphery are more dynamic than their minus-ends, which face the center of the cell. In cells, microtubules are organized into a network that is being constantly rebuilt and renovated due to stochastic switching of its individual filaments from growth to shrinkage and back. Because of these dynamics and their mechanical properties, microtubules take part in various essential processes, from intracellular transport to search and capture of chromosomes during mitosis. Microtubule dynamics are regulated by many proteins that are located on the plus-ends of these filaments. One of the most important and abundant groups of plus-end-interacting proteins are EB-family proteins, which autonomously recognize structures of the microtubule growing plus-ends, modulate their dynamics, and recruit multiple partner proteins with diverse functions onto the microtubule plus-ends. In this review, we summarize the published data about the properties and functions of EB-proteins, focusing on analysis of their mechanism of interaction with the microtubule growing ends.
Jung, Du-Kyo; Lee, Youra; Park, Sung Goo; Park, Byoung Chul; Kim, Ghyung-Hwa; Rhee, Sangkee
2006-01-01
The ureide pathway, which produces ureides from uric acid, is an essential purine catabolic process for storing and transporting the nitrogen fixed in leguminous plants and some bacteria. PucM from Bacillus subtilis was recently characterized and found to catalyze the second reaction of the pathway, hydrolyzing 5-hydroxyisourate (HIU), a product of uricase in the first step. PucM has 121 amino acid residues and shows high sequence similarity to the functionally unrelated protein transthyretin (TTR), a thyroid hormone-binding protein. Therefore, PucM belongs to the TTR-related proteins (TRP) family. The crystal structures of PucM at 2.0 Å and its complexes with the substrate analogs 8-azaxanthine and 5,6-diaminouracil reveal that even with their overall structure similarity, homotetrameric PucM and TTR are completely different, both in their electrostatic potential and in the size of the active sites located at the dimeric interface. Nevertheless, the absolutely conserved residues across the TRP family, including His-14, Arg-49, His-105, and the C-terminal Tyr-118–Arg-119–Gly-120–Ser-121, indeed form the active site of PucM. Based on the results of site-directed mutagenesis of these residues, we propose a possible mechanism for HIU hydrolysis. The PucM structure determined for the TRP family leads to the conclusion that diverse members of the TRP family would function similarly to PucM as HIU hydrolase. PMID:16782815
Srinivasan, E; Rajasekaran, R
2017-07-25
The genetic substitution mutation of Cys146Arg in the SOD1 protein is predominantly found in the Japanese population suffering from familial amyotrophic lateral sclerosis (FALS). A complete study of the biophysical aspects of this particular missense mutation through conformational analysis and producing free energy landscapes could provide an insight into the pathogenic mechanism of ALS disease. In this study, we utilized general molecular dynamics simulations along with computational predictions to assess the structural characterization of the protein as well as the conformational preferences of monomeric wild type and mutant SOD1. Our static analysis, accomplished through multiple programs, predicted the deleterious and destabilizing effect of mutant SOD1. Subsequently, comparative molecular dynamic studies performed on the wild type and mutant SOD1 indicated a loss in the protein conformational stability and flexibility. We observed the mutational consequences not only in local but also in long-range variations in the structural properties of the SOD1 protein. Long-range intramolecular protein interactions decrease upon mutation, resulting in less compact structures in the mutant protein rather than in the wild type, suggesting that the mutant structures are less stable than the wild type SOD1. We also presented the free energy landscape to study the collective motion of protein conformations through principal component analysis for the wild type and mutant SOD1. Overall, the study assisted in revealing the cause of the structural destabilization and protein misfolding via structural characterization, secondary structure composition and free energy landscapes. Hence, the computational framework in our study provides a valuable direction for the search for the cure against fatal FALS.
Liu, Zhong-Yuan; Wang, Yun; Lü, Guo-Dong; Wang, Xian-Lei; Zhang, Fu-Chun; Ma, Ji
2006-12-01
The partial cDNA sequence coding for the antifreeze proteins in the Tenebrio molitor was obtained by RT-PCR. Sequence analysis revealed nine putative cDNAs with a high degree of homology to Tenebrio molitor antifreeze proteins. The recombinant pGEX-4T-1-tmafp-XJ430 was introduced into E. coli BL21 to induce a GST fusion protein by IPTG. SDS-PAGE of the fusion protein demonstrated that the antifreeze protein migrated at a size of 38 kDa. The immunization was performed by intra-muscular injection of pCDNA3-tmafp-XJ430, and then antiserum was detected by ELISA. The titer of the antibody was 1:2,000. Western blotting analysis showed the antiserum was specific against the antifreeze protein. This finding could lead to further investigation of the properties and function of antifreeze proteins.
Alborghetti, Marcos R; Furlan, Ariane S; Kobarg, Jörg
2011-03-08
The FEZ (fasciculation and elongation protein zeta) family designation was purposed by Bloom and Horvitz by genetic analysis of C. elegans unc-76. Similar human sequences were identified in the expressed sequence tag database as FEZ1 and FEZ2. The unc-76 function is necessary for normal axon fasciculation and is required for axon-axon interactions. Indeed, the loss of UNC-76 function results in defects in axonal transport. The human FEZ1 protein has been shown to rescue defects caused by unc-76 mutations in nematodes, indicating that both UNC-76 and FEZ1 are evolutionarily conserved in their function. Until today, little is known about FEZ2 protein function. Using the yeast two-hybrid system we demonstrate here conserved evolutionary features among orthologs and non-conserved features between paralogs of the FEZ family of proteins, by comparing the interactome profiles of the C-terminals of human FEZ1, FEZ2 and UNC-76 from C. elegans. Furthermore, we correlate our data with an analysis of the molecular evolution of the FEZ protein family in the animal kingdom. We found that FEZ2 interacted with 59 proteins and that of these only 40 interacted with FEZ1. Of the 40 FEZ1 interacting proteins, 36 (90%), also interacted with UNC-76 and none of the 19 FEZ2 specific proteins interacted with FEZ1 or UNC-76. This together with the duplication of unc-76 gene in the ancestral line of chordates suggests that FEZ2 is in the process of acquiring new additional functions. The results provide also an explanation for the dramatic difference between C. elegans and D. melanogaster unc-76 mutants on one hand, which cause serious defects in the nervous system, and the mouse FEZ1 -/- knockout mice on the other, which show no morphological and no strong behavioural phenotype. Likely, the ubiquitously expressed FEZ2 can completely compensate the lack of neuronal FEZ1, since it can interact with all FEZ1 interacting proteins and additional 19 proteins.
Alborghetti, Marcos R.; Furlan, Ariane S.; Kobarg, Jörg
2011-01-01
Background The FEZ (fasciculation and elongation protein zeta) family designation was purposed by Bloom and Horvitz by genetic analysis of C. elegans unc-76. Similar human sequences were identified in the expressed sequence tag database as FEZ1 and FEZ2. The unc-76 function is necessary for normal axon fasciculation and is required for axon-axon interactions. Indeed, the loss of UNC-76 function results in defects in axonal transport. The human FEZ1 protein has been shown to rescue defects caused by unc-76 mutations in nematodes, indicating that both UNC-76 and FEZ1 are evolutionarily conserved in their function. Until today, little is known about FEZ2 protein function. Methodology/Principal Findings Using the yeast two-hybrid system we demonstrate here conserved evolutionary features among orthologs and non-conserved features between paralogs of the FEZ family of proteins, by comparing the interactome profiles of the C-terminals of human FEZ1, FEZ2 and UNC-76 from C. elegans. Furthermore, we correlate our data with an analysis of the molecular evolution of the FEZ protein family in the animal kingdom. Conclusions/Significance We found that FEZ2 interacted with 59 proteins and that of these only 40 interacted with FEZ1. Of the 40 FEZ1 interacting proteins, 36 (90%), also interacted with UNC-76 and none of the 19 FEZ2 specific proteins interacted with FEZ1 or UNC-76. This together with the duplication of unc-76 gene in the ancestral line of chordates suggests that FEZ2 is in the process of acquiring new additional functions. The results provide also an explanation for the dramatic difference between C. elegans and D. melanogaster unc-76 mutants on one hand, which cause serious defects in the nervous system, and the mouse FEZ1 -/- knockout mice on the other, which show no morphological and no strong behavioural phenotype. Likely, the ubiquitously expressed FEZ2 can completely compensate the lack of neuronal FEZ1, since it can interact with all FEZ1 interacting proteins and additional 19 proteins. PMID:21408165
Gao, Feng; Song, Weibo; Katz, Laura A
2014-08-01
In most lineages, diversity among gene family members results from gene duplication followed by sequence divergence. Because of the genome rearrangements during the development of somatic nuclei, gene family evolution in ciliates involves more complex processes. Previous work on the ciliate Chilodonella uncinata revealed that macronuclear β-tubulin gene family members are generated by alternative processing, in which germline regions are alternatively used in multiple macronuclear chromosomes. To further study genome evolution in this ciliate, we analyzed its transcriptome and found that (1) alternative processing is extensive among gene families; and (2) such gene families are likely to be C. uncinata specific. We characterized additional macronuclear and micronuclear copies of one candidate alternatively processed gene family-a protein kinase domain containing protein (PKc)-from two C. uncinata strains. Analysis of the PKc sequences reveals that (1) multiple PKc gene family members in the macronucleus share some identical regions flanked by divergent regions; and (2) the shared identical regions are processed from a single micronuclear chromosome. We discuss analogous processes in lineages across the eukaryotic tree of life to provide further insights on the impact of genome structure on gene family evolution in eukaryotes. © 2014 The Author(s). Evolution © 2014 The Society for the Study of Evolution.
Magwanga, Richard Odongo; Lu, Pu; Kirungu, Joy Nyangasi; Lu, Hejun; Wang, Xingxing; Cai, Xiaoyan; Zhou, Zhongli; Zhang, Zhenmei; Salih, Haron; Wang, Kunbo; Liu, Fang
2018-01-15
Late embryogenesis abundant (LEA) proteins are large groups of hydrophilic proteins with major role in drought and other abiotic stresses tolerance in plants. In-depth study and characterization of LEA protein families have been carried out in other plants, but not in upland cotton. The main aim of this research work was to characterize the late embryogenesis abundant (LEA) protein families and to carry out gene expression analysis to determine their potential role in drought stress tolerance in upland cotton. Increased cotton production in the face of declining precipitation and availability of fresh water for agriculture use is the focus for breeders, cotton being the backbone of textile industries and a cash crop for many countries globally. In this work, a total of 242, 136 and 142 LEA genes were identified in G. hirsutum, G. arboreum and G. raimondii respectively. The identified genes were classified into eight groups based on their conserved domain and phylogenetic tree analysis. LEA 2 were the most abundant, this could be attributed to their hydrophobic character. Upland cotton LEA genes have fewer introns and are distributed in all chromosomes. Majority of the duplicated LEA genes were segmental. Syntenic analysis showed that greater percentages of LEA genes are conserved. Segmental gene duplication played a key role in the expansion of LEA genes. Sixty three miRNAs were found to target 89 genes, such as miR164, ghr-miR394 among others. Gene ontology analysis revealed that LEA genes are involved in desiccation and defense responses. Almost all the LEA genes in their promoters contained ABRE, MBS, W-Box and TAC-elements, functionally known to be involved in drought stress and other stress responses. Majority of the LEA genes were involved in secretory pathways. Expression profile analysis indicated that most of the LEA genes were highly expressed in drought tolerant cultivars Gossypium tomentosum as opposed to drought susceptible, G. hirsutum. The tolerant genotypes have a greater ability to modulate genes under drought stress than the more susceptible upland cotton cultivars. The finding provides comprehensive information on LEA genes in upland cotton, G. hirsutum and possible function in plants under drought stress.
Hübner, Sebastian; Declerck, Nathalie; Diethmaier, Christine; Le Coq, Dominique; Aymerich, Stephane; Stülke, Jörg
2011-01-01
Each family of signal transduction systems requires specificity determinants that link individual signals to the correct regulatory output. In Bacillus subtilis, a family of four anti-terminator proteins controls the expression of genes for the utilisation of alternative sugars. These regulatory systems contain the anti-terminator proteins and a RNA structure, the RNA anti-terminator (RAT) that is bound by the anti-terminator proteins. We have studied three of these proteins (SacT, SacY, and LicT) to understand how they can transmit a specific signal in spite of their strong structural homology. A screen for random mutations that render SacT capable to bind a RNA structure recognized by LicT only revealed a substitution (P26S) at one of the few non-conserved residues that are in contact with the RNA. We have randomly modified this position in SacT together with another non-conserved RNA-contacting residue (Q31). Surprisingly, the mutant proteins could bind all RAT structures that are present in B. subtilis. In a complementary approach, reciprocal amino acid exchanges have been introduced in LicT and SacY at non-conserved positions of the RNA-binding site. This analysis revealed the key role of an arginine side-chain for both the high affinity and specificity of LicT for its cognate RAT. Introduction of this Arg at the equivalent position of SacY (A26) increased the RNA binding in vitro but also resulted in a relaxed specificity. Altogether our results suggest that this family of anti-termination proteins has evolved to reach a compromise between RNA binding efficacy and specific interaction with individual target sequences. PMID:21278164
Ludwig, Yvonne; Zhang, Yanxiang; Hochholdinger, Frank
2013-01-01
The plant hormone auxin plays a key role in the coordination of many aspects of growth and development. AUXIN/INDOLE-3-ACETIC ACID (Aux/IAA) genes encode instable primary auxin responsive regulators of plant development that display a protein structure with four characteristic domains. In the present study, a comprehensive analysis of the 34 members of the maize Aux/IAA gene family was performed. Phylogenetic reconstructions revealed two classes of Aux/IAA proteins that can be distinguished by alterations in their domain III. Seven pairs of paralogous maize Aux/IAA proteins were discovered. Comprehensive root-type and tissue-specific expression profiling revealed unique expression patterns of the diverse members of the gene family. Remarkably, five of seven pairs of paralogous genes displayed highly correlated expression patterns in roots. All but one (ZmIAA23) tested maize Aux/IAA genes were auxin inducible, displaying two types of auxin induction within three hours of treatment. Moreover, 51 of 55 (93%) differential Aux/IAA expression patterns between different root-types followed the expression tendency: crown roots > seminal roots > primary roots > lateral roots. This pattern might imply root-type-specific regulation of Aux/IAA transcript abundance. In summary, the detailed analysis of the maize Aux/IAA gene family provides novel insights in the evolution and developmental regulation and thus the function of these genes in different root-types and tissues. PMID:24223858
Matas-Arroyo, Antonio J.; Caballero, José Luis; Muñoz-Blanco, Juan
2018-01-01
NAC proteins are a family of transcription factors which have a variety of important regulatory roles in plants. They present a very well conserved group of NAC subdomains in the N-terminal region and a highly variable domain at the C-terminus. Currently, knowledge concerning NAC family in the strawberry plant remains very limited. In this work, we analyzed the NAC family of Fragaria vesca, and a total of 112 NAC proteins were identified after we curated the annotations from the version 4.0.a1 genome. They were placed into the ligation groups (pseudo-chromosomes) and described its physicochemical and genetic features. A microarray transcriptomic analysis showed six of them expressed during the development and ripening of the Fragaria x ananassa fruit. Their expression patterns were studied in fruit (receptacle and achenes) in different stages of development and in vegetative tissues. Also, the expression level under different hormonal treatments (auxins, ABA) and drought stress was investigated. In addition, they were clustered with other NAC transcription factor with known function related to growth and development, senescence, fruit ripening, stress response, and secondary cell wall and vascular development. Our results indicate that these six strawberry NAC proteins could play different important regulatory roles in the process of development and ripening of the fruit, providing the basis for further functional studies and the selection for NAC candidates suitable for biotechnological applications. PMID:29723301
Ludwig, Yvonne; Zhang, Yanxiang; Hochholdinger, Frank
2013-01-01
The plant hormone auxin plays a key role in the coordination of many aspects of growth and development. AUXIN/INDOLE-3-ACETIC ACID (Aux/IAA) genes encode instable primary auxin responsive regulators of plant development that display a protein structure with four characteristic domains. In the present study, a comprehensive analysis of the 34 members of the maize Aux/IAA gene family was performed. Phylogenetic reconstructions revealed two classes of Aux/IAA proteins that can be distinguished by alterations in their domain III. Seven pairs of paralogous maize Aux/IAA proteins were discovered. Comprehensive root-type and tissue-specific expression profiling revealed unique expression patterns of the diverse members of the gene family. Remarkably, five of seven pairs of paralogous genes displayed highly correlated expression patterns in roots. All but one (ZmIAA23) tested maize Aux/IAA genes were auxin inducible, displaying two types of auxin induction within three hours of treatment. Moreover, 51 of 55 (93%) differential Aux/IAA expression patterns between different root-types followed the expression tendency: crown roots > seminal roots > primary roots > lateral roots. This pattern might imply root-type-specific regulation of Aux/IAA transcript abundance. In summary, the detailed analysis of the maize Aux/IAA gene family provides novel insights in the evolution and developmental regulation and thus the function of these genes in different root-types and tissues.
Knutson, Stacy T.; Westwood, Brian M.; Leuthaeuser, Janelle B.; Turner, Brandon E.; Nguyendac, Don; Shea, Gabrielle; Kumar, Kiran; Hayden, Julia D.; Harper, Angela F.; Brown, Shoshana D.; Morris, John H.; Ferrin, Thomas E.; Babbitt, Patricia C.
2017-01-01
Abstract Protein function identification remains a significant problem. Solving this problem at the molecular functional level would allow mechanistic determinant identification—amino acids that distinguish details between functional families within a superfamily. Active site profiling was developed to identify mechanistic determinants. DASP and DASP2 were developed as tools to search sequence databases using active site profiling. Here, TuLIP (Two‐Level Iterative clustering Process) is introduced as an iterative, divisive clustering process that utilizes active site profiling to separate structurally characterized superfamily members into functionally relevant clusters. Underlying TuLIP is the observation that functionally relevant families (curated by Structure‐Function Linkage Database, SFLD) self‐identify in DASP2 searches; clusters containing multiple functional families do not. Each TuLIP iteration produces candidate clusters, each evaluated to determine if it self‐identifies using DASP2. If so, it is deemed a functionally relevant group. Divisive clustering continues until each structure is either a functionally relevant group member or a singlet. TuLIP is validated on enolase and glutathione transferase structures, superfamilies well‐curated by SFLD. Correlation is strong; small numbers of structures prevent statistically significant analysis. TuLIP‐identified enolase clusters are used in DASP2 GenBank searches to identify sequences sharing functional site features. Analysis shows a true positive rate of 96%, false negative rate of 4%, and maximum false positive rate of 4%. F‐measure and performance analysis on the enolase search results and comparison to GEMMA and SCI‐PHY demonstrate that TuLIP avoids the over‐division problem of these methods. Mechanistic determinants for enolase families are evaluated and shown to correlate well with literature results. PMID:28054422
Analysis of protein interactions within the cytokinin-signaling pathway of Arabidopsis thaliana.
Dortay, Hakan; Mehnert, Nijuscha; Bürkle, Lukas; Schmülling, Thomas; Heyl, Alexander
2006-10-01
The signal of the plant hormone cytokinin is perceived by membrane-located sensor histidine kinases and transduced by other members of the plant two-component system. In Arabidopsis thaliana, 28 two-component system proteins (phosphotransmitters and response regulators) act downstream of three receptors, transmitting the signal from the membrane to the nucleus and modulating the cellular response. Although the principal signaling mechanism has been elucidated, redundancy in the system has made it difficult to understand which of the many components interact to control the downstream biological processes. Here, we present a large-scale interaction study comprising most members of the Arabidopsis cytokinin signaling pathway. Using the yeast two-hybrid system, we detected 42 new interactions, of which more than 90% were confirmed by in vitro coaffinity purification. There are distinct patterns of interaction between protein families, but only a few interactions between proteins of the same family. An interaction map of this signaling pathway shows the Arabidopsis histidine phosphotransfer proteins as hubs, which interact with members from all other protein families, mostly in a redundant fashion. Domain-mapping experiments revealed the interaction domains of the proteins of this pathway. Analyses of Arabidopsis histidine phosphotransfer protein 5 mutant proteins showed that the presence of the canonical phospho-accepting histidine residue is not required for the interactions. Interaction of A-type response regulators with Arabidopsis histidine phosphotransfer proteins but not with B-type response regulators suggests that their known activity in feedback regulation may be realized by interfering at the level of Arabidopsis histidine phosphotransfer protein-mediated signaling. This study contributes to our understanding of the protein interactions of the cytokinin-signaling system and provides a framework for further functional studies in planta.
Yamazaki, Yasuo; Hyodo, Fumiko; Morita, Takashi
2003-04-01
Cysteine-rich secretory proteins (CRISPs) are found in epididymis and granules of mammals, and they are thought to function in sperm maturation and in the immune system. Recently, we isolated and obtained clones for novel snake venom proteins that are classified as CRISP family proteins. To elucidate the distribution of snake venom CRISP family proteins, we evaluated a wide range of venoms for immuno-cross-reactivity. Then we isolated, characterized, and cloned genes for three novel CRISP family proteins (piscivorin, ophanin, and catrin) from the venom of eastern cottonmouth (Agkistrodon piscivorus piscivorus), king cobra (Ophiophagus hannah), and western diamondback rattlesnake (Crotalus atrox). Our results show the wide distribution of snake venom CRISP family proteins among Viperidae and Elapidae from different continents, indicating that CRISP family proteins compose a new group of snake venom proteins.
Phadtare, Sangita; Severinov, Konstantin
2009-11-01
In Escherichia coli, temperature downshift elicits cold shock response, which is characterized by induction of cold shock proteins. CspA, the major cold shock protein of E. coli, helps cells to acclimatize to low temperature by melting the secondary structures in nucleic acids and acting as a transcription antiterminator. CspA and its homologues contain the cold shock domain and belong to the oligomer binding protein family, which also includes S1 domain proteins such as IF1. Structural similarity between IF1 and CspA homologues suggested a functional overlap between these proteins. Indeed IF1 can melt secondary structures in RNA and acts as transcription antiterminator in vivo and in vitro. Here, we show that in spite of having these critical activities, IF1 does not complement cold-sensitivity of a csp quadruple deletion strain. DNA microarray analysis shows that overproduction of IF1 and Csp leads to changes in expression of different sets of genes. Importantly, several genes which were previously shown to require Csp proteins for their expression at low temperature did not respond to IF1. Moreover, in vitro, we show that a transcription terminator responsive to Csp does not respond to IF1. Our results suggest that Csp proteins and IF1 have different sets of target genes as they may be suppressing the function of different types of transcription termination elements in specific genes.
Trujillo-Ocampo, Abel; Cázares-Raga, Febe Elena; Celestino-Montes, Antonio; Cortés-Martínez, Leticia; Rodríguez, Mario H; Hernández-Hernández, Fidel de la Cruz
2016-11-01
The 14-3-3 proteins are evolutionarily conserved acidic proteins that form a family with several isoforms in many cell types of plants and animals. In invertebrates, including dipteran and lepidopteran insects, only two isoforms have been reported. 14-3-3 proteins are scaffold molecules that form homo- or heterodimeric complexes, acting as molecular adaptors mediating phosphorylation-dependent interactions with signaling molecules involved in immunity, cell differentiation, cell cycle, proliferation, apoptosis, and cancer. Here, we describe the presence of two isoforms of 14-3-3 in the mosquito Aedes aegypti, the main vector of dengue, yellow fever, chikungunya, and zika viruses. Both isoforms have the conserved characteristics of the family: two protein signatures (PS1 and PS2), an annexin domain, three serine residues, targets for phosphorylation (positions 58, 184, and 233), necessary for their function, and nine alpha helix-forming segments. By sequence alignment and phylogenetic analysis, we found that the molecules correspond to Ɛ and ζ isoforms (Aeae14-3-3ε and Aeae14-3-3ζ). The messengers and protein products were present in all stages of the mosquito life cycle and all the tissues analyzed, with a small predominance of Aeae14-3-3ζ except in the midgut and ovaries of adult females. The 14-3-3 proteins in female midgut epithelial cells were located in the cytoplasm. Our results may provide insights to further investigate the functions of these proteins in mosquitoes. © 2016 Wiley Periodicals, Inc.
Zhou, Yonghong; Wang, Qianqian; Chang, Yinlong; Wang, Beilei; Zheng, Jiemin; Zhang, Liming
2014-01-01
Thioredoxins (Trx proteins) are a family of small, highly-conserved and ubiquitous proteins that play significant roles in the resistance of oxidative damage. In this study, a homologue of Trx was identified from the cDNA library of tentacle of the jellyfish Cyanea capillata and named CcTrx1. The full-length cDNA of CcTrx1 was 479 bp with a 312 bp open reading frame encoding 104 amino acids. Bioinformatics analysis revealed that the putative CcTrx1 protein harbored the evolutionarily-conserved Trx active site 31CGPC34 and shared a high similarity with Trx1 proteins from other organisms analyzed, indicating that CcTrx1 is a new member of Trx1 sub-family. CcTrx1 mRNA was found to be constitutively expressed in tentacle, umbrella, oral arm and gonad, indicating a general role of CcTrx1 protein in various physiological processes. The recombinant CcTrx1 (rCcTrx1) protein was expressed in Escherichia coli BL21 (DE3), and then purified by affinity chromatography. The rCcTrx1 protein was demonstrated to possess the expected redox activity in enzymatic analysis and protection against oxidative damage of supercoiled DNA. These results indicate that CcTrx1 may function as an important antioxidant in C. capillata. To our knowledge, this is the first Trx protein characterized from jellyfish species. PMID:24824597
Soybean kinome: functional classification and gene expression patterns
Liu, Jinyi; Chen, Nana; Grant, Joshua N.; Cheng, Zong-Ming (Max); Stewart, C. Neal; Hewezi, Tarek
2015-01-01
The protein kinase (PK) gene family is one of the largest and most highly conserved gene families in plants and plays a role in nearly all biological functions. While a large number of genes have been predicted to encode PKs in soybean, a comprehensive functional classification and global analysis of expression patterns of this large gene family is lacking. In this study, we identified the entire soybean PK repertoire or kinome, which comprised 2166 putative PK genes, representing 4.67% of all soybean protein-coding genes. The soybean kinome was classified into 19 groups, 81 families, and 122 subfamilies. The receptor-like kinase (RLK) group was remarkably large, containing 1418 genes. Collinearity analysis indicated that whole-genome segmental duplication events may have played a key role in the expansion of the soybean kinome, whereas tandem duplications might have contributed to the expansion of specific subfamilies. Gene structure, subcellular localization prediction, and gene expression patterns indicated extensive functional divergence of PK subfamilies. Global gene expression analysis of soybean PK subfamilies revealed tissue- and stress-specific expression patterns, implying regulatory functions over a wide range of developmental and physiological processes. In addition, tissue and stress co-expression network analysis uncovered specific subfamilies with narrow or wide interconnected relationships, indicative of their association with particular or broad signalling pathways, respectively. Taken together, our analyses provide a foundation for further functional studies to reveal the biological and molecular functions of PKs in soybean. PMID:25614662
Wang, Rong-Kai; Zhang, Rui-Fen; Hao, Yu-Jin
2013-01-01
The MYB proteins comprise one of the largest families of transcription factors (TFs) in plants. Although several MYB genes have been characterized to play roles in secondary metabolism, the MYB family has not yet been identified in apple. In this study, 229 apple MYB genes were identified through a genome-wide analysis and divided into 45 subgroups. A computational analysis was conducted using the apple genomic database to yield a complete overview of the MYB family, including the intron-exon organizations, the sequence features of the MYB DNA-binding domains, the carboxy-terminal motifs, and the chromosomal locations. Subsequently, the expression of 18 MYB genes, including 12 were chosen from stress-related subgroups, while another 6 ones from other subgroups, in response to various abiotic stresses was examined. It was found that several of these MYB genes, particularly MdoMYB121, were induced by multiple stresses. The MdoMYB121 was then further functionally characterized. Its predicted protein was found to be localized in the nucleus. A transgenic analysis indicated that the overexpression of the MdoMYB121 gene remarkably enhanced the tolerance to high salinity, drought, and cold stresses in transgenic tomato and apple plants. Our results indicate that the MYB genes are highly conserved in plant species and that MdoMYB121 can be used as a target gene in genetic engineering approaches to improve the tolerance of plants to multiple abiotic stresses. PMID:23950843
Protein Structure Determination using Metagenome sequence data
Ovchinnikov, Sergey; Park, Hahnbeom; Varghese, Neha; Huang, Po-Ssu; Pavlopoulos, Georgios A.; Kim, David E.; Kamisetty, Hetunandan; Kyrpides, Nikos C.; Baker, David
2017-01-01
Despite decades of work by structural biologists, there are still ~5200 protein families with unknown structure outside the range of comparative modeling. We show that Rosetta structure prediction guided by residue-residue contacts inferred from evolutionary information can accurately model proteins that belong to large families, and that metagenome sequence data more than triples the number of protein families with sufficient sequences for accurate modeling. We then integrate metagenome data, contact based structure matching and Rosetta structure calculations to generate models for 614 protein families with currently unknown structures; 206 are membrane proteins and 137 have folds not represented in the PDB. This approach provides the representative models for large protein families originally envisioned as the goal of the protein structure initiative at a fraction of the cost. PMID:28104891
Comparative bioinformatics analyses and profiling of lysosome-related organelle proteomes
NASA Astrophysics Data System (ADS)
Hu, Zhang-Zhi; Valencia, Julio C.; Huang, Hongzhan; Chi, An; Shabanowitz, Jeffrey; Hearing, Vincent J.; Appella, Ettore; Wu, Cathy
2007-01-01
Complete and accurate profiling of cellular organelle proteomes, while challenging, is important for the understanding of detailed cellular processes at the organelle level. Mass spectrometry technologies coupled with bioinformatics analysis provide an effective approach for protein identification and functional interpretation of organelle proteomes. In this study, we have compiled human organelle reference datasets from large-scale proteomic studies and protein databases for seven lysosome-related organelles (LROs), as well as the endoplasmic reticulum and mitochondria, for comparative organelle proteome analysis. Heterogeneous sources of human organelle proteins and rodent homologs are mapped to human UniProtKB protein entries based on ID and/or peptide mappings, followed by functional annotation and categorization using the iProXpress proteomic expression analysis system. Cataloging organelle proteomes allows close examination of both shared and unique proteins among various LROs and reveals their functional relevance. The proteomic comparisons show that LROs are a closely related family of organelles. The shared proteins indicate the dynamic and hybrid nature of LROs, while the unique transmembrane proteins may represent additional candidate marker proteins for LROs. This comparative analysis, therefore, provides a basis for hypothesis formulation and experimental validation of organelle proteins and their functional roles.
Using genome-wide measurements for computational prediction of SH2–peptide interactions
Wunderlich, Zeba; Mirny, Leonid A.
2009-01-01
Peptide-recognition modules (PRMs) are used throughout biology to mediate protein–protein interactions, and many PRMs are members of large protein domain families. Recent genome-wide measurements describe networks of peptide–PRM interactions. In these networks, very similar PRMs recognize distinct sets of peptides, raising the question of how peptide-recognition specificity is achieved using similar protein domains. The analysis of individual protein complex structures often gives answers that are not easily applicable to other members of the same PRM family. Bioinformatics-based approaches, one the other hand, may be difficult to interpret physically. Here we integrate structural information with a large, quantitative data set of SH2 domain–peptide interactions to study the physical origin of domain–peptide specificity. We develop an energy model, inspired by protein folding, based on interactions between the amino-acid positions in the domain and peptide. We use this model to successfully predict which SH2 domains and peptides interact and uncover the positions in each that are important for specificity. The energy model is general enough that it can be applied to other members of the SH2 family or to new peptides, and the cross-validation results suggest that these energy calculations will be useful for predicting binding interactions. It can also be adapted to study other PRM families, predict optimal peptides for a given SH2 domain, or study other biological interactions, e.g. protein–DNA interactions. PMID:19502496
Wang, Yukun; Qiao, Linyi; Bai, Jianfang; Wang, Peng; Duan, Wenjing; Yuan, Shaohua; Yuan, Guoliang; Zhang, Fengting; Zhang, Liping; Zhao, Changping
2017-02-13
The JASMONATE-ZIM DOMAIN (JAZ) repressor family proteins are jasmonate co-receptors and transcriptional repressor in jasmonic acid (JA) signaling pathway, and they play important roles in regulating the growth and development of plants. Recently, more and more researches on JAZ gene family are reported in many plants. Although the genome sequencing of common wheat (Triticum aestivum L.) and its relatives is complete, our knowledge about this gene family remains vacant. Fourteen JAZ genes were identified in the wheat genome. Structural analysis revealed that the TaJAZ proteins in wheat were as conserved as those in other plants, but had structural characteristics. By phylogenetic analysis, all JAZ proteins from wheat and other plants were clustered into 11 sub-groups (G1-G11), and TaJAZ proteins shared a high degree of similarity with some JAZ proteins from Aegliops tauschii, Brachypodium distachyon and Oryza sativa. The Ka/Ks ratios of TaJAZ genes ranged from 0.0016 to 0.6973, suggesting that the TaJAZ family had undergone purifying selection in wheat. Gene expression patterns obtained by quantitative real-time PCR (qRT-PCR) revealed differential temporal and spatial regulation of TaJAZ genes under multifarious abiotic stress treatments of high salinity, drought, cold and phytohormone. Among these, TaJAZ7, 8 and 12 were specifically expressed in the anther tissues of the thermosensitive genic male sterile (TGMS) wheat line BS366 and normal control wheat line Jing411. Compared with the gene expression patterns in the normal wheat line Jing411, TaJAZ7, 8 and 12 had different expression patterns in abnormally dehiscent anthers of BS366 at the heading stage 6, suggesting that specific up- or down-regulation of these genes might be associated with the abnormal anther dehiscence in TGMS wheat line. This study analyzed the size and composition of the JAZ gene family in wheat, and investigated stress responsive and differential tissue-specific expression profiles of each TaJAZ gene in TGMS wheat line BS366. In addition, we isolated 3 TaJAZ genes that would be more likely to be involved in the regulation of abnormal anther dehiscence in TGMS wheat line. In conclusion, the results of this study contributed some novel and detailed information about JAZ gene family in wheat, and also provided 3 potential candidate genes for improving the TGMS wheat line.
Genomewide analysis of TCP transcription factor gene family in Malus domestica.
Xu, Ruirui; Sun, Peng; Jia, Fengjuan; Lu, Longtao; Li, Yuanyuan; Zhang, Shizhong; Huang, Jinguang
2014-12-01
Teosinte branched 1/cycloidea/proliferating cell factor 1 (TCP) proteins are a large family of transcriptional regulators in angiosperms. They are involved in various biological processes, including development and plant metabolism pathways. In this study, a total of 52 TCP genes were identified in apple (Malus domestica) genome. Bioinformatic methods were employed to predicate and analyse their relevant gene classification, gene structure, chromosome location, sequence alignment and conserved domains of MdTCP proteins. Expression analysis from microarray data showed that the expression levels of 28 and 51 MdTCP genes changed during the ripening and rootstock-scion interaction processes, respectively. The expression patterns of 12 selected MdTCP genes were analysed in different tissues and in response to abiotic stresses. All of the selected genes were detected in at least one of the tissues tested, and most of them were modulated by adverse treatments indicating that the MdTCPs were involved in various developmental and physiological processes. To the best of our knowledge, this is the first study of a genomewide analysis of apple TCP gene family. These results provide valuable information for studies on functions of the TCP transcription factor genes in apple.
Coral-Vazquez, Ramon M; Rosas-Vargas, Haydee; Meza-Espinosa, Pedro; Mendoza, Irma; Huicochea, Juan C; Ramon, Guillermo; Salamanca, Fabio
2003-01-01
The congenital muscular dystrophies (CMDs) are a heterogeneous group of autosomal recessive disorders. Approximately one half of cases diagnosed with classic CMD show primary deficiency of the laminin alpha2 chain of merosin. Complete absence of this protein is usually associated with a severe phenotype characterized by drastic muscle weakness and characteristic changes in white matter in cerebral magnetic resonance imaging (MRI). Here we report an 8-month-old Mexican female infant, from a consanguineous family, with classical CMD. Serum creatine kinase was elevated, muscle biopsy showed dystrophic changes, and there were abnormalities in brain MRI. Immunofluorescence analysis demonstrated the complete absence of laminin alpha2. In contrast, expression of alpha-, beta-, gamma-, and delta-sarcoglycans and dystrophin, all components of the dystrophin-glycoprotein complex, appeared normal. A homozygous C long right arrow T substitution at position 7781 that generated a stop codon in the G domain of the protein was identified by mutation analysis of the laminin alpha2 gene ( LAMA2). Sequence analysis on available DNA samples of the family showed that parents and other relatives were carriers of the mutation.
An updated version of NPIDB includes new classifications of DNA–protein complexes and their families
Zanegina, Olga; Kirsanov, Dmitriy; Baulin, Eugene; Karyagina, Anna; Alexeevski, Andrei; Spirin, Sergey
2016-01-01
The recent upgrade of nucleic acid–protein interaction database (NPIDB, http://npidb.belozersky.msu.ru/) includes a newly elaborated classification of complexes of protein domains with double-stranded DNA and a classification of families of related complexes. Our classifications are based on contacting structural elements of both DNA: the major groove, the minor groove and the backbone; and protein: helices, beta-strands and unstructured segments. We took into account both hydrogen bonds and hydrophobic interaction. The analyzed material contains 1942 structures of protein domains from 748 PDB entries. We have identified 97 interaction modes of individual protein domain–DNA complexes and 17 DNA–protein interaction classes of protein domain families. We analyzed the sources of diversity of DNA–protein interaction modes in different complexes of one protein domain family. The observed interaction mode is sometimes influenced by artifacts of crystallization or diversity in secondary structure assignment. The interaction classes of domain families are more stable and thus possess more biological sense than a classification of single complexes. Integration of the classification into NPIDB allows the user to browse the database according to the interacting structural elements of DNA and protein molecules. For each family, we present average DNA shape parameters in contact zones with domains of the family. PMID:26656949
Shackleford, Gregory M; Ganguly, Amit; MacArthur, Craig A
2001-01-01
Background Studies suggest that the related proteins nucleoplasmin and nucleophosmin (also called B23, NO38 or numatrin) are nuclear chaperones that mediate the assembly of nucleosomes and ribosomes, respectively, and that these activities are accomplished through the binding of basic proteins via their acidic domains. Recently discovered and less well characterized members of this family of acidic phosphoproteins include mouse nucleophosmin/nucleoplasmin 3 (Npm3) and Xenopus NO29. Here we report the cloning and initial characterization of the human ortholog of Npm3. Results Human genomic and cDNA clones of NPM3 were isolated and sequenced. NPM3 lies 5.5 kb upstream of FGF8 and thus maps to chromosome 10q24-26. In addition to amino acid similarities, NPM3 shares many physical characteristics with the nucleophosmin/nucleoplasmin family, including an acidic domain, multiple potential phosphorylation sites and a putative nuclear localization signal. Comparative analyses of 14 members of this family from various metazoans suggest that Xenopus NO29 is a candidate ortholog of human and mouse NPM3, and they further group both proteins closer with the nucleoplasmins than with the nucleophosmins. Northern blot analysis revealed that NPM3 was strongly expressed in all 16 human tissues examined, with especially robust expression in pancreas and testis; lung displayed the lowest level of expression. An analysis of subcellular fractions of NIH3T3 cells expressing epitope-tagged NPM3 revealed that NPM3 protein was localized solely in the nucleus. Conclusions Human NPM3 is an abundant and widely expressed protein with primarily nuclear localization. These biological activities, together with its physical relationship to the chaparones nucleoplasmin and nucleophosmin, are consistent with the proposed function of NPM3 as a molecular chaperone functioning in the nucleus. PMID:11722795
Comparative analysis of genome-wide Mlo gene family in Cajanus cajan and Phaseolus vulgaris.
Deshmukh, Reena; Singh, V K; Singh, B D
2016-04-01
The Mlo gene was discovered in barley because the mutant 'mlo' allele conferred broad-spectrum, non-race-specific resistance to powdery mildew caused by Blumeria graminis f. sp. hordei. The Mlo genes also play important roles in growth and development of plants, and in responses to biotic and abiotic stresses. The Mlo gene family has been characterized in several crop species, but only a single legume species, soybean (Glycine max L.), has been investigated so far. The present report describes in silico identification of 18 CcMlo and 20 PvMlo genes in the important legume crops Cajanus cajan (L.) Millsp. and Phaseolus vulgaris L., respectively. In silico analysis of gene organization, protein properties and conserved domains revealed that the C. cajan and P. vulgaris Mlo gene paralogs are more divergent from each other than from their orthologous pairs. The comparative phylogenetic analysis classified CcMlo and PvMlo genes into three major clades. A comparative analysis of CcMlo and PvMlo proteins with the G. max Mlo proteins indicated close association of one CcMlo, one PvMlo with two GmMlo genes, indicating that there was no further expansion of the Mlo gene family after the separation of these species. Thus, most of the diploid species of eudicots might be expected to contain 15-20 Mlo genes. The genes CcMlo12 and 14, and PvMlo11 and 12 are predicted to participate in powdery mildew resistance. If this prediction were verified, these genes could be targeted by TILLING or CRISPR to isolate powdery mildew resistant mutants.
Li, Jun; Hou, Hongmin; Li, Xiaoqin; Xiang, Jiang; Yin, Xiangjing; Gao, Hua; Zheng, Yi; Bassett, Carole L; Wang, Xiping
2013-09-01
SQUAMOSA promoter binding protein (SBP)-box genes encode a family of plant-specific transcription factors and play many crucial roles in plant development. In this study, 27 SBP-box gene family members were identified in the apple (Malus × domestica Borkh.) genome, 15 of which were suggested to be putative targets of MdmiR156. Plant SBPs were classified into eight groups according to the phylogenetic analysis of SBP-domain proteins. Gene structure, gene chromosomal location and synteny analyses of MdSBP genes within the apple genome demonstrated that tandem and segmental duplications, as well as whole genome duplications, have likely contributed to the expansion and evolution of the SBP-box gene family in apple. Additionally, synteny analysis between apple and Arabidopsis indicated that several paired homologs of MdSBP and AtSPL genes were located in syntenic genomic regions. Tissue-specific expression analysis of MdSBP genes in apple demonstrated their diversified spatiotemporal expression patterns. Most MdmiR156-targeted MdSBP genes, which had relatively high transcript levels in stems, leaves, apical buds and some floral organs, exhibited a more differential expression pattern than most MdmiR156-nontargeted MdSBP genes. Finally, expression analysis of MdSBP genes in leaves upon various plant hormone treatments showed that many MdSBP genes were responsive to different plant hormones, indicating that MdSBP genes may be involved in responses to hormone signaling during stress or in apple development. Copyright © 2013 Elsevier Masson SAS. All rights reserved.
FunSimMat: a comprehensive functional similarity database
Schlicker, Andreas; Albrecht, Mario
2008-01-01
Functional similarity based on Gene Ontology (GO) annotation is used in diverse applications like gene clustering, gene expression data analysis, protein interaction prediction and evaluation. However, there exists no comprehensive resource of functional similarity values although such a database would facilitate the use of functional similarity measures in different applications. Here, we describe FunSimMat (Functional Similarity Matrix, http://funsimmat.bioinf.mpi-inf.mpg.de/), a large new database that provides several different semantic similarity measures for GO terms. It offers various precomputed functional similarity values for proteins contained in UniProtKB and for protein families in Pfam and SMART. The web interface allows users to efficiently perform both semantic similarity searches with GO terms and functional similarity searches with proteins or protein families. All results can be downloaded in tab-delimited files for use with other tools. An additional XML–RPC interface gives automatic online access to FunSimMat for programs and remote services. PMID:17932054
Molecular insights into the binding of phosphoinositides to the TH domain region of TIPE proteins.
Antony, Priya; Baby, Bincy; Vijayan, Ranjit
2016-11-01
Phosphatidylinositols and their phosphorylated derivatives, phosphoinositides, play a central role in regulating diverse cellular functions. These phospholipids have been shown to interact with the hydrophobic TH domain of the tumor necrosis factor (TNF)-α-induced protein 8 (TIPE) family of proteins. However, the precise mechanism of interaction of these lipids is unclear. Here we report the binding mode and interactions of these phospholipids in the TH domain, as elucidated using molecular docking and simulations. Results indicate that phosphoinositides bind to the TH domain in a similar way by inserting their lipid tails in the hydrophobic cavity. The exposed head group is stabilized by interactions with critical positively charged residues on the surface of these proteins. Further MD simulations confirmed the dynamic stability of these lipids in the TH domain. This computational analysis thus provides insight into the binding mode of phospholipids in the TH domain of the TIPE family of proteins. Graphical abstract A phosphoinositide (phosphatidylinositol 4-phosphate; PtdIns4P) docked to TIPE2.
Xia, Jun Hong; Li, Hong Lian; Zhang, Yong; Meng, Zi Ning; Lin, Hao Ran
2018-05-01
Fish species inhabitating seawater (SW) or freshwater (FW) habitats have to develop genetic adaptations to alternative environment factors, especially salinity. Functional consequences of the protein variations associated with habitat environments in fish mitochondrial genomes have not yet received much attention. We analyzed 829 complete fish mitochondrial genomes and compared the amino acid differences of 13 mitochondrial protein families between FW and SW fish groups. We identified 47 specificity determining sites (SDS) that associated with FW or SW environments from 12 mitochondrial protein families. Thirty-two (68%) of the SDS sites are hydrophobic, 13 (28%) are neutral, and the remaining sites are acidic or basic. Seven of those SDS from ND1, ND2 and ND5 were scored as probably damaging to the protein structures. Furthermore, phylogenetic tree based Bayes Empirical Bayes analysis also detected 63 positive sites associated with alternative habitat environments across ten mtDNA proteins. These signatures could be important for studying mitochondrial genetic variation relevant to fish physiology and ecology.
Lovering, Andrew L; Capeness, Michael J; Lambert, Carey; Hobley, Laura; Sockett, R Elizabeth
2011-01-01
Cyclic-di-GMP is a near-ubiquitous bacterial second messenger that is important in localized signal transmission during the control of various processes, including virulence and switching between planktonic and biofilm-based lifestyles. Cyclic-di-GMP is synthesized by GGDEF diguanylate cyclases and hydrolyzed by EAL or HD-GYP phosphodiesterases, with each functional domain often appended to distinct sensory modules. HD-GYP domain proteins have resisted structural analysis, but here we present the first structural representative of this family (1.28 Å), obtained using the unusual Bd1817 HD-GYP protein from the predatory bacterium Bdellovibrio bacteriovorus. Bd1817 lacks the active-site tyrosine present in most HD-GYP family members yet remains an excellent model of their features, sharing 48% sequence similarity with the archetype RpfG. The protein structure is highly modular and thus provides a basis for delineating domain boundaries in other stimulus-dependent homologues. Conserved residues in the HD-GYP family cluster around a binuclear metal center, which is observed complexed to a molecule of phosphate, providing information on the mode of hydroxide ion attack on substrate. The fold and active site of the HD-GYP domain are different from those of EAL proteins, and restricted access to the active-site cleft is indicative of a different mode of activity regulation. The region encompassing the GYP motif has a novel conformation and is surface exposed and available for complexation with binding partners, including GGDEF proteins. It is becoming apparent that many bacteria use the signaling molecule cyclic-di-GMP to regulate a variety of processes, most notably, transitions between motility and sessility. Importantly, this regulation is central to several traits implicated in chronic disease (adhesion, biofilm formation, and virulence gene expression). The mechanisms of cyclic-di-GMP synthesis via GGDEF enzymes and hydrolysis via EAL enzymes have been suggested by the analysis of several crystal structures, but no information has been available to date for the unrelated HD-GYP class of hydrolases. Here we present the multidomain structure of an unusual member of the HD-GYP family from the predatory bacterium Bdellovibrio bacteriovorus and detail the features that distinguish it from the wider structural family of general HD fold hydrolases. The structure reveals how a binuclear iron center is formed from several conserved residues and provides a basis for understanding HD-GYP family sequence requirements for c-di-GMP hydrolysis.
Hassan, Mubashir; Shahzadi, Saba; Alashwal, Hany; Zaki, Nazar; Seo, Sung-Yum; Moustafa, Ahmed A
2018-05-22
Cas scaffolding protein family member 4 and protein tyrosine kinase 2 are signaling proteins, which are involved in neuritic plaques burden, neurofibrillary tangles, and disruption of synaptic connections in Alzheimer's disease. In the current study, a computational approach was employed to explore the active binding sites of Cas scaffolding protein family member 4 and protein tyrosine kinase 2 proteins and their significant role in the activation of downstream signaling pathways. Sequential and structural analyses were performed on Cas scaffolding protein family member 4 and protein tyrosine kinase 2 to identify their core active binding sites. Molecular docking servers were used to predict the common interacting residues in both Cas scaffolding protein family member 4 and protein tyrosine kinase 2 and their involvement in Alzheimer's disease-mediated pathways. Furthermore, the results from molecular dynamic simulation experiment show the stability of targeted proteins. In addition, the generated root mean square deviations and fluctuations, solvent-accessible surface area, and gyration graphs also depict their backbone stability and compactness, respectively. A better understanding of CAS and their interconnected protein signaling cascade may help provide a treatment for Alzheimer's disease. Further, Cas scaffolding protein family member 4 could be used as a novel target for the treatment of Alzheimer's disease by inhibiting the protein tyrosine kinase 2 pathway.
Genome-Wide Analysis of bZIP-Encoding Genes in Maize
Wei, Kaifa; Chen, Juan; Wang, Yanmei; Chen, Yanhui; Chen, Shaoxiang; Lin, Yina; Pan, Si; Zhong, Xiaojun; Xie, Daoxin
2012-01-01
In plants, basic leucine zipper (bZIP) proteins regulate numerous biological processes such as seed maturation, flower and vascular development, stress signalling and pathogen defence. We have carried out a genome-wide identification and analysis of 125 bZIP genes that exist in the maize genome, encoding 170 distinct bZIP proteins. This family can be divided into 11 groups according to the phylogenetic relationship among the maize bZIP proteins and those in Arabidopsis and rice. Six kinds of intron patterns (a–f) within the basic and hinge regions are defined. The additional conserved motifs have been identified and present the group specificity. Detailed three-dimensional structure analysis has been done to display the sequence conservation and potential distribution of the bZIP domain. Further, we predict the DNA-binding pattern and the dimerization property on the basis of the characteristic features in the basic and hinge regions and the leucine zipper, respectively, which supports our classification greatly and helps to classify 26 distinct subfamilies. The chromosome distribution and the genetic analysis reveal that 58 ZmbZIP genes are located in the segmental duplicate regions in the maize genome, suggesting that the segment chromosomal duplications contribute greatly to the expansion of the maize bZIP family. Across the 60 different developmental stages of 11 organs, three apparent clusters formed represent three kinds of different expression patterns among the ZmbZIP gene family in maize development. A similar but slightly different expression pattern of bZIPs in two inbred lines displays that 22 detected ZmbZIP genes might be involved in drought stress. Thirteen pairs and 143 pairs of ZmbZIP genes show strongly negative and positive correlations in the four distinct fungal infections, respectively, based on the expression profile and Pearson's correlation coefficient analysis. PMID:23103471
Wang, Guo-Ming; Yin, Hao; Qiao, Xin; Tan, Xu; Gu, Chao; Wang, Bao-Hua; Cheng, Rui; Wang, Ying-Zhen; Zhang, Shao-Ling
2016-12-01
F-box gene family, as one of the largest gene families in plants, plays crucial roles in regulating plant development, reproduction, cellular protein degradation and responses to biotic and abiotic stresses. However, comprehensive analysis of the F-box gene family in pear (Pyrus bretschneideri Rehd.) and other Rosaceae species has not been reported yet. Herein, we identified a total of 226 full-length F-box genes in pear for the first time. And these genes were further divided into various subgroups based on specific domains and phylogenetic analysis. Intriguingly, we observed that whole-genome duplication and dispersed duplication have a major contribution to F-box family expansion. Furthermore, the dynamic evolution for different modes of gene duplication was dissected. Interestingly, we found that dispersed and tandem duplicate have been evolving at a high rate. In addition, we found that F-box genes exhibited functional specificity based on GO analysis, and most of the F-box genes were significantly enriched in the protein binding (GO: 0005515) term, supporting that F-box genes might play a critical role for gene regulation in pear. Transcriptome and digital expression profiles revealed that F-box genes are involved in the development of multiple pear tissues. Overall, these results will set stage for elaborating the biological role of F-box genes in pear and other plants. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Screening for germline mutations in the neurofibromatosis type 2 (NF2) gene in NF2 patients
DOE Office of Scientific and Technical Information (OSTI.GOV)
Andermann, A.A.; Ruttledge, M.H.; Rangaratnam, A.
Neurofibromatosis type 2 (NF2) is an autosomal dominant disease with over 95% penetrance which predisposes gene carriers to develop multiple tumors of the central nervous system. The NF2 gene is a putative tumor suppressor gene which was previously mapped to the long arm of chromosome 22, and has recently been identified, using positional cloning techniques. The gene encodes a protein, schwannomin (SCH), which is highly homologous to the band 4.1 protein family. In an attempt to identify and characterize mutations which lead to the manifestation of the disease, we have used single strand conformation analysis (SSCA) to screen for germlinemore » mutations in all 17 exons of the NF2 gene in 59 unrelated NF2 patients, representing both familial and new mutations. A total of 27 migration abnormalities was found in 26 patients. Using direct sequencing analysis, the majority of these variants were found to result in nonsense, splice-site or frameshift mutations. Mutations identified in familial NF2 patients segregate in the family, and may prove to be useful tools for a simple and direct SSCA-based technique of presymptomatic or prenatal diagnosis in relatives of patients with NF2. This may be of particular importance in children of patients who have new mutations in the NF2 gene, where linkage analysis may not be feasible.« less
Evolution of the vertebrate claudin gene family: insights from a basal vertebrate, the sea lamprey.
Mukendi, Christian; Dean, Nicholas; Lala, Rushil; Smith, Jeramiah; Bronner, Marianne E; Nikitina, Natalya V
2016-01-01
Claudins are major constituents of tight junctions, contributing both to their intercellular sealing and selective permeability properties. While claudins and claudin-like molecules are present in some invertebrates, the association of claudins with tight junctions has been conclusively documented only in vertebrates. Here we report the sequencing, phylogenetic analysis and comprehensive spatiotemporal expression analysis of the entire claudin gene family in the basal extant vertebrate, the sea lamprey. Our results demonstrate that clear orthologues to about half of all mammalian claudins are present in the lamprey, suggesting that at least one round of whole genome duplication contributed to the diversification of this gene family. Expression analysis revealed that claudins are expressed in discrete and specific domains, many of which represent vertebrate-specific innovations, such as in cranial ectodermal placodes and the neural crest; whereas others represent structures characteristic of chordates, e.g. pronephros, notochord, somites, endostyle and pharyngeal arches. By comparing the embryonic expression of claudins in the lamprey to that of other vertebrates, we found that ancestral expression patterns were often preserved in higher vertebrates. Morpholino mediated loss of Cldn3b demonstrated a functional role for this protein in placode and pharyngeal arch morphogenesis. Taken together, our data provide novel insights into the origins and evolution of the claudin gene family and the significance of claudin proteins in the evolution of vertebrates.
Nonagonal cadherins: A new protein family found within the Stramenopiles.
Fletcher, Kyle I G; van West, Pieter; Gachon, Claire M M
2016-11-15
Cadherins, a group of molecules typically associated with planar cell polarity and Wnt signalling, have been little reported outside of the animal kingdom. Here, we identify a new family of cadherins in the Stramenopiles, termed Nonagonal after their 9 transmembrane passes, which contrast to the one or seven passes found in other known cadherin families. Manual curation and experimental validation reveal two subclasses of nonagonal cadherins, depending on the number of uninterrupted extracellular cadherin (EC) modules presented. Firstly, shorter mono-exonic, unimodular, protein models, with 3 to 12 EC domains occur as duplicate paralogs in the saprotrophic Labyrinthulomycetes Aurantiochytrium limanicum and Schizochytrium aggregatum, the gastrointestinal Blastocystis hominis (Blastocystae) and as a single copy gene in the autotrophic Pelagophyte Aureococcus anophagefferens. Larger, single copy, multi-exonal, tri-modular protein models, with up to 72 EC domain in total, are found in the Oomycete genera Albugo, Phytophthora, Pythium and Eurychasma. No homolog was found in the closely related autotrophic Phaeophyceae (brown algae) or Bacillariophyceae (diatoms), nor in several genera of plant and animal pathogenic oomycetes (Aphanomyces, Saprolegnia and Hyaloperonospora). This potential absence was further investigated by synteny analysis of the genome regions flanking the cadherin gene models, which are found to be highly variable. Novel to this new cadherin family is the presence of intercalated laminin and putative carbohydrate binding in tri-modular oomycete cadherins and at the N-terminus of thraustochytrid proteins. As we were unable to detect any homologs of proteins involved in signalling pathways where other cadherin families are involved, we present a conceptual hypothesis on the function of nonagonal cadherin based around the presence of putative carbohydrate binding domains. Copyright © 2016. Published by Elsevier B.V.
Evolutionary divergence of phytochrome protein function in Zea mays PIF3 signaling.
Kumar, Indrajit; Swaminathan, Kankshita; Hudson, Karen; Hudson, Matthew E
2016-07-01
Two maize phytochrome-interacting factor (PIF) basic helix-loop-helix (bHLH) family members, ZmPIF3.1 and ZmPIF3.2, were identified, cloned and expressed in vitro to investigate light-signaling interactions. A phylogenetic analysis of sequences of the maize bHLH transcription factor gene family revealed the extent of the PIF family, and a total of seven predicted PIF-encoding genes were identified from genes encoding bHLH family VIIa/b proteins in the maize genome. To investigate the role of maize PIFs in phytochrome signaling, full-length cDNAs for phytochromes PhyA2, PhyB1, PhyB2 and PhyC1 from maize were cloned and expressed in vitro as chromophorylated holophytochromes. We showed that ZmPIF3.1 and ZmPIF3.2 interact specifically with the Pfr form of maize holophytochrome B1 (ZmphyB1), showing no detectable affinity for the Pr form. Maize holophytochrome B2 (ZmphyB2) showed no detectable binding affinity for PIFs in either Pr or Pfr forms, but phyB Pfr from Arabidopsis interacted with ZmPIF3.1 similarly to ZmphyB1 Pfr. We conclude that subfunctionalization at the protein-protein interaction level has altered the role of phyB2 relative to that of phyB1 in maize. Since the phyB2 mutant shows photomorphogenic defects, we conclude that maize phyB2 is an active photoreceptor, without the binding of PIF3 seen in other phyB family proteins. © The Author 2016. Published by Oxford University Press on behalf of the Society for Experimental Biology.
A 14-3-3 Family Protein from Wild Soybean (Glycine Soja) Regulates ABA Sensitivity in Arabidopsis
Sun, Xiaoli; Sun, Mingzhe; Jia, Bowei; Chen, Chao; Qin, Zhiwei; Yang, Kejun; Shen, Yang; Meiping, Zhang; Mingyang, Cong; Zhu, Yanming
2015-01-01
It is widely accepted that the 14-3-3 family proteins are key regulators of multiple stress signal transduction cascades. By conducting genome-wide analysis, researchers have identified the soybean 14-3-3 family proteins; however, until now, there is still no direct genetic evidence showing the involvement of soybean 14-3-3s in ABA responses. Hence, in this study, based on the latest Glycine max genome on Phytozome v10.3, we initially analyzed the evolutionary relationship, genome organization, gene structure and duplication, and three-dimensional structure of soybean 14-3-3 family proteins systematically. Our results suggested that soybean 14-3-3 family was highly evolutionary conserved and possessed segmental duplication in evolution. Then, based on our previous functional characterization of a Glycine soja 14-3-3 protein GsGF14o in drought stress responses, we further investigated the expression characteristics of GsGF14o in detail, and demonstrated its positive roles in ABA sensitivity. Quantitative real-time PCR analyses in Glycine soja seedlings and GUS activity assays in PGsGF14O:GUS transgenic Arabidopsis showed that GsGF14o expression was moderately and rapidly induced by ABA treatment. As expected, GsGF14o overexpression in Arabidopsis augmented the ABA inhibition of seed germination and seedling growth, promoted the ABA induced stomata closure, and up-regulated the expression levels of ABA induced genes. Moreover, through yeast two hybrid analyses, we further demonstrated that GsGF14o physically interacted with the AREB/ABF transcription factors in yeast cells. Taken together, results presented in this study strongly suggested that GsGF14o played an important role in regulation of ABA sensitivity in Arabidopsis. PMID:26717241
Litholdo, Celso G; Parker, Benjamin L; Eamens, Andrew L; Larsen, Martin R; Cordwell, Stuart J; Waterhouse, Peter M
2016-06-01
Expression of the F-Box protein Leaf Curling Responsiveness (LCR) is regulated by microRNA, miR394, and alterations to this interplay in Arabidopsis thaliana produce defects in leaf polarity and shoot apical meristem organization. Although the miR394-LCR node has been documented in Arabidopsis, the identification of proteins targeted by LCR F-box itself has proven problematic. Here, a proteomic analysis of shoot apices from plants with altered LCR levels identified a member of the Latex Protein (MLP) family gene as a potential LCR F-box target. Bioinformatic and molecular analyses also suggested that other MLP family members are likely to be targets for this post-translational regulation. Direct interaction between LCR F-Box and MLP423 was validated. Additional MLP members had reduction in protein accumulation, in varying degrees, mediated by LCR F-Box. Transgenic Arabidopsis lines, in which MLP28 expression was reduced through an artificial miRNA technology, displayed severe developmental defects, including changes in leaf patterning and morphology, shoot apex defects, and eventual premature death. These phenotypic characteristics resemble those of Arabidopsis plants modified to over-express LCR Taken together, the results demonstrate that MLPs are driven to degradation by LCR, and indicate that MLP gene family is target of miR394-LCR regulatory node, representing potential targets for directly post-translational regulation mediated by LCR F-Box. In addition, MLP28 family member is associated with the LCR regulation that is critical for normal Arabidopsis development. © 2016 by The American Society for Biochemistry and Molecular Biology, Inc.
MicroRNAs affect BCL-2 family proteins in the setting of cerebral ischemia.
Ouyang, Yi-Bing; Giffard, Rona G
2014-11-01
The BCL-2 family is centrally involved in the mechanism of cell death after cerebral ischemia. It is well known that the proteins of the BCL-2 family are key regulators of apoptosis through controlling mitochondrial outer membrane permeabilization. Recent findings suggest that many BCL-2 family members are also directly involved in controlling transmission of Ca(2+) from the endoplasmic reticulum (ER) to mitochondria through a specialization called the mitochondria-associated ER membrane (MAM). Increasing evidence supports the involvement of microRNAs (miRNAs), some of them targeting BCL-2 family proteins, in the regulation of cerebral ischemia. In this mini-review, after highlighting current knowledge about the multiple functions of BCL-2 family proteins and summarizing their relationship to outcome from cerebral ischemia, we focus on the regulation of BCL-2 family proteins by miRNAs, especially miR-29 which targets multiple BCL-2 family proteins. Copyright © 2013 Elsevier Ltd. All rights reserved.
2014-01-01
Background Pectins are acidic sugar-containing polysaccharides that are universally conserved components of the primary cell walls of plants and modulate both tip and diffuse cell growth. However, many of their specific functions and the evolution of the genes responsible for producing and modifying them are incompletely understood. The moss Physcomitrella patens is emerging as a powerful model system for the study of plant cell walls. To identify deeply conserved pectin-related genes in Physcomitrella, we generated phylogenetic trees for 16 pectin-related gene families using sequences from ten plant genomes and analyzed the evolutionary relationships within these families. Results Contrary to our initial hypothesis that a single ancestral gene was present for each pectin-related gene family in the common ancestor of land plants, five of the 16 gene families, including homogalacturonan galacturonosyltransferases, polygalacturonases, pectin methylesterases, homogalacturonan methyltransferases, and pectate lyase-like proteins, show evidence of multiple members in the early land plant that gave rise to the mosses and vascular plants. Seven of the gene families, the UDP-rhamnose synthases, UDP-glucuronic acid epimerases, homogalacturonan galacturonosyltransferase-like proteins, β-1,4-galactan β-1,4-galactosyltransferases, rhamnogalacturonan II xylosyltransferases, and pectin acetylesterases appear to have had a single member in the common ancestor of land plants. We detected no Physcomitrella members in the xylogalacturonan xylosyltransferase, rhamnogalacturonan I arabinosyltransferase, pectin methylesterase inhibitor, or polygalacturonase inhibitor protein families. Conclusions Several gene families related to the production and modification of pectins in plants appear to have multiple members that are conserved as far back as the common ancestor of mosses and vascular plants. The presence of multiple members of these families even before the divergence of other important cell wall-related genes, such as cellulose synthases, suggests a more complex role than previously suspected for pectins in the evolution of land plants. The presence of relatively small pectin-related gene families in Physcomitrella as compared to Arabidopsis makes it an attractive target for analysis of the functions of pectins in cell walls. In contrast, the absence of genes in Physcomitrella for some families suggests that certain pectin modifications, such as homogalacturonan xylosylation, arose later during land plant evolution. PMID:24666997
Lan, Yi; Sun, Jin; Tian, Renmao; Bartlett, Douglas H; Li, Runsheng; Wong, Yue Him; Zhang, Weipeng; Qiu, Jian-Wen; Xu, Ting; He, Li-Sheng; Tabata, Harry G; Qian, Pei-Yuan
2017-07-01
The Challenger Deep in the Mariana Trench is the deepest point in the oceans of our planet. Understanding how animals adapt to this harsh environment characterized by high hydrostatic pressure, food limitation, dark and cold is of great scientific interest. Of the animals dwelling in the Challenger Deep, amphipods have been captured using baited traps. In this study, we sequenced the transcriptome of the amphipod Hirondellea gigas collected at a depth of 10,929 m from the East Pond of the Challenger Deep. Assembly of these sequences resulted in 133,041 contigs and 22,046 translated proteins. Functional annotation of these contigs was made using the go and kegg databases. Comparison of these translated proteins with those of four shallow-water amphipods revealed 10,731 gene families, of which 5659 were single-copy orthologs. Base substitution analysis on these single-copy orthologs showed that 62 genes are positively selected in H. gigas, including genes related to β-alanine biosynthesis, energy metabolism and genetic information processing. For multiple-copy orthologous genes, gene family expansion analysis revealed that cold-inducible proteins (i.e., transcription factors II A and transcription elongation factor 1) as well as zinc finger domains are expanded in H. gigas. Overall, our results indicate that genetic adaptation to the hadal environment by H. gigas may be mediated by both gene family expansion and amino acid substitutions of specific proteins. © 2017 John Wiley & Sons Ltd.
Structural Characterization of the Predominant Family of Histidine Kinase Sensor Domains
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhang, Z.; Hendrickson, W
2010-01-01
Histidine kinase (HK) receptors are used ubiquitously by bacteria to monitor environmental changes, and they are also prevalent in plants, fungi, and other protists. Typical HK receptors have an extracellular sensor portion that detects a signal, usually a chemical ligand, and an intracellular transmitter portion that includes both the kinase domain itself and the site for histidine phosphorylation. While kinase domains are highly conserved, sensor domains are diverse. HK receptors function as dimers, but the molecular mechanism for signal transduction across cell membranes remains obscure. In this study, eight crystal structures were determined from five sensor domains representative of themore » most populated family, family HK1, found in a bioinformatic analysis of predicted sensor domains from transmembrane HKs. Each structure contains an inserted repeat of PhoQ/DcuS/CitA (PDC) domains, and similarity between sequence and structure is correlated across these and other double-PDC sensor proteins. Three of the five sensors crystallize as dimers that appear to be physiologically relevant, and comparisons between ligated structures and apo-state structures provide insights into signal transmission. Some HK1 family proteins prove to be sensors for chemotaxis proteins or diguanylate cyclase receptors, implying a combinatorial molecular evolution.« less
Raman, Gurusamy; Park, SeonJoo
2015-01-01
Dianthus superbus var. longicalycinus is an economically important traditional Chinese medicinal plant that is also used for ornamental purposes. In this study, D. superbus was compared to its closely related family of Caryophyllaceae chloroplast (cp) genomes such as Lychnis chalcedonica and Spinacia oleracea. D. superbus had the longest large single copy (LSC) region (82,805 bp), with some variations in the inverted repeat region A (IRA)/LSC regions. The IRs underwent both expansion and constriction during evolution of the Caryophyllaceae family; however, intense variations were not identified. The pseudogene ribosomal protein subunit S19 (rps19) was identified at the IRA/LSC junction, but was not present in the cp genome of other Caryophyllaceae family members. The translation initiation factor IF-1 (infA) and ribosomal protein subunit L23 (rpl23) genes were absent from the Dianthus cp genome. When the cp genome of Dianthus was compared with 31 other angiosperm lineages, the infA gene was found to have been lost in most members of rosids, solanales of asterids and Lychnis of Caryophyllales, whereas rpl23 gene loss or pseudogization had occurred exclusively in Caryophyllales. Nevertheless, the cp genome of Dianthus and Spinacia has two introns in the proteolytic subunit of ATP-dependent protease (clpP) gene, but Lychnis has lost introns from the clpP gene. Furthermore, phylogenetic analysis of individual protein-coding genes infA and rpl23 revealed that gene loss or pseudogenization occurred independently in the cp genome of Dianthus. Molecular phylogenetic analysis also demonstrated a sister relationship between Dianthus and Lychnis based on 78 protein-coding sequences. The results presented herein will contribute to studies of the evolution, molecular biology and genetic engineering of the medicinal and ornamental plant, D. superbus var. longicalycinus.
Raman, Gurusamy; Park, SeonJoo
2015-01-01
Dianthus superbus var. longicalycinus is an economically important traditional Chinese medicinal plant that is also used for ornamental purposes. In this study, D. superbus was compared to its closely related family of Caryophyllaceae chloroplast (cp) genomes such as Lychnis chalcedonica and Spinacia oleracea. D. superbus had the longest large single copy (LSC) region (82,805 bp), with some variations in the inverted repeat region A (IRA)/LSC regions. The IRs underwent both expansion and constriction during evolution of the Caryophyllaceae family; however, intense variations were not identified. The pseudogene ribosomal protein subunit S19 (rps19) was identified at the IRA/LSC junction, but was not present in the cp genome of other Caryophyllaceae family members. The translation initiation factor IF-1 (infA) and ribosomal protein subunit L23 (rpl23) genes were absent from the Dianthus cp genome. When the cp genome of Dianthus was compared with 31 other angiosperm lineages, the infA gene was found to have been lost in most members of rosids, solanales of asterids and Lychnis of Caryophyllales, whereas rpl23 gene loss or pseudogization had occurred exclusively in Caryophyllales. Nevertheless, the cp genome of Dianthus and Spinacia has two introns in the proteolytic subunit of ATP-dependent protease (clpP) gene, but Lychnis has lost introns from the clpP gene. Furthermore, phylogenetic analysis of individual protein-coding genes infA and rpl23 revealed that gene loss or pseudogenization occurred independently in the cp genome of Dianthus. Molecular phylogenetic analysis also demonstrated a sister relationship between Dianthus and Lychnis based on 78 protein-coding sequences. The results presented herein will contribute to studies of the evolution, molecular biology and genetic engineering of the medicinal and ornamental plant, D. superbus var. longicalycinus. PMID:26513163
Zouhar, Miloslav; Mazakova, Jana; Rysanek, Pavel
2014-01-01
Abstract Phoma stem canker (blackleg) is a disease of world-wide importance on oilseed rape (Brassica napus) and can cause serious losses for crops globally. The disease is caused by dothideomycetous fungus, Leptosphaeria maculans, which is highly virulent/aggressive. Cyclophilins (CYPs) and FK506-binding proteins (FKBPs) are ubiquitous proteins belonging to the peptidyl-prolyl cis/trans isomerase (PPIase) family. They are collectively referred to as immunophilins (IMMs). In the present study, IMM genes, CYP and FKBP in haploid strain v23.1.3 of L. maculans genome, were identified and classified. Twelve CYPs and five FKBPs were determined in total. Domain architecture analysis revealed the presence of a conserved cyclophilin-like domain (CLD) in the case of CYPs and FKBP_C in the case of FKBPs. Interestingly, IMMs in L. maculans also subgrouped into single domain (SD) and multidomain (MD) proteins. They were primarily found to be localized in cytoplasm, nuclei, and mitochondria. Homologous and orthologous gene pairs were also determined by comparison with the model organism Saccharomyces cerevisiae. Remarkably, IMMs of L. maculans contain shorter introns in comparison to exons. Moreover, CYPs, in contrast with FKBPs, contain few exons. However, two CYPs were determined as being intronless. The expression profile of IMMs in both mycelium and infected primary leaves of B. napus demonstrated their potential role during infection. Secondary structure analysis revealed the presence of atypical eight β strands and two α helices fold architecture. Gene ontology analysis of IMMs predicted their significant role in protein folding and PPIase activity. Taken together, our findings for the first time present new prospects of this highly conserved gene family in phytopathogenic fungus. PMID:25259854
Zhou, Wenya; Du, Xiaoling; Song, Fengju; Zheng, Hong; Chen, Kexin; Zhang, Wei; Yang, Jilong
2016-04-19
Malignant peripheral nerve sheath tumors (MPNST) are rare, highly malignant, and poorly understood sarcomas. The often poor outcome of MPNST highlights the necessity of identifying prognostic predictors for this aggressive sarcoma. Here, we investigate the role of fibroblast growth factor receptor (FGFR) family members in human MPNSTs. aCGH and bioinformatics analysis identified frequent amplification of the FGFR1 gene. FISH analysis revealed that 26.9% MPNST samples had amplification of FGFR1, with both focal and polysomy patterns observed. IHC identified that FGFR1 protein expression was positively correlated with FGFR1 gene amplification. High expression of FGFR1 protein was associated with better overall survival (OS) and was an independent prognostic predictor for OS of MPNST patients. Additionally, combined expression of FGFR1 and FGFR2 protein characterized a subtype of MPNST with better OS. FGFR4 protein was expressed 82.3% of MPNST samples, and was associated with poor disease-free survival. We performed microarray-based comparative genomic hybridization (aCGH) profiling of two cohorts of primary MPNST tissue samples including 25 patients treated at The University of Texas MD Anderson Cancer Center and 26 patients from Tianjin Medical University Cancer Institute and Hospital. Fluorescence in situ hybridization (FISH) was used to validate the gene amplification detected by aCGH analysis. Another cohort of 63 formalin-fixed paraffin-embedded MPNST samples (including 52 samples for FISH assay) was obtained to explore FGFR1, 2, 3, and 4 protein expression by immunohistochemical (IHC) analysis. Our integrated genomic and molecular studies provide evidence that FGFRs play different prognostic roles in MPNST.
Baltoumas, Fotis A; Theodoropoulou, Margarita C; Hamodrakas, Stavros J
2013-06-01
G-protein coupled receptors (GPCRs) are one of the largest families of membrane receptors in eukaryotes. Heterotrimeric G-proteins, composed of α, β and γ subunits, are important molecular switches in the mediation of GPCR signaling. Receptor stimulation after the binding of a suitable ligand leads to G-protein heterotrimer activation and dissociation into the Gα subunit and Gβγ heterodimer. These subunits then interact with a large number of effectors, leading to several cell responses. We studied the interactions between Gα subunits and their binding partners, using information from structural, mutagenesis and Bioinformatics studies, and conducted a series of comparisons of sequence, structure, electrostatic properties and intermolecular energies among different Gα families and subfamilies. We identified a number of Gα surfaces that may, in several occasions, participate in interactions with receptors as well as effectors. The study of Gα interacting surfaces in terms of sequence, structure and electrostatic potential reveals features that may account for the Gα subunit's behavior towards its interacting partners. The electrostatic properties of the Gα subunits, which in some cases differ greatly not only between families but also between subfamilies, as well as the G-protein interacting surfaces of effectors and regulators of G-protein signaling (RGS) suggest that electrostatic complementarity may be an important factor in G-protein interactions. Energy calculations also support this notion. This information may be useful in future studies of G-protein interactions with GPCRs and effectors. Copyright © 2013 Elsevier Inc. All rights reserved.
Functional and Evolutionary Analysis of the CASPARIAN STRIP MEMBRANE DOMAIN PROTEIN Family1[C][W
Roppolo, Daniele; Boeckmann, Brigitte; Pfister, Alexandre; Boutet, Emmanuel; Rubio, Maria C.; Dénervaud-Tendon, Valérie; Vermeer, Joop E.M.; Gheyselinck, Jacqueline; Xenarios, Ioannis; Geldner, Niko
2014-01-01
CASPARIAN STRIP MEMBRANE DOMAIN PROTEINS (CASPs) are four-membrane-span proteins that mediate the deposition of Casparian strips in the endodermis by recruiting the lignin polymerization machinery. CASPs show high stability in their membrane domain, which presents all the hallmarks of a membrane scaffold. Here, we characterized the large family of CASP-like (CASPL) proteins. CASPLs were found in all major divisions of land plants as well as in green algae; homologs outside of the plant kingdom were identified as members of the MARVEL protein family. When ectopically expressed in the endodermis, most CASPLs were able to integrate the CASP membrane domain, which suggests that CASPLs share with CASPs the propensity to form transmembrane scaffolds. Extracellular loops are not necessary for generating the scaffold, since CASP1 was still able to localize correctly when either one of the extracellular loops was deleted. The CASP first extracellular loop was found conserved in euphyllophytes but absent in plants lacking Casparian strips, an observation that may contribute to the study of Casparian strip and root evolution. In Arabidopsis (Arabidopsis thaliana), CASPL showed specific expression in a variety of cell types, such as trichomes, abscission zone cells, peripheral root cap cells, and xylem pole pericycle cells. PMID:24920445
PINOID functions in root phototropism as a negative regulator
Haga, Ken; Sakai, Tatsuya
2015-01-01
The PINOID (PID) family, which belongs to AGCVIII kinases, is known to be involved in the regulation of auxin efflux transporter PIN-FORMED (PIN) proteins through changes in the phosphorylation status. Recently, we demonstrated that the PID family is necessary for phytochrome-mediated phototropic enhancement in Arabidopsis hypocotyls and that the downregulation of PID expression by red-light pretreatment results in the promotion of the PIN-mediated auxin gradient during phototropic responses. However, whether PID participates in root phototropism in Arabidopsis seedlings has not been well studied. Here, we demonstrated that negative root phototropic responses are enhanced in the pid quadruple mutant and are severely impaired in transgenic plants expressing PID constitutively. The results indicate that the PID family functions in a negative root phototropism as a negative regulator. On the other hand, analysis with PID fused to a yellow fluorescent protein, VENUS, showed that unilateral blue-light irradiation causes a lower accumulation of PID proteins on the shaded side than on the irradiated side. This result suggests that the blue-light-mediated asymmetrical distribution of PID proteins may be one of the critical responses in phototropin-mediated signals during a negative root phototropism. Alternatively, such a transverse gradient of PID proteins may result from gravitropic stimulation produced by phototropic bending. PMID:26039488
PINOID functions in root phototropism as a negative regulator.
Haga, Ken; Sakai, Tatsuya
2015-01-01
The PINOID (PID) family, which belongs to AGCVIII kinases, is known to be involved in the regulation of auxin efflux transporter PIN-formed (PIN) proteins through changes in the phosphorylation status. Recently, we demonstrated that the PID family is necessary for phytochrome-mediated phototropic enhancement in Arabidopsis hypocotyls and that the downregulation of PID expression by red-light pretreatment results in the promotion of the PIN-mediated auxin gradient during phototropic responses. However, whether PID participates in root phototropism in Arabidopsis seedlings has not been well studied. Here, we demonstrated that negative root phototropic responses are enhanced in the pid quadruple mutant and are severely impaired in transgenic plants expressing PID constitutively. The results indicate that the PID family functions in a negative root phototropism as a negative regulator. On the other hand, analysis with PID fused to a yellow fluorescent protein, VENUS, showed that unilateral blue-light irradiation causes a lower accumulation of PID proteins on the shaded side than on the irradiated side. This result suggests that the blue-light-mediated asymmetrical distribution of PID proteins may be one of the critical responses in phototropin-mediated signals during a negative root phototropism. Alternatively, such a transverse gradient of PID proteins may result from gravitropic stimulation produced by phototropic bending.
Malhotra, Sony; Sowdhamini, Ramanathan
2013-08-01
The interaction of proteins with their respective DNA targets is known to control many high-fidelity cellular processes. Performing a comprehensive survey of the sequenced genomes for DNA-binding proteins (DBPs) will help in understanding their distribution and the associated functions in a particular genome. Availability of fully sequenced genome of Arabidopsis thaliana enables the review of distribution of DBPs in this model plant genome. We used profiles of both structure and sequence-based DNA-binding families, derived from PDB and PFam databases, to perform the survey. This resulted in 4471 proteins, identified as DNA-binding in Arabidopsis genome, which are distributed across 300 different PFam families. Apart from several plant-specific DNA-binding families, certain RING fingers and leucine zippers also had high representation. Our search protocol helped to assign DNA-binding property to several proteins that were previously marked as unknown, putative or hypothetical in function. The distribution of Arabidopsis genes having a role in plant DNA repair were particularly studied and noted for their functional mapping. The functions observed to be overrepresented in the plant genome harbour DNA-3-methyladenine glycosylase activity, alkylbase DNA N-glycosylase activity and DNA-(apurinic or apyrimidinic site) lyase activity, suggesting their role in specialized functions such as gene regulation and DNA repair.
Functional binding interaction identified between the axonal CAM L1 and members of the ERM family
Dickson, Tracey C.; Mintz, C. David; Benson, Deanna L.; Salton, Stephen R.J.
2002-01-01
Ayeast two-hybrid library was screened using the cytoplasmic domain of the axonal cell adhesion molecule L1 to identify binding partners that may be involved in the regulation of L1 function. The intracellular domain of L1 bound to ezrin, a member of the ezrin, radixin, and moesin (ERM) family of membrane–cytoskeleton linking proteins, at a site overlapping that for AP2, a clathrin adaptor. Binding of bacterial fusion proteins confirmed this interaction. To determine whether ERM proteins interact with L1 in vivo, extracellular antibodies to L1 were used to force cluster the protein on cultured hippocampal neurons and PC12 cells, which were then immunolabeled for ERM proteins. Confocal analysis revealed a precise pattern of codistribution between ERMs and L1 clusters in axons and PC12 neurites, whereas ERMs in dendrites and spectrin labeling remained evenly distributed. Transfection of hippocampal neurons grown on an L1 substrate with a dominant negative ERM construct resulted in extensive and abnormal elaboration of membrane protrusions and an increase in axon branching, highlighting the importance of the ERM–actin interaction in axon development. Together, our data indicate that L1 binds directly to members of the ERM family and suggest this association may coordinate aspects of axonal morphogenesis. PMID:12070130
Functional binding interaction identified between the axonal CAM L1 and members of the ERM family.
Dickson, Tracey C; Mintz, C David; Benson, Deanna L; Salton, Stephen R J
2002-06-24
A yeast two-hybrid library was screened using the cytoplasmic domain of the axonal cell adhesion molecule L1 to identify binding partners that may be involved in the regulation of L1 function. The intracellular domain of L1 bound to ezrin, a member of the ezrin, radixin, and moesin (ERM) family of membrane-cytoskeleton linking proteins, at a site overlapping that for AP2, a clathrin adaptor. Binding of bacterial fusion proteins confirmed this interaction. To determine whether ERM proteins interact with L1 in vivo, extracellular antibodies to L1 were used to force cluster the protein on cultured hippocampal neurons and PC12 cells, which were then immunolabeled for ERM proteins. Confocal analysis revealed a precise pattern of codistribution between ERMs and L1 clusters in axons and PC12 neurites, whereas ERMs in dendrites and spectrin labeling remained evenly distributed. Transfection of hippocampal neurons grown on an L1 substrate with a dominant negative ERM construct resulted in extensive and abnormal elaboration of membrane protrusions and an increase in axon branching, highlighting the importance of the ERM-actin interaction in axon development. Together, our data indicate that L1 binds directly to members of the ERM family and suggest this association may coordinate aspects of axonal morphogenesis.
Functional and Evolutionary Analysis of the CASPARIAN STRIP MEMBRANE DOMAIN PROTEIN Family.
Roppolo, Daniele; Boeckmann, Brigitte; Pfister, Alexandre; Boutet, Emmanuel; Rubio, Maria C; Dénervaud-Tendon, Valérie; Vermeer, Joop E M; Gheyselinck, Jacqueline; Xenarios, Ioannis; Geldner, Niko
2014-08-01
CASPARIAN STRIP MEMBRANE DOMAIN PROTEINS (CASPs) are four-membrane-span proteins that mediate the deposition of Casparian strips in the endodermis by recruiting the lignin polymerization machinery. CASPs show high stability in their membrane domain, which presents all the hallmarks of a membrane scaffold. Here, we characterized the large family of CASP-like (CASPL) proteins. CASPLs were found in all major divisions of land plants as well as in green algae; homologs outside of the plant kingdom were identified as members of the MARVEL protein family. When ectopically expressed in the endodermis, most CASPLs were able to integrate the CASP membrane domain, which suggests that CASPLs share with CASPs the propensity to form transmembrane scaffolds. Extracellular loops are not necessary for generating the scaffold, since CASP1 was still able to localize correctly when either one of the extracellular loops was deleted. The CASP first extracellular loop was found conserved in euphyllophytes but absent in plants lacking Casparian strips, an observation that may contribute to the study of Casparian strip and root evolution. In Arabidopsis (Arabidopsis thaliana), CASPL showed specific expression in a variety of cell types, such as trichomes, abscission zone cells, peripheral root cap cells, and xylem pole pericycle cells. © 2014 American Society of Plant Biologists. All Rights Reserved.
Wytynck, Pieter; Rougé, Pierre; Van Damme, Els J M
2017-11-01
Ribosome-inactivating proteins (RIPs) are cytotoxic enzymes capable of halting protein synthesis by irreversible modification of ribosomes. Although RIPs are widespread they are not ubiquitous in the plant kingdom. The physiological importance of RIPs is not fully elucidated, but evidence suggests a role in the protection of the plant against biotic and abiotic stresses. Searches in the rice genome revealed a large and highly complex family of proteins with a RIP domain. A comparative analysis retrieved 38 RIP sequences from the genome sequence of Oryza sativa subspecies japonica and 34 sequences from the subspecies indica. The RIP sequences are scattered over different chromosomes but are mostly found on the third chromosome. The phylogenetic tree revealed the pairwise clustering of RIPs from japonica and indica. Molecular modeling and sequence analysis yielded information on the catalytic site of the enzyme, and suggested that a large part of RIP domains probably possess N-glycosidase activity. Several RIPs are differentially expressed in plant tissues and in response to specific abiotic stresses. This study provides an overview of RIP motifs in rice and will help to understand their biological role(s) and evolutionary relationships. Copyright © 2017 Elsevier Ltd. All rights reserved.
Suplatov, Dmitry; Kirilin, Eugeny; Arbatsky, Mikhail; Takhaveev, Vakil; Švedas, Vytas
2014-01-01
The new web-server pocketZebra implements the power of bioinformatics and geometry-based structural approaches to identify and rank subfamily-specific binding sites in proteins by functional significance, and select particular positions in the structure that determine selective accommodation of ligands. A new scoring function has been developed to annotate binding sites by the presence of the subfamily-specific positions in diverse protein families. pocketZebra web-server has multiple input modes to meet the needs of users with different experience in bioinformatics. The server provides on-site visualization of the results as well as off-line version of the output in annotated text format and as PyMol sessions ready for structural analysis. pocketZebra can be used to study structure–function relationship and regulation in large protein superfamilies, classify functionally important binding sites and annotate proteins with unknown function. The server can be used to engineer ligand-binding sites and allosteric regulation of enzymes, or implemented in a drug discovery process to search for potential molecular targets and novel selective inhibitors/effectors. The server, documentation and examples are freely available at http://biokinet.belozersky.msu.ru/pocketzebra and there are no login requirements. PMID:24852248
A structural analysis of the AAA+ domains in Saccharomyces cerevisiae cytoplasmic dynein
Gleave, Emma S.; Schmidt, Helgo; Carter, Andrew P.
2014-01-01
Dyneins are large protein complexes that act as microtubule based molecular motors. The dynein heavy chain contains a motor domain which is a member of the AAA+ protein family (ATPases Associated with diverse cellular Activities). Proteins of the AAA+ family show a diverse range of functionalities, but share a related core AAA+ domain, which often assembles into hexameric rings. Dynein is unusual because it has all six AAA+ domains linked together, in one long polypeptide. The dynein motor domain generates movement by coupling ATP driven conformational changes in the AAA+ ring to the swing of a motile element called the linker. Dynein binds to its microtubule track via a long antiparallel coiled-coil stalk that emanates from the AAA+ ring. Recently the first high resolution structures of the dynein motor domain were published. Here we provide a detailed structural analysis of the six AAA+ domains using our Saccharomycescerevisiae crystal structure. We describe how structural similarities in the dynein AAA+ domains suggest they share a common evolutionary origin. We analyse how the different AAA+ domains have diverged from each other. We discuss how this is related to the function of dynein as a motor protein and how the AAA+ domains of dynein compare to those of other AAA+ proteins. PMID:24680784
USDA-ARS?s Scientific Manuscript database
A novel Babesia bovis gene family encoding proteins with similarities to the Plasmodium 6cys protein family was identified by TBLASTN searches of the Babesia bovis genome using the sequence of the P. falciparum PFS230 protein as query, and was termed Bbo-6cys gene family. The Bbo-cys6 gene family co...
Cloning and analysis of 16 Rab genes from macronuclear DNA of Euplotes octocarinatus.
Zhi, Hui; Wang, Wei; Li, Lingyan; Chai, Baofeng; Sun, Yonghua; Liang, Aihua
2005-08-01
Rab proteins belong to the largest family of the Ras superfamily of small GTPase that play an important role in intracellular vesicular traffic. So far, almost 60 members of Rab family have been identified in mammalian cells. To further study the diversity and function of Rab protein in evolution, unicellular protozoa ciliates, Euplotes octocarinatus, were used in this study, Rab genes were screened by PCR method from macronuclear DNA of E. octocarinatus. Sixteen Rab genes were obtained. They share 87.6-99.5% identities. Highly conserved GTP-binding domains were found. There are some hot regions that diverse sharply in these genes as well.
Parody, Nuria; Fuertes, Miguel Angel; Alonso, Carlos; Pico de Coaña, Yago
2013-01-01
The polcalcin family is one of the most epidemiologically relevant families of calcium-binding allergens. Polcalcins are potent plant allergens that contain one or several EF-hand motifs and their allergenicity is primarily associated with the Ca(2+)-bound form of the protein. Conformation, stability, as well as IgE recognition of calcium-binding allergens greatly depend on the presence of protein-bound calcium ions. We describe a protocol that uses three techniques (SDS-PAGE, circular dichroism spectroscopy, and ELISA) to describe the effects that calcium has on the structural changes in an allergen and its IgE binding properties.
Ariani, Andrea; Gepts, Paul
2015-10-01
Plant aquaporins are a large and diverse family of water channel proteins that are essential for several physiological processes in living organisms. Numerous studies have linked plant aquaporins with a plethora of processes, such as nutrient acquisition, CO2 transport, plant growth and development, and response to abiotic stresses. However, little is known about this protein family in common bean. Here, we present a genome-wide identification of the aquaporin gene family in common bean (Phaseolus vulgaris L.), a legume crop essential for human nutrition. We identified 41 full-length coding aquaporin sequences in the common bean genome, divided by phylogenetic analysis into five sub-families (PIPs, TIPs, NIPs, SIPs and XIPs). Residues determining substrate specificity of aquaporins (i.e., NPA motifs and ar/R selectivity filter) seem conserved between common bean and other plant species, allowing inference of substrate specificity for these proteins. Thanks to the availability of RNA-sequencing datasets, expression levels in different organs and in leaves of wild and domesticated bean accessions were evaluated. Three aquaporins (PvTIP1;1, PvPIP2;4 and PvPIP1;2) have the overall highest mean expressions, with PvTIP1;1 having the highest expression among all aquaporins. We performed an EST database mining to identify drought-responsive aquaporins in common bean. This analysis showed a significant increase in expression for PvTIP1;1 in drought stress conditions compared to well-watered environments. The pivotal role suggested for PvTIP1;1 in regulating water homeostasis and drought stress response in the common bean should be verified by further field experimentation under drought stress.
Unexpected Role for a Serine/Threonine-Rich Domain in the Candida albicans Iff Protein Family▿†
Boisramé, Anita; Cornu, Amandine; Da Costa, Grégory; Richard, Mathias L.
2011-01-01
Glycosylphosphatidylinositol (GPI)-anchored proteins are an important class of cell wall proteins in Candida albicans because of their localization and their function, even if more than half of them have no characterized homolog in the databases. In this study, we focused on the IFF protein family, investigating their exposure on the cell surface and the sequences that determine their subcellular localization. Protein localization and surface exposure were monitored by the addition of a V5 tag on all members of the family. The data obtained using the complete proteins showed for Iff3 (or -9), Iff5, Iff6, and Iff8 a covalent linkage to the β-1,6-glucan network but, remarkably, showed that Iff2/Hyr3 was linked through disulfide bridges or NaOH-labile bonds. However, since some proteins of the Iff family were undetectable, we designed chimeric constructions using the last 60 amino acids of these proteins to test the localization signal. These constructions showed a β-1,6-glucan linkage for Iff1/Rbr3, Iff2/Hyr3, Iff4 and Iff7/Hyr4 C-terminal–Iff5 fusion proteins, and a membrane localization for the Iff10/Flo9 C terminus-Iff5 fusion protein. Immunofluorescence analyses coupled to these cell fraction data confirmed the importance of the length of the central serine/threonine-rich region for cell surface exposure. Further analysis of the Iff2/Hyr3 linkage to the cell surface showed for the first time that a serine/threonine central region of a GPI-anchored protein may be responsible for the disulfide and the NaOH bonds to the glucan and glycoproteins network and may also override the signal of the proximal ω site region. PMID:21841123
Nepusz, Tamás; Sasidharan, Rajkumar; Paccanaro, Alberto
2010-03-09
An important problem in genomics is the automatic inference of groups of homologous proteins from pairwise sequence similarities. Several approaches have been proposed for this task which are "local" in the sense that they assign a protein to a cluster based only on the distances between that protein and the other proteins in the set. It was shown recently that global methods such as spectral clustering have better performance on a wide variety of datasets. However, currently available implementations of spectral clustering methods mostly consist of a few loosely coupled Matlab scripts that assume a fair amount of familiarity with Matlab programming and hence they are inaccessible for large parts of the research community. SCPS (Spectral Clustering of Protein Sequences) is an efficient and user-friendly implementation of a spectral method for inferring protein families. The method uses only pairwise sequence similarities, and is therefore practical when only sequence information is available. SCPS was tested on difficult sets of proteins whose relationships were extracted from the SCOP database, and its results were extensively compared with those obtained using other popular protein clustering algorithms such as TribeMCL, hierarchical clustering and connected component analysis. We show that SCPS is able to identify many of the family/superfamily relationships correctly and that the quality of the obtained clusters as indicated by their F-scores is consistently better than all the other methods we compared it with. We also demonstrate the scalability of SCPS by clustering the entire SCOP database (14,183 sequences) and the complete genome of the yeast Saccharomyces cerevisiae (6,690 sequences). Besides the spectral method, SCPS also implements connected component analysis and hierarchical clustering, it integrates TribeMCL, it provides different cluster quality tools, it can extract human-readable protein descriptions using GI numbers from NCBI, it interfaces with external tools such as BLAST and Cytoscape, and it can produce publication-quality graphical representations of the clusters obtained, thus constituting a comprehensive and effective tool for practical research in computational biology. Source code and precompiled executables for Windows, Linux and Mac OS X are freely available at http://www.paccanarolab.org/software/scps.
Caridi, Gianluca; Malaventura, Cristina; Dagnino, Monica; Leonardi, Emanuela; Artifoni, Lina; Ghiggeri, Gian Marco; Tosatto, Silvio C.E.; Murer, Luisa
2010-01-01
Background and objectives: Wilms tumor-suppressor gene-1 (WT1) plays a key role in kidney development and function. WT1 mutations usually occur in exons 8 and 9 and are associated with Denys-Drash, or in intron 9 and are associated with Frasier syndrome. However, overlapping clinical and molecular features have been reported. Few familial cases have been described, with intrafamilial variability. Sporadic cases of WT1 mutations in isolated diffuse mesangial sclerosis or focal segmental glomerulosclerosis have also been reported. Design, setting, participants, & measurements: Molecular analysis of WT1 exons 8 and 9 was carried out in five members on three generations of a family with late-onset isolated proteinuria. The effect of the detected amino acid substitution on WT1 protein's structure was studied by bioinformatics tools. Results: Three family members reached end-stage renal disease in full adulthood. None had genital abnormalities or Wilms tumor. Histologic analysis in two subjects revealed focal segmental glomerulosclerosis. The novel sequence variant c.1208G>A in WT1 exon 9 was identified in all of the affected members of the family. Conclusions: The lack of Wilms tumor or other related phenotypes suggests the expansion of WT1 gene analysis in patients with focal segmental glomerulosclerosis, regardless of age or presence of typical Denys-Drash or Frasier syndrome clinical features. Structural analysis of the mutated protein revealed that the mutation hampers zinc finger-DNA interactions, impairing target gene transcription. This finding opens up new issues about WT1 function in the maintenance of the complex gene network that regulates normal podocyte function. PMID:20150449
Genome-wide identification and analysis of MAPK and MAPKK gene families in Brachypodium distachyon.
Chen, Lihong; Hu, Wei; Tan, Shenglong; Wang, Min; Ma, Zhanbing; Zhou, Shiyi; Deng, Xiaomin; Zhang, Yang; Huang, Chao; Yang, Guangxiao; He, Guangyuan
2012-01-01
MAPK cascades are universal signal transduction modules and play important roles in plant growth, development and in response to a variety of biotic and abiotic stresses. Although MAPKs and MAPKKs have been systematically investigated in several plant species including Arabidopsis, rice and poplar, no systematic analysis has been conducted in the emerging monocot model plant Brachypodium distachyon. In the present study, a total of 16 MAPK genes and 12 MAPKK genes were identified from B. distachyon. An analysis of the genomic evolution showed that both tandem and segment duplications contributed significantly to the expansion of MAPK and MAPKK families. Evolutionary relationships within subfamilies were supported by exon-intron organizations and the architectures of conserved protein motifs. Synteny analysis between B. distachyon and the other two plant species of rice and Arabidopsis showed that only one homolog of B. distachyon MAPKs was found in the corresponding syntenic blocks of Arabidopsis, while 13 homologs of B. distachyon MAPKs and MAPKKs were found in that of rice, which was consistent with the speciation process of the three species. In addition, several interactive protein pairs between the two families in B. distachyon were found through yeast two hybrid assay, whereas their orthologs of a pair in Arabidopsis and other plant species were not found to interact with each other. Finally, expression studies of closely related family members among B. distachyon, Arabidopsis and rice showed that even recently duplicated representatives may fulfill different functions and be involved in different signal pathways. Taken together, our data would provide a foundation for evolutionary and functional characterization of MAPK and MAPKK gene families in B. distachyon and other plant species to unravel their biological roles.
Cartilage oligomeric matrix protein-deficient mice have normal skeletal development.
Svensson, Liz; Aszódi, Attila; Heinegård, Dick; Hunziker, Ernst B; Reinholt, Finn P; Fässler, Reinhard; Oldberg, Ake
2002-06-01
Cartilage oligomeric matrix protein (COMP) belongs to the thrombospondin family and is a homopentamer primarily expressed in cartilage. Mutations in the COMP gene result in the autosomal dominant chondrodysplasias pseudoachondroplasia (PSACH) and some types of multiple epiphyseal dysplasia (MED), which are characterized by mild to severe short-limb dwarfism and early-onset osteoarthritis. We have generated COMP-null mice to study the role of COMP in vivo. These mice show no anatomical, histological, or ultrastructural abnormalities and show none of the clinical signs of PSACH or MED. Northern blot analysis and immunohistochemical analysis of cartilage indicate that the lack of COMP is not compensated for by any other member of the thrombospondin family. The results also show that the phenotype in PSACH/MED cartilage disorders is not caused by the reduced amount of COMP.
Álvarez-Cervantes, Jorge; Díaz-Godínez, Gerardo; Mercado-Flores, Yuridia; Gupta, Vijai Kumar; Anducho-Reyes, Miguel Angel
2016-01-01
In this paper, the amino acid sequence of the β-xylanase SRXL1 of Sporisorium reilianum, which is a pathogenic fungus of maize was used as a model protein to find its phylogenetic relationship with other xylanases of Ascomycetes and Basidiomycetes and the information obtained allowed to establish a hypothesis of monophyly and of biological role. 84 amino acid sequences of β-xylanase obtained from the GenBank database was used. Groupings analysis of higher-level in the Pfam database allowed to determine that the proteins under study were classified into the GH10 and GH11 families, based on the regions of highly conserved amino acids, 233–318 and 180–193 respectively, where glutamate residues are responsible for the catalysis. PMID:27040368
Long, Xigui; Huang, Yanru; Tan, Hu; Li, Zhuo; Zhang, Rui; Linpeng, Siyuan; Lv, Weigang; Cao, Yingxi; Li, Haoxian; Liang, Desheng; Wu, Lingqian
2018-04-26
To detect the underlying pathogenesis of congenital cataract in a four-generation Chinese family. Whole-exome sequencing (WES) of family members (III:4, IV:4, and IV:6) was performed. Sanger sequencing and bioinformatics analysis were subsequently conducted. Full-length WT-MIP or K228fs-MIP fused to HA markers at the N-terminal was transfected into HeLa cells. Next, quantitative real-time PCR, western blotting and immunofluorescence confocal laser scanning were performed. The age of onset for nonsyndromic cataracts in male patients was by 1-year old, earlier than for female patients, who exhibited onset at adulthood. A novel c.682_683delAA (p.K228fs230X) mutation in main intrinsic protein (MIP) cosegregated with the cataract phenotype. The instability index and unfolded states for truncated MIP were predicted to increase by bioinformatics analysis. The mRNA transcription level of K228fs-MIP was reduced compared with that of WT-MIP, and K228fs-MIP protein expression was also lower than that of WT-MIP. Immunofluorescence images showed that WT-MIP principally localized to the plasma membrane, whereas the mutant protein was trapped in the cytoplasm. Our study generated genetic and primary functional evidence for a novel c.682_683delAA mutation in MIP that expands the variant spectrum of MIP and help us better understand the molecular basis of cataract.
Isolation, structural analysis, and expression characteristics of the maize TIFY gene family.
Zhang, Zhongbao; Li, Xianglong; Yu, Rong; Han, Meng; Wu, Zhongyi
2015-10-01
TIFY, previously known as ZIM, comprises a plant-specific family annotated as transcription factors that might play important roles in stress response. Despite TIFY proteins have been reported in Arabidopsis and rice, a comprehensive and systematic survey of ZmTIFY genes has not yet been conducted. To investigate the functions of ZmTIFY genes in this family, we isolated and characterized 30 ZmTIFY (1 TIFY, 3 ZML, and 26 JAZ) genes in an analysis of the maize (Zea mays L.) genome in this study. The 30 ZmTIFY genes were distributed over eight chromosomes. Multiple alignment and motif display results indicated that all ZmTIFY proteins share two conserved TIFY and Jas domains. Phylogenetic analysis revealed that the ZmTIFY family could be divided into two groups. Putative cis-elements, involved in abiotic stress response, phytohormones, pollen grain, and seed development, were detected in the promoters of maize TIFY genes. Microarray data showed that the ZmTIFY genes had tissue-specific expression patterns in various maize developmental stages and in response to biotic and abiotic stresses. The results indicated that ZmTIFY4, 5, 8, 26, and 28 were induced, while ZmTIFY16, 13, 24, 27, 18, and 30 were suppressed, by drought stress in the maize inbred lines Han21 and Ye478. ZmTIFY1, 19, and 28 were upregulated after infection by three pathogens, whereas ZmTIFY4, 13, 21, 23, 24, and 26 were suppressed. These results indicate that the ZmTIFY family may have vital roles in response to abiotic and biotic stresses. The data presented in this work provide vital clues for further investigating the functions of the genes in the ZmTIFY family.
Palacios, Gustavo; Forrester, Naomi L; Savji, Nazir; Travassos da Rosa, Amelia P A; Guzman, Hilda; Detoy, Kelly; Popov, Vsevolod L; Walker, Peter J; Lipkin, W Ian; Vasilakis, Nikos; Tesh, Robert B
2013-07-01
Farmington virus (FARV) is a rhabdovirus that was isolated from a wild bird during an outbreak of epizootic eastern equine encephalitis on a pheasant farm in Connecticut, USA. Analysis of the nearly complete genome sequence of the prototype CT AN 114 strain indicates that it encodes the five canonical rhabdovirus structural proteins (N, P, M, G and L) with alternative ORFs (> 180 nt) in the N and G genes. Phenotypic and genetic characterization of FARV has confirmed that it is a novel rhabdovirus and probably represents a new species within the family Rhabdoviridae. In sum, our analysis indicates that FARV represents a new species within the family Rhabdoviridae.
Mechanical Network in Titin Immunoglobulin from Force Distribution Analysis
Wilmanns, Matthias; Gräter, Frauke
2009-01-01
The role of mechanical force in cellular processes is increasingly revealed by single molecule experiments and simulations of force-induced transitions in proteins. How the applied force propagates within proteins determines their mechanical behavior yet remains largely unknown. We present a new method based on molecular dynamics simulations to disclose the distribution of strain in protein structures, here for the newly determined high-resolution crystal structure of I27, a titin immunoglobulin (IG) domain. We obtain a sparse, spatially connected, and highly anisotropic mechanical network. This allows us to detect load-bearing motifs composed of interstrand hydrogen bonds and hydrophobic core interactions, including parts distal to the site to which force was applied. The role of the force distribution pattern for mechanical stability is tested by in silico unfolding of I27 mutants. We then compare the observed force pattern to the sparse network of coevolved residues found in this family. We find a remarkable overlap, suggesting the force distribution to reflect constraints for the evolutionary design of mechanical resistance in the IG family. The force distribution analysis provides a molecular interpretation of coevolution and opens the road to the study of the mechanism of signal propagation in proteins in general. PMID:19282960
Pruyne, David
2016-01-01
Formins are a widespread family of eukaryotic cytoskeleton-organizing proteins. Many species encode multiple formin isoforms, and for animals, much of this reflects the presence of multiple conserved subtypes. Earlier phylogenetic analyses identified seven major formin subtypes in animals (DAAM, DIAPH, FHOD, FMN, FMNL, INF, and GRID2IP/delphilin), but left a handful of formins, particularly from nematodes, unassigned. In this new analysis drawing from genomic data from a wider range of taxa, nine formin subtypes are identified that encompass all the animal formins analyzed here. Included in this analysis are Multiple Wing Hairs proteins (MWH), which bear homology to formin N-terminal domains. Originally identified in Drosophila melanogaster and other arthropods, MWH-related proteins are also identified here in some nematodes (including Caenorhabditis elegans), and are shown to be related to a novel MWH-related formin (MWHF) subtype. One surprising result of this work is the discovery that a family of pleckstrin homology domain-containing formins (PHCFs) is represented in many vertebrates, but is strikingly absent from placental mammals. Consistent with a relatively recent loss of this formin, the human genome retains fragments of a defunct homologous formin gene.
Hernández Torres, Jorge; Papandreou, Nikolaos; Chomilier, Jacques
2009-05-01
The co-chaperone Hop [heat shock protein (HSP) organising protein] is known to bind both Hsp70 and Hsp90. Hop comprises three repeats of a tetratricopeptide repeat (TPR) domain, each consisting of three TPR motifs. The first and last TPR domains are followed by a domain containing several dipeptide (DP) repeats called the DP domain. These analyses suggest that the hop genes result from successive recombination events of an ancestral TPR-DP module. From a hydrophobic cluster analysis of homologous Hop protein sequences derived from gene families, we can postulate that shifts in the open reading frames are at the origin of the present sequences. Moreover, these shifts can be related to the presence or absence of biological function. We propose to extend the family of Hop co-chaperons into the kingdom of bacteria, as several structurally related genes have been identified by hydrophobic cluster analysis. We also provide evidence of common structural characteristics between hop and hip genes, suggesting a shared precursor of ancestral TPR-DP domains.
Morais do Amaral, Alexandre; Antoniw, John; Rudd, Jason J.; Hammond-Kosack, Kim E.
2012-01-01
The Dothideomycete fungus Mycosphaerella graminicola is the causal agent of Septoria tritici blotch, a devastating disease of wheat leaves that causes dramatic decreases in yield. Infection involves an initial extended period of symptomless intercellular colonisation prior to the development of visible necrotic disease lesions. Previous functional genomics and gene expression profiling studies have implicated the production of secreted virulence effector proteins as key facilitators of the initial symptomless growth phase. In order to identify additional candidate virulence effectors, we re-analysed and catalogued the predicted protein secretome of M. graminicola isolate IPO323, which is currently regarded as the reference strain for this species. We combined several bioinformatic approaches in order to increase the probability of identifying truly secreted proteins with either a predicted enzymatic function or an as yet unknown function. An initial secretome of 970 proteins was predicted, whilst further stringent selection criteria predicted 492 proteins. Of these, 321 possess some functional annotation, the composition of which may reflect the strictly intercellular growth habit of this pathogen, leaving 171 with no functional annotation. This analysis identified a protein family encoding secreted peroxidases/chloroperoxidases (PF01328) which is expanded within all members of the family Mycosphaerellaceae. Further analyses were done on the non-annotated proteins for size and cysteine content (effector protein hallmarks), and then by studying the distribution of homologues in 17 other sequenced Dothideomycete fungi within an overall total of 91 predicted proteomes from fungal, oomycete and nematode species. This detailed M. graminicola secretome analysis provides the basis for further functional and comparative genomics studies. PMID:23236356
Song, Aiping; Li, Peiling; Xin, Jingjing; Chen, Sumei; Zhao, Kunkun; Wu, Dan; Fan, Qingqing; Gao, Tianwei; Chen, Fadi; Guan, Zhiyong
2016-01-01
The homeodomain-leucine zipper (HD-Zip) transcription factor family is a key transcription factor family and unique to the plant kingdom. It consists of a homeodomain and a leucine zipper that serve in combination as a dimerization motif. The family can be classified into four subfamilies, and these subfamilies participate in the development of hormones and mediation of hormone action and are involved in plant responses to environmental conditions. However, limited information on this gene family is available for the important chrysanthemum ornamental species (Chrysanthemum morifolium). Here, we characterized 17 chrysanthemum HD-Zip genes based on transcriptome sequences. Phylogenetic analyses revealed that 17 CmHB genes were distributed in the HD-Zip subfamilies I and II and identified two pairs of putative orthologous proteins in Arabidopsis and chrysanthemum and four pairs of paralogous proteins in chrysanthemum. The software MEME was used to identify 7 putative motifs with E values less than 1e-3 in the chrysanthemum HD-Zip factors, and they can be clearly classified into two groups based on the composition of the motifs. A bioinformatics analysis predicted that 8 CmHB genes could be targeted by 10 miRNA families, and the expression of these 17 genes in response to phytohormone treatments and abiotic stresses was characterized. The results presented here will promote research on the various functions of the HD-Zip gene family members in plant hormones and stress responses. PMID:27196930
HOP family plays a major role in long-term acquired thermotolerance in Arabidopsis.
Fernández-Bautista, Nuria; Fernández-Calvino, Lourdes; Muñoz, Alfonso; Toribio, René; Mock, Hans P; Castellano, M Mar
2018-05-08
HSP70-HSP90 organizing protein (HOP) is a family of cytosolic cochaperones whose molecular role in thermotolerance is quite unknown in eukaryotes and unexplored in plants. In this article, we describe that the three members of the AtHOP family display a different induction pattern under heat, being HOP3 highly regulated during the challenge and the attenuation period. Despite HOP3 is the most heat-regulated member, the analysis of the hop1 hop2 hop3 triple mutant demonstrates that the three HOP proteins act redundantly to promote long-term acquired thermotolerance in Arabidopsis. HOPs interact strongly with HSP90 and part of the bulk of HOPs shuttles from the cytoplasm to the nuclei and to cytoplasmic foci during the challenge. RNAseq analyses demonstrate that, although the expression of the Hsf targets is not generally affected, the transcriptional response to heat is drastically altered during the acclimation period in the hop1 hop2 hop3 triple mutant. This mutant also displays an unusual high accumulation of insoluble and ubiquitinated proteins under heat, which highlights the additional role of HOP in protein quality control. These data reveal that HOP family is involved in different aspects of the response to heat, affecting the plant capacity to acclimate to high temperatures for long periods. © 2018 John Wiley & Sons Ltd.
Howe, Daniel K.; Gaji, Rajshekhar Y.; Mroz-Barrett, Meaghan; Gubbels, Marc-Jan; Striepen, Boris; Stamper, Shelby
2005-01-01
Sarcocystis neurona is a member of the Apicomplexa that causes myelitis and encephalitis in horses but normally cycles between the opossum and small mammals. Analysis of an S. neurona expressed sequence tag (EST) database revealed four paralogous proteins that exhibit clear homology to the family of surface antigens (SAGs) and SAG-related sequences of Toxoplasma gondii. The primary peptide sequences of the S. neurona proteins are consistent with the two-domain structure that has been described for the T. gondii SAGs, and each was predicted to have an amino-terminal signal peptide and a carboxyl-terminal glycolipid anchor addition site, suggesting surface localization. All four proteins were confirmed to be membrane associated and displayed on the surface of S. neurona merozoites. Due to their surface localization and homology to T. gondii surface antigens, these S. neurona proteins were designated SnSAG1, SnSAG2, SnSAG3, and SnSAG4. Consistent with their homology, the SnSAGs elicited a robust immune response in infected and immunized animals, and their conserved structure further suggests that the SnSAGs similarly serve as adhesins for attachment to host cells. Whether the S. neurona SAG family is as extensive as the T. gondii SAG family remains unresolved, but it is probable that additional SnSAGs will be revealed as more S. neurona ESTs are generated. The existence of an SnSAG family in S. neurona indicates that expression of multiple related surface antigens is not unique to the ubiquitous organism T. gondii. Instead, the SAG gene family is a common trait that presumably has an essential, conserved function(s). PMID:15664946
Howe, Daniel K; Gaji, Rajshekhar Y; Mroz-Barrett, Meaghan; Gubbels, Marc-Jan; Striepen, Boris; Stamper, Shelby
2005-02-01
Sarcocystis neurona is a member of the Apicomplexa that causes myelitis and encephalitis in horses but normally cycles between the opossum and small mammals. Analysis of an S. neurona expressed sequence tag (EST) database revealed four paralogous proteins that exhibit clear homology to the family of surface antigens (SAGs) and SAG-related sequences of Toxoplasma gondii. The primary peptide sequences of the S. neurona proteins are consistent with the two-domain structure that has been described for the T. gondii SAGs, and each was predicted to have an amino-terminal signal peptide and a carboxyl-terminal glycolipid anchor addition site, suggesting surface localization. All four proteins were confirmed to be membrane associated and displayed on the surface of S. neurona merozoites. Due to their surface localization and homology to T. gondii surface antigens, these S. neurona proteins were designated SnSAG1, SnSAG2, SnSAG3, and SnSAG4. Consistent with their homology, the SnSAGs elicited a robust immune response in infected and immunized animals, and their conserved structure further suggests that the SnSAGs similarly serve as adhesins for attachment to host cells. Whether the S. neurona SAG family is as extensive as the T. gondii SAG family remains unresolved, but it is probable that additional SnSAGs will be revealed as more S. neurona ESTs are generated. The existence of an SnSAG family in S. neurona indicates that expression of multiple related surface antigens is not unique to the ubiquitous organism T. gondii. Instead, the SAG gene family is a common trait that presumably has an essential, conserved function(s).
Basel‐Vanagaite, L; Attia, R; Yahav, M; Ferland, R J; Anteki, L; Walsh, C A; Olender, T; Straussberg, R; Magal, N; Taub, E; Drasinover, V; Alkelai, A; Bercovich, D; Rechavi, G; Simon, A J; Shohat, M
2006-01-01
Background The molecular basis of autosomal recessive non‐syndromic mental retardation (NSMR) is poorly understood, mostly owing to heterogeneity and absence of clinical criteria for grouping families for linkage analysis. Only two autosomal genes, the PRSS12 gene on chromosome 4q26 and the CRBN on chromosome 3p26, have been shown to cause autosomal recessive NSMR, each gene in only one family. Objective To identify the gene causing autosomal recessive NSMR on chromosome 19p13.12. Results The candidate region established by homozygosity mapping was narrowed down from 2.4 Mb to 0.9 Mb on chromosome 19p13.12. A protein truncating mutation was identified in the gene CC2D1A in nine consanguineous families with severe autosomal recessive NSMR. The absence of the wild type protein in the lymphoblastoid cells of the patients was confirmed. CC2D1A is a member of a previously uncharacterised gene family that carries two conserved motifs, a C2 domain and a DM14 domain. The C2 domain is found in proteins which function in calcium dependent phospholipid binding; the DM14 domain is unique to the CC2D1A protein family and its role is unknown. CC2D1A is a putative signal transducer participating in positive regulation of I‐κB kinase/NFκB cascade. Expression of CC2D1A mRNA was shown in the embryonic ventricular zone and developing cortical plate in staged mouse embryos, persisting into adulthood, with highest expression in the cerebral cortex and hippocampus. Conclusions A previously unknown signal transduction pathway is important in human cognitive development. PMID:16033914
Basel-Vanagaite, L; Attia, R; Yahav, M; Ferland, R J; Anteki, L; Walsh, C A; Olender, T; Straussberg, R; Magal, N; Taub, E; Drasinover, V; Alkelai, A; Bercovich, D; Rechavi, G; Simon, A J; Shohat, M
2006-03-01
The molecular basis of autosomal recessive non-syndromic mental retardation (NSMR) is poorly understood, mostly owing to heterogeneity and absence of clinical criteria for grouping families for linkage analysis. Only two autosomal genes, the PRSS12 gene on chromosome 4q26 and the CRBN on chromosome 3p26, have been shown to cause autosomal recessive NSMR, each gene in only one family. To identify the gene causing autosomal recessive NSMR on chromosome 19p13.12. The candidate region established by homozygosity mapping was narrowed down from 2.4 Mb to 0.9 Mb on chromosome 19p13.12. A protein truncating mutation was identified in the gene CC2D1A in nine consanguineous families with severe autosomal recessive NSMR. The absence of the wild type protein in the lymphoblastoid cells of the patients was confirmed. CC2D1A is a member of a previously uncharacterised gene family that carries two conserved motifs, a C2 domain and a DM14 domain. The C2 domain is found in proteins which function in calcium dependent phospholipid binding; the DM14 domain is unique to the CC2D1A protein family and its role is unknown. CC2D1A is a putative signal transducer participating in positive regulation of I-kappaB kinase/NFkappaB cascade. Expression of CC2D1A mRNA was shown in the embryonic ventricular zone and developing cortical plate in staged mouse embryos, persisting into adulthood, with highest expression in the cerebral cortex and hippocampus. A previously unknown signal transduction pathway is important in human cognitive development.
2014-01-01
Background The Maternally expressed gene (Meg) family is a locally-duplicated gene family of maize which encodes cysteine-rich proteins (CRPs). The founding member of the family, Meg1, is required for normal development of the basal endosperm transfer cell layer (BETL) and is involved in the allocation of maternal nutrients to growing seeds. Despite the important roles of Meg1 in maize seed development, the evolutionary history of the Meg cluster and the activities of the duplicate genes are not understood. Results In maize, the Meg gene cluster resides in a 2.3 Mb-long genomic region that exhibits many features of non-centromeric heterochromatin. Using phylogenetic reconstruction and syntenic alignments, we identified the pedigree of the Meg family, in which 11 of its 13 members arose in maize after allotetraploidization ~4.8 mya. Phylogenetic and population-genetic analyses identified possible signatures suggesting recent positive selection in Meg homologs. Structural analyses of the Meg proteins indicated potentially adaptive changes in secondary structure from α-helix to β-strand during the expansion. Transcriptomic analysis of the maize endosperm indicated that 6 Meg genes are selectively activated in the BETL, and younger Meg genes are more active than older ones. In endosperms from B73 by Mo17 reciprocal crosses, most Meg genes did not display parent-specific expression patterns. Conclusions Recently-duplicated Meg genes have different protein secondary structures, and their expressions in the BETL dominate over those of older members. Together with the signs of positive selections in the young Meg genes, these results suggest that the expansion of the Meg family involves potentially adaptive transitions in which new members with novel functions prevailed over older members. PMID:25084677
Botha, M; Pesce, E-R; Blatch, G L
2007-01-01
Extensive structural and functional remodelling of Plasmodium falciparum (malaria)-infected erythrocytes follows the export of a range of proteins of parasite origin (exportome) across the parasitophorous vacuole into the host erythrocyte. The genome of P. falciparum encodes a diverse chaperone complement including at least 43 members of the heat shock protein 40kDa (Hsp40) family, and six members of the heat shock protein 70kDa (Hsp70) family. Nearly half of the Hsp40 proteins of P. falciparum are predicted to contain a PEXEL/HT (Plasmodium export element/host targeting signal) sequence motif, and hence are likely to be part of the exportome. In this review we critically evaluate the classification, sequence similarity and clustering, and possible interactors of the P. falciparum Hsp40 chaperone machinery. In addition to the types I, II and III Hsp40 proteins all exhibiting the signature J-domain, the P. falciparum genome also encodes a number of specialized Hsp40 proteins with a J-like domain, which we have categorized as type IV Hsp40 proteins. Analysis of the potential P. falciparum Hsp40 protein interaction network revealed connections predominantly with cytoskeletal and membrane proteins, transcriptional machinery, DNA repair and replication machinery, translational machinery, the proteasome and proteolytic enzymes, and enzymes involved in cellular physiology. Comparison of the Hsp40 proteins of P. falciparum to those of other apicomplexa reveals that most of the proteins (especially the PEXEL/HT-containing proteins) are unique to P. falciparum. Furthermore, very few of the P. falciparum Hsp40 proteins have human homologs, except for those proteins implicated in fundamental biological processes. Our analysis suggests that P. falciparum has evolved an expanded and specialized Hsp40 protein machinery to enable it successfully to invade and remodel the human erythrocyte, and we propose a model in which these proteins are involved in chaperone-mediated translocation, folding, assembly and regulation of parasite and host proteins.
Van Holle, Sofie; Rougé, Pierre; Van Damme, Els J M
2017-03-01
The Nictaba family groups all proteins that show homology to Nictaba, the tobacco lectin. So far, Nictaba and an Arabidopsis thaliana homologue have been shown to be implicated in the plant stress response. The availability of more than 50 sequenced plant genomes provided the opportunity for a genome-wide identification of Nictaba -like genes in 15 species, representing members of the Fabaceae, Poaceae, Solanaceae, Musaceae, Arecaceae, Malvaceae and Rubiaceae. Additionally, phylogenetic relationships between the different species were explored. Furthermore, this study included domain organization analysis, searching for orthologous genes in the legume family and transcript profiling of the Nictaba -like lectin genes in soybean. Using a combination of BLASTp, InterPro analysis and hidden Markov models, the genomes of Medicago truncatula , Cicer arietinum , Lotus japonicus , Glycine max , Cajanus cajan , Phaseolus vulgaris , Theobroma cacao , Solanum lycopersicum , Solanum tuberosum , Coffea canephora , Oryza sativa , Zea mays, Sorghum bicolor , Musa acuminata and Elaeis guineensis were searched for Nictaba -like genes. Phylogenetic analysis was performed using RAxML and additional protein domains in the Nictaba-like sequences were identified using InterPro. Expression analysis of the soybean Nictaba -like genes was investigated using microarray data. Nictaba -like genes were identified in all studied species and analysis of the duplication events demonstrated that both tandem and segmental duplication contributed to the expansion of the Nictaba gene family in angiosperms. The single-domain Nictaba protein and the multi-domain F-box Nictaba architectures are ubiquitous among all analysed species and microarray analysis revealed differential expression patterns for all soybean Nictaba-like genes. Taken together, the comparative genomics data contributes to our understanding of the Nictaba -like gene family in species for which the occurrence of Nictaba domains had not yet been investigated. Given the ubiquitous nature of these genes, they have probably acquired new functions over time and are expected to take on various roles in plant development and defence. © The Author 2017. Published by Oxford University Press on behalf of the Annals of Botany Company. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Jawallapersand, Poojah; Mashele, Samson Sitheni; Kovačič, Lidija; Stojan, Jure; Komel, Radovan; Pakala, Suresh Babu; Kraševec, Nada; Syed, Khajamohiddin
2014-01-01
Cytochrome P450 monooxygenases (CYPs/P450s) are heme-thiolate proteins whose role as a drug target against pathogenic microbes has been explored because of their stereo- and regio-specific oxidation activity. We aimed to assess the CYP53 family's role as a common alternative drug target against animal (including human) and plant pathogenic fungi and its role in fungal-mediated wood degradation. Genome-wide analysis of fungal species revealed the presence of CYP53 members in ascomycetes and basidiomycetes. Basidiomycetes had a higher number of CYP53 members in their genomes than ascomycetes. Only two CYP53 subfamilies were found in ascomycetes and six subfamilies in basidiomycetes, suggesting that during the divergence of phyla ascomycetes lost CYP53 P450s. According to phylogenetic and gene-structure analysis, enrichment of CYP53 P450s in basidiomycetes occurred due to the extensive duplication of CYP53 P450s in their genomes. Numerous amino acids (103) were found to be conserved in the ascomycetes CYP53 P450s, against only seven in basidiomycetes CYP53 P450s. 3D-modelling and active-site cavity mapping data revealed that the ascomycetes CYP53 P450s have a highly conserved protein structure whereby 78% amino acids in the active-site cavity were found to be conserved. Because of this rigid nature of ascomycetes CYP53 P450s' active site cavity, any inhibitor directed against this P450 family can serve as a common anti-fungal drug target, particularly toward pathogenic ascomycetes. The dynamic nature of basidiomycetes CYP53 P450s at a gene and protein level indicates that these P450s are destined to acquire novel functions. Functional analysis of CYP53 P450s strongly supported our hypothesis that the ascomycetes CYP53 P450s ability is limited for detoxification of toxic molecules, whereas basidiomycetes CYP53 P450s play an additional role, i.e. involvement in degradation of wood and its derived components. This study is the first report on genome-wide comparative structural (gene and protein structure-level) and evolutionary analysis of a fungal P450 family.
Secretome analysis of rat osteoblasts during icariin treatment induced osteogenesis
Qian, Weiqing; Su, Yan; Zhang, Yajie; Yao, Nianwei; Gu, Nin; Zhang, Xu; Yin, Hong
2018-01-01
Osteoporosis is a serious public health problem and icariin (ICA) is the active component of the Epimedium sagittatum, a traditional Chinese medicinal herb. The present study aimed to investigate the effects and underlying mechanisms of ICA as a potential therapy for osteoporosis. Calvaria osteoblasts were isolated from newborn rats and treated with ICA. Cell viability, apoptosis, alkaline phosphatase activity and calcium deposition were analyzed. Bioinformatics analyses were performed to identify differentially expressed proteins (DEPs) in response to ICA treatment. Western blot analysis was performed to validate the expression of DEPs. ICA administration promoted osteoblast viability, alkaline phosphatase activity, calcium deposition and inhibited osteoblast apoptosis. Secretome analysis of ICA-treated cells was performed using two-dimensional gel electrophoresis and matrix-assisted laser desorption/ionization time-of-flight mass spectrometry. A total of 56 DEPs were identified, including serpin family F member 1 (PEDF), protein disulfide isomerase family A, member 3 (PDIA3), nuclear protein, co-activator of histone transcription (NPAT), c-Myc and heat shock protein 70 (HSP70). These proteins were associated with signaling pathways, including Fas and p53. Bioinformatics and western blot analyses confirmed that the expression levels of the six DEPs were upregulated following ICA treatment. These genes may be directly or indirectly involved in ICA-mediated osteogenic differentiation and osteogenesis. It was demonstrated that ICA treatment promoted osteogenesis by modulating the expression of PEDF, PDIA3, NPAT and HSP70 through signaling pathways, including Fas and p53. PMID:29532868
Zheng, Jianhua; Ren, Xianwen; Wei, Candong; Yang, Jian; Hu, Yongfeng; Liu, Liguo; Xu, Xingye; Wang, Jin; Jin, Qi
2013-01-01
Tuberculosis (TB) is an infectious bacterial disease that causes morbidity and mortality, especially in developing countries. Although its efficacy against TB has displayed a high degree of variability (0%–80%) in different trials, Mycobacterium bovis bacillus Calmette-Guérin (BCG) has been recognized as an important weapon for preventing TB worldwide for over 80 years. Because secreted proteins often play vital roles in the interaction between bacteria and host cells, the secretome of mycobacteria is considered to be an attractive reservoir of potential candidate antigens for the development of novel vaccines and diagnostic reagents. In this study, we performed a proteomic analysis of BCG culture filtrate proteins using SDS-PAGE and high-resolution Fourier transform mass spectrometry. In total, 239 proteins (1555 unique peptides) were identified, including 185 secreted proteins or lipoproteins. Furthermore, 17 novel protein products not annotated in the BCG database were detected and validated by means of RT-PCR at the transcriptional level. Additionally, the translational start sites of 52 proteins were confirmed, and 22 proteins were validated through extension of the translational start sites based on N-terminus-derived peptides. There are 103 secreted proteins that have not been reported in previous studies on the mycobacterial secretome and are unique to our study. The physicochemical characteristics of the secreted proteins were determined. Major components from the culture supernatant, including low-molecular-weight antigens, lipoproteins, Pro-Glu and Pro-Pro-Glu family proteins, and Mce family proteins, are discussed; some components represent potential predominant antigens in the humoral and cellular immune responses. PMID:23616670
Dittmer, Neal T; Hiromasa, Yasuaki; Tomich, John M; Lu, Nanyan; Beeman, Richard W; Kramer, Karl J; Kanost, Michael R
2012-01-01
The insect cuticle is a composite biomaterial made up primarily of chitin and proteins. The physical properties of the cuticle can vary greatly from hard and rigid to soft and flexible. Understanding how different cuticle types are assembled can aid in the development of novel biomimetic materials for use in medicine and technology. Toward this goal, we have taken a combined proteomics and transcriptomics approach with the red flour beetle, Tribolium castaneum, to examine the protein and gene expression profiles of the elytra and hindwings, appendages that contain rigid and soft cuticles, respectively. Two-dimensional gel electrophoresis analysis revealed distinct differences in the protein profiles between elytra and hindwings, with four highly abundant proteins dominating the elytral cuticle extract. MALDI/TOF mass spectrometry identified 19 proteins homologous to known or hypothesized cuticular proteins (CPs), including a novel low complexity protein enriched in charged residues. Microarray analysis identified 372 genes with a 10-fold or greater difference in transcript levels between elytra and hindwings. CP genes with higher expression in the elytra belonged to the Rebers and Riddiford family (CPR) type 2, or cuticular proteins of low complexity (CPLC) enriched in glycine or proline. In contrast, a majority of the CP genes with higher expression in hindwings were classified as CPR type 1, cuticular proteins analogous to peritrophins (CPAP), or members of the Tweedle family. This research shows that the elyra and hindwings, representatives of rigid and soft cuticles, have different protein and gene expression profiles for structural proteins that may influence the mechanical properties of these cuticles.
Proteomic analysis and food-grade enzymes of Moringa oleifer Lam. a Lam. flower.
Shi, Yanan; Wang, Xuefeng; Huang, Aixiang
2018-08-01
Moringa oleifer Lam. flower contain high-proteins and function nutrients. Many advances have been made to it, but there is still no proteomic information of this species. Total protein from the flowers applied shotgun 2DLC-MS/MS proteomic identified 9443 peptides corresponding to 4004 high-confidence proteins by Proteome Discoverer™ Software 2.1. These proteins were mostly distributed ranging between 40 and 70 kDa. Gene Ontology (GO) analysis indicated that the largest of the proteins were cytoplasm 72.7%, catalytic activity 61.5% and macromolecule metabolism 43.7%, and KEGG analysis revealed that the largest group of 129 proteins was involved in Ribosome to directing protein synthesis (translation). Moreover, a number of commercially important food-grade enzymes were commented, 261 proteins were annotated as carbohydrate-active enzymes, 16 protease, 22 proteins are assigned to the citrate cycle, which the top proteins were assigned to GH family, cysteine synthase and serine/threonine-protein phosphatase. These enzymes indicated that is a new source with potential use for fermentation and brewing industry, fruit and vegetable storage and the development of function peptides. Copyright © 2018 Elsevier B.V. All rights reserved.
Comprehensive comparison of two protein family of P-ATPases (13A1 and 13A3) in insects.
Seddigh, Samin
2017-06-01
The P-type ATPases (P-ATPases) are present in all living cells where they mediate ion transport across membranes on the expense of ATP hydrolysis. Different ions which are transported by these pumps are protons like calcium, sodium, potassium, and heavy metals such as manganese, iron, copper, and zinc. Maintenance of the proper gradients for essential ions across cellular membranes makes P-ATPases crucial for cell survival. In this study, characterization of two families of P-ATPases including P-ATPase 13A1 and P-ATPase 13A3 protein was compared in two different insect species from different orders. According to the conserved motifs found with MEME, nine motifs were shared by insects of 13A1 family but eight in 13A3 family. Seven different insect species from 13A1 and five samples from 13A3 family were selected as the representative samples for functional and structural analyses. The structural and functional analyses were performed with ProtParam, SOPMA, SignalP 4.1, TMHMM 2.0, ProtScale and ProDom tools in the ExPASy database. The tertiary structure of Bombus terrestris as a sample of each family of insects were predicted by the Phyre2 and TM-score servers and their similarities were verified by SuperPose server. The tertiary structures were predicted via the "c3b9bA" model (PDB Accession Code: 3B9B) in P-ATPase 13A1 family and "c2zxeA" model (PDB Accession Code: 2ZXE) in P-ATPase 13A3 family. A phylogenetic tree was constructed with MEGA 6.06 software using the Neighbor-joining method. According to the results, there was a high identity of P-ATPase families so that they should be derived from a common ancestor however they belonged to separate groups. In protein-protein interaction analysis by STRING 10.0, six common enriched pathways of KEGG were identified in B. terrestris in both families. The obtained data provide a background for bioinformatic studies of the function and evolution of other insects and organisms. Copyright © 2017 Elsevier Ltd. All rights reserved.
Romá-Mateo, Carlos; Sacristán-Reviriego, Almudena; Beresford, Nicola J; Caparrós-Martín, José Antonio; Culiáñez-Macià, Francisco A; Martín, Humberto; Molina, María; Tabernero, Lydia; Pulido, Rafael
2011-04-01
Dual-specificity phosphatases (DSPs) constitute a large protein tyrosine phosphatase (PTP) family, with examples in distant evolutive phyla. PFA-DSPs (Plant and Fungi Atypical DSPs) are a group of atypical DSPs present in plants, fungi, kinetoplastids, and slime molds, the members of which share structural similarity with atypical- and lipid phosphatase DSPs from mammals. The analysis of the PFA-DSPs from the plant Arabidopsis thaliana (AtPFA-DSPs) showed differential tissue mRNA expression, substrate specificity, and catalytic activity for these proteins, suggesting different functional roles among plant PFA-DSPs. Bioinformatic analysis revealed the existence of novel PFA-DSP-related proteins in fungi (Oca1, Oca2, Oca4 and Oca6 in Saccharomyces cerevisiae) and protozoa, which were segregated from plant PFA-DSPs. The closest yeast homolog for these proteins was the PFA-DSP from S. cerevisiae ScPFA-DSP1/Siw14/Oca3. Oca1, Oca2, Siw14/Oca3, Oca4, and Oca6 were involved in the yeast response to caffeine and rapamycin stresses. Siw14/Oca3 was an active phosphatase in vitro, whereas no phosphatase activity could be detected for Oca1. Remarkably, overexpression of Siw14/Oca3 suppressed the caffeine sensitivity of oca1, oca2, oca4, and oca6 deleted strains, indicating a genetic linkage and suggesting a functional relationship for these proteins. Functional studies on mutations targeting putative catalytic residues from the A. thaliana AtPFA-DSP1/At1g05000 protein indicated the absence of canonical amino acids acting as the general acid/base in the phosphor-ester hydrolysis, which suggests a specific mechanism of reaction for PFA-DSPs and related enzymes. Our studies demonstrate the existence of novel phosphatase protein families in fungi and protozoa, with active and inactive enzymes linked in common signaling pathways. This illustrates the catalytic and functional complexity of the expanding family of atypical dual-specificity phosphatases in non-metazoans, including parasite organisms responsible for infectious human diseases.
Ussery, David; Nielsen, Lene N.; Ingmer, Hanne
2015-01-01
The qac genes of Staphylococcus species encode multidrug efflux pumps: membrane proteins that export toxic molecules and thus increase tolerance to a variety of compounds such as disinfecting agents, including quaternary ammonium compounds (for which they are named), intercalating dyes and some antibiotics. In Stapylococcus species, six different plasmid-encoded Qac efflux pumps have been described, and they belong to two major protein families. QacA and QacB are members of the Major Facilitator Superfamily, while QacC, QacG, QacH, and QacJ all belong to the Small Multidrug Resistance (SMR) family. Not all SMR proteins are called Qac and the reverse is also true, which has caused confusion in the literature and in gene annotations. The discovery of qac genes and their presence in various staphylococcal populations is briefly reviewed. A sequence comparison revealed that some of the PCR primers described in the literature for qac detection may miss particular qac genes due to lack of DNA conservation. Despite their resemblance in substrate specificity, the Qac proteins belonging to the two protein families have little in common. QacA and QacB are highly conserved in Staphylococcus species, while qacA was also detected in Enterococcus faecalis, suggesting that these plasmid-born genes have spread across bacterial genera. Nevertheless, these qacA and qacB genes are quite dissimilar to their closest homologues in other organisms. In contrast, SMR-type Qac proteins display considerable sequence variation, despite their short length, even within the Staphylococcus genus. Phylogenetic analysis of these genes identified similarity to a large number of other SMR members, found in staphylococci as well as in other genera. A number of phylogenetic trees of SMR Qac proteins are presented here, starting with genes present in S. aureus and S. epidermidis, and extending this to related genes found in other species of this genus, and finally to genes found in other genera. PMID:25883793
Wassenaar, Trudy M; Ussery, David; Nielsen, Lene N; Ingmer, Hanne
2015-03-01
The qac genes of Staphylococcus species encode multidrug efflux pumps: membrane proteins that export toxic molecules and thus increase tolerance to a variety of compounds such as disinfecting agents, including quaternary ammonium compounds (for which they are named), intercalating dyes and some antibiotics. In Stapylococcus species, six different plasmid-encoded Qac efflux pumps have been described, and they belong to two major protein families. QacA and QacB are members of the Major Facilitator Superfamily, while QacC, QacG, QacH, and QacJ all belong to the Small Multidrug Resistance (SMR) family. Not all SMR proteins are called Qac and the reverse is also true, which has caused confusion in the literature and in gene annotations. The discovery of qac genes and their presence in various staphylococcal populations is briefly reviewed. A sequence comparison revealed that some of the PCR primers described in the literature for qac detection may miss particular qac genes due to lack of DNA conservation. Despite their resemblance in substrate specificity, the Qac proteins belonging to the two protein families have little in common. QacA and QacB are highly conserved in Staphylococcus species, while qacA was also detected in Enterococcus faecalis, suggesting that these plasmid-born genes have spread across bacterial genera. Nevertheless, these qacA and qacB genes are quite dissimilar to their closest homologues in other organisms. In contrast, SMR-type Qac proteins display considerable sequence variation, despite their short length, even within the Staphylococcus genus. Phylogenetic analysis of these genes identified similarity to a large number of other SMR members, found in staphylococci as well as in other genera. A number of phylogenetic trees of SMR Qac proteins are presented here, starting with genes present in S. aureus and S. epidermidis, and extending this to related genes found in other species of this genus, and finally to genes found in other genera.
Iida, Aya; Ohnishi, Yasuo; Horinouchi, Sueharu
2008-01-01
Via N-acylhomoserine lactones, the GinI/GinR quorum-sensing system in Gluconacetobacter intermedius NCI1051, a gram-negative acetic acid bacterium, represses acetic acid and gluconic acid fermentation. Two-dimensional polyacrylamide gel electrophoretic analysis of protein profiles of strain NCI1051 and ginI and ginR mutants identified a protein that was produced in response to the GinI/GinR regulatory system. Cloning and nucleotide sequencing of the gene encoding this protein revealed that it encoded an OmpA family protein, named GmpA. gmpA was a member of the gene cluster containing three adjacent homologous genes, gmpA to gmpC, the organization of which appeared to be unique to vinegar producers, including “Gluconacetobacter polyoxogenes.” In addition, GmpA was unique among the OmpA family proteins in that its N-terminal membrane domain forming eight antiparallel transmembrane β-strands contained an extra sequence in one of the surface-exposed loops. Transcriptional analysis showed that only gmpA of the three adjacent gmp genes was activated by the GinI/GinR quorum-sensing system. However, gmpA was not controlled directly by GinR but was controlled by an 89-amino-acid protein, GinA, a target of this quorum-sensing system. A gmpA mutant grew more rapidly in the presence of 2% (vol/vol) ethanol and accumulated acetic acid and gluconic acid in greater final yields than strain NCI1051. Thus, GmpA plays a role in repressing oxidative fermentation, including acetic acid fermentation, which is unique to acetic acid bacteria and allows ATP synthesis via ethanol oxidation. Consistent with the involvement of gmpA in oxidative fermentation, its transcription was also enhanced by ethanol and acetic acid. PMID:18487322
A Review on Structures and Functions of Bcl-2 Family Proteins from Homo sapiens.
Sivakumar, Dakshinamurthy; Sivaraman, Thirunavukkarasu
2016-01-01
Cancer cells evade apoptosis, which is regulated by proteins of Bcl-2 family in the intrinsic pathways. Numerous experimental three-dimensional (3D) structures of the apoptotic proteins and the proteins bound with small chemical molecules/peptides/proteins have been reported in the literature. In this review article, the 3D structures of the Bcl-2 family proteins from Homo sapiens and as well complex structures of the anti-apoptotic proteins bound with small molecular inhibitors reported in the literature to date have been comprehensively listed out and described in detail. Moreover, the molecular mechanisms by which the Bcl-2 family proteins modulate the apoptotic processes and strategies for designing antagonists to anti-apoptotic proteins have been concisely discussed.
2013-01-01
Background The widespread protozoan parasite Toxoplasma gondii interferes with host cell functions by exporting the contents of a unique apical organelle, the rhoptry. Among the mix of secreted proteins are an expanded, lineage-specific family of protein kinases termed rhoptry kinases (ROPKs), several of which have been shown to be key virulence factors, including the pseudokinase ROP5. The extent and details of the diversification of this protein family are poorly understood. Results In this study, we comprehensively catalogued the ROPK family in the genomes of Toxoplasma gondii, Neospora caninum and Eimeria tenella, as well as portions of the unfinished genome of Sarcocystis neurona, and classified the identified genes into 42 distinct subfamilies. We systematically compared the rhoptry kinase protein sequences and structures to each other and to the broader superfamily of eukaryotic protein kinases to study the patterns of diversification and neofunctionalization in the ROPK family and its subfamilies. We identified three ROPK sub-clades of particular interest: those bearing a structurally conserved N-terminal extension to the kinase domain (NTE), an E. tenella-specific expansion, and a basal cluster including ROP35 and BPK1 that we term ROPKL. Structural analysis in light of the solved structures ROP2, ROP5, ROP8 and in comparison to typical eukaryotic protein kinases revealed ROPK-specific conservation patterns in two key regions of the kinase domain, surrounding a ROPK-conserved insert in the kinase hinge region and a disulfide bridge in the kinase substrate-binding lobe. We also examined conservation patterns specific to the NTE-bearing clade. We discuss the possible functional consequences of each. Conclusions Our work sheds light on several important but previously unrecognized features shared among rhoptry kinases, as well as the essential differences between active and degenerate protein kinases. We identify the most distinctive ROPK-specific features conserved across both active kinases and pseudokinases, and discuss these in terms of sequence motifs, evolutionary context, structural impact and potential functional relevance. By characterizing the proteins that enable these parasites to invade the host cell and co-opt its signaling mechanisms, we provide guidance on potential therapeutic targets for the diseases caused by coccidian parasites. PMID:23742205
Mukherjee, Ashis K; Kalita, Bhargab; Mackessy, Stephen P
2016-07-20
To address the dearth of knowledge on the biochemical composition of Pakistan Russell's Viper (Daboia russelii russelii) venom (RVV), the venom proteome has been analyzed and several biochemical and pharmacological properties of the venom were investigated. SDS-PAGE (reduced) analysis indicated that proteins/peptides in the molecular mass range of ~56.0-105.0kDa, 31.6-51.0kDa, 15.6-30.0kDa, 9.0-14.2kDa and 5.6-7.2kDa contribute approximately 9.8%, 12.1%, 13.4%, 34.1% and 30.5%, respectively of Pakistan RVV. Proteomics analysis of gel-filtration peaks of RVV resulted in identification of 75 proteins/peptides which belong to 14 distinct snake venom protein families. Phospholipases A2 (32.8%), Kunitz type serine protease inhibitors (28.4%), and snake venom metalloproteases (21.8%) comprised the majority of Pakistan RVV proteins, while 11 additional families accounted for 6.5-0.2%. Occurrence of aminotransferase, endo-β-glycosidase, and disintegrins is reported for the first time in RVV. Several of RVV proteins/peptides share significant sequence homology across Viperidae subfamilies. Pakistan RVV was well recognized by both the polyvalent (PAV) and monovalent (MAV) antivenom manufactured in India; nonetheless, immunological cross-reactivity determined by ELISA and neutralization of pro-coagulant/anticoagulant activity of RVV and its fractions by MAV surpassed that of PAV. The study establishes the proteome profile of the Pakistan RVV, thereby indicating the presence of diverse proteins and peptides that play a significant role in the pathophysiology of RVV bite. Further, the proteomic findings will contribute to understand the variation in venom composition owing to different geographical location and identification of pharmacologically important proteins in Pakistan RVV. Copyright © 2016. Published by Elsevier B.V.
van Ooij, C; Snyder, R C; Paeper, B W; Duester, G
1992-01-01
The human class I alcohol dehydrogenase (ADH) gene family consists of ADH1, ADH2, and ADH3, which are sequentially activated in early fetal, late fetal, and postnatal liver, respectively. Analysis of ADH promoters revealed differential activation by several factors previously shown to control liver transcription. In cotransfection assays, the ADH1 promoter, but not the ADH2 or ADH3 promoter, was shown to respond to hepatocyte nuclear factor 1 (HNF-1), which has previously been shown to regulate transcription in early liver development. The ADH2 promoter, but not the ADH1 or ADH3 promoter, was shown to respond to CCAAT/enhancer-binding protein alpha (C/EBP alpha), a transcription factor particularly active during late fetal liver and early postnatal liver development. The ADH1, ADH2, and ADH3 promoters all responded to the liver transcription factors liver activator protein (LAP) and D-element-binding protein (DBP), which are most active in postnatal liver. For all three promoters, the activation by LAP or DBP was higher than that seen by HNF-1 or C/EBP alpha, and a significant synergism between C/EBP alpha and LAP was noticed for the ADH2 and ADH3 promoters when both factors were simultaneously cotransfected. A hierarchy of ADH promoter responsiveness to C/EBP alpha and LAP homo- and heterodimers is suggested. In all three ADH genes, LAP bound to the same four sites previously reported for C/EBP alpha (i.e., -160, -120, -40, and -20 bp), but DBP bound strongly only to the site located at -40 bp relative to the transcriptional start. Mutational analysis of ADH2 indicated that the -40 bp element accounts for most of the promoter regulation by the bZIP factors analyzed. These studies suggest that HNF-1 and C/EBP alpha help establish ADH gene family transcription in fetal liver and that LAP and DBP help maintain high-level ADH gene family transcription in postnatal liver. Images PMID:1620113
DOE Office of Scientific and Technical Information (OSTI.GOV)
Peisach,E.; Wang, L.; Burroughs, A.
2008-01-01
The haloacid dehalogenase (HAD) superfamily is a large family of proteins dominated by phosphotransferases. Thirty-three sequence families within the HAD superfamily (HADSF) have been identified to assist in function assignment. One such family includes the enzyme phosphoacetaldehyde hydrolase (phosphonatase). Phosphonatase possesses the conserved Rossmanniod core domain and a C1-type cap domain. Other members of this family do not possess a cap domain and because the cap domain of phosphonatase plays an important role in active site desolvation and catalysis, the function of the capless family members must be unique. A representative of the capless subfamily, PSPTO{_}2114, from the plant pathogenmore » Pseudomonas syringae, was targeted for catalytic activity and structure analyses. The X-ray structure of PSPTO{_}2114 reveals a capless homodimer that conserves some but not all of the intersubunit contacts contributed by the core domains of the phosphonatase homodimer. The region of the PSPTO{_}2114 that corresponds to the catalytic scaffold of phosphonatase (and other HAD phosphotransfereases) positions amino acid residues that are ill suited for Mg+2 cofactor binding and mediation of phosphoryl group transfer between donor and acceptor substrates. The absence of phosphotransferase activity in PSPTO{_}2114 was confirmed by kinetic assays. To explore PSPTO{_}2114 function, the conservation of sequence motifs extending outside of the HADSF catalytic scaffold was examined. The stringently conserved residues among PSPTO{_}2114 homologs were mapped onto the PSPTO{_}2114 three-dimensional structure to identify a surface region unique to the family members that do not possess a cap domain. The hypothesis that this region is used in protein-protein recognition is explored to define, for the first time, HADSF proteins which have acquired a function other than that of a catalyst. Proteins 2008.« less
Fafetine, J M; Domingos, A; Antunes, S; Esteves, A; Paweska, J T; Coetzer, J A W; Rutten, V P M G; Neves, L
2013-11-01
Due to the unpredictable and explosive nature of Rift Valley fever (RVF) outbreaks, rapid and accurate diagnostic assays for low-resource settings are urgently needed. To improve existing diagnostic assays, monoclonal antibodies (MAbs) specific for the nucleocapsid protein of RVF virus (RVFV) were produced and characterized. Four IgG2a MAbs showed specific binding to denatured nucleocapsid protein, both from a recombinant source and from inactivated RVFV, in Western blot analysis and in an enzyme-linked immunosorbent assay (ELISA). Cross-reactivity with genetically related and non-related arboviruses including Bunyamwera and Calovo viruses (Bunyaviridae family), West Nile and Dengue-2 viruses (Flaviviridae family), and Sindbis and Chikungunya viruses (Togaviridae family) was not detected. These MAbs represent a useful tool for the development of rapid diagnostic assays for early recognition of RVF. © 2013 Blackwell Verlag GmbH.
Ariyarathna, H A Chandima K; Oldach, Klaus H; Francki, Michael G
2016-01-19
Although the HKT transporter genes ascertain some of the key determinants of crop salt tolerance mechanisms, the diversity and functional role of group II HKT genes are not clearly understood in bread wheat. The advanced knowledge on rice HKT and whole genome sequence was, therefore, used in comparative gene analysis to identify orthologous wheat group II HKT genes and their role in trait variation under different saline environments. The four group II HKTs in rice identified two orthologous gene families from bread wheat, including the known TaHKT2;1 gene family and a new distinctly different gene family designated as TaHKT2;2. A single copy of TaHKT2;2 was found on each homeologous chromosome arm 7AL, 7BL and 7DL and each gene was expressed in leaf blade, sheath and root tissues under non-stressed and at 200 mM salt stressed conditions. The proteins encoded by genes of the TaHKT2;2 family revealed more than 93% amino acid sequence identity but ≤52% amino acid identity compared to the proteins encoded by TaHKT2;1 family. Specifically, variations in known critical domains predicted functional differences between the two protein families. Similar to orthologous rice genes on chromosome 6L, TaHKT2;1 and TaHKT2;2 genes were located approximately 3 kb apart on wheat chromosomes 7AL, 7BL and 7DL, forming a static syntenic block in the two species. The chromosomal region on 7AL containing TaHKT2;1 7AL-1 co-located with QTL for shoot Na(+) concentration and yield in some saline environments. The differences in copy number, genes sequences and encoded proteins between TaHKT2;2 homeologous genes and other group II HKT gene families within and across species likely reflect functional diversity for ion selectivity and transport in plants. Evidence indicated that neither TaHKT2;2 nor TaHKT2;1 were associated with primary root Na(+) uptake but TaHKT2;1 may be associated with trait variation for Na(+) exclusion and yield in some but not all saline environments.
Masuda, Tokiha; Ling, Feng; Shibata, Takehiko; Mikawa, Tsutomu
2010-03-01
The Mhr1 protein is necessary for mtDNA homologous recombination in Saccharomyces cerevisiae. Homologous pairing (HP) is an essential reaction during homologous recombination, and is generally catalyzed by the RecA/Rad51 family of proteins in an ATP-dependent manner. Mhr1 catalyzes HP through a mechanism similar, at the DNA level, to that of the RecA/Rad51 proteins, but without utilizing ATP. However, it has no sequence homology with the RecA/Rad51 family proteins or with other ATP-independent HP proteins, and exhibits different requirements for DNA topology. We are interested in the structural features of the functional domains of Mhr1. In this study, we employed the native fluorescence of Mhr1's Trp residues to examine the energy transfer from the Trp residues to etheno-modified ssDNA bound to Mhr1. Our results showed that two of the seven Trp residues (Trp71 and Trp165) are spatially close to the bound DNA. A systematic analysis of mutant Mhr1 proteins revealed that Asp69 is involved in Mg(2+)-dependent DNA binding, and that multiple Lys and Arg residues located around Trp71 and Trp165 are involved in the DNA-binding activity of Mhr1. In addition, in vivo complementation analyses showed that a region around Trp165 is important for the maintenance of mtDNA. On the basis of these results, we discuss the function of the region surrounding Trp165.
Yang, S D; Yu, J S; Yang, C C; Lee, S C; Lee, T T; Ni, M H; Kuan, C Y; Chen, H C
1996-05-01
Computer analysis of protein phosphorylation sites sequence revealed that transcriptional factors and viral oncoproteins are prime targets for regulation of proline-directed protein phosphorylation, suggesting an association of the proline-directed protein kinase (PDPK) family with neoplastic transformation and tumorigenesis. In this report, an immunoprecipitate activity assay of protein kinase FA/glycogen synthase kinase-3 alpha (kinase F(A)/GSK-3 alpha) (a member of PDPK family) has been optimized for human hepatoma and used to demonstrate for the first time significantly increased (P < 0.01) activity in poorly differentiated SK-Hep-1 hepatoma (24.2 +/- 2.8 units/mg) and moderately differentiated Mahlavu hepatoma (14.5 +/- 2.2 units/mg) when compared to well differentiated Hep 3B hepatoma (8.0 +/- 2.4 units/mg). Immunoblotting analysis revealed that increased activity of kinase FA/GSK-3 alpha is due to overexpression of the protein. Elevated kinase FA/GSK-3 alpha expression in human hepatoma biopsies relative to normal liver tissue was found to be even more profound. This kinase appeared to be fivefold overexpressed in well differentiated hepatoma and 13-fold overexpressed in poorly differentiated hepatoma when compared to normal liver tissue. Taken together, the results provide initial evidence that overexpression of kinase FA/GSK-3 alpha is involved in human hepatoma dedifferentiation/progression. Since kinase FA/GSK-3 alpha is a PDPK, the results further support a potential role of this kinase in human liver tumorigenesis, especially in its dedifferentiation/progression.
Hippophae rhamnoides N-glycoproteome analysis: a small step towards sea buckthorn proteome mining.
Sougrakpam, Yaiphabi; Deswal, Renu
2016-10-01
Hippophae rhamnoides is a hardy shrub capable of growing under extreme environmental conditions namely, high salt, drought and cold. Its ability to grow under extreme conditions and its wide application in pharmaceutical and nutraceutical industry calls for its in-depth analysis. N-glycoproteome mining by con A affinity chromatography from seedling was attempted. The glycoproteome was resolved on first and second dimension gel electrophoresis. A total of 48 spots were detected and 10 non-redundant proteins were identified by MALDI-TOF/TOF. Arabidopsis thaliana protein disulfide isomerase-like 1-4 (ATPDIL1-4) electron transporter, protein disulphide isomerase, calreticulin 1 (CRT1), glycosyl hydrolase family 38 (GH 38) protein, phantastica, maturase k, Arabidopsis trithorax related protein 6 (ATXR 6), cysteine protease inhibitor were identified out of which ATXR 6, phantastica and putative ATPDIL1-4 electron transporter are novel glycoproteins. Calcium binding protein CRT1 was validated for its calcium binding by stains all staining. GO analysis showed involvement of GH 38 and ATXR 6 in glycan and lysine degradation pathways. This is to our knowledge the first report of glycoproteome analysis for any Elaeagnaceae member.
Algorithm, applications and evaluation for protein comparison by Ramanujan Fourier transform.
Zhao, Jian; Wang, Jiasong; Hua, Wei; Ouyang, Pingkai
2015-12-01
The amino acid sequence of a protein determines its chemical properties, chain conformation and biological functions. Protein sequence comparison is of great importance to identify similarities of protein structures and infer their functions. Many properties of a protein correspond to the low-frequency signals within the sequence. Low frequency modes in protein sequences are linked to the secondary structures, membrane protein types, and sub-cellular localizations of the proteins. In this paper, we present Ramanujan Fourier transform (RFT) with a fast algorithm to analyze the low-frequency signals of protein sequences. The RFT method is applied to similarity analysis of protein sequences with the Resonant Recognition Model (RRM). The results show that the proposed fast RFT method on protein comparison is more efficient than commonly used discrete Fourier transform (DFT). RFT can detect common frequencies as significant feature for specific protein families, and the RFT spectrum heat-map of protein sequences demonstrates the information conservation in the sequence comparison. The proposed method offers a new tool for pattern recognition, feature extraction and structural analysis on protein sequences. Copyright © 2015 Elsevier Ltd. All rights reserved.
Insights into the Shc Family of Adaptor Proteins
Prigent, Sally A.
2017-01-01
The Shc family of adaptor proteins is a group of proteins that lacks intrinsic enzymatic activity. Instead, Shc proteins possess various domains that allow them to recruit different signalling molecules. Shc proteins help to transduce an extracellular signal into an intracellular signal, which is then translated into a biological response. The Shc family of adaptor proteins share the same structural topography, CH2-PTB-CH1-SH2, which is more than an isoform of Shc family proteins; this structure, which includes multiple domains, allows for the posttranslational modification of Shc proteins and increases the functional diversity of Shc proteins. The deregulation of Shc proteins has been linked to different disease conditions, including cancer and Alzheimer’s, which indicates their key roles in cellular functions. Accordingly, a question might arise as to whether Shc proteins could be targeted therapeutically to correct their disturbance. To answer this question, thorough knowledge must be acquired; herein, we aim to shed light on the Shc family of adaptor proteins to understand their intracellular role in normal and disease states, which later might be applied to connote mechanisms to reverse the disease state.
Chen, Lin; Yang, Yang; Liu, Can; Zheng, Yanyan; Xu, Mingshuang; Wu, Na; Sheng, Jiping; Shen, Lin
2015-08-28
WRKY transcription factors play an important role in cold defense of plants. However, little information is available about the cold-responsive WRKYs in tomato (Solanum lycopersicum). In the present study, a complete characterization of this gene family was described. Eighty WRKY genes in the tomato genome were identified. Almost all WRKY genes contain putative stress-responsive cis-elements in their promoter regions. Segmental duplications contributed significantly to the expansion of the SlWRKY gene family. Transcriptional analysis revealed notable differential expression in tomato tissues and expression patterns under cold stress, which indicated wide functional divergence in this family. Ten WRKYs in tomato were strongly induced more than 2-fold during cold stress. These genes represented candidate genes for future functional analysis of WRKYs involved in the cold-related signal pathways. Our data provide valuable information about tomato WRKY proteins and form a foundation for future studies of these proteins, especially for those that play an important role in response to cold stress. Copyright © 2015 Elsevier Inc. All rights reserved.
Clinical heterogeneity and phenotype/genotype findings in 5 families with GYG1 deficiency
Ben Yaou, Rabah; Hubert, Aurélie; Nelson, Isabelle; Dahlqvist, Julia R.; Gaist, David; Streichenberger, Nathalie; Beuvin, Maud; Krahn, Martin; Petiot, Philippe; Parisot, Frédéric; Michel, Fabrice; Malfatti, Edoardo; Romero, Norma; Carlier, Robert Yves; Eymard, Bruno; Labrune, Philippe; Duno, Morten; Krag, Thomas; Cerino, Mathieu; Bartoli, Marc; Bonne, Gisèle; Vissing, John; Laforet, Pascal
2017-01-01
Objective: To describe the variability of muscle symptoms in patients carrying mutations in the GYG1 gene, encoding glycogenin-1, an enzyme involved in the biosynthesis of glycogen, and to discuss genotype-phenotype relations. Methods: We describe 9 patients from 5 families in whom muscle biopsies showed vacuoles with an abnormal accumulation of glycogen in muscle fibers, partially α-amylase resistant suggesting polyglucosan bodies. The patients had either progressive early-onset limb-girdle weakness or late-onset distal or scapuloperoneal muscle affection as shown by muscle imaging. No clear definite cardiac disease was found. Histologic and protein analysis investigations were performed on muscle. Results: Genetic analyses by direct or exome sequencing of the GYG1 gene revealed 6 different GYG1 mutations. Four of the mutations were novel. They were compound heterozygous in 3 families and homozygous in 2. Protein analysis revealed either the absence of glycogenin-1 or reduced glycogenin-1 expression with impaired glucosylation. Conclusions: Our report extends the genetic and clinical spectrum of glycogenin-1–related myopathies to include scapuloperoneal and distal affection with glycogen accumulation. PMID:29264399
Clinical heterogeneity and phenotype/genotype findings in 5 families with GYG1 deficiency.
Ben Yaou, Rabah; Hubert, Aurélie; Nelson, Isabelle; Dahlqvist, Julia R; Gaist, David; Streichenberger, Nathalie; Beuvin, Maud; Krahn, Martin; Petiot, Philippe; Parisot, Frédéric; Michel, Fabrice; Malfatti, Edoardo; Romero, Norma; Carlier, Robert Yves; Eymard, Bruno; Labrune, Philippe; Duno, Morten; Krag, Thomas; Cerino, Mathieu; Bartoli, Marc; Bonne, Gisèle; Vissing, John; Laforet, Pascal; Petit, François M
2017-12-01
To describe the variability of muscle symptoms in patients carrying mutations in the GYG1 gene, encoding glycogenin-1, an enzyme involved in the biosynthesis of glycogen, and to discuss genotype-phenotype relations. We describe 9 patients from 5 families in whom muscle biopsies showed vacuoles with an abnormal accumulation of glycogen in muscle fibers, partially α-amylase resistant suggesting polyglucosan bodies. The patients had either progressive early-onset limb-girdle weakness or late-onset distal or scapuloperoneal muscle affection as shown by muscle imaging. No clear definite cardiac disease was found. Histologic and protein analysis investigations were performed on muscle. Genetic analyses by direct or exome sequencing of the GYG1 gene revealed 6 different GYG1 mutations. Four of the mutations were novel. They were compound heterozygous in 3 families and homozygous in 2. Protein analysis revealed either the absence of glycogenin-1 or reduced glycogenin-1 expression with impaired glucosylation. Our report extends the genetic and clinical spectrum of glycogenin-1-related myopathies to include scapuloperoneal and distal affection with glycogen accumulation.
Germ line p53 mutations in a familial syndrome of breast cancer, sarcomas, and other neoplasms.
Malkin, D; Li, F P; Strong, L C; Fraumeni, J F; Nelson, C E; Kim, D H; Kassel, J; Gryka, M A; Bischoff, F Z; Tainsky, M A
1990-11-30
Familial cancer syndromes have helped to define the role of tumor suppressor genes in the development of cancer. The dominantly inherited Li-Fraumeni syndrome (LFS) is of particular interest because of the diversity of childhood and adult tumors that occur in affected individuals. The rarity and high mortality of LFS precluded formal linkage analysis. The alternative approach was to select the most plausible candidate gene. The tumor suppressor gene, p53, was studied because of previous indications that this gene is inactivated in the sporadic (nonfamilial) forms of most cancers that are associated with LFS. Germ line p53 mutations have been detected in all five LFS families analyzed. These mutations do not produce amounts of mutant p53 protein expected to exert a trans-dominant loss of function effect on wild-type p53 protein. The frequency of germ line p53 mutations can now be examined in additional families with LFS, and in other cancer patients and families with clinical features that might be attributed to the mutation.
Jiang, Yuanzhong; Duan, Yanjiao; Yin, Jia; Ye, Shenglong; Zhu, Jingru; Zhang, Faqi; Lu, Wanxiang; Fan, Di; Luo, Keming
2014-01-01
WRKY proteins are a large family of regulators involved in various developmental and physiological processes, especially in coping with diverse biotic and abiotic stresses. In this study, 100 putative PtrWRKY genes encoded the proteins contained in the complete WRKY domain in Populus. Phylogenetic analysis revealed that the members of this superfamily among poplar, Arabidopsis, and other species were divided into three groups with several subgroups based on the structures of the WRKY protein sequences. Various cis-acting elements related to stress and defence responses were found in the promoter regions of PtrWRKY genes by promoter analysis. High-throughput transcriptomic analyses identified that 61 of the PtrWRKY genes were induced by biotic and abiotic treatments, such as Marssonina brunnea, salicylic acid (SA), methyl jasmonate (MeJA), wounding, cold, and salinity. Among these PtrWRKY genes, transcripts of 46 selected genes were observed in different tissues, including roots, stems, and leaves. Quantitative RT-PCR analysis further confirmed the induced expression of 18 PtrWRKY genes by one or more stress treatments. The overexpression of an SA-inducible gene, PtrWRKY89, accelerated expression of PR protein genes and improved resistance to pathogens in transgenic poplar, suggesting that PtrWRKY89 is a regulator of an SA-dependent defence-signalling pathway in poplar. Taken together, our results provided significant information for improving the resistance and stress tolerance of woody plants. PMID:25249073
Nakano, Shogo; Asano, Yasuhisa
2015-02-03
Development of software and methods for design of complete sequences of functional proteins could contribute to studies of protein engineering and protein evolution. To this end, we developed the INTMSAlign software, and used it to design functional proteins and evaluate their usefulness. The software could assign both consensus and correlation residues of target proteins. We generated three protein sequences with S-selective hydroxynitrile lyase (S-HNL) activity, which we call designed S-HNLs; these proteins folded as efficiently as the native S-HNL. Sequence and biochemical analysis of the designed S-HNLs suggested that accumulation of neutral mutations occurs during the process of S-HNLs evolution from a low-activity form to a high-activity (native) form. Taken together, our results demonstrate that our software and the associated methods could be applied not only to design of complete sequences, but also to predictions of protein evolution, especially within families such as esterases and S-HNLs.
NASA Astrophysics Data System (ADS)
Nakano, Shogo; Asano, Yasuhisa
2015-02-01
Development of software and methods for design of complete sequences of functional proteins could contribute to studies of protein engineering and protein evolution. To this end, we developed the INTMSAlign software, and used it to design functional proteins and evaluate their usefulness. The software could assign both consensus and correlation residues of target proteins. We generated three protein sequences with S-selective hydroxynitrile lyase (S-HNL) activity, which we call designed S-HNLs; these proteins folded as efficiently as the native S-HNL. Sequence and biochemical analysis of the designed S-HNLs suggested that accumulation of neutral mutations occurs during the process of S-HNLs evolution from a low-activity form to a high-activity (native) form. Taken together, our results demonstrate that our software and the associated methods could be applied not only to design of complete sequences, but also to predictions of protein evolution, especially within families such as esterases and S-HNLs.
González, Carolina; Lazcano, Marcelo; Valdés, Jorge; Holmes, David S.
2016-01-01
Using phylogenomic and gene compositional analyses, five highly conserved gene families have been detected in the core genome of the phylogenetically coherent genus Acidithiobacillus of the class Acidithiobacillia. These core gene families are absent in the closest extant genus Thermithiobacillus tepidarius that subtends the Acidithiobacillus genus and roots the deepest in this class. The predicted proteins encoded by these core gene families are not detected by a BLAST search in the NCBI non-redundant database of more than 90 million proteins using a relaxed cut-off of 1.0e−5. None of the five families has a clear functional prediction. However, bioinformatic scrutiny, using pI prediction, motif/domain searches, cellular location predictions, genomic context analyses, and chromosome topology studies together with previously published transcriptomic and proteomic data, suggests that some may have functions associated with membrane remodeling during cell division perhaps in response to pH stress. Despite the high level of amino acid sequence conservation within each family, there is sufficient nucleotide variation of the respective genes to permit the use of the DNA sequences to distinguish different species of Acidithiobacillus, making them useful additions to the armamentarium of tools for phylogenetic analysis. Since the protein families are unique to the Acidithiobacillus genus, they can also be leveraged as probes to detect the genus in environmental metagenomes and metatranscriptomes, including industrial biomining operations, and acid mine drainage (AMD). PMID:28082953
González, Carolina; Lazcano, Marcelo; Valdés, Jorge; Holmes, David S
2016-01-01
Using phylogenomic and gene compositional analyses, five highly conserved gene families have been detected in the core genome of the phylogenetically coherent genus Acidithiobacillus of the class Acidithiobacillia . These core gene families are absent in the closest extant genus Thermithiobacillus tepidarius that subtends the Acidithiobacillus genus and roots the deepest in this class. The predicted proteins encoded by these core gene families are not detected by a BLAST search in the NCBI non-redundant database of more than 90 million proteins using a relaxed cut-off of 1.0e -5 . None of the five families has a clear functional prediction. However, bioinformatic scrutiny, using pI prediction, motif/domain searches, cellular location predictions, genomic context analyses, and chromosome topology studies together with previously published transcriptomic and proteomic data, suggests that some may have functions associated with membrane remodeling during cell division perhaps in response to pH stress. Despite the high level of amino acid sequence conservation within each family, there is sufficient nucleotide variation of the respective genes to permit the use of the DNA sequences to distinguish different species of Acidithiobacillus , making them useful additions to the armamentarium of tools for phylogenetic analysis. Since the protein families are unique to the Acidithiobacillus genus, they can also be leveraged as probes to detect the genus in environmental metagenomes and metatranscriptomes, including industrial biomining operations, and acid mine drainage (AMD).
Wulfkuhle, Julia D.; Berg, Daniela; Wolff, Claudia; Langer, Rupert; Tran, Kai; Illi, Julie; Espina, Virginia; Pierobon, Mariaelena; Deng, Jianghong; DeMichele, Angela; Walch, Axel; Bronger, Holger; Becker, Ingrid; Waldhör, Christine; Höfler, Heinz; Esserman, Laura; Liotta, Lance A.; Becker, Karl-Friedrich; Petricoin, Emanuel F.
2017-01-01
Purpose Targeting of the HER2 protein in human breast cancer represents a major advance in oncology, but relies on measurements of total HER2 protein and not HER2 signaling network activation. We utilized reverse phase protein microarrays (RPMAs) to measure total and phosphorylated HER2 in the context of HER family signaling to understand correlations between phosphorylated and total levels of HER2 and downstream signaling activity. Experimental Design Three independent study sets, comprising a total of 415 individual patient samples from flash frozen core biopsy samples and FFPE surgical and core samples, were analyzed via RPMA. The phosphorylation and total levels of the HER receptor family proteins and downstream signaling molecules were measured in laser capture microdissected (LCM) enriched tumor epithelium from 127 frozen pre-treatment core biopsy samples and whole tissue lysates from 288 FFPE samples and these results were compared to FISH and IHC. Results RPMA measurements of total HER2 were highly concordant (> 90% all sets) with FISH and/or IHC data, as was phosphorylation of HER2 in the FISH/IHC+ population. Phosphorylation analysis of HER family signaling identified HER2 activation in some FISH/IHC- tumors and, identical to that seen with FISH/IHC+ tumors, the HER2 activation was concordant with EGFR and HER3 phosphorylation and downstream signaling endpoint activation. Conclusions Molecular profiling of HER2 signaling of a large cohort of human breast cancer specimens using a quantitative and sensitive functional pathway activation mapping technique reveals IHC-/FISH-/pHER2+ tumors with HER2 pathway activation independent of total HER2 levels and functional signaling through HER3 and EGFR. PMID:23045247
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cardarelli, Lia; Lam, Robert; Tuite, Ashleigh
2010-08-17
The final step in the morphogenesis of long-tailed double-stranded DNA bacteriophages is the joining of the DNA-filled head to the tail. The connector is a specialized structure of the head that serves as the interface for tail attachment and the point of egress for DNA from the head during infection. Here, we report the determination of a 2.1 {angstrom} crystal structure of gp6 of bacteriophage HK97. Through structural comparisons, functional studies, and bioinformatic analysis, gp6 has been determined to be a component of the connector of phage HK97 that is evolutionarily related to gp15, a well-characterized connector component of bacteriophagemore » SPP1. Whereas the structure of gp15 was solved in a monomeric form, gp6 crystallized as an oligomeric ring with the dimensions expected for a connector protein. Although this ring is composed of 13 subunits, which does not match the symmetry of the connector within the phage, sequence conservation and modeling of this structure into the cryo-electron microscopy density of the SPP1 connector indicate that this oligomeric structure represents the arrangement of gp6 subunits within the mature phage particle. Through sequence searches and genomic position analysis, we determined that gp6 is a member of a large family of connector proteins that are present in long-tailed phages. We have also identified gp7 of HK97 as a homologue of gp16 of phage SPP1, which is the second component of the connector of this phage. These proteins are members of another large protein family involved in connector assembly.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cardarelli, Lia; Lam, Robert; Tuite, Ashleigh
2011-11-23
The final step in the morphogenesis of long-tailed double-stranded DNA bacteriophages is the joining of the DNA-filled head to the tail. The connector is a specialized structure of the head that serves as the interface for tail attachment and the point of egress for DNA from the head during infection. Here, we report the determination of a 2.1 Å crystal structure of gp6 of bacteriophage HK97. Through structural comparisons, functional studies, and bioinformatic analysis, gp6 has been determined to be a component of the connector of phage HK97 that is evolutionarily related to gp15, a well-characterized connector component of bacteriophagemore » SPP1. Whereas the structure of gp15 was solved in a monomeric form, gp6 crystallized as an oligomeric ring with the dimensions expected for a connector protein. Although this ring is composed of 13 subunits, which does not match the symmetry of the connector within the phage, sequence conservation and modeling of this structure into the cryo-electron microscopy density of the SPP1 connector indicate that this oligomeric structure represents the arrangement of gp6 subunits within the mature phage particle. Through sequence searches and genomic position analysis, we determined that gp6 is a member of a large family of connector proteins that are present in long-tailed phages. We have also identified gp7 of HK97 as a homologue of gp16 of phage SPP1, which is the second component of the connector of this phage. These proteins are members of another large protein family involved in connector assembly.« less
Rendón-Ramírez, Adela; Shukla, Manish; Oda, Masataka; Chakraborty, Sandeep; Minda, Renu; Dandekar, Abhaya M; Ásgeirsson, Bjarni; Goñi, Félix M; Rao, Basuthkar J
2013-01-01
Proteolytic enzymes have evolved several mechanisms to cleave peptide bonds. These distinct types have been systematically categorized in the MEROPS database. While a BLAST search on these proteases identifies homologous proteins, sequence alignment methods often fail to identify relationships arising from convergent evolution, exon shuffling, and modular reuse of catalytic units. We have previously established a computational method to detect functions in proteins based on the spatial and electrostatic properties of the catalytic residues (CLASP). CLASP identified a promiscuous serine protease scaffold in alkaline phosphatases (AP) and a scaffold recognizing a β-lactam (imipenem) in a cold-active Vibrio AP. Subsequently, we defined a methodology to quantify promiscuous activities in a wide range of proteins. Here, we assemble a module which encapsulates the multifarious motifs used by protease families listed in the MEROPS database. Since APs and proteases are an integral component of outer membrane vesicles (OMV), we sought to query other OMV proteins, like phospholipase C (PLC), using this search module. Our analysis indicated that phosphoinositide-specific PLC from Bacillus cereus is a serine protease. This was validated by protease assays, mass spectrometry and by inhibition of the native phospholipase activity of PI-PLC by the well-known serine protease inhibitor AEBSF (IC50 = 0.018 mM). Edman degradation analysis linked the specificity of the protease activity to a proline in the amino terminal, suggesting that the PI-PLC is a prolyl peptidase. Thus, we propose a computational method of extending protein families based on the spatial and electrostatic congruence of active site residues.
Murphy, Maureen E.
2013-01-01
The HSP70 family of heat shock proteins consists of molecular chaperones of approximately 70kDa in size that serve critical roles in protein homeostasis. These adenosine triphosphatases unfold misfolded or denatured proteins and can keep these proteins in an unfolded, folding-competent state. They also protect nascently translating proteins, promote the cellular or organellar transport of proteins, reduce proteotoxic protein aggregates and serve general housekeeping roles in maintaining protein homeostasis. The HSP70 family is the most conserved in evolution, and all eukaryotes contain multiple members. Some members of this family serve specific organellar- or tissue-specific functions; however, in many cases, these members can function redundantly. Overall, the HSP70 family of proteins can be thought of as a potent buffering system for cellular stress, either from extrinsic (physiological, viral and environmental) or intrinsic (replicative or oncogenic) stimuli. As such, this family serves a critical survival function in the cell. Not surprisingly, cancer cells rely heavily on this buffering system for survival. The overwhelming majority of human tumors overexpress HSP70 family members, and expression of these proteins is typically a marker for poor prognosis. With the proof of principle that inhibitors of the HSP90 chaperone have emerged as important anticancer agents, intense focus has now been placed on the potential for HSP70 inhibitors to assume a role as a significant chemotherapeutic avenue. In this review, the history, regulation, mechanism of action and role in cancer of the HSP70 family are reviewed. Additionally, the promise of pharmacologically targeting this protein for cancer therapy is addressed. PMID:23563090
De Moura, Dref C; Bryksa, Brian C; Yada, Rickey Y
2014-01-01
The plant-specific insert is an approximately 100-residue domain found exclusively within the C-terminal lobe of some plant aspartic proteases. Structurally, this domain is a member of the saposin-like protein family, and is involved in plant pathogen defense as well as vacuolar targeting of the parent protease molecule. Similar to other members of the saposin-like protein family, most notably saposins A and C, the recently resolved crystal structure of potato (Solanum tuberosum) plant-specific insert has been shown to exist in a substrate-bound open conformation in which the plant-specific insert oligomerizes to form homodimers. In addition to the open structure, a closed conformation also exists having the classic saposin fold of the saposin-like protein family as observed in the crystal structure of barley (Hordeum vulgare L.) plant-specific insert. In the present study, the mechanisms of tertiary and quaternary conformation changes of potato plant-specific insert were investigated in silico as a function of pH. Umbrella sampling and determination of the free energy change of dissociation of the plant-specific insert homodimer revealed that increasing the pH of the system to near physiological levels reduced the free energy barrier to dissociation. Furthermore, principal component analysis was used to characterize conformational changes at both acidic and neutral pH. The results indicated that the plant-specific insert may adopt a tertiary structure similar to the characteristic saposin fold and suggest a potential new structural motif among saposin-like proteins. To our knowledge, this acidified PSI structure presents the first example of an alternative saposin-fold motif for any member of the large and diverse SAPLIP family.
Demir, Hande; Donner, Iikki; Kivipelto, Leena; Kuismin, Outi; Schalin-Jäntti, Camilla; De Menis, Ernesto; Karhu, Auli
2014-01-01
Pituitary adenomas are neoplasms of the anterior pituitary lobe and account for 15-20% of all intracranial tumors. Although most pituitary tumors are benign they can cause severe symptoms related to tumor size as well as hypopituitarism and/or hypersecretion of one or more pituitary hormones. Most pituitary adenomas are sporadic, but it has been estimated that 5% of patients have a familial background. Germline mutations of the tumor suppressor gene aryl hydrocarbon receptor-interacting protein (AIP) predispose to hereditary pituitary neoplasia. Recently, it has been demonstrated that AIP mutations predispose to pituitary tumorigenesis through defective inhibitory GTP binding protein (Gαi) signaling. This finding prompted us to examine whether germline loss-of-function mutations in inhibitory guanine nucleotide (GTP) binding protein alpha (GNAI) loci are involved in genetic predisposition of pituitary tumors. To our knowledge, this is the first time GNAI genes are sequenced in order to examine the occurrence of inactivating germline mutations. Thus far, only somatic gain-of-function hot-spot mutations have been studied in these loci. Here, we have analyzed the coding regions of GNAI1, GNAI2, and GNAI3 in a set of young sporadic somatotropinoma patients (n = 32; mean age of diagnosis 32 years) and familial index cases (n = 14), thus in patients with a disease phenotype similar to that observed in AIP mutation carriers. In addition, expression of Gαi proteins was studied in human growth hormone (GH), prolactin (PRL), adrenocorticotropic hormone (ACTH)-secreting and non-functional pituitary tumors. No pathogenic germline mutations affecting the Gαi proteins were detected. The result suggests that loss-of-function mutations of GNAI loci are rare or nonexistent in familial pituitary adenomas.