Gurung, Ratna B.; Purdie, Auriol C.; Begg, Douglas J.
2012-01-01
Johne's disease in ruminants is caused by Mycobacterium avium subsp. paratuberculosis. Diagnosis of M. avium subsp. paratuberculosis infection is difficult, especially in the early stages. To date, ideal antigen candidates are not available for efficient immunization or immunodiagnosis. This study reports the in silico selection and subsequent analysis of epitopes of M. avium subsp. paratuberculosis proteins that were found to be upregulated under stress conditions as a means to identify immunogenic candidate proteins. Previous studies have reported differential regulation of proteins when M. avium subsp. paratuberculosis is exposed to stressors which induce a response similar to dormancy. Dormancy may be involved in evading host defense mechanisms, and the host may also mount an immune response against these proteins. Twenty-five M. avium subsp. paratuberculosis proteins that were previously identified as being upregulated under in vitro stress conditions were analyzed for B and T cell epitopes by use of the prediction tools at the Immune Epitope Database and Analysis Resource. Major histocompatibility complex class I T cell epitopes were predicted using an artificial neural network method, and class II T cell epitopes were predicted using the consensus method. Conformational B cell epitopes were predicted from the relevant three-dimensional structure template for each protein. Based on the greatest number of predicted epitopes, eight proteins (MAP2698c [encoded by desA2], MAP2312c [encoded by fadE19], MAP3651c [encoded by fadE3_2], MAP2872c [encoded by fabG5_2], MAP3523c [encoded by oxcA], MAP0187c [encoded by sodA], and the hypothetical proteins MAP3567 and MAP1168c) were identified as potential candidates for study of antibody- and cell-mediated immune responses within infected hosts. PMID:22496492
Protein Secondary Structure Prediction Using AutoEncoder Network and Bayes Classifier
NASA Astrophysics Data System (ADS)
Wang, Leilei; Cheng, Jinyong
2018-03-01
Protein secondary structure prediction is belong to bioinformatics,and it's important in research area. In this paper, we propose a new prediction way of protein using bayes classifier and autoEncoder network. Our experiments show some algorithms including the construction of the model, the classification of parameters and so on. The data set is a typical CB513 data set for protein. In terms of accuracy, the method is the cross validation based on the 3-fold. Then we can get the Q3 accuracy. Paper results illustrate that the autoencoder network improved the prediction accuracy of protein secondary structure.
A protein-dependent side-chain rotamer library.
Bhuyan, Md Shariful Islam; Gao, Xin
2011-12-14
Protein side-chain packing problem has remained one of the key open problems in bioinformatics. The three main components of protein side-chain prediction methods are a rotamer library, an energy function and a search algorithm. Rotamer libraries summarize the existing knowledge of the experimentally determined structures quantitatively. Depending on how much contextual information is encoded, there are backbone-independent rotamer libraries and backbone-dependent rotamer libraries. Backbone-independent libraries only encode sequential information, whereas backbone-dependent libraries encode both sequential and locally structural information. However, side-chain conformations are determined by spatially local information, rather than sequentially local information. Since in the side-chain prediction problem, the backbone structure is given, spatially local information should ideally be encoded into the rotamer libraries. In this paper, we propose a new type of backbone-dependent rotamer library, which encodes structural information of all the spatially neighboring residues. We call it protein-dependent rotamer libraries. Given any rotamer library and a protein backbone structure, we first model the protein structure as a Markov random field. Then the marginal distributions are estimated by the inference algorithms, without doing global optimization or search. The rotamers from the given library are then re-ranked and associated with the updated probabilities. Experimental results demonstrate that the proposed protein-dependent libraries significantly outperform the widely used backbone-dependent libraries in terms of the side-chain prediction accuracy and the rotamer ranking ability. Furthermore, without global optimization/search, the side-chain prediction power of the protein-dependent library is still comparable to the global-search-based side-chain prediction methods.
Transcriptomic analysis of Arabidopsis developing stems: a close-up on cell wall genes
Minic, Zoran; Jamet, Elisabeth; San-Clemente, Hélène; Pelletier, Sandra; Renou, Jean-Pierre; Rihouey, Christophe; Okinyo, Denis PO; Proux, Caroline; Lerouge, Patrice; Jouanin, Lise
2009-01-01
Background Different strategies (genetics, biochemistry, and proteomics) can be used to study proteins involved in cell biogenesis. The availability of the complete sequences of several plant genomes allowed the development of transcriptomic studies. Although the expression patterns of some Arabidopsis thaliana genes involved in cell wall biogenesis were identified at different physiological stages, detailed microarray analysis of plant cell wall genes has not been performed on any plant tissues. Using transcriptomic and bioinformatic tools, we studied the regulation of cell wall genes in Arabidopsis stems, i.e. genes encoding proteins involved in cell wall biogenesis and genes encoding secreted proteins. Results Transcriptomic analyses of stems were performed at three different developmental stages, i.e., young stems, intermediate stage, and mature stems. Many genes involved in the synthesis of cell wall components such as polysaccharides and monolignols were identified. A total of 345 genes encoding predicted secreted proteins with moderate or high level of transcripts were analyzed in details. The encoded proteins were distributed into 8 classes, based on the presence of predicted functional domains. Proteins acting on carbohydrates and proteins of unknown function constituted the two most abundant classes. Other proteins were proteases, oxido-reductases, proteins with interacting domains, proteins involved in signalling, and structural proteins. Particularly high levels of expression were established for genes encoding pectin methylesterases, germin-like proteins, arabinogalactan proteins, fasciclin-like arabinogalactan proteins, and structural proteins. Finally, the results of this transcriptomic analyses were compared with those obtained through a cell wall proteomic analysis from the same material. Only a small proportion of genes identified by previous proteomic analyses were identified by transcriptomics. Conversely, only a few proteins encoded by genes having moderate or high level of transcripts were identified by proteomics. Conclusion Analysis of the genes predicted to encode cell wall proteins revealed that about 345 genes had moderate or high levels of transcripts. Among them, we identified many new genes possibly involved in cell wall biogenesis. The discrepancies observed between results of this transcriptomic study and a previous proteomic study on the same material revealed post-transcriptional mechanisms of regulation of expression of genes encoding cell wall proteins. PMID:19149885
Molecular cloning and characterization of alpha - galactosidase gene from Glaciozyma antarctica
NASA Astrophysics Data System (ADS)
Moheer, Reyad Qaed Al; Bakar, Farah Diba Abu; Murad, Abdul Munir Abdul
2015-09-01
Psychrophilic enzymes are proteins produced by psychrophilic organisms which recently are the limelight for industrial applications. A gene encoding α-galactosidase from a psychrophilic yeast, Glaciozyma antarctica PI12 which belongs to glycoside hydrolase family 27, was isolated and analyzed using several bioinformatic tools. The cDNA of the gene with the size of 1,404-bp encodes a protein with 467 amino acid residues. Predicted molecular weight of protein was 48.59 kDa and hence we name the gene encoding α-galactosidase as GAL48. We found that the predicted protein sequences possessed signal peptide sequence and are highly conserved among other fungal α-galactosidase.
Sequence heuristics to encode phase behaviour in intrinsically disordered protein polymers
Quiroz, Felipe García; Chilkoti, Ashutosh
2015-01-01
Proteins and synthetic polymers that undergo aqueous phase transitions mediate self-assembly in nature and in man-made material systems. Yet little is known about how the phase behaviour of a protein is encoded in its amino acid sequence. Here, by synthesizing intrinsically disordered, repeat proteins to test motifs that we hypothesized would encode phase behaviour, we show that the proteins can be designed to exhibit tunable lower or upper critical solution temperature (LCST and UCST, respectively) transitions in physiological solutions. We also show that mutation of key residues at the repeat level abolishes phase behaviour or encodes an orthogonal transition. Furthermore, we provide heuristics to identify, at the proteome level, proteins that might exhibit phase behaviour and to design novel protein polymers consisting of biologically active peptide repeats that exhibit LCST or UCST transitions. These findings set the foundation for the prediction and encoding of phase behaviour at the sequence level. PMID:26390327
DNA Asymmetric Strand Bias Affects the Amino Acid Composition of Mitochondrial Proteins
Min, Xiang Jia; Hickey, Donal A.
2007-01-01
Abstract Variations in GC content between genomes have been extensively documented. Genomes with comparable GC contents can, however, still differ in the apportionment of the G and C nucleotides between the two DNA strands. This asymmetric strand bias is known as GC skew. Here, we have investigated the impact of differences in nucleotide skew on the amino acid composition of the encoded proteins. We compared orthologous genes between animal mitochondrial genomes that show large differences in GC and AT skews. Specifically, we compared the mitochondrial genomes of mammals, which are characterized by a negative GC skew and a positive AT skew, to those of flatworms, which show the opposite skews for both GC and AT base pairs. We found that the mammalian proteins are highly enriched in amino acids encoded by CA-rich codons (as predicted by their negative GC and positive AT skews), whereas their flatworm orthologs were enriched in amino acids encoded by GT-rich codons (also as predicted from their skews). We found that these differences in mitochondrial strand asymmetry (measured as GC and AT skews) can have very large, predictable effects on the composition of the encoded proteins. PMID:17974594
Root-Bernstein, Robert; Root-Bernstein, Meredith
2016-05-21
We have proposed that the ribosome may represent a missing link between prebiotic chemistries and the first cells. One of the predictions that follows from this hypothesis, which we test here, is that ribosomal RNA (rRNA) must have encoded the proteins necessary for ribosomal function. In other words, the rRNA also functioned pre-biotically as mRNA. Since these ribosome-binding proteins (rb-proteins) must bind to the rRNA, but the rRNA also functioned as mRNA, it follows that rb-proteins should bind to their own mRNA as well. This hypothesis can be contrasted to a "null" hypothesis in which rb-proteins evolved independently of the rRNA sequences and therefore there should be no necessary similarity between the rRNA to which rb-proteins bind and the mRNA that encodes the rb-protein. Five types of evidence reported here support the plausibility of the hypothesis that the mRNA encoding rb-proteins evolved from rRNA: (1) the ubiquity of rb-protein binding to their own mRNAs and autogenous control of their own translation; (2) the higher-than-expected incidence of Arginine-rich modules associated with RNA binding that occurs in rRNA-encoded proteins; (3) the fact that rRNA-binding regions of rb-proteins are homologous to their mRNA binding regions; (4) the higher than expected incidence of rb-protein sequences encoded in rRNA that are of a high degree of homology to their mRNA as compared with a random selection of other proteins; and (5) rRNA in modern prokaryotes and eukaryotes encodes functional proteins. None of these results can be explained by the null hypothesis that assumes independent evolution of rRNA and the mRNAs encoding ribosomal proteins. Also noteworthy is that very few proteins bind their own mRNAs that are not associated with ribosome function. Further tests of the hypothesis are suggested: (1) experimental testing of whether rRNA-encoded proteins bind to rRNA at their coding sites; (2) whether tRNA synthetases, which are also known to bind to their own mRNAs, are encoded by the tRNA sequences themselves; (3) and the prediction that archaeal and prokaryotic (DNA-based) genomes were built around rRNA "genes" so that rRNA-related sequences will be found to make up an unexpectedly high proportion of these genomes. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Norton, Jeanette M.; Klotz, Martin G; Stein, Lisa Y
2008-01-01
The complete genome of the ammonia-oxidizing bacterium, Nitrosospira multiformis (ATCC 25196T), consists of a circular chromosome and three small plasmids totaling 3,234,309 bp and encoding 2827 putative proteins. Of these, 2026 proteins have predicted functions and 801 are without conserved functional domains, yet 747 of these have similarity to other predicted proteins in databases. Gene homologs from Nitrosomonas europaea and N. eutropha were the best match for 42% of the predicted genes in N. multiformis. The genome contains three nearly identical copies of amo and hao gene clusters as large repeats. Distinguishing features compared to N. europaea include: the presencemore » of gene clusters encoding urease and hydrogenase, a RuBisCO-encoding operon of distinctive structure and phylogeny, and a relatively small complement of genes related to Fe acquisition. Systems for synthesis of a pyoverdine-like siderophore and for acyl-homoserine lactone were unique to N. multiformis among the sequenced AOB genomes. Gene clusters encoding proteins associated with outer membrane and cell envelope functions including transporters, porins, exopolysaccharide synthesis, capsule formation and protein sorting/export were abundant. Numerous sensory transduction and response regulator gene systems directed towards sensing of the extracellular environment are described. Gene clusters for glycogen, polyphosphate and cyanophycin storage and utilization were identified providing mechanisms for meeting energy requirements under substrate-limited conditions. The genome of N. multiformis encodes the core pathways for chemolithoautotrophy along with adaptations for surface growth and survival in soil environments.« less
Three reasons protein disorder analysis makes more sense in the light of collagen
Oates, Matt E.; Tompa, Peter; Gough, Julian
2016-01-01
Abstract We have identified that the collagen helix has the potential to be disruptive to analyses of intrinsically disordered proteins. The collagen helix is an extended fibrous structure that is both promiscuous and repetitive. Whilst its sequence is predicted to be disordered, this type of protein structure is not typically considered as intrinsic disorder. Here, we show that collagen‐encoding proteins skew the distribution of exon lengths in genes. We find that previous results, demonstrating that exons encoding disordered regions are more likely to be symmetric, are due to the abundance of the collagen helix. Other related results, showing increased levels of alternative splicing in disorder‐encoding exons, still hold after considering collagen‐containing proteins. Aside from analyses of exons, we find that the set of proteins that contain collagen significantly alters the amino acid composition of regions predicted as disordered. We conclude that research in this area should be conducted in the light of the collagen helix. PMID:26941008
Qiu, T; Lu, R H; Zhang, J; Zhu, Z Y
2001-07-01
The complete nucleotide sequence of M6 gene of grass carp hemorrhage virus (GCHV) was determined. It is 2039 nucleotides in length and contains a single large open reading frame that could encode a protein of 648 amino acids with predicted molecular mass of 68.7 kDa. Amino acid sequence comparison revealed that the protein encoded by GCHV M6 is closely related to the protein mu1 of mammalian reovirus. The M6 gene, encoding the major outer-capsid protein, was expressed using the pET fusion protein vector in Escherichia coli and detected by Western blotting using chicken anti-GCHV immunoglobulin (IgY). The result indicates that the protein encoded by M6 may share a putative Asn-42-Pro-43 proteolytic cleavage site with mu1.
Genes encoding giant danio and golden shiner ependymin.
Adams, D S; Kiyokawa, M; Getman, M E; Shashoua, V E
1996-03-01
Ependymin (EPN) is a brain glycoprotein that functions as a neurotrophic factor in optic nerve regeneration and long-term memory consolidation in goldfish. To date, true epn genes have been characterized in one order of teleost fish, Cypriniformes. In the study presented here, polymerase chain reactions were used to analyze the complete epn genes, gd (1480 bp), and sh (2071 bp), from Cypriniformes giant danio and shiner, respectively. Southern hybridizations demonstrated the existence of one copy of each gene per corresponding haploid genome. Each gene was found to contain six exons and five introns. Gene gd encodes a predicted 218-amino acid (aa) protein GD 93 percent conserved to goldfish EPN, while sh encodes a predicted 214-aa protein SH 91 percent homologous to goldfish. Evidence is presented classifying proteins previously termed "EPNs" into two major categories: true EPNs and non-EPN cerebrospinal fluid glycoproteins. Proteins GD and SH contain all the hallmark, features of true EPNs.
Gao, J; Naglich, J G; Laidlaw, J; Whaley, J M; Seizinger, B R; Kley, N
1995-02-15
The human von Hippel-Lindau disease (VHL) gene has recently been identified and, based on the nucleotide sequence of a partial cDNA clone, has been predicted to encode a novel protein with as yet unknown functions [F. Latif et al., Science (Washington DC), 260: 1317-1320, 1993]. The length of the encoded protein and the characteristics of the cellular expressed protein are as yet unclear. Here we report the cloning and characterization of a mouse gene (mVHLh1) that is widely expressed in different mouse tissues and shares high homology with the human VHL gene. It predicts a protein 181 residues long (and/or 162 amino acids, considering a potential alternative start codon), which across a core region of approximately 140 residues displays a high degree of sequence identity (98%) to the predicted human VHL protein. High stringency DNA and RNA hybridization experiments and protein expression analyses indicate that this gene is the most highly VHL-related mouse gene, suggesting that it represents the mouse VHL gene homologue rather than a related gene sharing a conserved functional domain. These findings provide new insights into the potential organization of the VHL gene and nature of its encoded protein.
Otsuki, Tetsuji; Ota, Toshio; Nishikawa, Tetsuo; Hayashi, Koji; Suzuki, Yutaka; Yamamoto, Jun-ichi; Wakamatsu, Ai; Kimura, Kouichi; Sakamoto, Katsuhiko; Hatano, Naoto; Kawai, Yuri; Ishii, Shizuko; Saito, Kaoru; Kojima, Shin-ichi; Sugiyama, Tomoyasu; Ono, Tetsuyoshi; Okano, Kazunori; Yoshikawa, Yoko; Aotsuka, Satoshi; Sasaki, Naokazu; Hattori, Atsushi; Okumura, Koji; Nagai, Keiichi; Sugano, Sumio; Isogai, Takao
2005-01-01
We have developed an in silico method of selection of human full-length cDNAs encoding secretion or membrane proteins from oligo-capped cDNA libraries. Fullness rates were increased to about 80% by combination of the oligo-capping method and ATGpr, software for prediction of translation start point and the coding potential. Then, using 5'-end single-pass sequences, cDNAs having the signal sequence were selected by PSORT ('signal sequence trap'). We also applied 'secretion or membrane protein-related keyword trap' based on the result of BLAST search against the SWISS-PROT database for the cDNAs which could not be selected by PSORT. Using the above procedures, 789 cDNAs were primarily selected and subjected to full-length sequencing, and 334 of these cDNAs were finally selected as novel. Most of the cDNAs (295 cDNAs: 88.3%) were predicted to encode secretion or membrane proteins. In particular, 165(80.5%) of the 205 cDNAs selected by PSORT were predicted to have signal sequences, while 70 (54.2%) of the 129 cDNAs selected by 'keyword trap' preserved the secretion or membrane protein-related keywords. Many important cDNAs were obtained, including transporters, receptors, and ligands, involved in significant cellular functions. Thus, an efficient method of selecting secretion or membrane protein-encoding cDNAs was developed by combining the above four procedures.
Song, Jiangning; Yuan, Zheng; Tan, Hao; Huber, Thomas; Burrage, Kevin
2007-12-01
Disulfide bonds are primary covalent crosslinks between two cysteine residues in proteins that play critical roles in stabilizing the protein structures and are commonly found in extracy-toplasmatic or secreted proteins. In protein folding prediction, the localization of disulfide bonds can greatly reduce the search in conformational space. Therefore, there is a great need to develop computational methods capable of accurately predicting disulfide connectivity patterns in proteins that could have potentially important applications. We have developed a novel method to predict disulfide connectivity patterns from protein primary sequence, using a support vector regression (SVR) approach based on multiple sequence feature vectors and predicted secondary structure by the PSIPRED program. The results indicate that our method could achieve a prediction accuracy of 74.4% and 77.9%, respectively, when averaged on proteins with two to five disulfide bridges using 4-fold cross-validation, measured on the protein and cysteine pair on a well-defined non-homologous dataset. We assessed the effects of different sequence encoding schemes on the prediction performance of disulfide connectivity. It has been shown that the sequence encoding scheme based on multiple sequence feature vectors coupled with predicted secondary structure can significantly improve the prediction accuracy, thus enabling our method to outperform most of other currently available predictors. Our work provides a complementary approach to the current algorithms that should be useful in computationally assigning disulfide connectivity patterns and helps in the annotation of protein sequences generated by large-scale whole-genome projects. The prediction web server and Supplementary Material are accessible at http://foo.maths.uq.edu.au/~huber/disulfide
Cloning and sequencing the genes encoding goldfish and carp ependymin.
Adams, D S; Shashoua, V E
1994-04-20
Ependymins (EPNs) are brain glycoproteins thought to function in optic nerve regeneration and long-term memory consolidation. To date, epn genes have been characterized in two orders of teleost fish. In this study, polymerase chain reactions (PCR) were used to amplify the complete 1.6-kb epn genes, gf-I and cc-I, from genomic DNA of Cypriniformes, goldfish and carp, respectively. Amplified bands were cloned and sequenced. Each gene consists of six exons and five introns. The exon portion of gf-I encodes a predicted 215-amino-acid (aa) protein previously characterized as GF-I, while cc-I encodes a predicted 215-aa protein 95% homologous to GF-I.
The prediction of biogenic magnetic nanoparticles biomineralization in human tissues and organs
NASA Astrophysics Data System (ADS)
Medviediev, O.; Gorobets, O. Yu; Gorobets, S. V.; Yadrykhins'ky, V. S.
2017-10-01
In this study, human homologs of magnetosome island proteins basing on pairwise and multiple alignment of amino acid sequences were found. The expression levels of genes, which encode magnetosome island proteins of M. gryphiswaldense MSR-1, that were cultured under oxygen deficiency conditions and also under microaerobic conditions were compared to the expression levels of genes that encode the relevant homologs in human organism. The possibility of BMN biomineralization in human tissues and organs, in which BMN were not experimentally found before, was predicted.
Rosemblat, S; Durham-Pierre, D; Gardner, J M; Nakatsu, Y; Brilliant, M H; Orlow, S J
1994-01-01
The pink-eyed dilution (p) locus in the mouse is critical to melanogenesis; mutations in the homologous locus in humans, P, are a cause of type II oculocutaneous albinism. Although a cDNA encoded by the p gene has recently been identified, nothing is known about the protein product of this gene. To characterize the protein encoded by the p gene, we performed immunoblot analysis of extracts of melanocytes cultured from wild-type mice with an antiserum from rabbits immunized with a peptide corresponding to amino acids 285-298 of the predicted protein product of the murine p gene. This antiserum recognized a 110-kDa protein. The protein was absent from extracts of melanocytes cultured from mice with two mutations (pcp and p) in which transcripts of the p gene are absent or greatly reduced. Introduction of the cDNA for the p gene into pcp melanocytes by electroporation resulted in expression of the 3.3-kb mRNA and the 110-kDa protein. Upon subcellular fractionation of cultured melanocytes, the 110-kDa protein was found to be present in melanosomes but absent from the vesicular fraction; phase separation performed with the nonionic detergent Triton X-114 confirmed the predicted hydrophobic nature of the protein. These results demonstrate that the p gene encodes a 110-kDa integral melanosomal membrane protein and establish a framework by which mutations at this locus, which diminish pigmentation, can be analyzed at the cellular and biochemical levels. Images PMID:7991586
Trabanino, Rene J; Vaidehi, Nagarajan; Hall, Spencer E; Goddard, William A; Floriano, Wely
2013-02-05
The invention provides computer-implemented methods and apparatus implementing a hierarchical protocol using multiscale molecular dynamics and molecular modeling methods to predict the presence of transmembrane regions in proteins, such as G-Protein Coupled Receptors (GPCR), and protein structural models generated according to the protocol. The protocol features a coarse grain sampling method, such as hydrophobicity analysis, to provide a fast and accurate procedure for predicting transmembrane regions. Methods and apparatus of the invention are useful to screen protein or polynucleotide databases for encoded proteins with transmembrane regions, such as GPCRs.
Livingston, B T; Shaw, R; Bailey, A; Wilt, F
1991-12-01
In order to investigate the role of proteins in the formation of mineralized tissues during development, we have isolated a cDNA that encodes a protein that is a component of the organic matrix of the skeletal spicule of the sea urchin, Lytechinus pictus. The expression of the RNA encoding this protein is regulated over development and is localized to the descendents of the micromere lineage. Comparison of the sequence of this cDNA to homologous cDNAs from other species of urchin reveal that the protein is basic and contains three conserved structural motifs: a signal peptide, a proline-rich region, and an unusual region composed of a series of direct repeats. Studies on the protein encoded by this cDNA confirm the predicted reading frame deduced from the nucleotide sequence and show that the protein is secreted and not glycosylated. Comparison of the amino acid sequence to databases reveal that the repeat domain is similar to proteins that form a unique beta-spiral supersecondary structure.
Gabe, Jeffrey D.; Dragon, Elizabeth; Chang, Ray-Jen; McCaman, Michael T.
1998-01-01
A tandem pair of nearly identical genes from Serpulina hyodysenteriae (B204) were cloned and sequenced. The full open reading frame of one gene and the partial open reading frame of the neighboring gene appear to encode secreted proteins which are homologous to, yet distinct from, the 39-kDa extracytoplasmic protein purified from the membrane fraction of S. hyodysenteriae. We have designated these newly identified genes vspA and vspB (for variable surface protein). PMID:9440540
Predicting Drug-Target Interaction Networks Based on Functional Groups and Biological Features
Shi, Xiao-He; Hu, Le-Le; Kong, Xiangyin; Cai, Yu-Dong; Chou, Kuo-Chen
2010-01-01
Background Study of drug-target interaction networks is an important topic for drug development. It is both time-consuming and costly to determine compound-protein interactions or potential drug-target interactions by experiments alone. As a complement, the in silico prediction methods can provide us with very useful information in a timely manner. Methods/Principal Findings To realize this, drug compounds are encoded with functional groups and proteins encoded by biological features including biochemical and physicochemical properties. The optimal feature selection procedures are adopted by means of the mRMR (Maximum Relevance Minimum Redundancy) method. Instead of classifying the proteins as a whole family, target proteins are divided into four groups: enzymes, ion channels, G-protein- coupled receptors and nuclear receptors. Thus, four independent predictors are established using the Nearest Neighbor algorithm as their operation engine, with each to predict the interactions between drugs and one of the four protein groups. As a result, the overall success rates by the jackknife cross-validation tests achieved with the four predictors are 85.48%, 80.78%, 78.49%, and 85.66%, respectively. Conclusion/Significance Our results indicate that the network prediction system thus established is quite promising and encouraging. PMID:20300175
Draft Map of Human Proteome Published | Office of Cancer Clinical Proteomics Research
In a recently published article in the journal Nature, researchers have developed a draft map of the human proteome. Striving for the protein equivalent of the Human Genome Project, an international team of researchers has created an initial catalog of the human proteome. In total, using 30 different human tissues, the researchers identified proteins encoded by 17,294 genes, which is approximately 84 percent of all of the genes in the human genome predicted to encode proteins.
2011-01-01
The genomic DNA sequence of a novel enteric uncultured microphage, ΦCA82 from a turkey gastrointestinal system was determined utilizing metagenomics techniques. The entire circular, single-stranded nucleotide sequence of the genome was 5,514 nucleotides. The ΦCA82 genome is quite different from other microviruses as indicated by comparisons of nucleotide similarity, predicted protein similarity, and functional classifications. Only three genes showed significant similarity to microviral proteins as determined by local alignments using BLAST analysis. ORF1 encoded a predicted phage F capsid protein that was phylogenetically most similar to the Microviridae ΦMH2K member's major coat protein. The ΦCA82 genome also encoded a predicted minor capsid protein (ORF2) and putative replication initiation protein (ORF3) most similar to the microviral bacteriophage SpV4. The distant evolutionary relationship of ΦCA82 suggests that the divergence of this novel turkey microvirus from other microviruses may reflect unique evolutionary pressures encountered within the turkey gastrointestinal system. PMID:21714899
Abdelkader, E H; Feintuch, A; Yao, X; Adams, L A; Aurelio, L; Graham, B; Goldfarb, D; Otting, G
2015-11-14
Quantitative cysteine-independent ligation of a Gd(3+) tag to genetically encoded p-azido-L-phenylalanine via Cu(I)-catalyzed click chemistry is shown to deliver an exceptionally powerful tool for Gd(3+)-Gd(3+) distance measurements by double electron-electron resonance (DEER) experiments, as the position of the Gd(3+) ion relative to the protein can be predicted with high accuracy.
Hsu, Jack C-C; Reid, David W; Hoffman, Alyson M; Sarkar, Devanand; Nicchitta, Christopher V
2018-05-01
Astrocyte elevated gene-1 (AEG-1), an oncogene whose overexpression promotes tumor cell proliferation, angiogenesis, invasion, and enhanced chemoresistance, is thought to function primarily as a scaffolding protein, regulating PI3K/Akt and Wnt/β-catenin signaling pathways. Here we report that AEG-1 is an endoplasmic reticulum (ER) resident integral membrane RNA-binding protein (RBP). Examination of the AEG-1 RNA interactome by HITS-CLIP and PAR-CLIP methodologies revealed a high enrichment for endomembrane organelle-encoding transcripts, most prominently those encoding ER resident proteins, and within this cohort, for integral membrane protein-encoding RNAs. Cluster mapping of the AEG-1/RNA interaction sites demonstrated a normalized rank order interaction of coding sequence >5' untranslated region, with 3' untranslated region interactions only weakly represented. Intriguingly, AEG-1/membrane protein mRNA interaction sites clustered downstream from encoded transmembrane domains, suggestive of a role in membrane protein biogenesis. Secretory and cytosolic protein-encoding mRNAs were also represented in the AEG-1 RNA interactome, with the latter category notably enriched in genes functioning in mRNA localization, translational regulation, and RNA quality control. Bioinformatic analyses of RNA-binding motifs and predicted secondary structure characteristics indicate that AEG-1 lacks established RNA-binding sites though shares the property of high intrinsic disorder commonly seen in RBPs. These data implicate AEG-1 in the localization and regulation of secretory and membrane protein-encoding mRNAs and provide a framework for understanding AEG-1 function in health and disease. © 2018 Hsu et al.; Published by Cold Spring Harbor Laboratory Press for the RNA Society.
Valles, Steven M; Bell, Susanne; Firth, Andrew E
2014-01-01
Solenopsis invicta virus 3 (SINV-3) is a positive-sense single-stranded RNA virus that infects the red imported fire ant, Solenopsis invicta. We show that the second open reading frame (ORF) of the dicistronic genome is expressed via a frameshifting mechanism and that the sequences encoding the structural proteins map to both ORF2 and the 3' end of ORF1, downstream of the sequence that encodes the RNA-dependent RNA polymerase. The genome organization and structural protein expression strategy resemble those of Acyrthosiphon pisum virus (APV), an aphid virus. The capsid protein that is encoded by the 3' end of ORF1 in SINV-3 and APV is predicted to have a jelly-roll fold similar to the capsid proteins of picornaviruses and caliciviruses. The capsid-extension protein that is produced by frameshifting, includes the jelly-roll fold domain encoded by ORF1 as its N-terminus, while the C-terminus encoded by the 5' half of ORF2 has no clear homology with other viral structural proteins. A third protein, encoded by the 3' half of ORF2, is associated with purified virions at sub-stoichiometric ratios. Although the structural proteins can be translated from the genomic RNA, we show that SINV-3 also produces a subgenomic RNA encoding the structural proteins. Circumstantial evidence suggests that APV may also produce such a subgenomic RNA. Both SINV-3 and APV are unclassified picorna-like viruses distantly related to members of the order Picornavirales and the family Caliciviridae. Within this grouping, features of the genome organization and capsid domain structure of SINV-3 and APV appear more similar to caliciviruses, perhaps suggesting the basis for a "Calicivirales" order.
RGAugury: a pipeline for genome-wide prediction of resistance gene analogs (RGAs) in plants.
Li, Pingchuan; Quan, Xiande; Jia, Gaofeng; Xiao, Jin; Cloutier, Sylvie; You, Frank M
2016-11-02
Resistance gene analogs (RGAs), such as NBS-encoding proteins, receptor-like protein kinases (RLKs) and receptor-like proteins (RLPs), are potential R-genes that contain specific conserved domains and motifs. Thus, RGAs can be predicted based on their conserved structural features using bioinformatics tools. Computer programs have been developed for the identification of individual domains and motifs from the protein sequences of RGAs but none offer a systematic assessment of the different types of RGAs. A user-friendly and efficient pipeline is needed for large-scale genome-wide RGA predictions of the growing number of sequenced plant genomes. An integrative pipeline, named RGAugury, was developed to automate RGA prediction. The pipeline first identifies RGA-related protein domains and motifs, namely nucleotide binding site (NB-ARC), leucine rich repeat (LRR), transmembrane (TM), serine/threonine and tyrosine kinase (STTK), lysin motif (LysM), coiled-coil (CC) and Toll/Interleukin-1 receptor (TIR). RGA candidates are identified and classified into four major families based on the presence of combinations of these RGA domains and motifs: NBS-encoding, TM-CC, and membrane associated RLP and RLK. All time-consuming analyses of the pipeline are paralleled to improve performance. The pipeline was evaluated using the well-annotated Arabidopsis genome. A total of 98.5, 85.2, and 100 % of the reported NBS-encoding genes, membrane associated RLPs and RLKs were validated, respectively. The pipeline was also successfully applied to predict RGAs for 50 sequenced plant genomes. A user-friendly web interface was implemented to ease command line operations, facilitate visualization and simplify result management for multiple datasets. RGAugury is an efficiently integrative bioinformatics tool for large scale genome-wide identification of RGAs. It is freely available at Bitbucket: https://bitbucket.org/yaanlpc/rgaugury .
Dunham, S P; Onions, D E
2001-06-21
A cDNA encoding feline granulocyte colony stimulating factor (fG-CSF) was cloned from alveolar macrophages using the reverse transcriptase-polymerase chain reaction. The cDNA is 949 bp in length and encodes a predicted mature protein of 174 amino acids. Recombinant fG-CSF was expressed as a glutathione S-transferase fusion and purified by affinity chromatography. Biological activity of the recombinant protein was demonstrated using the murine myeloblastic cell line GNFS-60, which showed an ED50 for fG-CSF of approximately 2 ng/ml. Copyright 2001 Academic Press.
Elrobh, Mohamed S.; Alanazi, Mohammad S.; Khan, Wajahatullah; Abduljaleel, Zainularifeen; Al-Amri, Abdullah; Bazzi, Mohammad D.
2011-01-01
Heat shock proteins are ubiquitous, induced under a number of environmental and metabolic stresses, with highly conserved DNA sequences among mammalian species. Camelus dromedaries (the Arabian camel) domesticated under semi-desert environments, is well adapted to tolerate and survive against severe drought and high temperatures for extended periods. This is the first report of molecular cloning and characterization of full length cDNA of encoding a putative stress-induced heat shock HSPA6 protein (also called HSP70B′) from Arabian camel. A full-length cDNA (2417 bp) was obtained by rapid amplification of cDNA ends (RACE) and cloned in pET-b expression vector. The sequence analysis of HSPA6 gene showed 1932 bp-long open reading frame encoding 643 amino acids. The complete cDNA sequence of the Arabian camel HSPA6 gene was submitted to NCBI GeneBank (accession number HQ214118.1). The BLAST analysis indicated that C. dromedaries HSPA6 gene nucleotides shared high similarity (77–91%) with heat shock gene nucleotide of other mammals. The deduced 643 amino acid sequences (accession number ADO12067.1) showed that the predicted protein has an estimated molecular weight of 70.5 kDa with a predicted isoelectric point (pI) of 6.0. The comparative analyses of camel HSPA6 protein sequences with other mammalian heat shock proteins (HSPs) showed high identity (80–94%). Predicted camel HSPA6 protein structure using Protein 3D structural analysis high similarities with human and mouse HSPs. Taken together, this study indicates that the cDNA sequences of HSPA6 gene and its amino acid and protein structure from the Arabian camel are highly conserved and have similarities with other mammalian species. PMID:21845074
Walker, J; Tait, A
1997-11-01
A reverse-transcriptase polymerase chain reaction (PCR) procedure was used to isolate an Ostertagia circumcincta partial cDNA encoding a protein with general primary sequence features characteristic of members of the mitochondrial processing peptidase (MPP) subfamily of M16 metallopeptidases. The structural relationships of the predicted protein (Oc MPPX) with MPP subfamily proteins from other species (including the model free-living nematode Caenorhabditis elegans) were examined, and Northern analysis confirmed the expression of the Oc mppx gene in adult nematodes.
Construction and Screening of a Lentiviral Secretome Library.
Liu, Tao; Jia, Panpan; Ma, Huailei; Reed, Sean A; Luo, Xiaozhou; Larman, H Benjamin; Schultz, Peter G
2017-06-22
Over 2,000 human proteins are predicted to be secreted, but the biological function of the many of these proteins is still unknown. Moreover, a number of these proteins may act as new therapeutic agents or be targets for the development of therapeutic antibodies. To further explore the extracellular proteome, we have developed a secretome-enriched open reading frame (ORF) library that can be readily screened for autocrine activity in cell-based phenotypic or reporter assays. Next-generation sequencing (NGS) and database analysis predict that the library contains approximately 900 ORFs encoding known secreted proteins (accounting for 77.8% of the library), as well as genes encoding potentially unknown secreted proteins. In a proof-of-principle study, human TF-1 cells were screened for proliferative factors, and the known cytokine GMCSF was identified as a dominant hit. This library offers a relatively low-cost and straightforward approach for functional autocrine screens of secreted proteins. Copyright © 2017 Elsevier Ltd. All rights reserved.
Parallel protein secondary structure prediction based on neural networks.
Zhong, Wei; Altun, Gulsah; Tian, Xinmin; Harrison, Robert; Tai, Phang C; Pan, Yi
2004-01-01
Protein secondary structure prediction has a fundamental influence on today's bioinformatics research. In this work, binary and tertiary classifiers of protein secondary structure prediction are implemented on Denoeux belief neural network (DBNN) architecture. Hydrophobicity matrix, orthogonal matrix, BLOSUM62 and PSSM (position specific scoring matrix) are experimented separately as the encoding schemes for DBNN. The experimental results contribute to the design of new encoding schemes. New binary classifier for Helix versus not Helix ( approximately H) for DBNN produces prediction accuracy of 87% when PSSM is used for the input profile. The performance of DBNN binary classifier is comparable to other best prediction methods. The good test results for binary classifiers open a new approach for protein structure prediction with neural networks. Due to the time consuming task of training the neural networks, Pthread and OpenMP are employed to parallelize DBNN in the hyperthreading enabled Intel architecture. Speedup for 16 Pthreads is 4.9 and speedup for 16 OpenMP threads is 4 in the 4 processors shared memory architecture. Both speedup performance of OpenMP and Pthread is superior to that of other research. With the new parallel training algorithm, thousands of amino acids can be processed in reasonable amount of time. Our research also shows that hyperthreading technology for Intel architecture is efficient for parallel biological algorithms.
Gentry-Weeks, C R; Hultsch, A L; Kelly, S M; Keith, J M; Curtiss, R
1992-01-01
Three gene libraries of Bordetella avium 197 DNA were prepared in Escherichia coli LE392 by using the cosmid vectors pCP13 and pYA2329, a derivative of pCP13 specifying spectinomycin resistance. The cosmid libraries were screened with convalescent-phase anti-B. avium turkey sera and polyclonal rabbit antisera against B. avium 197 outer membrane proteins. One E. coli recombinant clone produced a 56-kDa protein which reacted with convalescent-phase serum from a turkey infected with B. avium 197. In addition, five E. coli recombinant clones were identified which produced B. avium outer membrane proteins with molecular masses of 21, 38, 40, 43, and 48 kDa. At least one of these E. coli clones, which encoded the 21-kDa protein, reacted with both convalescent-phase turkey sera and antibody against B. avium 197 outer membrane proteins. The gene for the 21-kDa outer membrane protein was localized by Tn5seq1 mutagenesis, and the nucleotide sequence was determined by dideoxy sequencing. DNA sequence analysis of the 21-kDa protein revealed an open reading frame of 582 bases that resulted in a predicted protein of 194 amino acids. Comparison of the predicted amino acid sequence of the gene encoding the 21-kDa outer membrane protein with protein sequences in the National Biomedical Research Foundation protein sequence data base indicated significant homology to the OmpA proteins of Shigella dysenteriae, Enterobacter aerogenes, E. coli, and Salmonella typhimurium and to Neisseria gonorrhoeae outer membrane protein III, Haemophilus influenzae protein P6, and Pseudomonas aeruginosa porin protein F. The gene (ompA) encoding the B. avium 21-kDa protein hybridized with 4.1-kb DNA fragments from EcoRI-digested, chromosomal DNA of Bordetella pertussis and Bordetella bronchiseptica and with 6.0- and 3.2-kb DNA fragments from EcoRI-digested, chromosomal DNA of B. avium and B. avium-like DNA, respectively. A 6.75-kb DNA fragment encoding the B. avium 21-kDa protein was subcloned into the Asd+ vector pYA292, and the construct was introduced into the avirulent delta cya delta crp delta asd S. typhimurium chi 3987 for oral immunization of birds. The gene encoding the 21-kDa protein was expressed equivalently in B. avium 197, delta asd E. coli chi 6097, and S. typhimurium chi 3987 and was localized primarily in the cytoplasmic membrane and outer membrane. In preliminary studies on oral inoculation of turkey poults with S. typhimurium chi 3987 expressing the gene encoding the B. avium 21-kDa protein, it was determined that a single dose of the recombinant Salmonella vaccine failed to elicit serum antibodies against the 21-kDa protein and challenge with wild-type B. avium 197 resulted in colonization of the trachea and thymus with B. avium 197. Images PMID:1447140
Li, You-Hai; Han, Wen-Jin; Gui, Xi-Wu; Wei, Tao; Tang, Shuang-Yan; Jin, Jian-Ming
2016-08-02
Tentoxin, a cyclic tetrapeptide produced by several Alternaria species, inhibits the F₁-ATPase activity of chloroplasts, resulting in chlorosis in sensitive plants. In this study, we report two clustered genes, encoding a putative non-ribosome peptide synthetase (NRPS) TES and a cytochrome P450 protein TES1, that are required for tentoxin biosynthesis in Alternaria alternata strain ZJ33, which was isolated from blighted leaves of Eupatorium adenophorum. Using a pair of primers designed according to the consensus sequences of the adenylation domain of NRPSs, two fragments containing putative adenylation domains were amplified from A. alternata ZJ33, and subsequent PCR analyses demonstrated that these fragments belonged to the same NRPS coding sequence. With no introns, TES consists of a single 15,486 base pair open reading frame encoding a predicted 5161 amino acid protein. Meanwhile, the TES1 gene is predicted to contain five introns and encode a 506 amino acid protein. The TES protein is predicted to be comprised of four peptide synthase modules with two additional N-methylation domains, and the number and arrangement of the modules in TES were consistent with the number and arrangement of the amino acid residues of tentoxin, respectively. Notably, both TES and TES1 null mutants generated via homologous recombination failed to produce tentoxin. This study provides the first evidence concerning the biosynthesis of tentoxin in A. alternata.
Possenti, Andrea; Vendruscolo, Michele; Camilloni, Carlo; Tiana, Guido
2018-05-23
Proteins employ the information stored in the genetic code and translated into their sequences to carry out well-defined functions in the cellular environment. The possibility to encode for such functions is controlled by the balance between the amount of information supplied by the sequence and that left after that the protein has folded into its structure. We study the amount of information necessary to specify the protein structure, providing an estimate that keeps into account the thermodynamic properties of protein folding. We thus show that the information remaining in the protein sequence after encoding for its structure (the 'information gap') is very close to what needed to encode for its function and interactions. Then, by predicting the information gap directly from the protein sequence, we show that it may be possible to use these insights from information theory to discriminate between ordered and disordered proteins, to identify unknown functions, and to optimize artificially-designed protein sequences. This article is protected by copyright. All rights reserved. © 2018 Wiley Periodicals, Inc.
Wise, C A; Chiang, L C; Paznekas, W A; Sharma, M; Musy, M M; Ashley, J A; Lovett, M; Jabs, E W
1997-04-01
Treacher Collins Syndrome (TCS) is the most common of the human mandibulofacial dysostosis disorders. Recently, a partial TCOF1 cDNA was identified and shown to contain mutations in TCS families. Here we present the entire exon/intron genomic structure and the complete coding sequence of TCOF1. TCOF1 encodes a low complexity protein of 1,411 amino acids, whose predicted protein structure reveals repeated motifs that mirror the organization of its exons. These motifs are shared with nucleolar trafficking proteins in other species and are predicted to be highly phosphorylated by casein kinase. Consistent with this, the full-length TCOF1 protein sequence also contains putative nuclear and nucleolar localization signals. Throughout the open reading frame, we detected an additional eight mutations in TCS families and several polymorphisms. We postulate that TCS results from defects in a nucleolar trafficking protein that is critically required during human craniofacial development.
Wise, Carol A.; Chiang, Lydia C.; Paznekas, William A.; Sharma, Mridula; Musy, Maurice M.; Ashley, Jennifer A.; Lovett, Michael; Jabs, Ethylin W.
1997-01-01
Treacher Collins Syndrome (TCS) is the most common of the human mandibulofacial dysostosis disorders. Recently, a partial TCOF1 cDNA was identified and shown to contain mutations in TCS families. Here we present the entire exon/intron genomic structure and the complete coding sequence of TCOF1. TCOF1 encodes a low complexity protein of 1,411 amino acids, whose predicted protein structure reveals repeated motifs that mirror the organization of its exons. These motifs are shared with nucleolar trafficking proteins in other species and are predicted to be highly phosphorylated by casein kinase. Consistent with this, the full-length TCOF1 protein sequence also contains putative nuclear and nucleolar localization signals. Throughout the open reading frame, we detected an additional eight mutations in TCS families and several polymorphisms. We postulate that TCS results from defects in a nucleolar trafficking protein that is critically required during human craniofacial development. PMID:9096354
Ansell, Brendan R E; Schnyder, Manuela; Deplazes, Peter; Korhonen, Pasi K; Young, Neil D; Hall, Ross S; Mangiola, Stefano; Boag, Peter R; Hofmann, Andreas; Sternberg, Paul W; Jex, Aaron R; Gasser, Robin B
2013-12-01
Angiostrongylus vasorum is a metastrongyloid nematode of dogs and other canids of major clinical importance in many countries. In order to gain first insights into the molecular biology of this worm, we conducted the first large-scale exploration of its transcriptome, and predicted essential molecules linked to metabolic and biological processes as well as host immune responses. We also predicted and prioritized drug targets and drug candidates. Following Illumina sequencing (RNA-seq), 52.3 million sequence reads representing adult A. vasorum were assembled and annotated. The assembly yielded 20,033 contigs, which encoded proteins with 11,505 homologues in Caenorhabditis elegans, and additional 2252 homologues in various other parasitic helminths for which curated data sets were publicly available. Functional annotation was achieved for 11,752 (58.6%) proteins predicted for A. vasorum, including peptidases (4.5%) and peptidase inhibitors (1.6%), protein kinases (1.7%), G protein-coupled receptors (GPCRs) (1.5%) and phosphatases (1.2%). Contigs encoding excretory/secretory and immuno-modulatory proteins represented some of the most highly transcribed molecules, and encoded enzymes that digest haemoglobin were conserved between A. vasorum and other blood-feeding nematodes. Using an essentiality-based approach, drug targets, including neurotransmitter receptors, an important chemosensory ion channel and cysteine proteinase-3 were predicted in A. vasorum, as were associated small molecular inhibitors/activators. Future transcriptomic analyses of all developmental stages of A. vasorum should facilitate deep explorations of the molecular biology of this important parasitic nematode and support the sequencing of its genome. These advances will provide a foundation for exploring immuno-molecular aspects of angiostrongylosis and have the potential to underpin the discovery of new methods of intervention. © 2013.
Binding Affinity prediction with Property Encoded Shape Distribution signatures
Das, Sourav; Krein, Michael P.
2010-01-01
We report the use of the molecular signatures known as “Property-Encoded Shape Distributions” (PESD) together with standard Support Vector Machine (SVM) techniques to produce validated models that can predict the binding affinity of a large number of protein ligand complexes. This “PESD-SVM” method uses PESD signatures that encode molecular shapes and property distributions on protein and ligand surfaces as features to build SVM models that require no subjective feature selection. A simple protocol was employed for tuning the SVM models during their development, and the results were compared to SFCscore – a regression-based method that was previously shown to perform better than 14 other scoring functions. Although the PESD-SVM method is based on only two surface property maps, the overall results were comparable. For most complexes with a dominant enthalpic contribution to binding (ΔH/-TΔS > 3), a good correlation between true and predicted affinities was observed. Entropy and solvent were not considered in the present approach and further improvement in accuracy would require accounting for these components rigorously. PMID:20095526
A taxonomy of bacterial microcompartment loci constructed by a novel scoring method
Axen, Seth D.; Erbilgin, Onur; Kerfeld, Cheryl A.; ...
2014-10-23
Bacterial microcompartments (BMCs) are proteinaceous organelles involved in both autotrophic and heterotrophic metabolism. All BMCs share homologous shell proteins but differ in their complement of enzymes; these are typically encoded adjacent to shell protein genes in genetic loci, or operons. To enable the identification and prediction of functional (sub)types of BMCs, we developed LoClass, an algorithm that finds putative BMC loci and inventories, weights, and compares their constituent pfam domains to construct a locus similarity network and predict locus (sub)types. In addition to using LoClass to analyze sequences in the Non-redundant Protein Database, we compared predicted BMC loci found inmore » seven candidate bacterial phyla (six from single-cell genomic studies) to the LoClass taxonomy. Together, these analyses resulted in the identification of 23 different types of BMCs encoded in 30 distinct locus (sub)types found in 23 bacterial phyla. These include the two carboxysome types and a divergent set of metabolosomes, BMCs that share a common catalytic core and process distinct substrates via specific signature enzymes. Furthermore, many Candidate BMCs were found that lack one or more core metabolosome components, including one that is predicted to represent an entirely new paradigm for BMC-associated metabolism, joining the carboxysome and metabolosome. By placing these results in a phylogenetic context, we provide a framework for understanding the horizontal transfer of these loci, a starting point for studies aimed at understanding the evolution of BMCs. This comprehensive taxonomy of BMC loci, based on their constituent protein domains, foregrounds the functional diversity of BMCs and provides a reference for interpreting the role of BMC gene clusters encoded in isolate, single cell, and metagenomic data. Many loci encode ancillary functions such as transporters or genes for cofactor assembly; this expanded vocabulary of BMC-related functions should be useful for design of genetic modules for introducing BMCs in bioengineering applications.« less
A Taxonomy of Bacterial Microcompartment Loci Constructed by a Novel Scoring Method
Kerfeld, Cheryl A.
2014-01-01
Bacterial microcompartments (BMCs) are proteinaceous organelles involved in both autotrophic and heterotrophic metabolism. All BMCs share homologous shell proteins but differ in their complement of enzymes; these are typically encoded adjacent to shell protein genes in genetic loci, or operons. To enable the identification and prediction of functional (sub)types of BMCs, we developed LoClass, an algorithm that finds putative BMC loci and inventories, weights, and compares their constituent pfam domains to construct a locus similarity network and predict locus (sub)types. In addition to using LoClass to analyze sequences in the Non-redundant Protein Database, we compared predicted BMC loci found in seven candidate bacterial phyla (six from single-cell genomic studies) to the LoClass taxonomy. Together, these analyses resulted in the identification of 23 different types of BMCs encoded in 30 distinct locus (sub)types found in 23 bacterial phyla. These include the two carboxysome types and a divergent set of metabolosomes, BMCs that share a common catalytic core and process distinct substrates via specific signature enzymes. Furthermore, many Candidate BMCs were found that lack one or more core metabolosome components, including one that is predicted to represent an entirely new paradigm for BMC-associated metabolism, joining the carboxysome and metabolosome. By placing these results in a phylogenetic context, we provide a framework for understanding the horizontal transfer of these loci, a starting point for studies aimed at understanding the evolution of BMCs. This comprehensive taxonomy of BMC loci, based on their constituent protein domains, foregrounds the functional diversity of BMCs and provides a reference for interpreting the role of BMC gene clusters encoded in isolate, single cell, and metagenomic data. Many loci encode ancillary functions such as transporters or genes for cofactor assembly; this expanded vocabulary of BMC-related functions should be useful for design of genetic modules for introducing BMCs in bioengineering applications. PMID:25340524
Evolution and Structural Organization of the C Proteins of Paramyxovirinae
Karlin, David G.
2014-01-01
The phosphoprotein (P) gene of most Paramyxovirinae encodes several proteins in overlapping frames: P and V, which share a common N-terminus (PNT), and C, which overlaps PNT. Overlapping genes are of particular interest because they encode proteins originated de novo, some of which have unknown structural folds, challenging the notion that nature utilizes only a limited, well-mapped area of fold space. The C proteins cluster in three groups, comprising measles, Nipah, and Sendai virus. We predicted that all C proteins have a similar organization: a variable, disordered N-terminus and a conserved, α-helical C-terminus. We confirmed this predicted organization by biophysically characterizing recombinant C proteins from Tupaia paramyxovirus (measles group) and human parainfluenza virus 1 (Sendai group). We also found that the C of the measles and Nipah groups have statistically significant sequence similarity, indicating a common origin. Although the C of the Sendai group lack sequence similarity with them, we speculate that they also have a common origin, given their similar genomic location and structural organization. Since C is dispensable for viral replication, unlike PNT, we hypothesize that C may have originated de novo by overprinting PNT in the ancestor of Paramyxovirinae. Intriguingly, in measles virus and Nipah virus, PNT encodes STAT1-binding sites that overlap different regions of the C-terminus of C, indicating they have probably originated independently. This arrangement, in which the same genetic region encodes simultaneously a crucial functional motif (a STAT1-binding site) and a highly constrained region (the C-terminus of C), seems paradoxical, since it should severely reduce the ability of the virus to adapt. The fact that it originated twice suggests that it must be balanced by an evolutionary advantage, perhaps from reducing the size of the genetic region vulnerable to mutations. PMID:24587180
Palma-Guerrero, Javier; Zhao, Jiuhai; Gonçalves, A. Pedro; Starr, Trevor L.
2015-01-01
The molecular mechanisms of membrane merger during somatic cell fusion in eukaryotic species are poorly understood. In the filamentous fungus Neurospora crassa, somatic cell fusion occurs between genetically identical germinated asexual spores (germlings) and between hyphae to form the interconnected network characteristic of a filamentous fungal colony. In N. crassa, two proteins have been identified to function at the step of membrane fusion during somatic cell fusion: PRM1 and LFD-1. The absence of either one of these two proteins results in an increase of germling pairs arrested during cell fusion with tightly appressed plasma membranes and an increase in the frequency of cell lysis of adhered germlings. The level of cell lysis in ΔPrm1 or Δlfd-1 germlings is dependent on the extracellular calcium concentration. An available transcriptional profile data set was used to identify genes encoding predicted transmembrane proteins that showed reduced expression levels in germlings cultured in the absence of extracellular calcium. From these analyses, we identified a mutant (lfd-2, for late fusion defect-2) that showed a calcium-dependent cell lysis phenotype. lfd-2 encodes a protein with a Fringe domain and showed endoplasmic reticulum and Golgi membrane localization. The deletion of an additional gene predicted to encode a low-affinity calcium transporter, fig1, also resulted in a strain that showed a calcium-dependent cell lysis phenotype. Genetic analyses showed that LFD-2 and FIG1 likely function in separate pathways to regulate aspects of membrane merger and repair during cell fusion. PMID:25595444
Binding ligand prediction for proteins using partial matching of local surface patches.
Sael, Lee; Kihara, Daisuke
2010-01-01
Functional elucidation of uncharacterized protein structures is an important task in bioinformatics. We report our new approach for structure-based function prediction which captures local surface features of ligand binding pockets. Function of proteins, specifically, binding ligands of proteins, can be predicted by finding similar local surface regions of known proteins. To enable partial comparison of binding sites in proteins, a weighted bipartite matching algorithm is used to match pairs of surface patches. The surface patches are encoded with the 3D Zernike descriptors. Unlike the existing methods which compare global characteristics of the protein fold or the global pocket shape, the local surface patch method can find functional similarity between non-homologous proteins and binding pockets for flexible ligand molecules. The proposed method improves prediction results over global pocket shape-based method which was previously developed by our group.
Binding Ligand Prediction for Proteins Using Partial Matching of Local Surface Patches
Sael, Lee; Kihara, Daisuke
2010-01-01
Functional elucidation of uncharacterized protein structures is an important task in bioinformatics. We report our new approach for structure-based function prediction which captures local surface features of ligand binding pockets. Function of proteins, specifically, binding ligands of proteins, can be predicted by finding similar local surface regions of known proteins. To enable partial comparison of binding sites in proteins, a weighted bipartite matching algorithm is used to match pairs of surface patches. The surface patches are encoded with the 3D Zernike descriptors. Unlike the existing methods which compare global characteristics of the protein fold or the global pocket shape, the local surface patch method can find functional similarity between non-homologous proteins and binding pockets for flexible ligand molecules. The proposed method improves prediction results over global pocket shape-based method which was previously developed by our group. PMID:21614188
Meta-omic signatures of microbial metal and nitrogen cycling in marine oxygen minimum zones
Glass, Jennifer B.; Kretz, Cecilia B.; Ganesh, Sangita; Ranjan, Piyush; Seston, Sherry L.; Buck, Kristen N.; Landing, William M.; Morton, Peter L.; Moffett, James W.; Giovannoni, Stephen J.; Vergin, Kevin L.; Stewart, Frank J.
2015-01-01
Iron (Fe) and copper (Cu) are essential cofactors for microbial metalloenzymes, but little is known about the metalloenyzme inventory of anaerobic marine microbial communities despite their importance to the nitrogen cycle. We compared dissolved O2, NO3−, NO2−, Fe and Cu concentrations with nucleic acid sequences encoding Fe and Cu-binding proteins in 21 metagenomes and 9 metatranscriptomes from Eastern Tropical North and South Pacific oxygen minimum zones and 7 metagenomes from the Bermuda Atlantic Time-series Station. Dissolved Fe concentrations increased sharply at upper oxic-anoxic transition zones, with the highest Fe:Cu molar ratio (1.8) occurring at the anoxic core of the Eastern Tropical North Pacific oxygen minimum zone and matching the predicted maximum ratio based on data from diverse ocean sites. The relative abundance of genes encoding Fe-binding proteins was negatively correlated with O2, driven by significant increases in genes encoding Fe-proteins involved in dissimilatory nitrogen metabolisms under anoxia. Transcripts encoding cytochrome c oxidase, the Fe- and Cu-containing terminal reductase in aerobic respiration, were positively correlated with O2 content. A comparison of the taxonomy of genes encoding Fe- and Cu-binding vs. bulk proteins in OMZs revealed that Planctomycetes represented a higher percentage of Fe genes while Thaumarchaeota represented a higher percentage of Cu genes, particularly at oxyclines. These results are broadly consistent with higher relative abundance of genes encoding Fe-proteins in the genome of a marine planctomycete vs. higher relative abundance of genes encoding Cu-proteins in the genome of a marine thaumarchaeote. These findings highlight the importance of metalloenzymes for microbial processes in oxygen minimum zones and suggest preferential Cu use in oxic habitats with Cu > Fe vs. preferential Fe use in anoxic niches with Fe > Cu. PMID:26441925
Meta-omic signatures of microbial metal and nitrogen cycling in marine oxygen minimum zones.
Glass, Jennifer B; Kretz, Cecilia B; Ganesh, Sangita; Ranjan, Piyush; Seston, Sherry L; Buck, Kristen N; Landing, William M; Morton, Peter L; Moffett, James W; Giovannoni, Stephen J; Vergin, Kevin L; Stewart, Frank J
2015-01-01
Iron (Fe) and copper (Cu) are essential cofactors for microbial metalloenzymes, but little is known about the metalloenyzme inventory of anaerobic marine microbial communities despite their importance to the nitrogen cycle. We compared dissolved O2, NO[Formula: see text], NO[Formula: see text], Fe and Cu concentrations with nucleic acid sequences encoding Fe and Cu-binding proteins in 21 metagenomes and 9 metatranscriptomes from Eastern Tropical North and South Pacific oxygen minimum zones and 7 metagenomes from the Bermuda Atlantic Time-series Station. Dissolved Fe concentrations increased sharply at upper oxic-anoxic transition zones, with the highest Fe:Cu molar ratio (1.8) occurring at the anoxic core of the Eastern Tropical North Pacific oxygen minimum zone and matching the predicted maximum ratio based on data from diverse ocean sites. The relative abundance of genes encoding Fe-binding proteins was negatively correlated with O2, driven by significant increases in genes encoding Fe-proteins involved in dissimilatory nitrogen metabolisms under anoxia. Transcripts encoding cytochrome c oxidase, the Fe- and Cu-containing terminal reductase in aerobic respiration, were positively correlated with O2 content. A comparison of the taxonomy of genes encoding Fe- and Cu-binding vs. bulk proteins in OMZs revealed that Planctomycetes represented a higher percentage of Fe genes while Thaumarchaeota represented a higher percentage of Cu genes, particularly at oxyclines. These results are broadly consistent with higher relative abundance of genes encoding Fe-proteins in the genome of a marine planctomycete vs. higher relative abundance of genes encoding Cu-proteins in the genome of a marine thaumarchaeote. These findings highlight the importance of metalloenzymes for microbial processes in oxygen minimum zones and suggest preferential Cu use in oxic habitats with Cu > Fe vs. preferential Fe use in anoxic niches with Fe > Cu.
Characterization of the Lymantria dispar nucleopolyhedrovirus 25K FP gene
David S. Bischoff; James M. Slavicek
1996-01-01
The Lymantria dispar nucleopolyhedrovirus (LdMNPV) gene encoding the 25K FP protein has been cloned and sequenced. The 25KFP gene codes for a 217 amino acid protein with a predicted molecular mass of 24870 Da. Expression of the 25K FP protein in a rabbit reticulocyte system generated a 27 kDa protein, in close agreement with the...
Sharma, Sandeep; Zaccaron, Alex Z; Ridenour, John B; Allen, Tom W; Conner, Kassie; Doyle, Vinson P; Price, Trey; Sikora, Edward; Singh, Raghuwinder; Spurlock, Terry; Tomaso-Peterson, Maria; Wilkerson, Tessie; Bluhm, Burton H
2018-04-01
The draft genome of Xylaria sp. isolate MSU_SB201401, causal agent of taproot decline of soybean in the southern U.S., is presented here. The genome assembly was 56.7 Mb in size with an L50 of 246. A total of 10,880 putative protein-encoding genes were predicted, including 647 genes encoding carbohydrate-active enzymes and 1053 genes encoding secreted proteins. This is the first draft genome of a plant-pathogenic Xylaria sp. associated with soybean. The draft genome of Xylaria sp. isolate MSU_SB201401 will provide an important resource for future experiments to determine the molecular basis of pathogenesis.
Li, You-Hai; Han, Wen-Jin; Gui, Xi-Wu; Wei, Tao; Tang, Shuang-Yan; Jin, Jian-Ming
2016-01-01
Tentoxin, a cyclic tetrapeptide produced by several Alternaria species, inhibits the F1-ATPase activity of chloroplasts, resulting in chlorosis in sensitive plants. In this study, we report two clustered genes, encoding a putative non-ribosome peptide synthetase (NRPS) TES and a cytochrome P450 protein TES1, that are required for tentoxin biosynthesis in Alternaria alternata strain ZJ33, which was isolated from blighted leaves of Eupatorium adenophorum. Using a pair of primers designed according to the consensus sequences of the adenylation domain of NRPSs, two fragments containing putative adenylation domains were amplified from A. alternata ZJ33, and subsequent PCR analyses demonstrated that these fragments belonged to the same NRPS coding sequence. With no introns, TES consists of a single 15,486 base pair open reading frame encoding a predicted 5161 amino acid protein. Meanwhile, the TES1 gene is predicted to contain five introns and encode a 506 amino acid protein. The TES protein is predicted to be comprised of four peptide synthase modules with two additional N-methylation domains, and the number and arrangement of the modules in TES were consistent with the number and arrangement of the amino acid residues of tentoxin, respectively. Notably, both TES and TES1 null mutants generated via homologous recombination failed to produce tentoxin. This study provides the first evidence concerning the biosynthesis of tentoxin in A. alternata. PMID:27490569
2015-01-01
Phytopathogenic fungi form intimate associations with host plant species and cause disease. To be successful, fungal pathogens communicate with a susceptible host through the secretion of proteinaceous effectors, hydrolytic enzymes and metabolites. Sclerotinia sclerotiorum and Botrytis cinerea are economically important necrotrophic fungal pathogens that cause disease on numerous crop species. Here, a powerful bioinformatics pipeline was used to predict the refined S. sclerotiorum and B. cinerea secretomes, identifying 432 and 499 proteins respectively. Analyses focusing on S. sclerotiorum revealed that 16% of the secretome encoding genes resided in small, sequence heterogeneous, gene clusters that were distributed over 13 of the 16 predicted chromosomes. Functional analyses highlighted the importance of plant cell hydrolysis, oxidation-reduction processes and the redox state to the S. sclerotiorum and B. cinerea secretomes and potentially host infection. Only 8% of the predicted proteins were distinct between the two secretomes. In contrast to S. sclerotiorum, the B. cinerea secretome lacked CFEM- or LysM-containing proteins. The 115 fungal and oomycete genome comparison identified 30 proteins specific to S. sclerotiorum and B. cinerea, plus 11 proteins specific to S. sclerotiorum and 32 proteins specific to B. cinerea. Expressed sequence tag (EST) and proteomic analyses showed that 246 S. sclerotiorum secretome encoding genes had EST support, including 101 which were only expressed in vitro and 49 which were only expressed in planta, whilst 42 predicted proteins were experimentally proven to be secreted. These detailed in silico analyses of two important necrotrophic pathogens will permit informed choices to be made when candidate effector proteins are selected for function analyses in planta. PMID:26107498
fRMSDPred: Predicting Local RMSD Between Structural Fragments Using Sequence Information
2007-04-04
machine learning approaches for estimating the RMSD value of a pair of protein fragments. These estimated fragment-level RMSD values can be used to construct the alignment, assess the quality of an alignment, and identify high-quality alignment segments. We present algorithms to solve this fragment-level RMSD prediction problem using a supervised learning framework based on support vector regression and classification that incorporates protein profiles, predicted secondary structure, effective information encoding schemes, and novel second-order pairwise exponential kernel
Carlson, Jonathan; Yan, Jiyu; Akinsiku, Olusimidele T.; Schaefer, Malinda; Sabbaj, Steffanie; Bet, Anne; Levy, David N.; Heath, Sonya; Tang, Jianming; Kaslow, Richard A.; Walker, Bruce D.; Ndung’u, Thumbi; Goulder, Philip J.; Heckerman, David; Hunter, Eric; Goepfert, Paul A.
2010-01-01
Retroviruses pack multiple genes into relatively small genomes by encoding several genes in the same genomic region with overlapping reading frames. Both sense and antisense HIV-1 transcripts contain open reading frames for known functional proteins as well as numerous alternative reading frames (ARFs). At least some ARFs have the potential to encode proteins of unknown function, and their antigenic properties can be considered as cryptic epitopes (CEs). To examine the extent of active immune response to virally encoded CEs, we analyzed human leukocyte antigen class I–associated polymorphisms in HIV-1 gag, pol, and nef genes from a large cohort of South Africans with chronic infection. In all, 391 CEs and 168 conventional epitopes were predicted, with the majority (307; 79%) of CEs derived from antisense transcripts. In further evaluation of CD8 T cell responses to a subset of the predicted CEs in patients with primary or chronic infection, both sense- and antisense-encoded CEs were immunogenic at both stages of infection. In addition, CEs often mutated during the first year of infection, which was consistent with immune selection for escape variants. These findings indicate that the HIV-1 genome might encode and deploy a large potential repertoire of unconventional epitopes to enhance vaccine-induced antiviral immunity. PMID:20065064
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mavromatis, K; Doyle, C Kuyler; Lykidis, A
2006-01-01
Ehrlichia canis, a small obligately intracellular, tick-transmitted, gram-negative, {alpha}-proteobacterium, is the primary etiologic agent of globally distributed canine monocytic ehrlichiosis. Complete genome sequencing revealed that the E. canis genome consists of a single circular chromosome of 1,315,030 bp predicted to encode 925 proteins, 40 stable RNA species, 17 putative pseudogenes, and a substantial proportion of noncoding sequence (27%). Interesting genome features include a large set of proteins with transmembrane helices and/or signal sequences and a unique serine-threonine bias associated with the potential for O glycosylation that was prominent in proteins associated with pathogen-host interactions. Furthermore, two paralogous protein families associatedmore » with immune evasion were identified, one of which contains poly(G-C) tracts, suggesting that they may play a role in phase variation and facilitation of persistent infections. Genes associated with pathogen-host interactions were identified, including a small group encoding proteins (n = 12) with tandem repeats and another group encoding proteins with eukaryote-like ankyrin domains (n = 7).« less
A highly divergent gene cluster in honey bees encodes a novel silk family.
Sutherland, Tara D; Campbell, Peter M; Weisman, Sarah; Trueman, Holly E; Sriskantha, Alagacone; Wanjura, Wolfgang J; Haritos, Victoria S
2006-11-01
The pupal cocoon of the domesticated silk moth Bombyx mori is the best known and most extensively studied insect silk. It is not widely known that Apis mellifera larvae also produce silk. We have used a combination of genomic and proteomic techniques to identify four honey bee fiber genes (AmelFibroin1-4) and two silk-associated genes (AmelSA1 and 2). The four fiber genes are small, comprise a single exon each, and are clustered on a short genomic region where the open reading frames are GC-rich amid low GC intergenic regions. The genes encode similar proteins that are highly helical and predicted to form unusually tight coiled coils. Despite the similarity in size, structure, and composition of the encoded proteins, the genes have low primary sequence identity. We propose that the four fiber genes have arisen from gene duplication events but have subsequently diverged significantly. The silk-associated genes encode proteins likely to act as a glue (AmelSA1) and involved in silk processing (AmelSA2). Although the silks of honey bees and silkmoths both originate in larval labial glands, the silk proteins are completely different in their primary, secondary, and tertiary structures as well as the genomic arrangement of the genes encoding them. This implies independent evolutionary origins for these functionally related proteins.
USDA-ARS?s Scientific Manuscript database
The gene encoding SnTox1, a necrotrophic effector from Stagonospora nodorum that causes necrosis of wheat lines expressing Snn1, has been verified by heterologous expression in Pichia pastoris. SnTox1 encodes a 117 amino acid cysteine rich protein with the first 17 amino acids predicted as a signal ...
Li, Yanan; Zeng, Xiaobo; Zhou, Xuejuan; Li, Youguo
2016-12-04
Lipid transfer protein superfamily is involved in lipid transport and metabolism. This study aimed to construct mutants of three lipid transfer protein encoding genes in Mesorhizobium huakuii 7653R, and to study the phenotypes and function of mutations during symbiosis with Astragalus sinicus. We used bioinformatics to predict structure characteristics and biological functions of lipid transfer proteins, and conducted semi-quantitative and fluorescent quantitative real-time PCR to analyze the expression levels of target genes in free-living and symbiotic conditions. Using pK19mob insertion mutagenesis to construct mutants, we carried out pot plant experiments to observe symbiotic phenotypes. MCHK-5577, MCHK-2172 and MCHK-2779 genes encoding proteins belonged to START/RHO alpha_C/PITP/Bet_v1/CoxG/CalC (SRPBCC) superfamily, involved in lipid transport or metabolism, and were identical to M. loti at 95% level. Gene relative transcription level of the three genes all increased compared to free-living condition. We obtained three mutants. Compared with wild-type 7653R, above-ground biomass of plants and nodulenitrogenase activity induced by the three mutants significantly decreased. Results indicated that lipid transfer protein encoding genes of Mesorhizobium huakuii 7653R may play important roles in symbiotic nitrogen fixation, and the mutations significantly affected the symbiotic phenotypes. The present work provided a basis to study further symbiotic function mechanism associated with lipid transfer proteins from rhizobia.
DrugECs: An Ensemble System with Feature Subspaces for Accurate Drug-Target Interaction Prediction
Jiang, Jinjian; Wang, Nian; Zhang, Jun
2017-01-01
Background Drug-target interaction is key in drug discovery, especially in the design of new lead compound. However, the work to find a new lead compound for a specific target is complicated and hard, and it always leads to many mistakes. Therefore computational techniques are commonly adopted in drug design, which can save time and costs to a significant extent. Results To address the issue, a new prediction system is proposed in this work to identify drug-target interaction. First, drug-target pairs are encoded with a fragment technique and the software “PaDEL-Descriptor.” The fragment technique is for encoding target proteins, which divides each protein sequence into several fragments in order and encodes each fragment with several physiochemical properties of amino acids. The software “PaDEL-Descriptor” creates encoding vectors for drug molecules. Second, the dataset of drug-target pairs is resampled and several overlapped subsets are obtained, which are then input into kNN (k-Nearest Neighbor) classifier to build an ensemble system. Conclusion Experimental results on the drug-target dataset showed that our method performs better and runs faster than the state-of-the-art predictors. PMID:28744468
NASA Astrophysics Data System (ADS)
Mahmood, Zakaria N.; Mahmuddin, Massudi; Mahmood, Mohammed Nooraldeen
Encoding proteins of amino acid sequence to predict classified into their respective families and subfamilies is important research area. However for a given protein, knowing the exact action whether hormonal, enzymatic, transmembranal or nuclear receptors does not depend solely on amino acid sequence but on the way the amino acid thread folds as well. This study provides a prototype system that able to predict a protein tertiary structure. Several methods are used to develop and evaluate the system to produce better accuracy in protein 3D structure prediction. The Bees Optimization algorithm which inspired from the honey bees food foraging method, is used in the searching phase. In this study, the experiment is conducted on short sequence proteins that have been used by the previous researches using well-known tools. The proposed approach shows a promising result.
Ciok, Anna; Adamczuk, Marcin; Bartosik, Dariusz; Dziewit, Lukasz
2016-11-28
Pseudomonas strains isolated from the heavily contaminated Lubin copper mine and Zelazny Most post-flotation waste reservoir in Poland were screened for the presence of integrons. This analysis revealed that two strains carried homologous DNA regions composed of a gene encoding a DNA_BRE_C domain-containing tyrosine recombinase (with no significant sequence similarity to other integrases of integrons) plus a three-component array of putative integron gene cassettes. The predicted gene cassettes encode three putative polypeptides with homology to (i) transmembrane proteins, (ii) GCN5 family acetyltransferases, and (iii) hypothetical proteins of unknown function (homologous proteins are encoded by the gene cassettes of several class 1 integrons). Comparative sequence analyses identified three structural variants of these novel integron-like elements within the sequenced bacterial genomes. Analysis of their distribution revealed that they are found exclusively in strains of the genus Pseudomonas .
Sullivan, William J; Monroy, M Alexandra; Bohne, Wolfgang; Nallani, Karuna C; Chrivia, John; Yaciuk, Peter; Smith, Charles K; Queener, Sherry F
2003-05-01
We have identified and mapped a gene in Toxoplasma gondii that encodes a homologue of SRCAP (Snf2-related CBP activator protein), a member of the SNF/SWI family of chromatin remodeling factors. The genomic locus (TgSRCAP) is present as a single copy and contains 16 introns. The predicted cDNA contains an open reading frame of 8,775 bp and encodes a protein of 2,924 amino acids. We have identified additional SRCAP-like sequences in Apicomplexa for comparison by screening genomic databases. An analysis of SRCAP homologues between species reveals signature features that may be indicative of SRCAP members. Expression of mRNA encoding TgSRCAP is upregulated when tachyzoite (invasive form) parasites are induced to differentiate into bradyzoites (encysted form) in vitro. Recombinant TgSRCAP protein is functionally equivalent to the human homologue, being capable of increasing transcription mediated by CREB.
Chaturvedi, Navaneet; Kajsik, Michal; Forsythe, Stephen; Pandey, Paras Nath
2015-12-01
The recently annotated genome of the bacterium Cronobacter sakazakii BAA-894 suggests that the organism has the ability to bind heavy metals. This study demonstrates heavy metal tolerance in C. sakazakii, in which proteins with the heavy metal interaction were recognized by computational and experimental study. As the result, approximately one-fourth of proteins encoded on the plasmid pESA3 are proposed to have potential interaction with heavy metals. Interaction between heavy metals and predicted proteins was further corroborated using protein crystal structures from protein data bank database and comparison of metal-binding ligands. In addition, a phylogenetic study was undertaken for the toxic heavy metals, arsenic, cadmium, lead and mercury, which generated relatedness clustering for lead, cadmium and arsenic. Laboratory studies confirmed the organism's tolerance to tellurite, copper and silver. These experimental and computational study data extend our understanding of the genes encoding for proteins of this important neonatal pathogen and provide further insights into the genotypes associated with features that can contribute to its persistence in the environment. The information will be of value for future environmental protection from heavy toxic metals.
Jiménez, Diego Javier; Dini-Andreote, Francisco; Ottoni, Júlia Ronzella; de Oliveira, Valéria Maia; van Elsas, Jan Dirk; Andreote, Fernando Dini
2015-01-01
The occurrence of genes encoding biotechnologically relevant α/β-hydrolases in mangrove soil microbial communities was assessed using data obtained by whole-metagenome sequencing of four mangroves areas, denoted BrMgv01 to BrMgv04, in São Paulo, Brazil. The sequences (215 Mb in total) were filtered based on local amino acid alignments against the Lipase Engineering Database. In total, 5923 unassembled sequences were affiliated with 30 different α/β-hydrolase fold superfamilies. The most abundant predicted proteins encompassed cytosolic hydrolases (abH08; ∼ 23%), microsomal hydrolases (abH09; ∼ 12%) and Moraxella lipase-like proteins (abH04 and abH01; < 5%). Detailed analysis of the genes predicted to encode proteins of the abH08 superfamily revealed a high proportion related to epoxide hydrolases and haloalkane dehalogenases in polluted mangroves BrMgv01-02-03. This suggested selection and putative involvement in local degradation/detoxification of the pollutants. Seven sequences that were annotated as genes for putative epoxide hydrolases and five for putative haloalkane dehalogenases were found in a fosmid library generated from BrMgv02 DNA. The latter enzymes were predicted to belong to Actinobacteria, Deinococcus-Thermus, Planctomycetes and Proteobacteria. Our integrated approach thus identified 12 genes (complete and/or partial) that may encode hitherto undescribed enzymes. The low amino acid identity (< 60%) with already-described genes opens perspectives for both production in an expression host and genetic screening of metagenomes. PMID:25171437
Protein Structure Prediction by Protein Threading
NASA Astrophysics Data System (ADS)
Xu, Ying; Liu, Zhijie; Cai, Liming; Xu, Dong
The seminal work of Bowie, Lüthy, and Eisenberg (Bowie et al., 1991) on "the inverse protein folding problem" laid the foundation of protein structure prediction by protein threading. By using simple measures for fitness of different amino acid types to local structural environments defined in terms of solvent accessibility and protein secondary structure, the authors derived a simple and yet profoundly novel approach to assessing if a protein sequence fits well with a given protein structural fold. Their follow-up work (Elofsson et al., 1996; Fischer and Eisenberg, 1996; Fischer et al., 1996a,b) and the work by Jones, Taylor, and Thornton (Jones et al., 1992) on protein fold recognition led to the development of a new brand of powerful tools for protein structure prediction, which we now term "protein threading." These computational tools have played a key role in extending the utility of all the experimentally solved structures by X-ray crystallography and nuclear magnetic resonance (NMR), providing structural models and functional predictions for many of the proteins encoded in the hundreds of genomes that have been sequenced up to now.
Adhikari, Utpal Kumar; Rahman, M Mizanur
2017-04-01
The nirk gene encoding the copper-containing nitrite reductase (CuNiR), a key catalytic enzyme in the environmental denitrification process that helps to produce nitric oxide from nitrite. The molecular mechanism of denitrification process is definitely complex and in this case a theoretical investigation has been conducted to know the sequence information and amino acid composition of the active site of CuNiR enzyme using various Bioinformatics tools. 10 Fasta formatted sequences were retrieved from the NCBI database and the domain and disordered regions identification and phylogenetic analyses were done on these sequences. The comparative modeling of protein was performed through Modeller 9v14 program and visualized by PyMOL tools. Validated protein models were deposited in the Protein Model Database (PMDB) (PMDB id: PM0080150 to PM0080159). Active sites of nirk encoding CuNiR enzyme were identified by Castp server. The PROCHECK showed significant scores for four protein models in the most favored regions of the Ramachandran plot. Active sites and cavities prediction exhibited that the amino acid, namely Glycine, Alanine, Histidine, Aspartic acid, Glutamic acid, Threonine, and Glutamine were common in four predicted protein models. The present in silico study anticipates that active site analyses result will pave the way for further research on the complex denitrification mechanism of the selected species in the experimental laboratory. Copyright © 2016. Published by Elsevier Ltd.
Du, Yu-Jie; Hou, Yi-Ling; Hou, Wan-Ru
2013-02-01
The Giant Panda is an endangered and valuable gene pool in genetic, its important functional gene POLR2H encodes an essential shared peptide H of RNA polymerases. The genomic DNA and cDNA sequences were cloned successfully for the first time from the Giant Panda (Ailuropoda melanoleuca) adopting touchdown-PCR and reverse transcription polymerase chain reaction (RT-PCR), respectively. The length of the genomic sequence of the Giant Panda is 3,285 bp, including five exons and four introns. The cDNA fragment cloned is 509 bp in length, containing an open reading frame of 453 bp encoding 150 amino acids. Alignment analysis indicated that both the cDNA and its deduced amino acid sequence were highly conserved. Protein structure prediction showed that there was one protein kinase C phosphorylation site, four casein kinase II phosphorylation sites and one amidation site in the POLR2H protein, further shaping advanced protein structure. The cDNA cloned was expressed in Escherichia coli, which indicated that POLR2H fusion with the N-terminally His-tagged form brought about the accumulation of an expected 20.5 kDa polypeptide in line with the predicted protein. On the basis of what has already been achieved in this study, further deep-in research will be conducted, which has great value in theory and practical significance.
Delcourt, Vivian; Lucier, Jean-François; Gagnon, Jules; Beaudoin, Maxime C; Vanderperre, Benoît; Breton, Marc-André; Motard, Julie; Jacques, Jean-François; Brunelle, Mylène; Gagnon-Arsenault, Isabelle; Fournier, Isabelle; Ouangraoua, Aida; Hunting, Darel J; Cohen, Alan A; Landry, Christian R; Scott, Michelle S
2017-01-01
Recent functional, proteomic and ribosome profiling studies in eukaryotes have concurrently demonstrated the translation of alternative open-reading frames (altORFs) in addition to annotated protein coding sequences (CDSs). We show that a large number of small proteins could in fact be coded by these altORFs. The putative alternative proteins translated from altORFs have orthologs in many species and contain functional domains. Evolutionary analyses indicate that altORFs often show more extreme conservation patterns than their CDSs. Thousands of alternative proteins are detected in proteomic datasets by reanalysis using a database containing predicted alternative proteins. This is illustrated with specific examples, including altMiD51, a 70 amino acid mitochondrial fission-promoting protein encoded in MiD51/Mief1/SMCR7L, a gene encoding an annotated protein promoting mitochondrial fission. Our results suggest that many genes are multicoding genes and code for a large protein and one or several small proteins. PMID:29083303
Poehlein, Anja; Heym, Daniel; Quitzke, Vivien; Fersch, Julia; Daniel, Rolf; Rother, Michael
2018-04-05
Methanococcus maripaludis type strain JJ (DSM 2067) is an important organism because it serves as a model for primary energy metabolism and hydrogenotrophic methanogenesis and is amenable to genetic manipulation. The complete genome (1.7 Mb) harbors 1,815 predicted protein-encoding genes, including 9 encoding selenoproteins. Copyright © 2018 Poehlein et al.
Poehlein, Anja; Daniel, Rolf
2017-01-01
Methanobrevibacter arboriphilus strain DH1 is an autotrophic methanogen that was isolated from the wetwood of methane-emitting trees. This species has been of considerable interest for its unusual oxygen tolerance and has been studied as a model organism for more than four decades. Strain DH1 is closely related to other host-associated Methanobrevibacter species from intestinal tracts of animals and the rumen, making this strain an interesting candidate for comparative analysis to identify factors important for colonizing intestinal environments. Here, the genome sequence of M. arboriphilus strain DH1 is reported. The draft genome is composed of 2.445.031 bp with an average GC content of 25.44% and predicted to harbour 1964 protein-encoding genes. Among the predicted genes, there are also more than 50 putative genes for the so-called adhesin-like proteins (ALPs). The presence of ALP-encoding genes in the genome of this non-host-associated methanogen strongly suggests that target surfaces for ALPs other than host tissues also need to be considered as potential interaction partners. The high abundance of ALPs may also indicate that these types of proteins are more characteristic for specific phylogenetic groups of methanogens rather than being indicative for a particular environment the methanogens thrives in. PMID:28634433
Panina, Ekaterina M; Mironov, Andrey A; Gelfand, Mikhail S
2003-08-19
Zinc is an important component of many proteins, but in large concentrations it is poisonous to the cell. Thus its transport is regulated by zinc repressors ZUR of proteobacteria and Gram-positive bacteria from the Bacillus group and AdcR of bacteria from the Streptococcus group. Comparative computational analysis allowed us to identify binding signals of ZUR repressors GAAATGTTATANTATAACATTTC for gamma-proteobacteria, GTAATGTAATAACATTAC for the Agrobacterium group, GATATGTTATAACATATC for the Rhododoccus group, TAAATCGTAATNATTACGATTTA for Gram-positive bacteria, and TTAACYRGTTAA of the streptococcal AdcR repressor. In addition to known transporters and their paralogs, zinc regulons were predicted to contain a candidate component of the ATP binding cassette, zinT (b1995 in Escherichia coli and yrpE in Bacillus subtilis). Candidate AdcR-binding sites were identified upstream of genes encoding pneumococcal histidine triad (PHT) proteins from a number of pathogenic streptococci. Protein functional analysis of this family suggests that PHT proteins are involved in the invasion process. Finally, repression by zinc was predicted for genes encoding a variety of paralogs of ribosomal proteins. The original copies of all these proteins contain zinc-ribbon motifs and thus likely bind zinc, whereas these motifs are destroyed in zinc-regulated paralogs. We suggest that the induction of these paralogs in conditions of zinc starvation leads to their incorporation in a fraction of ribosomes instead of the original ribosomal proteins; the latter are then degraded with subsequent release of some zinc for the utilization by other proteins. Thus we predict a mechanism for maintaining zinc availability for essential enzymes.
MITOPRED: a web server for the prediction of mitochondrial proteins
Guda, Chittibabu; Guda, Purnima; Fahy, Eoin; Subramaniam, Shankar
2004-01-01
MITOPRED web server enables prediction of nucleus-encoded mitochondrial proteins in all eukaryotic species. Predictions are made using a new algorithm based primarily on Pfam domain occurrence patterns in mitochondrial and non-mitochondrial locations. Pre-calculated predictions are instantly accessible for proteomes of Saccharomyces cerevisiae, Caenorhabditis elegans, Drosophila, Homo sapiens, Mus musculus and Arabidopsis species as well as all the eukaryotic sequences in the Swiss-Prot and TrEMBL databases. Queries, at different confidence levels, can be made through four distinct options: (i) entering Swiss-Prot/TrEMBL accession numbers; (ii) uploading a local file with such accession numbers; (iii) entering protein sequences; (iv) uploading a local file containing protein sequences in FASTA format. Automated updates are scheduled for the pre-calculated prediction database so as to provide access to the most current data. The server, its documentation and the data are available from http://mitopred.sdsc.edu. PMID:15215413
USDA-ARS?s Scientific Manuscript database
The cattle tick, Rhipicephalus (Boophilus) microplus, is a pest which causes multiple health complications in cattle. The G-protein coupled receptor (GPCR) super-family presents an interesting target for developing novel tick control methods. However, GPCRs share limited sequence similarity among or...
Weil, D; Levy, G; Sahly, I; Levi-Acobas, F; Blanchard, S; El-Amraoui, A; Crozet, F; Philippe, H; Abitbol, M; Petit, C
1996-04-16
The gene encoding human myosin VIIA is responsible for Usher syndrome type III (USH1B), a disease which associates profound congenital sensorineural deafness, vestibular dysfunction, and retinitis pigmentosa. The reconstituted cDNA sequence presented here predicts a 2215 amino acid protein with a typical unconventional myosin structure. This protein is expected to dimerize into a two-headed molecule. The C terminus of its tail shares homology with the membrane-binding domain of the band 4.1 protein superfamily. The gene consists of 48 coding exons. It encodes several alternatively spliced forms. In situ hybridization analysis in human embryos demonstrates that the myosin VIIA gene is expressed in the pigment epithelium and the photoreceptor cells of the retina, thus indicating that both cell types may be involved in the USH1B retinal degenerative process. In addition, the gene is expressed in the human embryonic cochlear and vestibular neuroepithelia. We suggest that deafness and vestibular dysfunction in USH1B patients result from a defect in the morphogenesis of the inner ear sensory cell stereocilia.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cool, D.E.; Tonks, N.K.; Charbonneau, H.
1989-07-01
A human peripheral T-cell cDNA library was screened with two labeled synthetic oligonucleotides encoding regions of a human placenta protein-tyrosine-phosphatase. One positive clone was isolated and the nucleotide sequence was determined. It contained 1,305 base pairs of open reading frame followed by a TAA stop codon and 978 base pairs of 3{prime} untranslated end, although a poly(A){sup +} tail was not found. An initiator methionine residue was predicted at position 61, which would result in a protein of 415 amino acid residues. This was supported by the synthesis of a M{sub r} 48,000 protein in an in vitro reticulocyte lysatemore » translation system using RNA transcribed from the cloned cDNA and T7 RNA polymerase. The deduced amino acid sequence was compared to other known proteins revealing 65% identity to the low M{sub r} PTPase 1B isolated from placenta. In view of the high degree of similarity, the T-cell cDNA likely encodes a newly discovered protein-tyrosine-phosphatase, thus expanding this family of genes.« less
Glutathione peroxidases of the potato cyst nematode Globodera Rostochiensis.
Jones, J T; Reavy, B; Smant, G; Prior, A E
2004-01-07
We report the cloning and characterisation of full-length DNAs complementary to RNA (cDNAs) encoding two glutathione peroxidases (GpXs) from a plant parasitic nematode, the potato cyst nematode (PCN) Globodera rostochiensis. One protein has a functional signal peptide that targets the protein for secretion from animal cells while the other is predicted to be intracellular. Both genes are expressed in all parasite stages tested. The mRNA encoding the intracellular GpX is present throughout the nematode second stage juvenile and is particularly abundant in metabolically active tissues including the genital primordia. The mRNA encoding the secreted GpX is restricted to the hypodermis, the outermost cellular layer of the nematode, a location from which it is likely to be secreted to the parasite surface. Biochemical studies confirmed the secreted protein as a functional GpX and showed that, like secreted GpXs of other parasitic nematodes, it does not metabolise hydrogen peroxide but has a preference for larger hydroperoxide substrates. The intracellular protein is likely to have a role in metabolism of active oxygen species derived from internal body metabolism while the secreted protein may protect the parasite from host defences. Other functional roles for this protein are discussed.
Wang, Lei; You, Zhu-Hong; Chen, Xing; Yan, Xin; Liu, Gang; Zhang, Wei
2018-01-01
Identification of interaction between drugs and target proteins plays an important role in discovering new drug candidates. However, through the experimental method to identify the drug-target interactions remain to be extremely time-consuming, expensive and challenging even nowadays. Therefore, it is urgent to develop new computational methods to predict potential drugtarget interactions (DTI). In this article, a novel computational model is developed for predicting potential drug-target interactions under the theory that each drug-target interaction pair can be represented by the structural properties from drugs and evolutionary information derived from proteins. Specifically, the protein sequences are encoded as Position-Specific Scoring Matrix (PSSM) descriptor which contains information of biological evolutionary and the drug molecules are encoded as fingerprint feature vector which represents the existence of certain functional groups or fragments. Four benchmark datasets involving enzymes, ion channels, GPCRs and nuclear receptors, are independently used for establishing predictive models with Rotation Forest (RF) model. The proposed method achieved the prediction accuracy of 91.3%, 89.1%, 84.1% and 71.1% for four datasets respectively. In order to make our method more persuasive, we compared our classifier with the state-of-theart Support Vector Machine (SVM) classifier. We also compared the proposed method with other excellent methods. Experimental results demonstrate that the proposed method is effective in the prediction of DTI, and can provide assistance for new drug research and development. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Yerrapragada, Shaila; Shukla, Animesh; Hallsworth-Pepin, Kymberlie; Choi, Kwangmin; Wollam, Aye; Clifton, Sandra; Qin, Xiang; Muzny, Donna; Raghuraman, Sriram; Ashki, Haleh; Uzman, Akif; Highlander, Sarah K.; Fryszczyn, Bartlomiej G.; Fox, George E.; Tirumalai, Madhan R.; Liu, Yamei; Kim, Sun
2015-01-01
Tolypothrix sp. PCC 7601 is a freshwater filamentous cyanobacterium with complex responses to environmental conditions. Here, we present its 9.96-Mbp draft genome sequence, containing 10,065 putative protein-coding sequences, including 305 predicted two-component system proteins and 27 putative phytochrome-class photoreceptors, the most such proteins in any sequenced genome. PMID:25953173
Christie, Andrew E.; Fontanilla, Tiana M.; Nesbit, Katherine T.; Lenz, Petra H.
2013-01-01
Diel vertical migration and seasonal diapause are critical life history events for the copepod Calanus finmarchicus. While much is known about these behaviors phenomenologically, little is known about their molecular underpinnings. Recent studies in insects suggest that some circadian genes/proteins also contribute to the establishment of seasonal diapause. Thus, it is possible that in Calanus these distinct timing regimes share some genetic components. To begin to address this possibility, we used the well-established Drosophila melanogaster circadian system as a reference for mining clock transcripts from a 200,000+ sequence Calanus transcriptome; the proteins encoded by the identified transcripts were also deduced and characterized. Sequences encoding homologs of the Drosophila core clock proteins CLOCK, CYCLE, PERIOD and TIMELESS were identified, as was one encoding CRYPTOCHROME 2, a core clock protein in ancestral insect systems, but absent in Drosophila. Calanus transcripts encoding proteins known to modulate the Drosophila core clock were also identified and characterized, e.g. CLOCKWORK ORANGE, DOUBLETIME, SHAGGY and VRILLE. Alignment and structural analyses of the deduced Calanus proteins with their Drosophila counterparts revealed extensive sequence conservation, particularly in functional domains. Interestingly, reverse BLAST analyses of these sequences against all arthropod proteins typically revealed non-Drosophila isoforms to be most similar to the Calanus queries. This, in combination with the presence of both CRYPTOCHROME 1 (a clock input pathway protein) and CRYPTOCHROME 2 in Calanus, suggests that the organization of the copepod circadian system is an ancestral one, more similar to that of insects like Danaus plexippus than to that of Drosophila. PMID:23727418
Using ensemble of classifiers for predicting HIV protease cleavage sites in proteins.
Nanni, Loris; Lumini, Alessandra
2009-03-01
The focus of this work is the use of ensembles of classifiers for predicting HIV protease cleavage sites in proteins. Due to the complex relationships in the biological data, several recent works show that often ensembles of learning algorithms outperform stand-alone methods. We show that the fusion of approaches based on different encoding models can be useful for improving the performance of this classification problem. In particular, in this work four different feature encodings for peptides are described and tested. An extensive evaluation on a large dataset according to a blind testing protocol is reported which demonstrates how different feature extraction methods and classifiers can be combined for obtaining a robust and reliable system. The comparison with other stand-alone approaches allows quantifying the performance improvement obtained by the ensembles proposed in this work.
Liu, Bin; Jin, Min; Zeng, Pan
2015-10-01
The identification of gene-phenotype relationships is very important for the treatment of human diseases. Studies have shown that genes causing the same or similar phenotypes tend to interact with each other in a protein-protein interaction (PPI) network. Thus, many identification methods based on the PPI network model have achieved good results. However, in the PPI network, some interactions between the proteins encoded by candidate gene and the proteins encoded by known disease genes are very weak. Therefore, some studies have combined the PPI network with other genomic information and reported good predictive performances. However, we believe that the results could be further improved. In this paper, we propose a new method that uses the semantic similarity between the candidate gene and known disease genes to set the initial probability vector of a random walk with a restart algorithm in a human PPI network. The effectiveness of our method was demonstrated by leave-one-out cross-validation, and the experimental results indicated that our method outperformed other methods. Additionally, our method can predict new causative genes of multifactor diseases, including Parkinson's disease, breast cancer and obesity. The top predictions were good and consistent with the findings in the literature, which further illustrates the effectiveness of our method. Copyright © 2015 Elsevier Inc. All rights reserved.
Makeyev, A V; Liebhaber, S A
2000-08-01
We have identified two novel human genes encoding proteins with a high level of sequence identity to two previously characterized RNA-binding proteins, alphaCP-1 and alphaCP-2. Both of these novel genes, alphaCP-3 and alphaCP-4, are predicted to encode proteins with triplicated KH domains. The number and organization of the KH domains, their sequences, and the sequences of the contiguous regions are conserved among all four alphaCP proteins. The common evolutionary origin of these proteins is substantiated by conservation of exon-intron organization in the corresponding genes. The map positions of alphaCP-1 and alphaCP-2 (previously reported) and those of alphaCP-3 and alphaCP-4 (present report) reveal that the four alphaCP loci are dispersed in the human genome; alphaCP-3 and alphaCP-4 mapped to 21q22.3 and 3p21, and the respective mouse orthologues mapped to syntenic regions of the mouse genome, 10B5 and 9F1-F2, respectively. Two additional loci in the human genome were identified as alphaCP-2 processed pseudogenes (PCBP2P1, 21q22.3, and PCBP2P2, 8q21-q22). Although the overall levels of alphaCP-3 and alphaCP-4 mRNAs are substantially lower than those of alphaCP-1 and alphaCP-2, transcripts of alphaCP-3 and alphaCP-4 were found in all mouse tissues tested. These data establish a new subfamily of genes predicted to encode closely related KH-containing RNA-binding proteins with potential functions in posttranscriptional controls. Copyright 2000 Academic Press.
A Bacteriophage-Related Chimeric Marine Virus Infecting Abalone
Zhuang, Jun; Cai, Guiqin; Lin, Qiying; Wu, Zujian; Xie, Lianhui
2010-01-01
Marine viruses shape microbial communities with the most genetic diversity in the sea by multiple genetic exchanges and infect multiple marine organisms. Here we provide proof from experimental infection that abalone shriveling syndrome-associated virus (AbSV) can cause abalone shriveling syndrome. This malady produces histological necrosis and abnormally modified macromolecules (hemocyanin and ferritin). The AbSV genome is a 34.952-kilobase circular double-stranded DNA, containing putative genes with similarity to bacteriophages, eukaryotic viruses, bacteria and endosymbionts. Of the 28 predicted open reading frames (ORFs), eight ORF-encoded proteins have identifiable functional homologues. The 4 ORF products correspond to a predicted terminase large subunit and an endonuclease in bacteriophage, and both an integrase and an exonuclease from bacteria. The other four proteins are homologous to an endosymbiont-derived helicase, primase, single-stranded binding (SSB) protein, and thymidylate kinase, individually. Additionally, AbSV exhibits a common gene arrangement similar to the majority of bacteriophages. Unique to AbSV, the viral genome also contains genes associated with bacterial outer membrane proteins and may lack the structural protein-encoding ORFs. Genomic characterization of AbSV indicates that it may represent a transitional form of microbial evolution from viruses to bacteria. PMID:21079776
Yang, Bingye; Pu, Fei; Qin, Ji; You, Weiwei; Ke, Caihuan
2014-03-10
During a large-scale screen of the larval transcriptome library of the Portuguese oyster, Crassostrea angulata, the oyster gene RACK, which encodes a receptor of activated protein kinase C protein was isolated and characterized. The cDNA is 1,148 bp long and has a predicted open reading frame encoding 317 aa. The predicted protein shows high sequence identity to many RACK proteins of different organisms including molluscs, fish, amphibians and mammals, suggesting that it is conserved during evolution. The structural analysis of the Ca-RACK1 genomic sequence implies that the Ca-RACK1 gene has seven exons and six introns, extending approximately 6.5 kb in length. It is expressed ubiquitously in many oyster tissues as detected by RT-PCR analysis. The Ca-RACK1 mRNA expression pattern was markedly increased at larval metamorphosis; and was further increased along with Ca-RACK1 protein synthesis during epinephrine-induced metamorphosis. These results indicate that the Ca-RACK1 plays an important role in tissue differentiation and/or in cell growth during larval metamorphosis in the oyster, C. angulata. Copyright © 2013 Elsevier B.V. All rights reserved.
How to kill the honey bee larva: genomic potential and virulence mechanisms of Paenibacillus larvae.
Djukic, Marvin; Brzuszkiewicz, Elzbieta; Fünfhaus, Anne; Voss, Jörn; Gollnow, Kathleen; Poppinga, Lena; Liesegang, Heiko; Garcia-Gonzalez, Eva; Genersch, Elke; Daniel, Rolf
2014-01-01
Paenibacillus larvae, a Gram positive bacterial pathogen, causes American Foulbrood (AFB), which is the most serious infectious disease of honey bees. In order to investigate the genomic potential of P. larvae, two strains belonging to two different genotypes were sequenced and used for comparative genome analysis. The complete genome sequence of P. larvae strain DSM 25430 (genotype ERIC II) consisted of 4,056,006 bp and harbored 3,928 predicted protein-encoding genes. The draft genome sequence of P. larvae strain DSM 25719 (genotype ERIC I) comprised 4,579,589 bp and contained 4,868 protein-encoding genes. Both strains harbored a 9.7 kb plasmid and encoded a large number of virulence-associated proteins such as toxins and collagenases. In addition, genes encoding large multimodular enzymes producing nonribosomally peptides or polyketides were identified. In the genome of strain DSM 25719 seven toxin associated loci were identified and analyzed. Five of them encoded putatively functional toxins. The genome of strain DSM 25430 harbored several toxin loci that showed similarity to corresponding loci in the genome of strain DSM 25719, but were non-functional due to point mutations or disruption by transposases. Although both strains cause AFB, significant differences between the genomes were observed including genome size, number and composition of transposases, insertion elements, predicted phage regions, and strain-specific island-like regions. Transposases, integrases and recombinases are important drivers for genome plasticity. A total of 390 and 273 mobile elements were found in strain DSM 25430 and strain DSM 25719, respectively. Comparative genomics of both strains revealed acquisition of virulence factors by horizontal gene transfer and provided insights into evolution and pathogenicity.
Molecular cloning of an inducible serine esterase gene from human cytotoxic lymphocytes.
Trapani, J A; Klein, J L; White, P C; Dupont, B
1988-01-01
A cDNA clone encoding a human serine esterase gene was isolated from a library constructed from poly(A)+ RNA of allogeneically stimulated, interleukin 2-expanded peripheral blood mononuclear cells. The clone, designated HSE26.1, represents a full-length copy of a 0.9-kilobase mRNA present in human cytotoxic cells but absent from a wide variety of noncytotoxic cell lines. Clone HSE26.1 contains an 892-base-pair sequence, including a single 741-base-pair open reading frame encoding a putative 247-residue polypeptide. The first 20 amino acids of the polypeptide form a leader sequence. The mature protein is predicted to have an unglycosylated Mr of approximately equal to 26,000 and contains a single potential site for N-linked glycosylation. The nucleotide and predicted amino acid sequences of clone HSE26.1 are homologous with all murine and human serine esterases cloned thus far but are most similar to mouse granzyme B (70% nucleotide and 68% amino acid identity). HSE26.1 protein is expressed weakly in unstimulated peripheral blood mononuclear cells but is strongly induced within 6-hr incubation in medium containing phytohemagglutinin. The data suggest that the protein encoded by HSE26.1 plays a role in cell-mediated cytotoxicity. Images PMID:3261871
Computational Analysis of Uncharacterized Proteins of Environmental Bacterial Genome
NASA Astrophysics Data System (ADS)
Coxe, K. J.; Kumar, M.
2017-12-01
Betaproteobacteria strain CB is a gram-negative bacterium in the phylum Proteobacteria and are found naturally in soil and water. In this complex environment, bacteria play a key role in efficiently eliminating the organic material and other pollutants from wastewater. To investigate the process of pollutant removal from wastewater using bacteria, it is important to characterize the proteins encoded by the bacterial genome. Our study combines a number of bioinformatics tools to predict the function of unassigned proteins in the bacterial genome. The genome of Betaproteobacteria strain CB contains 2,112 proteins in which function of 508 proteins are unknown, termed as uncharacterized proteins (UPs). The localization of the UPs with in the cell was determined and the structure of 38 UPs was accurately predicted. These UPs were predicted to belong to various classes of proteins such as enzymes, transporters, binding proteins, signal peptides, transmembrane proteins and other proteins. The outcome of this work will help better understand wastewater treatment mechanism.
The abundant extrachromosomal DNA content of the Spiroplasma citri GII3-3X genome
Saillard, Colette; Carle, Patricia; Duret-Nurbel, Sybille; Henri, Raphaël; Killiny, Nabil; Carrère, Sébastien; Gouzy, Jérome; Bové, Joseph-Marie; Renaudin, Joël; Foissac, Xavier
2008-01-01
Background Spiroplama citri, the causal agent of citrus stubborn disease, is a bacterium of the class Mollicutes and is transmitted by phloem-feeding leafhopper vectors. In order to characterize candidate genes potentially involved in spiroplasma transmission and pathogenicity, the genome of S. citri strain GII3-3X is currently being deciphered. Results Assembling 20,000 sequencing reads generated seven circular contigs, none of which fit the 1.8 Mb chromosome map or carried chromosomal markers. These contigs correspond to seven plasmids: pSci1 to pSci6, with sizes ranging from 12.9 to 35.3 kbp and pSciA of 7.8 kbp. Plasmids pSci were detected as multiple copies in strain GII3-3X. Plasmid copy numbers of pSci1-6, as deduced from sequencing coverage, were estimated at 10 to 14 copies per spiroplasma cell, representing 1.6 Mb of extrachromosomal DNA. Genes encoding proteins of the TrsE-TraE, Mob, TraD-TraG, and Soj-ParA protein families were predicted in most of the pSci sequences, in addition to members of 14 protein families of unknown function. Plasmid pSci6 encodes protein P32, a marker of insect transmissibility. Plasmids pSci1-5 code for eight different S. citri adhesion-related proteins (ScARPs) that are homologous to the previously described protein P89 and the S. kunkelii SkARP1. Conserved signal peptides and C-terminal transmembrane alpha helices were predicted in all ScARPs. The predicted surface-exposed N-terminal region possesses the following elements: (i) 6 to 8 repeats of 39 to 42 amino acids each (sarpin repeats), (ii) a central conserved region of 330 amino acids followed by (iii) a more variable domain of about 110 amino acids. The C-terminus, predicted to be cytoplasmic, consists of a 27 amino acid stretch enriched in arginine and lysine (KR) and an optional 23 amino acid stretch enriched in lysine, aspartate and glutamate (KDE). Plasmids pSci mainly present a linear increase of cumulative GC skew except in regions presenting conserved hairpin structures. Conclusion The genome of S. citri GII3-3X is characterized by abundant extrachromosomal elements. The pSci plasmids could not only be vertically inherited but also horizontally transmitted, as they encode proteins usually involved in DNA element partitioning and cell to cell DNA transfer. Because plasmids pSci1-5 encode surface proteins of the ScARP family and pSci6 was recently shown to confer insect transmissibility, diversity and abundance of S. citri plasmids may essentially aid the rapid adaptation of S. citri to more efficient transmission by different insect vectors and to various plant hosts. PMID:18442384
NASA Astrophysics Data System (ADS)
Yusof, Nik Yusnoraini; Bakar, Farah Diba Abu; Mahadi, Nor Muhammad; Raih, Mohd Firdaus; Murad, Abdul Munir Abdul
2015-09-01
A cDNA encoding Fe(II) 2-oxoglutarate (2OG) dependent dioxygenases was isolated from psychrophilic yeast, Glaciozyma antarctica PI12. We have successfully amplified 1,029 bp cDNA sequence that encodes 342 amino acid with predicted molecular weight 38 kDa. The prediction protein was analysed using various bioinformatics tools to explore the properties of the protein. Based on a BLAST search analysis, the Fe2OX amino acid sequence showed 61% identity to the sequence of oxoglutarate/iron-dependent oxygenase from Rhodosporidium toruloides NP11. SignalP prediction showed that the Fe2OX protein contains no putative signal peptide, which suggests that this enzyme most probably localised intracellularly.The structure of Fe2OX was predicted by homology modelling using MODELLER9v11. The model with the lowest objective function was selected from hundred models generated using MODELLER9v11. Analysis of the structure revealed the longer loop at Fe2OX from G.antarctica that might be responsible for the flexibility of the structure, which contributes to its adaptation to low temperatures. Fe2OX hold a highly conserved Fe(II) binding HXD/E…H triad motif. The binding site for 2-oxoglutarate was found conserved for Arg280 among reported studies, however the Phe268 was found to be different in Fe2OX.
Jiménez, Diego Javier; Dini-Andreote, Francisco; Ottoni, Júlia Ronzella; de Oliveira, Valéria Maia; van Elsas, Jan Dirk; Andreote, Fernando Dini
2015-05-01
The occurrence of genes encoding biotechnologically relevant α/β-hydrolases in mangrove soil microbial communities was assessed using data obtained by whole-metagenome sequencing of four mangroves areas, denoted BrMgv01 to BrMgv04, in São Paulo, Brazil. The sequences (215 Mb in total) were filtered based on local amino acid alignments against the Lipase Engineering Database. In total, 5923 unassembled sequences were affiliated with 30 different α/β-hydrolase fold superfamilies. The most abundant predicted proteins encompassed cytosolic hydrolases (abH08; ∼ 23%), microsomal hydrolases (abH09; ∼ 12%) and Moraxella lipase-like proteins (abH04 and abH01; < 5%). Detailed analysis of the genes predicted to encode proteins of the abH08 superfamily revealed a high proportion related to epoxide hydrolases and haloalkane dehalogenases in polluted mangroves BrMgv01-02-03. This suggested selection and putative involvement in local degradation/detoxification of the pollutants. Seven sequences that were annotated as genes for putative epoxide hydrolases and five for putative haloalkane dehalogenases were found in a fosmid library generated from BrMgv02 DNA. The latter enzymes were predicted to belong to Actinobacteria, Deinococcus-Thermus, Planctomycetes and Proteobacteria. Our integrated approach thus identified 12 genes (complete and/or partial) that may encode hitherto undescribed enzymes. The low amino acid identity (< 60%) with already-described genes opens perspectives for both production in an expression host and genetic screening of metagenomes. © 2014 The Authors. Microbial Biotechnology published by John Wiley & Sons Ltd and Society for Applied Microbiology.
Burton, Rachel A.; Johnson, Philip E.; Beckles, Diane M.; Fincher, Geoffrey B.; Jenner, Helen L.; Naldrett, Mike J.; Denyer, Kay
2002-01-01
In most species, the synthesis of ADP-glucose (Glc) by the enzyme ADP-Glc pyrophosphorylase (AGPase) occurs entirely within the plastids in all tissues so far examined. However, in the endosperm of many, if not all grasses, a second form of AGPase synthesizes ADP-Glc outside the plastid, presumably in the cytosol. In this paper, we show that in the endosperm of wheat (Triticum aestivum), the cytosolic form accounts for most of the AGPase activity. Using a combination of molecular and biochemical approaches to identify the cytosolic and plastidial protein components of wheat endosperm AGPase we show that the large and small subunits of the cytosolic enzyme are encoded by genes previously thought to encode plastidial subunits, and that a gene, Ta.AGP.S.1, which encodes the small subunit of the cytosolic form of AGPase, also gives rise to a second transcript by the use of an alternate first exon. This second transcript encodes an AGPase small subunit with a transit peptide. However, we could not find a plastidial small subunit protein corresponding to this transcript. The protein sequence of the purified plastidial small subunit does not match precisely to that encoded by Ta.AGP.S.1 or to the predicted sequences of any other known gene from wheat or barley (Hordeum vulgare). Instead, the protein sequence is most similar to those of the plastidial small subunits from chickpea (Cicer arietinum) and maize (Zea mays) and rice (Oryza sativa) seeds. These data suggest that the gene encoding the major plastidial small subunit of AGPase in wheat endosperm has yet to be identified. PMID:12428011
Maier, Uwe-G; Zauner, Stefan; Woehle, Christian; Bolte, Kathrin; Hempel, Franziska; Allen, John F.; Martin, William F.
2013-01-01
Plastid and mitochondrial genomes have undergone parallel evolution to encode the same functional set of genes. These encode conserved protein components of the electron transport chain in their respective bioenergetic membranes and genes for the ribosomes that express them. This highly convergent aspect of organelle genome evolution is partly explained by the redox regulation hypothesis, which predicts a separate plastid or mitochondrial location for genes encoding bioenergetic membrane proteins of either photosynthesis or respiration. Here we show that convergence in organelle genome evolution is far stronger than previously recognized, because the same set of genes for ribosomal proteins is independently retained by both plastid and mitochondrial genomes. A hitherto unrecognized selective pressure retains genes for the same ribosomal proteins in both organelles. On the Escherichia coli ribosome assembly map, the retained proteins are implicated in 30S and 50S ribosomal subunit assembly and initial rRNA binding. We suggest that ribosomal assembly imposes functional constraints that govern the retention of ribosomal protein coding genes in organelles. These constraints are subordinate to redox regulation for electron transport chain components, which anchor the ribosome to the organelle genome in the first place. As organelle genomes undergo reduction, the rRNAs also become smaller. Below size thresholds of approximately 1,300 nucleotides (16S rRNA) and 2,100 nucleotides (26S rRNA), all ribosomal protein coding genes are lost from organelles, while electron transport chain components remain organelle encoded as long as the organelles use redox chemistry to generate a proton motive force. PMID:24259312
Protein targeting and integration signal for the chloroplastic outer envelope membrane.
Li, H M; Chen, L J
1996-01-01
Most proteins in chloroplasts are encoded by the nuclear genome and synthesized in the cytosol. With the exception of most quter envelope membrane proteins, nuclear-encoded chloroplastic proteins are synthesized with N-terminal extensions that contain the chloroplast targeting information of these proteins. Most outer membrane proteins, however, are synthesized without extensions in the cytosol. Therefore, it is not clear where the chloroplastic outer membrane targeting information resides within these polypeptides. We have analyzed a chloroplastic outer membrane protein, OEP14 (outer envelope membrane protein of 14 kD, previously named OM14), and localized its outer membrane targeting and integration signal to the first 30 amino acids of the protein. This signal consists of a positively charged N-terminal portion followed by a hydrophobic core, bearing resemblance to the signal peptides of proteins targeted to the endoplasmic reticulum. However, a chimeric protein containing this signal fused to a passenger protein did not integrate into the endoplasmic reticulum membrane. Furthermore, membrane topology analysis indicated that the signal inserts into the chloroplastic outer membrane in an orientation opposite to that predicted by the "positive inside" rule. PMID:8953775
Niskanen, Einari A; Hytönen, Vesa P; Grapputo, Alessandro; Nordlund, Henri R; Kulomaa, Markku S; Laitinen, Olli H
2005-01-01
Background A chicken egg contains several biotin-binding proteins (BBPs), whose complete DNA and amino acid sequences are not known. In order to identify and characterise these genes and proteins we studied chicken cDNAs and genes available in the NCBI database and chicken genome database using the reported N-terminal amino acid sequences of chicken egg-yolk BBPs as search strings. Results Two separate hits showing significant homology for these N-terminal sequences were discovered. For one of these hits, the chromosomal location in the immediate proximity of the avidin gene family was found. Both of these hits encode proteins having high sequence similarity with avidin suggesting that chicken BBPs are paralogous to avidin family. In particular, almost all residues corresponding to biotin binding in avidin are conserved in these putative BBP proteins. One of the found DNA sequences, however, seems to encode a carboxy-terminal extension not present in avidin. Conclusion We describe here the predicted properties of the putative BBP genes and proteins. Our present observations link BBP genes together with avidin gene family and shed more light on the genetic arrangement and variability of this family. In addition, comparative modelling revealed the potential structural elements important for the functional and structural properties of the putative BBP proteins. PMID:15777476
de-Couet, H. G.; Fong, KSK.; Weeds, A. G.; McLaughlin, P. J.; Miklos, GLG.
1995-01-01
The flightless locus of Drosophila melanogaster has been analyzed at the genetic, molecular, ultrastructural and comparative crystallographic levels. The gene encodes a single transcript encoding a protein consisting of a leucine-rich amino terminal half and a carboxyterminal half with high sequence similarity to gelsolin. We determined the genomic sequence of the flightless landscape, the breakpoints of four chromosomal rearrangements, and the molecular lesions in two lethal and two viable alleles of the gene. The two alleles that lead to flight muscle abnormalities encode mutant proteins exhibiting amino acid replacements within the S1-like domain of their gelsolin-like region. Furthermore, the deduced intronexon structure of the D. melanogaster gene has been compared with that of the Caenorhabditis elegans homologue. Furthermore, the sequence similarities of the flightless protein with gelsolin allow it to be evaluated in the context of the published crystallographic structure of the S1 domain of gelsolin. Amino acids considered essential for the structural integrity of the core are found to be highly conserved in the predicted flightless protein. Some of the residues considered essential for actin and calcium binding in gelsolin S1 and villin V1 are also well conserved. These data are discussed in light of the phenotypic characteristics of the mutants and the putative functions of the protein. PMID:8582612
David S. Bischoff; James M. Slavicek
1995-01-01
The Lymantria dispar multinucleocapsid nuclear polyhedrosis virus (LdMNPV) gene encoding G22 was cloned and sequenced. The G22 gene codes for a 191 amino acid protein with a predicted Mr of 22000. Expression of G22 in a rabbit reticulocyte system generated a protein with an M...
Yerrapragada, Shaila; Shukla, Animesh; Hallsworth-Pepin, Kymberlie; Choi, Kwangmin; Wollam, Aye; Clifton, Sandra; Qin, Xiang; Muzny, Donna; Raghuraman, Sriram; Ashki, Haleh; Uzman, Akif; Highlander, Sarah K; Fryszczyn, Bartlomiej G; Fox, George E; Tirumalai, Madhan R; Liu, Yamei; Kim, Sun; Kehoe, David M; Weinstock, George M
2015-05-07
Tolypothrix sp. PCC 7601 is a freshwater filamentous cyanobacterium with complex responses to environmental conditions. Here, we present its 9.96-Mbp draft genome sequence, containing 10,065 putative protein-coding sequences, including 305 predicted two-component system proteins and 27 putative phytochrome-class photoreceptors, the most such proteins in any sequenced genome. Copyright © 2015 Yerrapragada et al.
Meher, Prabina Kumar; Sahu, Tanmaya Kumar; Banchariya, Anjali; Rao, Atmakuri Ramakrishna
2017-03-24
Insecticide resistance is a major challenge for the control program of insect pests in the fields of crop protection, human and animal health etc. Resistance to different insecticides is conferred by the proteins encoded from certain class of genes of the insects. To distinguish the insecticide resistant proteins from non-resistant proteins, no computational tool is available till date. Thus, development of such a computational tool will be helpful in predicting the insecticide resistant proteins, which can be targeted for developing appropriate insecticides. Five different sets of feature viz., amino acid composition (AAC), di-peptide composition (DPC), pseudo amino acid composition (PAAC), composition-transition-distribution (CTD) and auto-correlation function (ACF) were used to map the protein sequences into numeric feature vectors. The encoded numeric vectors were then used as input in support vector machine (SVM) for classification of insecticide resistant and non-resistant proteins. Higher accuracies were obtained under RBF kernel than that of other kernels. Further, accuracies were observed to be higher for DPC feature set as compared to others. The proposed approach achieved an overall accuracy of >90% in discriminating resistant from non-resistant proteins. Further, the two classes of resistant proteins i.e., detoxification-based and target-based were discriminated from non-resistant proteins with >95% accuracy. Besides, >95% accuracy was also observed for discrimination of proteins involved in detoxification- and target-based resistance mechanisms. The proposed approach not only outperformed Blastp, PSI-Blast and Delta-Blast algorithms, but also achieved >92% accuracy while assessed using an independent dataset of 75 insecticide resistant proteins. This paper presents the first computational approach for discriminating the insecticide resistant proteins from non-resistant proteins. Based on the proposed approach, an online prediction server DIRProt has also been developed for computational prediction of insecticide resistant proteins, which is accessible at http://cabgrid.res.in:8080/dirprot/ . The proposed approach is believed to supplement the efforts needed to develop dynamic insecticides in wet-lab by targeting the insecticide resistant proteins.
Identifying metabolic enzymes with multiple types of association evidence
Kharchenko, Peter; Chen, Lifeng; Freund, Yoav; Vitkup, Dennis; Church, George M
2006-01-01
Background Existing large-scale metabolic models of sequenced organisms commonly include enzymatic functions which can not be attributed to any gene in that organism. Existing computational strategies for identifying such missing genes rely primarily on sequence homology to known enzyme-encoding genes. Results We present a novel method for identifying genes encoding for a specific metabolic function based on a local structure of metabolic network and multiple types of functional association evidence, including clustering of genes on the chromosome, similarity of phylogenetic profiles, gene expression, protein fusion events and others. Using E. coli and S. cerevisiae metabolic networks, we illustrate predictive ability of each individual type of association evidence and show that significantly better predictions can be obtained based on the combination of all data. In this way our method is able to predict 60% of enzyme-encoding genes of E. coli metabolism within the top 10 (out of 3551) candidates for their enzymatic function, and as a top candidate within 43% of the cases. Conclusion We illustrate that a combination of genome context and other functional association evidence is effective in predicting genes encoding metabolic enzymes. Our approach does not rely on direct sequence homology to known enzyme-encoding genes, and can be used in conjunction with traditional homology-based metabolic reconstruction methods. The method can also be used to target orphan metabolic activities. PMID:16571130
EGASP: the human ENCODE Genome Annotation Assessment Project
Guigó, Roderic; Flicek, Paul; Abril, Josep F; Reymond, Alexandre; Lagarde, Julien; Denoeud, France; Antonarakis, Stylianos; Ashburner, Michael; Bajic, Vladimir B; Birney, Ewan; Castelo, Robert; Eyras, Eduardo; Ucla, Catherine; Gingeras, Thomas R; Harrow, Jennifer; Hubbard, Tim; Lewis, Suzanna E; Reese, Martin G
2006-01-01
Background We present the results of EGASP, a community experiment to assess the state-of-the-art in genome annotation within the ENCODE regions, which span 1% of the human genome sequence. The experiment had two major goals: the assessment of the accuracy of computational methods to predict protein coding genes; and the overall assessment of the completeness of the current human genome annotations as represented in the ENCODE regions. For the computational prediction assessment, eighteen groups contributed gene predictions. We evaluated these submissions against each other based on a 'reference set' of annotations generated as part of the GENCODE project. These annotations were not available to the prediction groups prior to the submission deadline, so that their predictions were blind and an external advisory committee could perform a fair assessment. Results The best methods had at least one gene transcript correctly predicted for close to 70% of the annotated genes. Nevertheless, the multiple transcript accuracy, taking into account alternative splicing, reached only approximately 40% to 50% accuracy. At the coding nucleotide level, the best programs reached an accuracy of 90% in both sensitivity and specificity. Programs relying on mRNA and protein sequences were the most accurate in reproducing the manually curated annotations. Experimental validation shows that only a very small percentage (3.2%) of the selected 221 computationally predicted exons outside of the existing annotation could be verified. Conclusion This is the first such experiment in human DNA, and we have followed the standards established in a similar experiment, GASP1, in Drosophila melanogaster. We believe the results presented here contribute to the value of ongoing large-scale annotation projects and should guide further experimental methods when being scaled up to the entire human genome sequence. PMID:16925836
Discovery of new enzymes and metabolic pathways by using structure and genome context.
Zhao, Suwen; Kumar, Ritesh; Sakai, Ayano; Vetting, Matthew W; Wood, B McKay; Brown, Shoshana; Bonanno, Jeffery B; Hillerich, Brandan S; Seidel, Ronald D; Babbitt, Patricia C; Almo, Steven C; Sweedler, Jonathan V; Gerlt, John A; Cronan, John E; Jacobson, Matthew P
2013-10-31
Assigning valid functions to proteins identified in genome projects is challenging: overprediction and database annotation errors are the principal concerns. We and others are developing computation-guided strategies for functional discovery with 'metabolite docking' to experimentally derived or homology-based three-dimensional structures. Bacterial metabolic pathways often are encoded by 'genome neighbourhoods' (gene clusters and/or operons), which can provide important clues for functional assignment. We recently demonstrated the synergy of docking and pathway context by 'predicting' the intermediates in the glycolytic pathway in Escherichia coli. Metabolite docking to multiple binding proteins and enzymes in the same pathway increases the reliability of in silico predictions of substrate specificities because the pathway intermediates are structurally similar. Here we report that structure-guided approaches for predicting the substrate specificities of several enzymes encoded by a bacterial gene cluster allowed the correct prediction of the in vitro activity of a structurally characterized enzyme of unknown function (PDB 2PMQ), 2-epimerization of trans-4-hydroxy-L-proline betaine (tHyp-B) and cis-4-hydroxy-D-proline betaine (cHyp-B), and also the correct identification of the catabolic pathway in which Hyp-B 2-epimerase participates. The substrate-liganded pose predicted by virtual library screening (docking) was confirmed experimentally. The enzymatic activities in the predicted pathway were confirmed by in vitro assays and genetic analyses; the intermediates were identified by metabolomics; and repression of the genes encoding the pathway by high salt concentrations was established by transcriptomics, confirming the osmolyte role of tHyp-B. This study establishes the utility of structure-guided functional predictions to enable the discovery of new metabolic pathways.
Structure of adenovirus bound to cellular receptor car
Freimuth, Paul I.
2004-05-18
Disclosed is a mutant adenovirus which has a genome comprising one or more mutations in sequences which encode the fiber protein knob domain wherein the mutation causes the encoded viral particle to have significantly weakened binding affinity for CARD1 relative to wild-type adenovirus. Such mutations may be in sequences which encode either the AB loop, or the HI loop of the fiber protein knob domain. Specific residues and mutations are described. Also disclosed is a method for generating a mutant adenovirus which is characterized by a receptor binding affinity or specificity which differs substantially from wild type. In the method, residues of the adenovirus fiber protein knob domain which are predicted to alter D1 binding when mutated, are identified from the crystal structure coordinates of the AD12knob:CAR-D1 complex. A mutation which alters one or more of the identified residues is introduced into the genome of the adenovirus to generate a mutant adenovirus. Whether or not the mutant produced exhibits altered adenovirus-CAR binding properties is then determined.
Proteins of Unknown Biochemical Function: A Persistent Problem and a Roadmap to Help Overcome It.
Niehaus, Thomas D; Thamm, Antje M K; de Crécy-Lagard, Valérie; Hanson, Andrew D
2015-11-01
The number of sequenced genomes is rapidly increasing, but functional annotation of the genes in these genomes lags far behind. Even in Arabidopsis (Arabidopsis thaliana), only approximately 40% of enzyme- and transporter-encoding genes have credible functional annotations, and this number is even lower in nonmodel plants. Functional characterization of unknown genes is a challenge, but various databases (e.g. for protein localization and coexpression) can be mined to provide clues. If homologous microbial genes exist-and about one-half the genes encoding unknown enzymes and transporters in Arabidopsis have microbial homologs-cross-kingdom comparative genomics can powerfully complement plant-based data. Multiple lines of evidence can strengthen predictions and warrant experimental characterization. In some cases, relatively quick tests in genetically tractable microbes can determine whether a prediction merits biochemical validation, which is costly and demands specialized skills. © 2015 American Society of Plant Biologists. All Rights Reserved.
Molecular cloning and characterization of chitinase genes from Candida albicans.
McCreath, K J; Specht, C A; Robbins, P W
1995-03-28
Chitinase (EC 3.2.1.14) is an important enzyme for the remodeling of chitin in the cell wall of fungi. We have cloned three chitinase genes (CHT1, CHT2, and CHT3) from the dimorphic human pathogen Candida albicans. CHT2 and CHT3 have been sequenced in full and their primary structures have been analyzed: CHT2 encodes a protein of 583 aa with a predicted size of 60.8 kDa; CHT3 encodes a protein of 567 aa with a predicted size of 60 kDa. All three genes show striking similarity to other chitinase genes in the literature, especially in the proposed catalytic domain. Transcription of CHT2 and CHT3 was greater when C. albicans was grown in a yeast phase as compared to a mycelial phase. A transcript of CHT1 could not be detected in either growth condition.
Auffret, Pauline; Segura, Audrey; Klopp, Christophe; Bouchez, Olivier; Kérourédan, Monique; Bibbal, Delphine; Brugère, Hubert; Forano, Evelyne
2017-01-01
ABSTRACT Enterohemorrhagic Escherichia coli (EHEC) with serotype O157:H7 is a major foodborne pathogen. Here, we report the draft genome sequence of EHEC O157:H7 strain MC2 isolated from cattle in France. The assembly contains 5,400,376 bp that encoded 5,914 predicted genes (5,805 protein-encoding genes and 109 RNA genes). PMID:28983004
Staphylococcal SCCmec elements encode an active MCM-like helicase and thus may be replicative
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mir-Sanchis, Ignacio; Roman, Christina A.; Misiura, Agnieszka
2016-08-29
Methicillin-resistant Staphylococcus aureus (MRSA) is a public-health threat worldwide. Although the mobile genomic island responsible for this phenotype, staphylococcal cassette chromosome (SCC), has been thought to be nonreplicative, we predicted DNA-replication-related functions for some of the conserved proteins encoded by SCC. We show that one of these, Cch, is homologous to the self-loading initiator helicases of an unrelated family of genomic islands, that it is an active 3'-to-5' helicase and that the adjacent ORF encodes a single-stranded DNA–binding protein. Our 2.9-Å crystal structure of intact Cch shows that it forms a hexameric ring. Cch, like the archaeal and eukaryotic MCM-familymore » replicative helicases, belongs to the pre–sensor II insert clade of AAA+ ATPases. Additionally, we found that SCC elements are part of a broader family of mobile elements, all of which encode a replication initiator upstream of their recombinases. Replication after excision would enhance the efficiency of horizontal gene transfer.« less
Zeng, Mu-Heng; Liu, Sheng-Hong; Yang, Miao-Xian; Zhang, Ya-Jun; Liang, Jia-Yong; Wan, Xiao-Rong; Liang, Hong
2013-01-01
Clathrin, a three-legged triskelion composed of three clathrin heavy chains (CHCs) and three light chains (CLCs), plays a critical role in clathrin-mediated endocytosis (CME) in eukaryotic cells. In this study, the genes ZmCHC1 and ZmCHC2 encoding clathrin heavy chain in maize were cloned and characterized for the first time in monocots. ZmCHC1 encodes a 1693-amino acid-protein including 29 exons and 28 introns, and ZmCHC2 encodes a 1746-amino acid-protein including 28 exons and 27 introns. The high similarities of gene structure, protein sequences and 3D models among ZmCHC1, and Arabidopsis AtCHC1 and AtCHC2 suggest their similar functions in CME. ZmCHC1 gene is predominantly expressed in maize roots instead of ubiquitous expression of ZmCHC2. Consistent with a typical predicted salicylic acid (SA)-responsive element and four predicted ABA-responsive elements (ABREs) in the promoter sequence of ZmCHC1, the expression of ZmCHC1 instead of ZmCHC2 in maize roots is significantly up-regulated by SA or ABA, suggesting that ZmCHC1 gene may be involved in the SA signaling pathway in maize defense responses. The expressions of ZmCHC1 and ZmCHC2 genes in maize are down-regulated by azide or cold treatment, further revealing the energy requirement of CME and suggesting that CME in plants is sensitive to low temperatures. PMID:23880865
β-Lactamase Genes of the Penicillin-Susceptible Bacillus anthracis Sterne Strain
Chen, Yahua; Succi, Janice; Tenover, Fred C.; Koehler, Theresa M.
2003-01-01
Susceptibility to penicillin and other β-lactam-containing compounds is a common trait of Bacillus anthracis. β-lactam agents, particularly penicillin, have been used worldwide to treat anthrax in humans. Nonetheless, surveys of clinical and soil-derived strains reveal penicillin G resistance in 2 to 16% of isolates tested. Bacterial resistance to β-lactam agents is often mediated by production of one or more types of β-lactamases that hydrolyze the β-lactam ring, inactivating the antimicrobial agent. Here, we report the presence of two β-lactamase (bla) genes in the penicillin-susceptible Sterne strain of B. anthracis. We identified bla1 by functional cloning with Escherichia coli. bla1 is a 927-nucleotide (nt) gene predicted to encode a protein with 93.8% identity to the type I β-lactamase gene of Bacillus cereus. A second gene, bla2, was identified by searching the unfinished B. anthracis chromosome sequence database of The Institute for Genome Research for open reading frames (ORFs) predicted to encode β-lactamases. We found a partial ORF predicted to encode a protein with significant similarity to the carboxy-terminal end of the type II β-lactamase of B. cereus. DNA adjacent to the 5′ end of the partial ORF was cloned using inverse PCR. bla2 is a 768-nt gene predicted to encode a protein with 92% identity to the B. cereus type II enzyme. The bla1 and bla2 genes confer ampicillin resistance to E. coli and Bacillus subtilis when cloned individually in these species. The MICs of various antimicrobial agents for the E. coli clones indicate that the two β-lactamase genes confer different susceptibility profiles to E. coli; bla1 is a penicillinase, while bla2 appears to be a cephalosporinase. The β-galactosidase activities of B. cereus group species harboring bla promoter-lacZ transcriptional fusions indicate that bla1 is poorly transcribed in B. anthracis, B. cereus, and B. thuringiensis. The bla2 gene is strongly expressed in B. cereus and B. thuringiensis and weakly expressed in B. anthracis. Taken together, these data indicate that the bla1 and bla2 genes of the B. anthracis Sterne strain encode functional β-lactamases of different types, but gene expression is usually not sufficient to confer resistance to β-lactam agents. PMID:12533457
In Silico Pattern-Based Analysis of the Human Cytomegalovirus Genome
Rigoutsos, Isidore; Novotny, Jiri; Huynh, Tien; Chin-Bow, Stephen T.; Parida, Laxmi; Platt, Daniel; Coleman, David; Shenk, Thomas
2003-01-01
More than 200 open reading frames (ORFs) from the human cytomegalovirus genome have been reported as potentially coding for proteins. We have used two pattern-based in silico approaches to analyze this set of putative viral genes. With the help of an objective annotation method that is based on the Bio-Dictionary, a comprehensive collection of amino acid patterns that describes the currently known natural sequence space of proteins, we have reannotated all of the previously reported putative genes of the human cytomegalovirus. Also, with the help of MUSCA, a pattern-based multiple sequence alignment algorithm, we have reexamined the original human cytomegalovirus gene family definitions. Our analysis of the genome shows that many of the coded proteins comprise amino acid combinations that are unique to either the human cytomegalovirus or the larger group of herpesviruses. We have confirmed that a surprisingly large portion of the analyzed ORFs encode membrane proteins, and we have discovered a significant number of previously uncharacterized proteins that are predicted to be G-protein-coupled receptor homologues. The analysis also indicates that many of the encoded proteins undergo posttranslational modifications such as hydroxylation, phosphorylation, and glycosylation. ORFs encoding proteins with similar functional behavior appear in neighboring regions of the human cytomegalovirus genome. All of the results of the present study can be found and interactively explored online (http://cbcsrv.watson.ibm.com/virus/). PMID:12634390
In silico pattern-based analysis of the human cytomegalovirus genome.
Rigoutsos, Isidore; Novotny, Jiri; Huynh, Tien; Chin-Bow, Stephen T; Parida, Laxmi; Platt, Daniel; Coleman, David; Shenk, Thomas
2003-04-01
More than 200 open reading frames (ORFs) from the human cytomegalovirus genome have been reported as potentially coding for proteins. We have used two pattern-based in silico approaches to analyze this set of putative viral genes. With the help of an objective annotation method that is based on the Bio-Dictionary, a comprehensive collection of amino acid patterns that describes the currently known natural sequence space of proteins, we have reannotated all of the previously reported putative genes of the human cytomegalovirus. Also, with the help of MUSCA, a pattern-based multiple sequence alignment algorithm, we have reexamined the original human cytomegalovirus gene family definitions. Our analysis of the genome shows that many of the coded proteins comprise amino acid combinations that are unique to either the human cytomegalovirus or the larger group of herpesviruses. We have confirmed that a surprisingly large portion of the analyzed ORFs encode membrane proteins, and we have discovered a significant number of previously uncharacterized proteins that are predicted to be G-protein-coupled receptor homologues. The analysis also indicates that many of the encoded proteins undergo posttranslational modifications such as hydroxylation, phosphorylation, and glycosylation. ORFs encoding proteins with similar functional behavior appear in neighboring regions of the human cytomegalovirus genome. All of the results of the present study can be found and interactively explored online (http://cbcsrv.watson.ibm.com/virus/).
2012-01-01
Background Natrialba magadii is an aerobic chemoorganotrophic member of the Euryarchaeota and is a dual extremophile requiring alkaline conditions and hypersalinity for optimal growth. The genome sequence of Nab. magadii type strain ATCC 43099 was deciphered to obtain a comprehensive insight into the genetic content of this haloarchaeon and to understand the basis of some of the cellular functions necessary for its survival. Results The genome of Nab. magadii consists of four replicons with a total sequence of 4,443,643 bp and encodes 4,212 putative proteins, some of which contain peptide repeats of various lengths. Comparative genome analyses facilitated the identification of genes encoding putative proteins involved in adaptation to hypersalinity, stress response, glycosylation, and polysaccharide biosynthesis. A proton-driven ATP synthase and a variety of putative cytochromes and other proteins supporting aerobic respiration and electron transfer were encoded by one or more of Nab. magadii replicons. The genome encodes a number of putative proteases/peptidases as well as protein secretion functions. Genes encoding putative transcriptional regulators, basal transcription factors, signal perception/transduction proteins, and chemotaxis/phototaxis proteins were abundant in the genome. Pathways for the biosynthesis of thiamine, riboflavin, heme, cobalamin, coenzyme F420 and other essential co-factors were deduced by in depth sequence analyses. However, approximately 36% of Nab. magadii protein coding genes could not be assigned a function based on Blast analysis and have been annotated as encoding hypothetical or conserved hypothetical proteins. Furthermore, despite extensive comparative genomic analyses, genes necessary for survival in alkaline conditions could not be identified in Nab. magadii. Conclusions Based on genomic analyses, Nab. magadii is predicted to be metabolically versatile and it could use different carbon and energy sources to sustain growth. Nab. magadii has the genetic potential to adapt to its milieu by intracellular accumulation of inorganic cations and/or neutral organic compounds. The identification of Nab. magadii genes involved in coenzyme biosynthesis is a necessary step toward further reconstruction of the metabolic pathways in halophilic archaea and other extremophiles. The knowledge gained from the genome sequence of this haloalkaliphilic archaeon is highly valuable in advancing the applications of extremophiles and their enzymes. PMID:22559199
USDA-ARS?s Scientific Manuscript database
A nitrogen-fixing alfalfa-nodulating microsymbiont, Sinorhizobium meliloti, has a genome consisting of a 3.5 Mbp circular chromosome and two megaplasmids totaling 3.0 Mbp, one a 1.3 Mbp pSymA carrying nonessential ‘accessory’ genes including nif, nod and others involved in plant interaction. Predict...
Residue-Specific Side-Chain Polymorphisms via Particle Belief Propagation.
Ghoraie, Laleh Soltan; Burkowski, Forbes; Li, Shuai Cheng; Zhu, Mu
2014-01-01
Protein side chains populate diverse conformational ensembles in crystals. Despite much evidence that there is widespread conformational polymorphism in protein side chains, most of the X-ray crystallography data are modeled by single conformations in the Protein Data Bank. The ability to extract or to predict these conformational polymorphisms is of crucial importance, as it facilitates deeper understanding of protein dynamics and functionality. In this paper, we describe a computational strategy capable of predicting side-chain polymorphisms. Our approach extends a particular class of algorithms for side-chain prediction by modeling the side-chain dihedral angles more appropriately as continuous rather than discrete variables. Employing a new inferential technique known as particle belief propagation, we predict residue-specific distributions that encode information about side-chain polymorphisms. Our predicted polymorphisms are in relatively close agreement with results from a state-of-the-art approach based on X-ray crystallography data, which characterizes the conformational polymorphisms of side chains using electron density information, and has successfully discovered previously unmodeled conformations.
Hopkins, Julia F.; Spencer, David F.; Laboissiere, Sylvie; Neilson, Jonathan A.D.; Eveleigh, Robert J.M.; Durnford, Dion G.; Gray, Michael W.; Archibald, John M.
2012-01-01
Chlorarachniophytes are unicellular marine algae with plastids (chloroplasts) of secondary endosymbiotic origin. Chlorarachniophyte cells retain the remnant nucleus (nucleomorph) and cytoplasm (periplastidial compartment, PPC) of the green algal endosymbiont from which their plastid was derived. To characterize the diversity of nucleus-encoded proteins targeted to the chlorarachniophyte plastid, nucleomorph, and PPC, we isolated plastid–nucleomorph complexes from the model chlorarachniophyte Bigelowiella natans and subjected them to high-pressure liquid chromatography-tandem mass spectrometry. Our proteomic analysis, the first of its kind for a nucleomorph-bearing alga, resulted in the identification of 324 proteins with 95% confidence. Approximately 50% of these proteins have predicted bipartite leader sequences at their amino termini. Nucleus-encoded proteins make up >90% of the proteins identified. With respect to biological function, plastid-localized light-harvesting proteins were well represented, as were proteins involved in chlorophyll biosynthesis. Phylogenetic analyses revealed that many, but by no means all, of the proteins identified in our proteomic screen are of apparent green algal ancestry, consistent with the inferred evolutionary origin of the plastid and nucleomorph in chlorarachniophytes. PMID:23221610
Cloning and bioinformatic analysis of lovastatin biosynthesis regulatory gene lovE.
Huang, Xin; Li, Hao-ming
2009-08-05
Lovastatin is an effective drug for treatment of hyperlipidemia. This study aimed to clone lovastatin biosynthesis regulatory gene lovE and analyze the structure and function of its encoding protein. According to the lovastatin synthase gene sequence from genebank, primers were designed to amplify and clone the lovastatin biosynthesis regulatory gene lovE from Aspergillus terrus genomic DNA. Bioinformatic analysis of lovE and its encoding animo acid sequence was performed through internet resources and software like DNAMAN. Target fragment lovE, almost 1500 bp in length, was amplified from Aspergillus terrus genomic DNA and the secondary and three-dimensional structures of LovE protein were predicted. In the lovastatin biosynthesis process lovE is a regulatory gene and LovE protein is a GAL4-like transcriptional factor.
Vandenbol, M; Jauniaux, J C; Grenson, M
1989-11-15
The complete nucleotide (nt) sequence of the PUT4 gene, whose product is required for high-affinity proline active transport in the yeast Saccharomyces cerevisiae, is presented. The sequence contains a single long open reading frame of 1881 nt, encoding a polypeptide with a calculated Mr of 68,795. The predicted protein is strongly hydrophobic and exhibits six potential glycosylation sites. Its hydropathy profile suggests the presence of twelve membrane-spanning regions flanked by hydrophilic N- and C-terminal domains. The N terminus does not resemble signal sequences found in secreted proteins. These features are characteristic of integral membrane proteins catalyzing translocation of ligands across cellular membranes. Protein sequence comparisons indicate strong resemblance to the arginine and histidine permeases of S. cerevisiae, but no marked sequence similarity to the proline permease of Escherichia coli or to other known prokaryotic or eukaryotic transport proteins. The strong similarity between the three yeast amino acid permeases suggests a common ancestor for the three proteins.
Lazar Adler, Natalie R; Stevens, Mark P; Dean, Rachel E; Saint, Richard J; Pankhania, Depesh; Prior, Joann L; Atkins, Timothy P; Kessler, Bianca; Nithichanon, Arnone; Lertmemongkolchai, Ganjana; Galyov, Edouard E
2015-01-01
Burkholderia pseudomallei is the causative agent of the severe tropical disease melioidosis, which commonly presents as sepsis. The B. pseudomallei K96243 genome encodes eleven predicted autotransporters, a diverse family of secreted and outer membrane proteins often associated with virulence. In a systematic study of these autotransporters, we constructed insertion mutants in each gene predicted to encode an autotransporter and assessed them for three pathogenesis-associated phenotypes: virulence in the BALB/c intra-peritoneal mouse melioidosis model, net intracellular replication in J774.2 murine macrophage-like cells and survival in 45% (v/v) normal human serum. From the complete repertoire of eleven autotransporter mutants, we identified eight mutants which exhibited an increase in median lethal dose of 1 to 2-log10 compared to the isogenic parent strain (bcaA, boaA, boaB, bpaA, bpaC, bpaE, bpaF and bimA). Four mutants, all demonstrating attenuation for virulence, exhibited reduced net intracellular replication in J774.2 macrophage-like cells (bimA, boaB, bpaC and bpaE). A single mutant (bpaC) was identified that exhibited significantly reduced serum survival compared to wild-type. The bpaC mutant, which demonstrated attenuation for virulence and net intracellular replication, was sensitive to complement-mediated killing via the classical and/or lectin pathway. Serum resistance was rescued by in trans complementation. Subsequently, we expressed recombinant proteins of the passenger domain of four predicted autotransporters representing each of the phenotypic groups identified: those attenuated for virulence (BcaA), those attenuated for virulence and net intracellular replication (BpaE), the BpaC mutant with defects in virulence, net intracellular replication and serum resistance and those displaying wild-type phenotypes (BatA). Only BcaA and BpaE elicited a strong IFN-γ response in a restimulation assay using whole blood from seropositive donors and were recognised by seropositive human sera from the endemic area. To conclude, several predicted autotransporters contribute to B. pseudomallei virulence and BpaC may do so by conferring resistance against complement-mediated killing.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ramalho, T.O.; Figueira, A.R.; Sotero, A.J.
2014-09-15
The emergence of viruses in Coffee (Coffea arabica and Coffea canephora), the most widely traded agricultural commodity in the world, is of critical concern. The RNA1 (6552 nt) of Coffee ringspot virus is organized into five open reading frames (ORFs) capable of encoding the viral nucleocapsid (ORF1p), phosphoprotein (ORF2p), putative cell-to-cell movement protein (ORF3p), matrix protein (ORF4p) and glycoprotein (ORF5p). Each ORF is separated by a conserved intergenic junction. RNA2 (5945 nt), which completes the bipartite genome, encodes a single protein (ORF6p) with homology to RNA-dependent RNA polymerases. Phylogenetic analysis of L protein sequences firmly establishes CoRSV as a membermore » of the recently proposed Dichorhavirus genus. Predictive algorithms, in planta protein expression, and a yeast-based nuclear import assay were used to determine the nucleophillic character of five CoRSV proteins. Finally, the temperature-dependent ability of CoRSV to establish systemic infections in an initially local lesion host was quantified. - Highlights: • We report genome sequence determination for Coffee ringspot virus (CoRSV). • CoRSV should be considered a member of the proposed Dichorhavirus genus. • We report temperature-dependent systemic infection of an initially local lesion host. • We report in planta protein and localization data for five CoRSV proteins. • In silico predictions of the CoRSV proteins were validated using in vivo assays.« less
Prediction of pi-turns in proteins using PSI-BLAST profiles and secondary structure information.
Wang, Yan; Xue, Zhi-Dong; Shi, Xiao-Hong; Xu, Jin
2006-09-01
Due to the structural and functional importance of tight turns, some methods have been proposed to predict gamma-turns, beta-turns, and alpha-turns in proteins. In the past, studies of pi-turns were made, but not a single prediction approach has been developed so far. It will be useful to develop a method for identifying pi-turns in a protein sequence. In this paper, the support vector machine (SVM) method has been introduced to predict pi-turns from the amino acid sequence. The training and testing of this approach is performed with a newly collected data set of 640 non-homologous protein chains containing 1931 pi-turns. Different sequence encoding schemes have been explored in order to investigate their effects on the prediction performance. With multiple sequence alignment and predicted secondary structure, the final SVM model yields a Matthews correlation coefficient (MCC) of 0.556 by a 7-fold cross-validation. A web server implementing the prediction method is available at the following URL: http://210.42.106.80/piturn/.
Yamada, Takashi; Onimatsu, Hideki; Van Etten, James L.
2007-01-01
Chlorella viruses or chloroviruses are large, icosahedral, plaque‐forming, double‐stranded‐DNA—containing viruses that replicate in certain strains of the unicellular green alga Chlorella. DNA sequence analysis of the 330‐kbp genome of Paramecium bursaria chlorella virus 1 (PBCV‐1), the prototype of this virus family (Phycodnaviridae), predict ∼366 protein‐encoding genes and 11 tRNA genes. The predicted gene products of ∼50% of these genes resemble proteins of known function, including many that are completely unexpected for a virus. In addition, the chlorella viruses have several features and encode many gene products that distinguish them from most viruses. These products include: (1) multiple DNA methyltransferases and DNA site‐specific endonucleases, (2) the enzymes required to glycosylate their proteins and synthesize polysaccharides such as hyaluronan and chitin, (3) a virus‐encoded K+ channel (called Kcv) located in the internal membrane of the virions, (4) a SET domain containing protein (referred to as vSET) that dimethylates Lys27 in histone 3, and (5) PBCV‐1 has three types of introns; a self‐splicing intron, a spliceosomal processed intron, and a small tRNA intron. Accumulating evidence indicates that the chlorella viruses have a very long evolutionary history. This review mainly deals with research on the virion structure, genome rearrangements, gene expression, cell wall degradation, polysaccharide synthesis, and evolution of PBCV‐1 as well as other related viruses. PMID:16877063
Developmental Regulation of Genes Encoding Universal Stress Proteins in Schistosoma mansoni
Isokpehi, Raphael D.; Mahmud, Ousman; Mbah, Andreas N.; Simmons, Shaneka S.; Avelar, Lívia; Rajnarayanan, Rajendram V.; Udensi, Udensi K.; Ayensu, Wellington K.; Cohly, Hari H.; Brown, Shyretha D.; Dates, Centdrika R.; Hentz, Sonya D.; Hughes, Shawntae J.; Smith-McInnis, Dominique R.; Patterson, Carvey O.; Sims, Jennifer N.; Turner, Kelisha T.; Williams, Baraka S.; Johnson, Matilda O.; Adubi, Taiwo; Mbuh, Judith V.; Anumudu, Chiaka I.; Adeoye, Grace O.; Thomas, Bolaji N.; Nashiru, Oyekanmi; Oliveira, Guilherme
2011-01-01
The draft nuclear genome sequence of the snail-transmitted, dimorphic, parasitic, platyhelminth Schistosoma mansoni revealed eight genes encoding proteins that contain the Universal Stress Protein (USP) domain. Schistosoma mansoni is a causative agent of human schistosomiasis, a severe and debilitating Neglected Tropical Disease (NTD) of poverty, which is endemic in at least 76 countries. The availability of the genome sequences of Schistosoma species presents opportunities for bioinformatics and genomics analyses of associated gene families that could be targets for understanding schistosomiasis ecology, intervention, prevention and control. Proteins with the USP domain are known to provide bacteria, archaea, fungi, protists and plants with the ability to respond to diverse environmental stresses. In this research investigation, the functional annotations of the USP genes and predicted nucleotide and protein sequences were initially verified. Subsequently, sequence clusters and distinctive features of the sequences were determined. A total of twelve ligand binding sites were predicted based on alignment to the ATP-binding universal stress protein from Methanocaldococcus jannaschii. In addition, six USP sequences showed the presence of ATP-binding motif residues indicating that they may be regulated by ATP. Public domain gene expression data and RT-PCR assays confirmed that all the S. mansoni USP genes were transcribed in at least one of the developmental life cycle stages of the helminth. Six of these genes were up-regulated in the miracidium, a free-swimming stage that is critical for transmission to the snail intermediate host. It is possible that during the intra-snail stages, S. mansoni gene transcripts for universal stress proteins are low abundant and are induced to perform specialized functions triggered by environmental stressors such as oxidative stress due to hydrogen peroxide that is present in the snail hemocytes. This report serves to catalyze the formation of a network of researchers to understand the function and regulation of the universal stress proteins encoded in genomes of schistosomes and their snail intermediate hosts. PMID:22084571
Analysis of membrane protein genes in a Brazilian isolate of Anaplasma marginale.
G Junior, Daniel S; Araújo, Flábio R; Almeida Junior, Nalvo F; Adi, Said S; Cheung, Luciana M; Fragoso, Stenio P; Ramos, Carlos A N; Oliveira, Renato Henrique M de; Santos, Caroline S; Bacanelli, Gisele; Soares, Cleber O; Rosinha, Grácia M S; Fonseca, Adivaldo H
2010-11-01
The sequencing of the complete genome of Anaplasma marginale has enabled the identification of several genes that encode membrane proteins, thereby increasing the chances of identifying candidate immunogens. Little is known regarding the genetic variability of genes that encode membrane proteins in A. marginale isolates. The aim of the present study was to determine the degree of conservation of the predicted amino acid sequences of OMP1, OMP4, OMP5, OMP7, OMP8, OMP10, OMP14, OMP15, SODb, OPAG1, OPAG3, VirB3, VirB9-1, PepA, EF-Tu and AM854 proteins in a Brazilian isolate of A. marginale compared to other isolates. Hence, primers were used to amplify these genes: omp1, omp4, omp5, omp7, omp8, omp10, omp14, omp15, sodb, opag1, opag3, virb3, VirB9-1, pepA, ef-tu and am854. After polimerase chain reaction amplification, the products were cloned and sequenced using the Sanger method and the predicted amino acid sequence were multi-aligned using the CLUSTALW and MEGA 4 programs, comparing the predicted sequences between the Brazilian, Saint Maries, Florida and A. marginale centrale isolates. With the exception of outer membrane protein (OMP) 7, all proteins exhibited 92-100% homology to the other A. marginale isolates. However, only OMP1, OMP5, EF-Tu, VirB3, SODb and VirB9-1 were selected as potential immunogens capable of promoting cross-protection between isolates due to the high degree of homology (over 72%) also found with A. (centrale) marginale.
Proteogenomic characterization of human colon and rectal cancer
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhang, Bing; Wang, Jing; Wang, Xiaojing
2014-09-18
We analyzed proteomes of colon and rectal tumors previously characterized by the Cancer Genome Atlas (TCGA) and performed integrated proteogenomic analyses. Protein sequence variants encoded by somatic genomic variations displayed reduced expression compared to protein variants encoded by germline variations. mRNA transcript abundance did not reliably predict protein expression differences between tumors. Proteomics identified five protein expression subtypes, two of which were associated with the TCGA "MSI/CIMP" transcriptional subtype, but had distinct mutation and methylation patterns and associated with different clinical outcomes. Although CNAs showed strong cis- and trans-effects on mRNA expression, relatively few of these extend to the proteinmore » level. Thus, proteomics data enabled prioritization of candidate driver genes. Our analyses identified HNF4A, a novel candidate driver gene in tumors with chromosome 20q amplifications. Integrated proteogenomic analysis provides functional context to interpret genomic abnormalities and affords novel insights into cancer biology.« less
Deep-sea vent phage DNA polymerase specifically initiates DNA synthesis in the absence of primers.
Zhu, Bin; Wang, Longfei; Mitsunobu, Hitoshi; Lu, Xueling; Hernandez, Alfredo J; Yoshida-Takashima, Yukari; Nunoura, Takuro; Tabor, Stanley; Richardson, Charles C
2017-03-21
A DNA polymerase is encoded by the deep-sea vent phage NrS-1. NrS-1 has a unique genome organization containing genes that are predicted to encode a helicase and a single-stranded DNA (ssDNA)-binding protein. The gene for an unknown protein shares weak homology with the bifunctional primase-polymerases (prim-pols) from archaeal plasmids but is missing the zinc-binding domain typically found in primases. We show that this gene product has efficient DNA polymerase activity and is processive in DNA synthesis in the presence of the NrS-1 helicase and ssDNA-binding protein. Remarkably, this NrS-1 DNA polymerase initiates DNA synthesis from a specific template DNA sequence in the absence of any primer. The de novo DNA polymerase activity resides in the N-terminal domain of the protein, whereas the C-terminal domain enhances DNA binding.
Cellular and molecular biology of orphan G protein-coupled receptors.
Oh, Da Young; Kim, Kyungjin; Kwon, Hyuk Bang; Seong, Jae Young
2006-01-01
The superfamily of G protein-coupled receptors (GPCRs) is the largest and most diverse group of membrane-spanning proteins. It plays a variety of roles in pathophysiological processes by transmitting extracellular signals to cells via heterotrimeric G proteins. Completion of the human genome project revealed the presence of approximately 168 genes encoding established nonsensory GPCRs, as well as 207 genes predicted to encode novel GPCRs for which the natural ligands remained to be identified, the so-called orphan GPCRs. Eighty-six of these orphans have now been paired to novel or previously known molecules, and 121 remain to be deorphaned. A better understanding of the GPCR structures and classification; knowledge of the receptor activation mechanism, either dependent on or independent of an agonist; increased understanding of the control of GPCR-mediated signal transduction; and development of appropriate ligand screening systems may improve the probability of discovering novel ligands for the remaining orphan GPCRs.
Tsai, Keng-Chang; Jian, Jhih-Wei; Yang, Ei-Wen; Hsu, Po-Chiang; Peng, Hung-Pin; Chen, Ching-Tai; Chen, Jun-Bo; Chang, Jeng-Yih; Hsu, Wen-Lian; Yang, An-Suei
2012-01-01
Non-covalent protein-carbohydrate interactions mediate molecular targeting in many biological processes. Prediction of non-covalent carbohydrate binding sites on protein surfaces not only provides insights into the functions of the query proteins; information on key carbohydrate-binding residues could suggest site-directed mutagenesis experiments, design therapeutics targeting carbohydrate-binding proteins, and provide guidance in engineering protein-carbohydrate interactions. In this work, we show that non-covalent carbohydrate binding sites on protein surfaces can be predicted with relatively high accuracy when the query protein structures are known. The prediction capabilities were based on a novel encoding scheme of the three-dimensional probability density maps describing the distributions of 36 non-covalent interacting atom types around protein surfaces. One machine learning model was trained for each of the 30 protein atom types. The machine learning algorithms predicted tentative carbohydrate binding sites on query proteins by recognizing the characteristic interacting atom distribution patterns specific for carbohydrate binding sites from known protein structures. The prediction results for all protein atom types were integrated into surface patches as tentative carbohydrate binding sites based on normalized prediction confidence level. The prediction capabilities of the predictors were benchmarked by a 10-fold cross validation on 497 non-redundant proteins with known carbohydrate binding sites. The predictors were further tested on an independent test set with 108 proteins. The residue-based Matthews correlation coefficient (MCC) for the independent test was 0.45, with prediction precision and sensitivity (or recall) of 0.45 and 0.49 respectively. In addition, 111 unbound carbohydrate-binding protein structures for which the structures were determined in the absence of the carbohydrate ligands were predicted with the trained predictors. The overall prediction MCC was 0.49. Independent tests on anti-carbohydrate antibodies showed that the carbohydrate antigen binding sites were predicted with comparable accuracy. These results demonstrate that the predictors are among the best in carbohydrate binding site predictions to date. PMID:22848404
NASA Astrophysics Data System (ADS)
Yee, Chai Sin; Murad, Abdul Munir Abdul; Bakar, Farah Diba Abu
2013-11-01
A gene encoding an endo-β-1,4-mannanase from Trichoderma virens UKM1 (manTV) and Aspergillus flavus UKM1 (manAF) was analysed with bioinformatic tools. In addition, A. flavus NRRL 3357 genome database was screened for a β-mannosidase gene and analysed (mndA-AF). These three genes were analysed to understand their gene properties. manTV and manAF both consists of 1,332-bp and 1,386-bp nucleotides encoding 443 and 461 amino acid residues, respectively. Both the endo-β-1,4-mannanases belong to the glycosyl hydrolase family 5 and contain a carbohydrate-binding module family 1 (CBM1). On the other hand, mndA-AF which is a 2,745-bp gene encodes a protein sequence of 914 amino acid residues. This β-mannosidase belongs to the glycosyl hydrolase family 2. Predicted molecular weight of manTV, manAF and mndA-AF are 47.74 kDa, 49.71 kDa and 103 kDa, respectively. All three predicted protein sequences possessed signal peptide sequence and are highly conserved among other fungal β-mannanases and β-mannosidases.
Ma, Jiale; Pan, Zihao; Huang, Jinhu; Sun, Min; Lu, Chengping; Yao, Huochun
2017-01-01
ABSTRACT The type VI secretion system (T6SS) is a widespread molecular weapon deployed by many bacterial species to target eukaryotic host cells or rival bacteria. Using a dynamic injection mechanism, diverse effectors can be delivered by T6SS directly into recipient cells. Here, we report a new family of T6SS effectors encoded by extended Hcps carrying diverse toxin domains. Bioinformatic analyses revealed that these Hcps with C-terminal extension toxins, designated as Hcp-ET, exist widely in the Enterobacteriaceae. To verify our findings, Hcp-ET1 was tested for its antibacterial effect, and showed effective inhibition of target cell growth via the predicted HNH-DNase activity by T6SS-dependent delivery. Further studies showed that Hcp-ET2 mediated interbacterial antagonism via a Tle1 phospholipase (encoded by DUF2235 domain) activity. Notably, comprehensive analyses of protein homology and genomic neighborhoods revealed that Hcp-ET3–4 is fused with 2 toxin domains (Pyocin S3 and Colicin-DNase) C-terminally, and its encoding gene is followed 3 duplications of the cognate immunity genes. However, some bacteria encode a separated hcp-et3 and an orphan et4 (et4O1) genes caused by a termination-codon mutation in the fusion region between Pyocin S3 and Colicin-DNase encoding fragments. Our results demonstrated that both of these toxins had antibacterial effects. Further, all duplications of the cognate immunity protein contributed to neutralize the DNase toxicity of Pyocin S3 and Colicin, which has not been reported previously. In conclusion, we propose that Hcp-ET proteins are polymorphic T6SS effectors, and thus present a novel encoding pattern of T6SS effectors. PMID:28060574
Lin, Wen-Hsien; Liu, Wei-Chung; Hwang, Ming-Jing
2009-03-11
Human cells of various tissue types differ greatly in morphology despite having the same set of genetic information. Some genes are expressed in all cell types to perform house-keeping functions, while some are selectively expressed to perform tissue-specific functions. In this study, we wished to elucidate how proteins encoded by human house-keeping genes and tissue-specific genes are organized in human protein-protein interaction networks. We constructed protein-protein interaction networks for different tissue types using two gene expression datasets and one protein-protein interaction database. We then calculated three network indices of topological importance, the degree, closeness, and betweenness centralities, to measure the network position of proteins encoded by house-keeping and tissue-specific genes, and quantified their local connectivity structure. Compared to a random selection of proteins, house-keeping gene-encoded proteins tended to have a greater number of directly interacting neighbors and occupy network positions in several shortest paths of interaction between protein pairs, whereas tissue-specific gene-encoded proteins did not. In addition, house-keeping gene-encoded proteins tended to connect with other house-keeping gene-encoded proteins in all tissue types, whereas tissue-specific gene-encoded proteins also tended to connect with other tissue-specific gene-encoded proteins, but only in approximately half of the tissue types examined. Our analysis showed that house-keeping gene-encoded proteins tend to occupy important network positions, while those encoded by tissue-specific genes do not. The biological implications of our findings were discussed and we proposed a hypothesis regarding how cells organize their protein tools in protein-protein interaction networks. Our results led us to speculate that house-keeping gene-encoded proteins might form a core in human protein-protein interaction networks, while clusters of tissue-specific gene-encoded proteins are attached to the core at more peripheral positions of the networks.
Isolation and characterization of a novel calmodulin-binding protein from potato
NASA Technical Reports Server (NTRS)
Reddy, Anireddy S N.; Day, Irene S.; Narasimhulu, S. B.; Safadi, Farida; Reddy, Vaka S.; Golovkin, Maxim; Harnly, Melissa J.
2002-01-01
Tuberization in potato is controlled by hormonal and environmental signals. Ca(2+), an important intracellular messenger, and calmodulin (CaM), one of the primary Ca(2+) sensors, have been implicated in controlling diverse cellular processes in plants including tuberization. The regulation of cellular processes by CaM involves its interaction with other proteins. To understand the role of Ca(2+)/CaM in tuberization, we have screened an expression library prepared from developing tubers with biotinylated CaM. This screening resulted in isolation of a cDNA encoding a novel CaM-binding protein (potato calmodulin-binding protein (PCBP)). Ca(2+)-dependent binding of the cDNA-encoded protein to CaM is confirmed by (35)S-labeled CaM. The full-length cDNA is 5 kb long and encodes a protein of 1309 amino acids. The deduced amino acid sequence showed significant similarity with a hypothetical protein from another plant, Arabidopsis. However, no homologs of PCBP are found in nonplant systems, suggesting that it is likely to be specific to plants. Using truncated versions of the protein and a synthetic peptide in CaM binding assays we mapped the CaM-binding region to a 20-amino acid stretch (residues 1216-1237). The bacterially expressed protein containing the CaM-binding domain interacted with three CaM isoforms (CaM2, CaM4, and CaM6). PCBP is encoded by a single gene and is expressed differentially in the tissues tested. The expression of CaM, PCBP, and another CaM-binding protein is similar in different tissues and organs. The predicted protein contained seven putative nuclear localization signals and several strong PEST motifs. Fusion of the N-terminal region of the protein containing six of the seven nuclear localization signals to the reporter gene beta-glucuronidase targeted the reporter gene to the nucleus, suggesting a nuclear role for PCBP.
Genomic sequence of mandarin fish rhabdovirus with an unusual small non-transcriptional ORF.
Tao, Jian-Jun; Zhou, Guang-Zhou; Gui, Jian-Fang; Zhang, Qi-Ya
2008-03-01
The complete genome of mandarin fish Siniperca chuatsi rhabdovirus (SCRV) was cloned and sequenced. It comprises 11,545 nucleotides and contains five genes encoding the nucleoprotein N, the phosphoprotein P, the matrix protein M, the glycoprotein G, and the RNA-dependent RNA polymerase protein L. At the 3' and 5' termini of SCRV genome, leader and trailer sequences show inverse complementarity. The N, P, M and G proteins share the highest sequence identities (ranging from 14.8 to 41.5%) with the respective proteins of rhabdovirus 903/87, the L protein has the highest identity with those of vesiculoviruses, especially with Chandipura virus (44.7%). Phylogenetic analysis of L proteins showed that SCRV clustered with spring vireamia of carp virus (SVCV) and was most closely related to viruses in the genus Vesiculovirus. In addition, an overlapping open reading frame (ORF) predicted to encode a protein similar to vesicular stomatitis virus C protein is present within the P gene of SCRV. Furthermore, an unoverlapping small ORF downstream of M ORF within M gene is predicted (tentatively called orf4). Therefore, the genomic organization of SCRV can be proposed as 3' leader-N-P/C-M-(orf4)-G-L-trailer 5'. Orf4 transcription or translation products could not be detected by northern or Western blot, respectively, though one similar mRNA band to M mRNA was found. This is the first report on one small unoverlapping ORF in M gene of a fish rhabdovirus.
Wilson, Marlena M; Anderson, D Eric; Bernstein, Harris D
2015-01-01
Bacteroides fragilis is a widely distributed member of the human gut microbiome and an opportunistic pathogen. Cell surface molecules produced by this organism likely play important roles in colonization, communication with other microbes, and pathogenicity, but the protein composition of the outer membrane (OM) and the mechanisms used to transport polypeptides into the extracellular space are poorly characterized. Here we used LC-MS/MS to analyze the OM proteome and secretome of B. fragilis NCTC 9343 grown under laboratory conditions. Of the 229 OM proteins that we identified, 108 are predicted to be lipoproteins, and 61 are predicted to be TonB-dependent transporters. Based on their proximity to genes encoding TonB-dependent transporters, many of the lipoprotein genes likely encode proteins involved in nutrient or small molecule uptake. Interestingly, protease accessibility and biotinylation experiments indicated that an unusually large fraction of the lipoproteins are cell-surface exposed. We also identified three proteins that are members of a novel family of autotransporters, multiple potential type I protein secretion systems, and proteins that appear to be components of a type VI secretion apparatus. The secretome consisted of lipoproteins and other proteins that might be substrates of the putative type I or type VI secretion systems. Our proteomic studies show that B. fragilis differs considerably from well-studied Gram-negative bacteria such as Escherichia coli in both the spectrum of OM proteins that it produces and the range of secretion strategies that it utilizes.
Cloning and expression analysis of FaPR-1 gene in strawberry
NASA Astrophysics Data System (ADS)
Mo, Fan; Luo, Ya; Ge, Cong; Mo, Qin; Ling, Yajie; Luo, Shu; Tang, Haoru
2018-04-01
The FaPR-1 gene was cloned by RT-PCR from `Benihoppe' strawberry and its bioinformatics analysis was conducted. The results showed that the open reading frame was 483 bp encoding encoding l60 amino acids which protein molecular weight and theoretical isoelectricity were 17854.17 and 8.72 respectively. Subcellular localization prediction shows that this gene is located extracellularly. By comparing strawberry FaPR-l and other plant Pathogenesis-related protein, homology and phylogenetic tree construction showed that the homology with grapes, peach is relatively close. In the treatments of ABA, sucrose and the mixture of the two, the expression of FaPR-1 in strawberry fruit were significantly increased.
1991-01-01
We recently described the identification of BOS1 (Newman, A., J. Shim, and S. Ferro-Novick. 1990. Mol. Cell. Biol. 10:3405-3414.). BOS1 is a gene that in multiple copy suppresses the growth and secretion defect of bet1 and sec22, two mutants that disrupt transport from the ER to the Golgi complex in yeast. The ability of BOS1 to specifically suppress mutants blocked at a particular stage of the secretory pathway suggested that this gene encodes a protein that functions in this process. The experiments presented in this study support this hypothesis. Specifically, the BOS1 gene was found to be essential for cellular growth. Furthermore, cells depleted of the Bos1 protein fail to transport pro-alpha-factor and carboxypeptidase Y (CPY) to the Golgi apparatus. This defect in export leads to the accumulation of an extensive network of ER and small vesicles. DNA sequence analysis predicts that Bos1 is a 27-kD protein containing a putative membrane- spanning domain. This prediction is supported by differential centrifugation experiments. Thus, Bos1 appears to be a membrane protein that functions in conjunction with Bet1 and Sec22 to facilitate the transport of proteins at a step subsequent to translocation into the ER but before entry into the Golgi apparatus. PMID:2007627
In silico MCMV Silencing Concludes Potential Host-Derived miRNAs in Maize
Iqbal, Muhammad Shahzad; Jabbar, Basit; Sharif, Muhammad Nauman; Ali, Qurban; Husnain, Tayyab; Nasir, Idrees A.
2017-01-01
Maize Chlorotic Mottle Virus (MCMV) is a deleterious pathogen which causes Maize Lethal Necrosis Disease (MLND) that results in substantial yield loss of Maize crop worldwide. The positive-sense RNA genome of MCMV (4.4 kb) encodes six proteins: P32 (32 kDa protein), RNA dependent RNA polymerases (P50 and P111), P31 (31 kDa protein), P7 (7 kDa protein), coat protein (25 kDa). P31, P7 and coat protein are encoded from sgRNA1, located at the 3′end of the genome and sgRNA2 is located at the extremity of the 3′genome end. The objective of this study is to locate the possible attachment sites of Zea mays derived miRNAs in the genome of MCMV using four diverse miRNA target prediction algorithms. In total, 321 mature miRNAs were retrieved from miRBase (miRNA database) and were tested for hybridization of MCMV genome. These algorithms considered the parameters of seed pairing, minimum free energy, target site accessibility, multiple target sites, pattern recognition and folding energy for attachment. Out of 321 miRNAs only 10 maize miRNAs are predicted for silencing of MCMV genome. The results of this study can hence act as the first step towards the development of MCMV resistant transgenic Maize plants through expression of the selected miRNAs. PMID:28400775
Wang, S Y; Gudas, L J
1990-09-15
We have previously isolated several cDNA clones specific for mRNA species that increase in abundance during the retinoic acid-associated differentiation of F9 teratocarcinoma stem cells. One of these mRNAs, J6, encodes a approximately 40 kDa protein as assayed by hybrid selection and in vitro translation (Wang, S.-Y., LaRosa, G., and Gudas, L. J. (1985) Dev. Biol. 107, 75-86). The time course of J6 mRNA expression is similar to those of both laminin B1 and collagen IV (alpha 1) messages following retinoic acid addition. To address the functional role of this protein, we have isolated a full-length cDNA clone complementary to this approximately 40-kDa protein mRNA. Sequence analysis reveals an open reading frame of 406 amino acids (Mr 45,652). The carboxyl-terminal portion of this predicted protein contains a region that is homologous to the reactive sites found among members of the serpin (serine protease inhibitor) family. The predicted reactive site (P1-P1') of this J6 protein is Arg-Ser, which is the same as that of antithrombin III. Like ovalbumin and human monocyte-derived plasminogen activator inhibitor (mPAI-2), which are members of the serpin gene family, the J6 protein appears to have no typical amino-terminal signal sequence.
Braaksma, Machtelt; Martens-Uzunova, Elena S; Punt, Peter J; Schaap, Peter J
2010-10-19
The ecological niche occupied by a fungal species, its pathogenicity and its usefulness as a microbial cell factory to a large degree depends on its secretome. Protein secretion usually requires the presence of a N-terminal signal peptide (SP) and by scanning for this feature using available highly accurate SP-prediction tools, the fraction of potentially secreted proteins can be directly predicted. However, prediction of a SP does not guarantee that the protein is actually secreted and current in silico prediction methods suffer from gene-model errors introduced during genome annotation. A majority rule based classifier that also evaluates signal peptide predictions from the best homologs of three neighbouring Aspergillus species was developed to create an improved list of potential signal peptide containing proteins encoded by the Aspergillus niger genome. As a complement to these in silico predictions, the secretome associated with growth and upon carbon source depletion was determined using a shotgun proteomics approach. Overall, some 200 proteins with a predicted signal peptide were identified to be secreted proteins. Concordant changes in the secretome state were observed as a response to changes in growth/culture conditions. Additionally, two proteins secreted via a non-classical route operating in A. niger were identified. We were able to improve the in silico inventory of A. niger secretory proteins by combining different gene-model predictions from neighbouring Aspergilli and thereby avoiding prediction conflicts associated with inaccurate gene-models. The expected accuracy of signal peptide prediction for proteins that lack homologous sequences in the proteomes of related species is 85%. An experimental validation of the predicted proteome confirmed in silico predictions.
2010-01-01
Background The ecological niche occupied by a fungal species, its pathogenicity and its usefulness as a microbial cell factory to a large degree depends on its secretome. Protein secretion usually requires the presence of a N-terminal signal peptide (SP) and by scanning for this feature using available highly accurate SP-prediction tools, the fraction of potentially secreted proteins can be directly predicted. However, prediction of a SP does not guarantee that the protein is actually secreted and current in silico prediction methods suffer from gene-model errors introduced during genome annotation. Results A majority rule based classifier that also evaluates signal peptide predictions from the best homologs of three neighbouring Aspergillus species was developed to create an improved list of potential signal peptide containing proteins encoded by the Aspergillus niger genome. As a complement to these in silico predictions, the secretome associated with growth and upon carbon source depletion was determined using a shotgun proteomics approach. Overall, some 200 proteins with a predicted signal peptide were identified to be secreted proteins. Concordant changes in the secretome state were observed as a response to changes in growth/culture conditions. Additionally, two proteins secreted via a non-classical route operating in A. niger were identified. Conclusions We were able to improve the in silico inventory of A. niger secretory proteins by combining different gene-model predictions from neighbouring Aspergilli and thereby avoiding prediction conflicts associated with inaccurate gene-models. The expected accuracy of signal peptide prediction for proteins that lack homologous sequences in the proteomes of related species is 85%. An experimental validation of the predicted proteome confirmed in silico predictions. PMID:20959013
Hoane, Jessica S; Carruthers, Vernon B; Striepen, Boris; Morrison, David P; Entzeroth, Rolf; Howe, Daniel K
2003-07-01
Sarcocystis neurona, an apicomplexan parasite, is the primary causative agent of equine protozoal myeloencephalitis. Like other members of the Apicomplexa, S. neurona zoites possess secretory organelles that contain proteins necessary for host cell invasion and intracellular survival. From a collection of S. neurona expressed sequence tags, we identified a sequence encoding a putative microneme protein based on similarity to Toxoplasma gondii MIC10 (TgMIC10). Pairwise sequence alignments of SnMIC10 to TgMIC10 and NcMIC10 from Neospora caninum revealed approximately 33% identity to both orthologues. The open reading frame of the S. neurona gene encodes a 255 amino acid protein with a predicted 39-residue signal peptide. Like TgMIC10 and NcMIC10, SnMIC10 is predicted to be hydrophilic, highly alpha-helical in structure, and devoid of identifiable adhesive domains. Antibodies raised against recombinant SnMIC10 recognised a protein band with an apparent molecular weight of 24 kDa in Western blots of S. neurona merozoites, consistent with the size predicted for SnMIC10. In vitro secretion assays demonstrated that this protein is secreted by extracellular merozoites in a temperature-dependent manner. Indirect immunofluorescence analysis of SnMIC10 showed a polar labelling pattern, which is consistent with the apical position of the micronemes, and immunoelectron microscopy provided definitive localisation of the protein to these secretory organelles. Further analysis of SnMIC10 in intracellular parasites revealed that expression of this protein is temporally regulated during endopolygeny, supporting the view that micronemes are only needed during host cell invasion. Collectively, the data indicate that SnMIC10 is a microneme protein that is part of the excreted/secreted antigen fraction of S. neurona. Identification and characterisation of additional S. neurona microneme antigens and comparisons to orthologues in other Apicomplexa could provide further insight into the functions that these proteins serve during invasion of host cells.
USDA-ARS?s Scientific Manuscript database
Maize rayado fino virus (MRFV) possesses an open reading frame (ORF) encoding a protein with predicted mass of 43 kDa (ORF43) that has been postulated to be a viral movement protein. Using a clone of MRFV (pMRFV-US) from which infectious RNA can be produced, point mutations were introduced to eithe...
Isolation and characterization of polygalacturonase genes (pecA and pecB) from Aspergillus flavus.
Whitehead, M P; Shieh, M T; Cleveland, T E; Cary, J W; Dean, R A
1995-01-01
Two genes, pecA and pecB, encoding endopolyglacturonases were cloned from a highly aggressive strain of Aspergillus flavus. The pecA gene consisted of 1,228 bp encoding a protein of 363 amino acids with a predicted molecular mass of 37.6 kDa, interrupted by two introns of 58 and 81 bp in length. Accumulation of pecA mRNA in both pectin- or glucose-grown mycelia in the highly aggressive strain matched the activity profile of a pectinase previously identified as P2c. Transformants of a weakly aggressive strain containing a functional copy of the pecA gene produced P2c in vitro, confirming that pecA encodes P2c. The coding region of pecB was determined to be 1,217 bp in length interrupted by two introns of 65 and 54 bp in length. The predicted protein of 366 amino acids had an estimated molecular mass of 38 kDa. Transcripts of this gene accumulated in mycelia grown in medium containing pectin alone, never in mycelia grown in glucose-containing medium, for both highly and weakly aggressive strains. Thus, pecB encodes the activity previously identified as P1 or P3. pecA and pecB share a high degree of sequence identity with polygalacturonase genes from Aspergillus parasiticus and Aspergillus oryzae, further establishing the close relationships between members of the A. flavus group. Conservation of intron positions in these genes also indicates that they share a common ancestor with genes encoding endopolyglacturonases of Aspergillus niger. PMID:7574642
Mayer-Jaekel, R E; Baumgartner, S; Bilbe, G; Ohkura, H; Glover, D M; Hemmings, B A
1992-01-01
cDNA clones encoding the catalytic subunit and the 65-kDa regulatory subunit of protein phosphatase 2A (PR65) from Drosophila melanogaster have been isolated by homology screening with the corresponding human cDNAs. The Drosophila clones were used to analyze the spatial and temporal expression of the transcripts encoding these two proteins. The Drosophila PR65 cDNA clones contained an open reading frame of 1773 nucleotides encoding a protein of 65.5 kDa. The predicted amino acid sequence showed 75 and 71% identity to the human PR65 alpha and beta isoforms, respectively. As previously reported for the mammalian PR65 isoforms, Drosophila PR65 is composed of 15 imperfect repeating units of approximately 39 amino acids. The residues contributing to this repeat structure show also the highest sequence conservation between species, indicating a functional importance for these repeats. The gene encoding Drosophila PR65 was located at 29B1,2 on the second chromosome. A major transcript of 2.8 kilobase (kb) encoding the PR65 subunit and two transcripts of 1.6 and 2.5 kb encoding the catalytic subunit could be detected throughout Drosophila development. All of these mRNAs were most abundant during early embryogenesis and were expressed at lower levels in larvae and adult flies. In situ hybridization of different developmental stages showed a colocalization of the PR65 and catalytic subunit transcripts. The mRNA expression is high in the nurse cells and oocytes, consistent with a high equally distributed expression in early embryos. In later embryonal development, the expression remains high in the nervous system and the gonads but the overall transcript levels decrease. In third instar larvae, high levels of mRNA could be observed in brain, imaginal discs, and in salivary glands. These results indicate that protein phosphatase 2A transcript levels change during development in a tissue and in a time-specific manner. Images PMID:1320961
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kumar, Purnima; Gunalan, Vithiagaran; Liu Boping
Severe acute respiratory syndrome (SARS) coronavirus (SARS-CoV) caused a severe outbreak in several regions of the world in 2003. The SARS-CoV genome is predicted to contain 14 functional open reading frames (ORFs). The first ORF (1a and 1b) encodes a large polyprotein that is cleaved into nonstructural proteins (nsp). The other ORFs encode for four structural proteins (spike, membrane, nucleocapsid and envelope) as well as eight SARS-CoV-specific accessory proteins (3a, 3b, 6, 7a, 7b, 8a, 8b and 9b). In this report we have cloned the predicted nsp8 gene and the ORF6 gene of the SARS-CoV and studied their abilities tomore » interact with each other. We expressed the two proteins as fusion proteins in the yeast two-hybrid system to demonstrate protein-protein interactions and tested the same using a yeast genetic cross. Further the strength of the interaction was measured by challenging growth of the positive interaction clones on increasing gradients of 2-amino trizole. The interaction was then verified by expressing both proteins separately in-vitro in a coupled-transcription translation system and by coimmunoprecipitation in mammalian cells. Finally, colocalization experiments were performed in SARS-CoV infected Vero E6 mammalian cells to confirm the nsp8-ORF6 interaction. To the best of our knowledge, this is the first report of the interaction between a SARS-CoV accessory protein and nsp8 and our findings suggest that ORF6 protein may play a role in virus replication.« less
Cui, Jian; Liu, Jinghua; Li, Yuhua; Shi, Tieliu
2011-01-01
Mitochondria are major players on the production of energy, and host several key reactions involved in basic metabolism and biosynthesis of essential molecules. Currently, the majority of nucleus-encoded mitochondrial proteins are unknown even for model plant Arabidopsis. We reported a computational framework for predicting Arabidopsis mitochondrial proteins based on a probabilistic model, called Naive Bayesian Network, which integrates disparate genomic data generated from eight bioinformatics tools, multiple orthologous mappings, protein domain properties and co-expression patterns using 1,027 microarray profiles. Through this approach, we predicted 2,311 candidate mitochondrial proteins with 84.67% accuracy and 2.53% FPR performances. Together with those experimental confirmed proteins, 2,585 mitochondria proteins (named CoreMitoP) were identified, we explored those proteins with unknown functions based on protein-protein interaction network (PIN) and annotated novel functions for 26.65% CoreMitoP proteins. Moreover, we found newly predicted mitochondrial proteins embedded in particular subnetworks of the PIN, mainly functioning in response to diverse environmental stresses, like salt, draught, cold, and wound etc. Candidate mitochondrial proteins involved in those physiological acitivites provide useful targets for further investigation. Assigned functions also provide comprehensive information for Arabidopsis mitochondrial proteome. PMID:21297957
Deep convolutional neural networks for pan-specific peptide-MHC class I binding prediction.
Han, Youngmahn; Kim, Dongsup
2017-12-28
Computational scanning of peptide candidates that bind to a specific major histocompatibility complex (MHC) can speed up the peptide-based vaccine development process and therefore various methods are being actively developed. Recently, machine-learning-based methods have generated successful results by training large amounts of experimental data. However, many machine learning-based methods are generally less sensitive in recognizing locally-clustered interactions, which can synergistically stabilize peptide binding. Deep convolutional neural network (DCNN) is a deep learning method inspired by visual recognition process of animal brain and it is known to be able to capture meaningful local patterns from 2D images. Once the peptide-MHC interactions can be encoded into image-like array(ILA) data, DCNN can be employed to build a predictive model for peptide-MHC binding prediction. In this study, we demonstrated that DCNN is able to not only reliably predict peptide-MHC binding, but also sensitively detect locally-clustered interactions. Nonapeptide-HLA-A and -B binding data were encoded into ILA data. A DCNN, as a pan-specific prediction model, was trained on the ILA data. The DCNN showed higher performance than other prediction tools for the latest benchmark datasets, which consist of 43 datasets for 15 HLA-A alleles and 25 datasets for 10 HLA-B alleles. In particular, the DCNN outperformed other tools for alleles belonging to the HLA-A3 supertype. The F1 scores of the DCNN were 0.86, 0.94, and 0.67 for HLA-A*31:01, HLA-A*03:01, and HLA-A*68:01 alleles, respectively, which were significantly higher than those of other tools. We found that the DCNN was able to recognize locally-clustered interactions that could synergistically stabilize peptide binding. We developed ConvMHC, a web server to provide user-friendly web interfaces for peptide-MHC class I binding predictions using the DCNN. ConvMHC web server can be accessible via http://jumong.kaist.ac.kr:8080/convmhc . We developed a novel method for peptide-HLA-I binding predictions using DCNN trained on ILA data that encode peptide binding data and demonstrated the reliable performance of the DCNN in nonapeptide binding predictions through the independent evaluation on the latest IEDB benchmark datasets. Our approaches can be applied to characterize locally-clustered patterns in molecular interactions, such as protein/DNA, protein/RNA, and drug/protein interactions.
The SdiA-Regulated Gene srgE Encodes a Type III Secreted Effector
Habyarimana, Fabien; Sabag-Daigle, Anice
2014-01-01
Salmonella enterica serovar Typhimurium is a food-borne pathogen that causes severe gastroenteritis. The ability of Salmonella to cause disease depends on two type III secretion systems (T3SSs) encoded in two distinct Salmonella pathogenicity islands, 1 and 2 (SPI1 and SPI2, respectively). S. Typhimurium encodes a solo LuxR homolog, SdiA, which can detect the acyl-homoserine lactones (AHLs) produced by other bacteria and upregulate the rck operon and the srgE gene. SrgE is predicted to encode a protein of 488 residues with a coiled-coil domain between residues 345 and 382. In silico studies have provided conflicting predictions as to whether SrgE is a T3SS substrate. Therefore, in this work, we tested the hypothesis that SrgE is a T3SS effector by two methods, a β-lactamase activity assay and a split green fluorescent protein (GFP) complementation assay. SrgE with β-lactamase fused to residue 40, 100, 150, or 300 was indeed expressed and translocated into host cells, but SrgE with β-lactamase fused to residue 400 or 488 was not expressed, suggesting interference by the coiled-coil domain. Similarly, SrgE with GFP S11 fused to residue 300, but not to residue 488, was expressed and translocated into host cells. With both systems, translocation into host cells was dependent upon SPI2. A phylogenetic analysis indicated that srgE is found only within Salmonella enterica subspecies. It is found sporadically within both typhoidal and nontyphoidal serovars, although the SrgE protein sequences found within typhoidal serovars tend to cluster separately from those found in nontyphoidal serovars, suggesting functional diversification. PMID:24727228
Diop, Awa; Diop, Khoudia; Tomei, Enora; Raoult, Didier; Fenollar, Florence; Fournier, Pierre-Edouard
2018-03-01
We report here the draft genome sequence of Ezakiella peruensis strain M6.X2 T The draft genome is 1,672,788 bp long and harbors 1,589 predicted protein-encoding genes, including 26 antibiotic resistance genes with 1 gene encoding vancomycin resistance. The genome also exhibits 1 clustered regularly interspaced short palindromic repeat region and 333 genes acquired by horizontal gene transfer. Copyright © 2018 Diop et al.
Identification of functional elements and regulatory circuits by Drosophila modENCODE
DOE Office of Scientific and Technical Information (OSTI.GOV)
Roy, Sushmita; Ernst, Jason; Kharchenko, Peter V.
2010-12-22
To gain insight into how genomic information is translated into cellular and developmental programs, the Drosophila model organism Encyclopedia of DNA Elements (modENCODE) project is comprehensively mapping transcripts, histone modifications, chromosomal proteins, transcription factors, replication proteins and intermediates, and nucleosome properties across a developmental time course and in multiple cell lines. We have generated more than 700 data sets and discovered protein-coding, noncoding, RNA regulatory, replication, and chromatin elements, more than tripling the annotated portion of the Drosophila genome. Correlated activity patterns of these elements reveal a functional regulatory network, which predicts putative new functions for genes, reveals stage- andmore » tissue-specific regulators, and enables gene-expression prediction. Our results provide a foundation for directed experimental and computational studies in Drosophila and related species and also a model for systematic data integration toward comprehensive genomic and functional annotation. Several years after the complete genetic sequencing of many species, it is still unclear how to translate genomic information into a functional map of cellular and developmental programs. The Encyclopedia of DNA Elements (ENCODE) (1) and model organism ENCODE (modENCODE) (2) projects use diverse genomic assays to comprehensively annotate the Homo sapiens (human), Drosophila melanogaster (fruit fly), and Caenorhabditis elegans (worm) genomes, through systematic generation and computational integration of functional genomic data sets. Previous genomic studies in flies have made seminal contributions to our understanding of basic biological mechanisms and genome functions, facilitated by genetic, experimental, computational, and manual annotation of the euchromatic and heterochromatic genome (3), small genome size, short life cycle, and a deep knowledge of development, gene function, and chromosome biology. The functions of {approx}40% of the protein and nonprotein-coding genes [FlyBase 5.12 (4)] have been determined from cDNA collections (5, 6), manual curation of gene models (7), gene mutations and comprehensive genome-wide RNA interference screens (8-10), and comparative genomic analyses (11, 12). The Drosophila modENCODE project has generated more than 700 data sets that profile transcripts, histone modifications and physical nucleosome properties, general and specific transcription factors (TFs), and replication programs in cell lines, isolated tissues, and whole organisms across several developmental stages (Fig. 1). Here, we computationally integrate these data sets and report (i) improved and additional genome annotations, including full-length proteincoding genes and peptides as short as 21 amino acids; (ii) noncoding transcripts, including 132 candidate structural RNAs and 1608 nonstructural transcripts; (iii) additional Argonaute (Ago)-associated small RNA genes and pathways, including new microRNAs (miRNAs) encoded within protein-coding exons and endogenous small interfering RNAs (siRNAs) from 3-inch untranslated regions; (iv) chromatin 'states' defined by combinatorial patterns of 18 chromatin marks that are associated with distinct functions and properties; (v) regions of high TF occupancy and replication activity with likely epigenetic regulation; (vi)mixed TF and miRNA regulatory networks with hierarchical structure and enriched feed-forward loops; (vii) coexpression- and co-regulation-based functional annotations for nearly 3000 genes; (viii) stage- and tissue-specific regulators; and (ix) predictive models of gene expression levels and regulator function.« less
Murcha, Monika W.; Rudhe, Charlotta; Elhafez, Dina; Adams, Keith L.; Daley, Daniel O.; Whelan, James
2005-01-01
The minimal requirements to support protein import into mitochondria were investigated in the context of the phenomenon of ongoing gene transfer from the mitochondrion to the nucleus in plants. Ribosomal protein 10 of the small subunit is encoded in the mitochondrion in soybean and many other angiosperms, whereas in several other species it is nuclear encoded and thus must be imported into the mitochondrial matrix to function. When encoded by the nuclear genome, it has adopted different strategies for mitochondrial targeting and import. In lettuce (Lactuca sativa) and carrot (Daucus carota), Rps10 independently gained different N-terminal extensions from other genes, following transfer to the nucleus. (The designation of Rps10 follows the following convention. The gene is indicated in italics. If encoded in the mitochondrion, it is rps10; if encoded in the nucleus, it is Rps10.) Here, we show that the N-terminal extensions of Rps10 in lettuce and carrot are both essential for mitochondrial import. In maize (Zea mays), Rps10 has not acquired an extension upon transfer but can be readily imported into mitochondria. Deletion analysis located the mitochondrial targeting region to the first 20 amino acids. Using site directed mutagenesis, we changed residues in the first 20 amino acids of the mitochondrial encoded soybean (Glycine max) rps10 to the corresponding amino acids in the nuclear encoded maize Rps10 until import was achieved. Changes were required that altered charge, hydrophobicity, predicted ability to form an amphiphatic α-helix, and generation of a binding motif for the outer mitochondrial membrane receptor, translocase of the outer membrane 20. In addition to defining the changes required to achieve mitochondrial localization, the results demonstrate that even proteins that do not present barriers to import can require substantial changes to acquire a mitochondrial targeting signal. PMID:16040655
Hayman, G T; Beck von Bodman, S; Kim, H; Jiang, P; Farrand, S K
1993-01-01
The acc region, subcloned from pTiC58 of classical nopaline and agrocinopine A and B Agrobacterium tumefaciens C58, allowed agrobacteria to grow using agrocinopine B as the sole source of carbon and energy. acc is approximately 6 kb in size. It consists of at least five genes, accA through accE, as defined by complementation analysis using subcloned fragments and transposon insertion mutations of acc carried on different plasmids within the same cell. All five regions are required for agrocin 84 sensitivity, and at least four are required for agrocinopine and agrocin 84 uptake. The complementation results are consistent with the hypothesis that each of the five regions is separately transcribed. Maxicell experiments showed that the first of these genes, accA, encodes a 60-kDa protein. Analysis of osmotic shock fractions showed this protein to be located in the periplasm. The DNA sequence of the accA region revealed an open reading frame encoding a predicted polypeptide of 59,147 Da. The amino acid sequence encoded by this open reading frame is similar to the periplasmic binding proteins OppA and DppA of Escherichia coli and Salmonella typhimurium and OppA of Bacillus subtilis. Images PMID:8366042
Complete nucleotide sequence of jasmine virus H, a new member of the family Tombusviridae.
Zhuo, Tao; Zhu, Li-Juan; Lu, Cheng-Cong; Jiang, Chao-Yang; Chen, Zi-Yin; Zhang, Guangzhi; Wang, Zong-Hua; Jovel, Juan; Han, Yan-Hong
2018-03-01
Jasmine virus H (JaVH) is a novel virus associated with symptoms of yellow mosaic on jasmine. The JaVH genome is 3,867 nt in length with five open reading frames (ORFs) encoding a 27-kDa protein (ORF 1), an 87-kDa replicase protein (ORF 2), two centrally located movement proteins (ORF 3 and 4), and a 37-kDa capsid protein (ORF 5). Based on genomic and phylogenetic analysis, JaVH is predicted to be a member of the genus Pelarspovirus in the family Tombusviridae.
Comparative Genomic Analyses of the Bacterial Phosphotransferase System
Barabote, Ravi D.; Saier, Milton H.
2005-01-01
We report analyses of 202 fully sequenced genomes for homologues of known protein constituents of the bacterial phosphoenolpyruvate-dependent phosphotransferase system (PTS). These included 174 bacterial, 19 archaeal, and 9 eukaryotic genomes. Homologues of PTS proteins were not identified in archaea or eukaryotes, showing that the horizontal transfer of genes encoding PTS proteins has not occurred between the three domains of life. Of the 174 bacterial genomes (136 bacterial species) analyzed, 30 diverse species have no PTS homologues, and 29 species have cytoplasmic PTS phosphoryl transfer protein homologues but lack recognizable PTS permeases. These soluble homologues presumably function in regulation. The remaining 77 species possess all PTS proteins required for the transport and phosphorylation of at least one sugar via the PTS. Up to 3.2% of the genes in a bacterium encode PTS proteins. These homologues were analyzed for family association, range of protein types, domain organization, and organismal distribution. Different strains of a single bacterial species often possess strikingly different complements of PTS proteins. Types of PTS protein domain fusions were analyzed, showing that certain types of domain fusions are common, while others are rare or prohibited. Select PTS proteins were analyzed from different phylogenetic standpoints, showing that PTS protein phylogeny often differs from organismal phylogeny. The results document the frequent gain and loss of PTS protein-encoding genes and suggest that the lateral transfer of these genes within the bacterial domain has played an important role in bacterial evolution. Our studies provide insight into the development of complex multicomponent enzyme systems and lead to predictions regarding the types of protein-protein interactions that promote efficient PTS-mediated phosphoryl transfer. PMID:16339738
Chen, Yan-Mei; Du, Zhong-Wei; Yao, Zhen
2005-12-01
Several putative Oct-4 downstream genes from mouse embryonic stem (ES) cells have been identified using the suppression-subtractive hybridization method. In this study, one of the novel genes encoding an ES cell and germ cell specific protein (ESGP) was cloned by rapid amplification of cDNA ends. ESGP contains 801 bp encoding an 84 amino acid small protein and has no significant homology to any known genes. There is a signal peptide at the N-terminal of ESGP protein as predicted by SeqWeb (GCG) (SeqWeb version 2.0.2, http://gcg.biosino.org:8080/). The result of immunofluorescence assay suggested that ESGP might encode a secretory protein. The expression pattern of ESGP is consistent with the expression of Oct-4 during embryonic development. ESGP protein was detected in fertilized oocyte, from 3.5 day postcoital (dpc) blastocyst to 17.5 dpc embryo, and was only detected in testis and ovary tissues in adult. In vitro, ESGP was only expressed in pluripotent cell lines, such as embryonic stem cells, embryonic caoma cells and embryonic germ cells, but not in their differentiated progenies. Despite its specific expression, forced expression of ESGP is not indispensable for the effect of Oct-4 on ES cell self-renewal, and does not affect the differentiation to three germ layers.
Three new members of the RNP protein family in Xenopus.
Good, P J; Rebbert, M L; Dawid, I B
1993-01-01
Many RNP proteins contain one or more copies of the RNA recognition motif (RRM) and are thought to be involved in cellular RNA metabolism. We have previously characterized in Xenopus a nervous system specific gene, nrp1, that is more similar to the hnRNP A/B proteins than to other known proteins (K. Richter, P. J. Good, and I. B. Dawid (1990), New Biol. 2, 556-565). PCR amplification with degenerate primers was used to identify additional cDNAs encoding two RRMs in Xenopus. Three previously uncharacterized genes were identified. Two genes encode hnRNP A/B proteins with two RRMs and a glycine-rich domain. One of these is the Xenopus homolog of the human A2/B1 gene; the other, named hnRNP A3, is similar to both the A1 and A2 hnRNP genes. The Xenopus hnRNP A1, A2 and A3 genes are expressed throughout development and in all adult tissues. Multiple protein isoforms for the hnRNP A2 gene are predicted that differ by the insertion of short peptide sequences in the glycine-rich domain. The third newly isolated gene, named xrp1, encodes a protein that is related by sequence to the nrp1 protein but is expressed ubiquitously. Despite the similarity to nuclear RNP proteins, both the nrp1 and xrp1 proteins are localized to the cytoplasm in the Xenopus oocyte. The xrp1 gene may have a function in all cells that is similar to that executed by nrp1 specifically within the nervous system. Images PMID:8451200
Rella, Monika; Elliot, Joann L; Revett, Timothy J; Lanfear, Jerry; Phelan, Anne; Jackson, Richard M; Turner, Anthony J; Hooper, Nigel M
2007-01-01
Background Mammalian angiotensin converting enzyme (ACE) plays a key role in blood pressure regulation. Although multiple ACE-like proteins exist in non-mammalian organisms, to date only one other ACE homologue, ACE2, has been identified in mammals. Results Here we report the identification and characterisation of the gene encoding a third homologue of ACE, termed ACE3, in several mammalian genomes. The ACE3 gene is located on the same chromosome downstream of the ACE gene. Multiple sequence alignment and molecular modelling have been employed to characterise the predicted ACE3 protein. In mouse, rat, cow and dog, the predicted protein has mutations in some of the critical residues involved in catalysis, including the catalytic Glu in the HEXXH zinc binding motif which is Gln, and ESTs or reverse-transcription PCR indicate that the gene is expressed. In humans, the predicted ACE3 protein has an intact HEXXH motif, but there are other deletions and insertions in the gene and no ESTs have been identified. Conclusion In the genomes of several mammalian species there is a gene that encodes a novel, single domain ACE-like protein, ACE3. In mouse, rat, cow and dog ACE3, the catalytic Glu is replaced by Gln in the putative zinc binding motif, indicating that in these species ACE3 would lack catalytic activity as a zinc metalloprotease. In humans, no evidence was found that the ACE3 gene is expressed and the presence of deletions and insertions in the sequence indicate that ACE3 is a pseudogene. PMID:17597519
Wu, Yichao; Arumugam, Krithika; Tay, Martin Qi Xiang; Seshan, Hari; Mohanty, Anee; Cao, Bin
2015-04-01
Comamonas testosteroni is an important environmental bacterium capable of degrading a variety of toxic aromatic pollutants and has been demonstrated to be a promising biocatalyst for environmental decontamination. This organism is often found to be among the primary surface colonizers in various natural and engineered ecosystems, suggesting an extraordinary capability of this organism in environmental adaptation and biofilm formation. The goal of this study was to gain genetic insights into the adaption of C. testosteroni to versatile environments and the importance of a biofilm lifestyle. Specifically, a draft genome of C. testosteroni I2 was obtained. The draft genome is 5,778,710 bp in length and comprises 110 contigs. The average G+C content was 61.88 %. A total of 5365 genes with 5263 protein-coding genes were predicted, whereas 4324 (80.60 % of total genes) protein-encoding genes were associated with predicted functions. The catabolic genes responsible for biodegradation of steroid and other aromatic compounds on draft genome were identified. Plasmid pI2 was found to encode a complete pathway for aniline degradation and a partial catabolic pathway for chloroaniline. This organism was found to be equipped with a sophisticated signaling system which helps it find ideal niches and switch between planktonic and biofilm lifestyles. A large number of putative multi-drug-resistant genes coding for abundant outer membrane transporters, chaperones, and heat shock proteins for the protection of cellular function were identified in the genome of strain I2. In addition, the genome of strain I2 was predicted to encode several proteins involved in producing, secreting, and uptaking siderophores under iron-limiting conditions. The genome of strain I2 contains a number of genes responsible for the synthesis and secretion of exopolysaccharides, an extracellular component essential for biofilm formation. Overall, our results reveal the genomic features underlying the adaption of C. testosteroni to versatile environments and highlighting the importance of its biofilm lifestyle.
Firth, Andrew E; Atkins, John F
2009-01-01
Japanese encephalitis, West Nile, Usutu and Murray Valley encephalitis viruses form a tight subgroup within the larger Flavivirus genus. These viruses utilize a single-polyprotein expression strategy, resulting in ~10 mature proteins. Plotting the conservation at synonymous sites along the polyprotein coding sequence reveals strong conservation peaks at the very 5' end of the coding sequence, and also at the 5' end of the sequence encoding the NS2A protein. Such peaks are generally indicative of functionally important non-coding sequence elements. The second peak corresponds to a predicted stable pseudoknot structure whose biological importance is supported by compensatory mutations that preserve the structure. The pseudoknot is preceded by a conserved slippery heptanucleotide (Y CCU UUU), thus forming a classical stimulatory motif for -1 ribosomal frameshifting. We hypothesize, therefore, that the functional importance of the pseudoknot is to stimulate a portion of ribosomes to shift -1 nt into a short (45 codon), conserved, overlapping open reading frame, termed foo. Since cleavage at the NS1-NS2A boundary is known to require synthesis of NS2A in cis, the resulting transframe fusion protein is predicted to be NS1-NS2AN-term-FOO. We hypothesize that this may explain the origin of the previously identified NS1 'extension' protein in JEV-group flaviviruses, known as NS1'. PMID:19196463
The ORF1 Protein Encoded by LINE-1: Structure and Function During L1 Retrotransposition
Martin, Sandra L.
2006-01-01
LINE-1, or L1 is an autonomous non-LTR retrotransposon in mammals. Retrotransposition requires the function of the two, L1-encoded polypeptides, ORF1p and ORF2p. Early recognition of regions of homology between the predicted amino acid sequence of ORF2 and known endonuclease and reverse transcriptase enzymes led to testable hypotheses regarding the function of ORF2p in retrotransposition. As predicted, ORF2p has been demonstrated to have both endonuclease and reverse transcriptase activities. In contrast, no homologs of known function have contributed to our understanding of the function of ORF1p during retrotransposition. Nevertheless, significant advances have been made such that we now know that ORF1p is a high affinity RNA binding protein that forms a ribonucleoprotein particle together with L1 RNA. Furthermore, ORF1p is a nucleic acid chaperone and this nucleic acid chaperone activity is required for L1 retrotransposition. PMID:16877816
Pritham, Ellen J; Putliwala, Tasneem; Feschotte, Cédric
2007-04-01
We previously identified a group of atypical mobile elements designated Mavericks from the nematodes Caenorhabditis elegans and C. briggsae and the zebrafish Danio rerio. Here we present the results of comprehensive database searches of the genome sequences available, which reveal that Mavericks are widespread in invertebrates and non-mammalian vertebrates but show a patchy distribution in non-animal species, being present in the fungi Glomus intraradices and Phakopsora pachyrhizi and in several single-celled eukaryotes such as the ciliate Tetrahymena thermophila, the stramenopile Phytophthora infestans and the trichomonad Trichomonas vaginalis, but not detectable in plants. This distribution, together with comparative and phylogenetic analyses of Maverick-encoded proteins, is suggestive of an ancient origin of these elements in eukaryotes followed by lineage-specific losses and/or recurrent episodes of horizontal transmission. In addition, we report that Maverick elements have amplified recently to high copy numbers in T. vaginalis where they now occupy as much as 30% of the genome. Sequence analysis confirms that most Mavericks encode a retroviral-like integrase, but lack other open reading frames typically found in retroelements. Nevertheless, the length and conservation of the target site duplication created upon Maverick insertion (5- or 6-bp) is consistent with a role of the integrase-like protein in the integration of a double-stranded DNA transposition intermediate. Mavericks also display long terminal-inverted repeats but do not contain ORFs similar to proteins encoded by DNA transposons. Instead, Mavericks encode a conserved set of 5 to 9 genes (in addition to the integrase) that are predicted to encode proteins with homology to replication and packaging proteins of some bacteriophages and diverse eukaryotic double-stranded DNA viruses, including a DNA polymerase B homolog and putative capsid proteins. Based on these and other structural similarities, we speculate that Mavericks represent an evolutionary missing link between seemingly disparate invasive DNA elements that include bacteriophages, adenoviruses and eukaryotic linear plasmids.
Yan, Hong-Bin; Lou, Zhong-Zi; Li, Li; Brindley, Paul J; Zheng, Yadong; Luo, Xuenong; Hou, Junling; Guo, Aijiang; Jia, Wan-Zhong; Cai, Xuepeng
2014-06-04
Cysticercosis remains a major neglected tropical disease of humanity in many regions, especially in sub-Saharan Africa, Central America and elsewhere. Owing to the emerging drug resistance and the inability of current drugs to prevent re-infection, identification of novel vaccines and chemotherapeutic agents against Taenia solium and related helminth pathogens is a public health priority. The T. solium genome and the predicted proteome were reported recently, providing a wealth of information from which new interventional targets might be identified. In order to characterize and classify the entire repertoire of protease-encoding genes of T. solium, which act fundamental biological roles in all life processes, we analyzed the predicted proteins of this cestode through a combination of bioinformatics tools. Functional annotation was performed to yield insights into the signaling processes relevant to the complex developmental cycle of this tapeworm and to highlight a suite of the proteases as potential intervention targets. Within the genome of this helminth parasite, we identified 200 open reading frames encoding proteases from five clans, which correspond to 1.68% of the 11,902 protein-encoding genes predicted to be present in its genome. These proteases include calpains, cytosolic, mitochondrial signal peptidases, ubiquitylation related proteins, and others. Many not only show significant similarity to proteases in the Conserved Domain Database but have conserved active sites and catalytic domains. KEGG Automatic Annotation Server (KAAS) analysis indicated that ~60% of these proteases share strong sequence identities with proteins of the KEGG database, which are involved in human disease, metabolic pathways, genetic information processes, cellular processes, environmental information processes and organismal systems. Also, we identified signal peptides and transmembrane helices through comparative analysis with classes of important regulatory proteases. Phylogenetic analysis using Bayes approach provided support for inferring functional divergence among regulatory cysteine and serine proteases. Numerous putative proteases were identified for the first time in T. solium, and important regulatory proteases have been predicted. This comprehensive analysis not only complements the growing knowledge base of proteolytic enzymes, but also provides a platform from which to expand knowledge of cestode proteases and to explore their biochemistry and potential as intervention targets.
Comparative genomic analysis of the multispecies probiotic-marketed product VSL#3.
Douillard, François P; Mora, Diego; Eijlander, Robyn T; Wels, Michiel; de Vos, Willem M
2018-01-01
Several probiotic-marketed formulations available for the consumers contain live lactic acid bacteria and/or bifidobacteria. The multispecies product commercialized as VSL#3 has been used for treating various gastro-intestinal disorders. However, like many other products, the bacterial strains present in VSL#3 have only been characterized to a limited extent and their efficacy as well as their predicted mode of action remain unclear, preventing further applications or comparative studies. In this work, the genomes of all eight bacterial strains present in VSL#3 were sequenced and characterized, to advance insights into the possible mode of action of this product and also to serve as a basis for future work and trials. Phylogenetic and genomic data analysis allowed us to identify the 7 species present in the VSL#3 product as specified by the manufacturer. The 8 strains present belong to the species Streptococcus thermophilus, Lactobacillus acidophilus, Lactobacillus paracasei, Lactobacillus plantarum, Lactobacillus helveticus, Bifidobacterium breve and B. animalis subsp. lactis (two distinct strains). Comparative genomics revealed that the draft genomes of the S. thermophilus and L. helveticus strains were predicted to encode most of the defence systems such as restriction modification and CRISPR-Cas systems. Genes associated with a variety of potential probiotic functions were also identified. Thus, in the three Bifidobacterium spp., gene clusters were predicted to encode tight adherence pili, known to promote bacteria-host interaction and intestinal barrier integrity, and to impact host cell development. Various repertoires of putative signalling proteins were predicted to be encoded by the genomes of the Lactobacillus spp., i.e. surface layer proteins, LPXTG-containing proteins, or sortase-dependent pili that may interact with the intestinal mucosa and dendritic cells. Taken altogether, the individual genomic characterization of the strains present in the VSL#3 product confirmed the product specifications, determined its coding capacity as well as identified potential probiotic functions.
Comparative genomic analysis of the multispecies probiotic-marketed product VSL#3
Mora, Diego; Eijlander, Robyn T.; Wels, Michiel; de Vos, Willem M.
2018-01-01
Several probiotic-marketed formulations available for the consumers contain live lactic acid bacteria and/or bifidobacteria. The multispecies product commercialized as VSL#3 has been used for treating various gastro-intestinal disorders. However, like many other products, the bacterial strains present in VSL#3 have only been characterized to a limited extent and their efficacy as well as their predicted mode of action remain unclear, preventing further applications or comparative studies. In this work, the genomes of all eight bacterial strains present in VSL#3 were sequenced and characterized, to advance insights into the possible mode of action of this product and also to serve as a basis for future work and trials. Phylogenetic and genomic data analysis allowed us to identify the 7 species present in the VSL#3 product as specified by the manufacturer. The 8 strains present belong to the species Streptococcus thermophilus, Lactobacillus acidophilus, Lactobacillus paracasei, Lactobacillus plantarum, Lactobacillus helveticus, Bifidobacterium breve and B. animalis subsp. lactis (two distinct strains). Comparative genomics revealed that the draft genomes of the S. thermophilus and L. helveticus strains were predicted to encode most of the defence systems such as restriction modification and CRISPR-Cas systems. Genes associated with a variety of potential probiotic functions were also identified. Thus, in the three Bifidobacterium spp., gene clusters were predicted to encode tight adherence pili, known to promote bacteria-host interaction and intestinal barrier integrity, and to impact host cell development. Various repertoires of putative signalling proteins were predicted to be encoded by the genomes of the Lactobacillus spp., i.e. surface layer proteins, LPXTG-containing proteins, or sortase-dependent pili that may interact with the intestinal mucosa and dendritic cells. Taken altogether, the individual genomic characterization of the strains present in the VSL#3 product confirmed the product specifications, determined its coding capacity as well as identified potential probiotic functions. PMID:29451876
Kobayashi, Michie; Hiraka, Yukie; Abe, Akira; Yaegashi, Hiroki; Natsume, Satoshi; Kikuchi, Hideko; Takagi, Hiroki; Saitoh, Hiromasa; Win, Joe; Kamoun, Sophien; Terauchi, Ryohei
2017-11-22
Downy mildew, caused by the oomycete pathogen Sclerospora graminicola, is an economically important disease of Gramineae crops including foxtail millet (Setaria italica). Plants infected with S. graminicola are generally stunted and often undergo a transformation of flower organs into leaves (phyllody or witches' broom), resulting in serious yield loss. To establish the molecular basis of downy mildew disease in foxtail millet, we carried out whole-genome sequencing and an RNA-seq analysis of S. graminicola. Sequence reads were generated from S. graminicola using an Illumina sequencing platform and assembled de novo into a draft genome sequence comprising approximately 360 Mbp. Of this sequence, 73% comprised repetitive elements, and a total of 16,736 genes were predicted from the RNA-seq data. The predicted genes included those encoding effector-like proteins with high sequence similarity to those previously identified in other oomycete pathogens. Genes encoding jacalin-like lectin-domain-containing secreted proteins were enriched in S. graminicola compared to other oomycetes. Of a total of 1220 genes encoding putative secreted proteins, 91 significantly changed their expression levels during the infection of plant tissues compared to the sporangia and zoospore stages of the S. graminicola lifecycle. We established the draft genome sequence of a downy mildew pathogen that infects Gramineae plants. Based on this sequence and our transcriptome analysis, we generated a catalog of in planta-induced candidate effector genes, providing a solid foundation from which to identify the effectors causing phyllody.
Semova, Natalia; Kapanadze, Bagrat; Corcoran, Martin; Kutsenko, Alexei; Baranova, Ancha; Semov, Alexandre
2003-09-01
IRLB was originally identified as a partial cDNA clone, encoding a 191-aa protein binding the interferon-stimulated response element (ISRE) in the P2 promoter of human MYC. Here, we cloned the full-size IRLB using different bioinformatics tools and an RT-PCR approach. The full-size gene encompasses 131 kb within chromosome 15q22 and consists of 32 exons. IRLB is transcribed as a 6.6-kb mRNA encoding a protein of 1865 aa. IRLB is ubiquitously expressed and its expression is regulated in a growth- and cell cycle-dependent manner. In addition to the ISRE-binding domain IRLB contains a tripartite DENN domain, a nuclear localization signal, two PPRs, and a calmodulin-binding domain. The presence of DENN domains predicts possible interactions of IRLB with GTPases from the Rab family or regulation of growth-induced MAPKs. Strongly homologous proteins were identified in all available vertebrate genomes as well as in Caenorhabditis elegans and Drosophila melanogaster. In human and mouse a family of IRLB proteins exists, consisting of at least three members.
Bourai, Neema; Jacobs, William R; Narayanan, Sujatha
2012-02-01
Mycobacterium tuberculosis genome encodes several high and low molecular mass penicillin binding proteins. One such low molecular mass protein is DacB2 encoded by open reading frame Rv2911 of M. tuberculosis which is predicted to play a role in peptidoglycan synthesis. In this study we have tried to gain an insight into the role of this accessory cell division protein in mycobacterial physiology by performing overexpression and deletion studies. The overproduction of DacB2 in non-pathogenic, fast growing mycobacterium Mycobacterium smegmatis mc(2)155 resulted in reduced growth, an altered colony morphology, a defect in sliding motility and biofilm formation. A point mutant of DacB2 was made wherein the active site serine residue was mutated to cysteine to abolish the penicillin binding function of protein. The overexpression of mutant protein showed similar results indicating that the effects produced were independent of protein's penicillin binding function. The gene encoding DacB2 was deleted in M. tuberculosis by specialized transduction method. The deletion mutant showed reduced growth in Sauton's medium under acidic and low oxygen availability. The in vitro infection studies with THP-1 cells showed increased intracellular survival of dacB2 mutant compared to parent and complemented strains. The colony morphology and antibiotic sensitivity of mutant and wild-type strains were similar. Copyright © 2011 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Rauf, Muhammad; Saeed, Nasir A.; Habib, Imran; Ahmed, Moddassir; Shahzad, Khurram; Mansoor, Shahid; Ali, Rashid
2017-02-01
Structure prediction can provide information about function and active sites of protein which helps to design new functional proteins. H+-pyrophosphatase is transmembrane protein involved in establishing proton motive force for active transport of Na+ across membrane by Na+/H+ antiporters. A full length novel H+-pyrophosphatase gene was isolated from halophytic grass Leptochloa fusca using RT-PCR and RACE method. Full length LfVP1 gene sequence of 2292 nucleotides encodes protein of 764 amino acids. DNA and protein sequences were used for characterization using bioinformatics tools. Various important potential sites were predicted by PROSITE webserver. Primary structural analysis showed LfVP1 as stable protein and Grand average hydropathy (GRAVY) indicated that LfVP1 protein has good hydrosolubility. Secondary structure analysis showed that LfVP1 protein sequence contains significant proportion of alpha helix and random coil. Protein membrane topology suggested the presence of 14 transmembrane domains and presence of catalytic domain in TM3. Three dimensional structure from LfVP1 protein sequence also indicated the presence of 14 transmembrane domains and hydrophobicity surface model showed amino acid hydrophobicity. Ramachandran plot showed that 98% amino acid residues were predicted in the favored region.
Defining Aggressive Prostate Cancer Using a 12-Gene Model1
Riva, Alberto; Kim, Robert; Varambally, Sooryanarayana; He, Le; Kutok, Jeff; Aster, Jonathan C; Tang, Jeffery; Kuefer, Rainer; Hofer, Matthias D; Febbo, Phillip G; Chinnaiyan, Arul M; Rubin, Mark A
2006-01-01
Abstract The critical clinical question in prostate cancer research is: How do we develop means of distinguishing aggressive disease from indolent disease? Using a combination of proteomic and expression array data, we identified a set of 36 genes with concordant dysregulation of protein products that could be evaluated in situ by quantitative immunohistochemistry. Another five prostate cancer biomarkers were included using linear discriminant analysis, we determined that the optimal model used to predict prostate cancer progression consisted of 12 proteins. Using a separate patient population, transcriptional levels of the 12 genes encoding for these proteins predicted prostate-specific antigen failure in 79 men following surgery for clinically localized prostate cancer (P = .0015). This study demonstrates that cross-platform models can lead to predictive models with the possible advantage of being more robust through this selection process. PMID:16533427
Springfeld, Christoph; Darai, Gholamreza; Cattaneo, Roberto
2005-06-01
Rhabdoviruses are negative-stranded RNA viruses of the order Mononegavirales and have been isolated from vertebrates, insects, and plants. Members of the genus Lyssavirus cause the invariably fatal disease rabies, and a member of the genus Vesiculovirus, Chandipura virus, has recently been associated with acute encephalitis in children. We present here the complete genome sequence and transcription map of a rhabdovirus isolated from cultivated cells of hepatocellular carcinoma tissue from a moribund tree shrew. The negative-strand genome of tupaia rhabdovirus is composed of 11,440 nucleotides and encodes six genes that are separated by one or two intergenic nucleotides. In addition to the typical rhabdovirus genes in the order N-P-M-G-L, a gene encoding a small hydrophobic putative type I transmembrane protein of approximately 11 kDa was identified between the M and G genes, and the corresponding transcript was detected in infected cells. Similar to some Vesiculoviruses and many Paramyxovirinae, the P gene has a second overlapping reading frame that can be accessed by ribosomal choice and encodes a protein of 26 kDa, predicted to be the largest C protein of these virus families. Phylogenetic analyses of the tupaia rhabdovirus N and L genes show that the virus is distantly related to the Vesiculoviruses, Ephemeroviruses, and the recently characterized Flanders virus and Oita virus and further extends the sequence territory occupied by animal rhabdoviruses.
Springfeld, Christoph; Darai, Gholamreza; Cattaneo, Roberto
2005-01-01
Rhabdoviruses are negative-stranded RNA viruses of the order Mononegavirales and have been isolated from vertebrates, insects, and plants. Members of the genus Lyssavirus cause the invariably fatal disease rabies, and a member of the genus Vesiculovirus, Chandipura virus, has recently been associated with acute encephalitis in children. We present here the complete genome sequence and transcription map of a rhabdovirus isolated from cultivated cells of hepatocellular carcinoma tissue from a moribund tree shrew. The negative-strand genome of tupaia rhabdovirus is composed of 11,440 nucleotides and encodes six genes that are separated by one or two intergenic nucleotides. In addition to the typical rhabdovirus genes in the order N-P-M-G-L, a gene encoding a small hydrophobic putative type I transmembrane protein of approximately 11 kDa was identified between the M and G genes, and the corresponding transcript was detected in infected cells. Similar to some Vesiculoviruses and many Paramyxovirinae, the P gene has a second overlapping reading frame that can be accessed by ribosomal choice and encodes a protein of 26 kDa, predicted to be the largest C protein of these virus families. Phylogenetic analyses of the tupaia rhabdovirus N and L genes show that the virus is distantly related to the Vesiculoviruses, Ephemeroviruses, and the recently characterized Flanders virus and Oita virus and further extends the sequence territory occupied by animal rhabdoviruses. PMID:15890917
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jackson, P.J.; Walthers, E.A.; Richmond, K.L.
1997-04-01
PCR analysis of 198 Bacillus anthracis isolates revealed a variable region of DNA sequence differing in length among the isolates. Five Polymorphisms differed by the presence Of two to six copies of the 12-bp tandem repeat 5{prime}-CAATATCAACAA-3{prime}. This variable-number tandem repeat (VNTR) region is located within a larger sequence containing one complete open reading frame that encodes a putative 30-kDa protein. Length variation did not change the reading frame of the encoded protein and only changed the copy number of a 4-amino-acid sequence (QYQQ) from 2 to 6. The structure of the VNTR region suggests that these multiple repeats aremore » generated by recombination or polymerase slippage. Protein structures predicted from the reverse-translated DNA sequence suggest that any structural changes in the encoded protein are confined to the region encoded by the VNTR sequence. Copy number differences in the VNTR region were used to define five different B. anthracis alleles. Characterization of 198 isolates revealed allele frequencies of 6.1, 17.7, 59.6, 5.6, and 11.1% sequentially from shorter to longer alleles. The high degree of polymorphism in the VNTR region provides a criterion for assigning isolates to five allelic categories. There is a correlation between categories and geographic distribution. Such molecular markers can be used to monitor the epidemiology of anthrax outbreaks in domestic and native herbivore populations. 22 refs., 4 figs., 3 tabs.« less
Zhang, Jiaxin; Movahedi, Ali; Wang, Xiaoli; Wu, Xiaolong; Yin, Tongming; Zhuge, Qiang
2015-06-01
The increasing resistance of bacteria and fungi to currently available antibiotics is a major concern worldwide, leading to enormous efforts to develop new antibiotics with new modes of actions. In this paper, cDNA encoding cecropin A was amplified from drury (Hyphantria cunea) (dHC) pupa fatbody total RNA using RT-PCR. The full-length dHC-cecropin A cDNA encoded a protein of 63 amino acids with a predicted 26-amino acid signal peptide and a 37-amino acid functional domain. We synthesized the antibacterial peptide (ABP) from the 37-amino acid functional domain (ABP-dHC-cecropin A), and amidated it via the C-terminus. Time-of-flight mass spectrometry showed its molecular weight to be 4058.94. The ABP-dHC-cecropin A was assessed in terms of its protein structure using bioinformatics and CD spectroscopy. The protein's secondary structure was predicted to be α-helical. In an antibacterial activity analysis, the ABP-dHC-cecropin A exhibited strong antibacterial activity against E. coli K12D31 and Agrobacterium EHA105. Copyright © 2014 Elsevier Inc. All rights reserved.
Mares-Mares, Everardo; Gutiérrez-Vargas, Santiago; Pérez-Moreno, Luis; Ordoñez-Acevedo, Leandro G; Barboza-Corona, José E; León-Galván, Ma Fabiola
2017-01-01
The objective of this research was to identify and characterize the encoded peptides present in nut storage proteins of Carya illinoinensis . It was found, through in silico prediction, proteomic analysis, and MS spectrometry, that bioactive peptides were mainly found in albumin and glutelin fractions. Glutelin was the major fraction with ~53% of the nut storage proteins containing at least 21 peptides with different putative biological activities, including antihypertensives, antioxidants, immunomodulators, protease inhibitors, and inhibitors of cell cycle progression in cancer cells. Data showed that using 50 μ g/mL tryptic digests of enriched peptides obtained from nut glutelins is able to induce up to 19% of apoptosis in both HeLa and CasKi cervical cancer cells. To our knowledge, this is the first report that shows the potential value of the nut-encoded peptides to be considered as adjuvants in cancer therapies.
Gutiérrez-Vargas, Santiago; Pérez-Moreno, Luis; Ordoñez-Acevedo, Leandro G.
2017-01-01
The objective of this research was to identify and characterize the encoded peptides present in nut storage proteins of Carya illinoinensis. It was found, through in silico prediction, proteomic analysis, and MS spectrometry, that bioactive peptides were mainly found in albumin and glutelin fractions. Glutelin was the major fraction with ~53% of the nut storage proteins containing at least 21 peptides with different putative biological activities, including antihypertensives, antioxidants, immunomodulators, protease inhibitors, and inhibitors of cell cycle progression in cancer cells. Data showed that using 50 μg/mL tryptic digests of enriched peptides obtained from nut glutelins is able to induce up to 19% of apoptosis in both HeLa and CasKi cervical cancer cells. To our knowledge, this is the first report that shows the potential value of the nut-encoded peptides to be considered as adjuvants in cancer therapies. PMID:29279842
O’Keeffe, Triona; Hill, Colin; Ross, R. Paul
1999-01-01
Enterocin A is a small, heat-stable, antilisterial bacteriocin produced by Enterococcus faecium DPC1146. The sequence of a 10,879-bp chromosomal region containing at least 12 open reading frames (ORFs), 7 of which are predicted to play a role in enterocin biosynthesis, is presented. The genes entA, entI, and entF encode the enterocin A prepeptide, the putative immunity protein, and the induction factor prepeptide, respectively. The deduced proteins EntK and EntR resemble the histidine kinase and response regulator proteins of two-component signal transducing systems of the AgrC-AgrA type. The predicted proteins EntT and EntD are homologous to ABC (ATP-binding cassette) transporters and accessory factors, respectively, of several other bacteriocin systems and to proteins implicated in the signal-sequence-independent export of Escherichia coli hemolysin A. Immediately downstream of the entT and entD genes are two ORFs, the product of one of which, ORF4, is very similar to the product of the yteI gene of Bacillus subtilis and to E. coli protease IV, a signal peptide peptidase known to be involved in outer membrane lipoprotein export. Another potential bacteriocin is encoded in the opposite direction to the other genes in the enterocin cluster. This putative bacteriocin-like peptide is similar to LafX, one of the components of the lactacin F complex. A deletion which included one of two direct repeats upstream of the entA gene abolished enterocin A activity, immunity, and ability to induce bacteriocin production. Transposon insertion upstream of the entF gene also had the same effect, but this mutant could be complemented by exogenously supplied induction factor. The putative EntI peptide was shown to be involved in the immunity to enterocin A. Cloning of a 10.5-kb amplicon comprising all predicted ORFs and regulatory regions resulted in heterologous production of enterocin A and induction factor in Enterococcus faecalis, while a four-gene construct (entAITD) under the control of a constitutive promoter resulted in heterologous enterocin A production in both E. faecalis and Lactococcus lactis. PMID:10103244
Yu, Dongjun; Wu, Xiaowei; Shen, Hongbin; Yang, Jian; Tang, Zhenmin; Qi, Yong; Yang, Jingyu
2012-12-01
Membrane proteins are encoded by ~ 30% in the genome and function importantly in the living organisms. Previous studies have revealed that membrane proteins' structures and functions show obvious cell organelle-specific properties. Hence, it is highly desired to predict membrane protein's subcellular location from the primary sequence considering the extreme difficulties of membrane protein wet-lab studies. Although many models have been developed for predicting protein subcellular locations, only a few are specific to membrane proteins. Existing prediction approaches were constructed based on statistical machine learning algorithms with serial combination of multi-view features, i.e., different feature vectors are simply serially combined to form a super feature vector. However, such simple combination of features will simultaneously increase the information redundancy that could, in turn, deteriorate the final prediction accuracy. That's why it was often found that prediction success rates in the serial super space were even lower than those in a single-view space. The purpose of this paper is investigation of a proper method for fusing multiple multi-view protein sequential features for subcellular location predictions. Instead of serial strategy, we propose a novel parallel framework for fusing multiple membrane protein multi-view attributes that will represent protein samples in complex spaces. We also proposed generalized principle component analysis (GPCA) for feature reduction purpose in the complex geometry. All the experimental results through different machine learning algorithms on benchmark membrane protein subcellular localization datasets demonstrate that the newly proposed parallel strategy outperforms the traditional serial approach. We also demonstrate the efficacy of the parallel strategy on a soluble protein subcellular localization dataset indicating the parallel technique is flexible to suite for other computational biology problems. The software and datasets are available at: http://www.csbio.sjtu.edu.cn/bioinf/mpsp.
Lugli, Gabriele Andrea; Mancino, Walter; Milani, Christian; Duranti, Sabrina; Turroni, Francesca; van Sinderen, Douwe; Ventura, Marco
2018-06-08
The repertoire of secreted proteins decoded by a microorganism represents proteins released from or associated with the cell's surface. In gut commensals, such as bifidobacteria, these proteins are perceived to be functionally relevant as they regulate the interaction with the gut environment. In the current study, we have screened the predicted proteome of over 300 bifidobacterial strains amongst the currently recognized bifidobacterial species to generate a comprehensive database encompassing bifidobacterial extracellular proteins. A glycobiome analysis of this predicted bifidobacterial secretome revealed that a correlation exists between particular bifidobacterial species and their capability to hydrolyze HMOs and intestinal glyconjugates such as mucin. Furthermore, exploration of metatranscriptomic datasets of the infant gut microbiota allowed the evaluation of the expression of bifidobacterial genes encoding extracellular proteins, represented by ABC transporter substrate-binding proteins and glycoside hydrolases enzymes involved in the degradation of human milk oligosaccharides and mucin. Overall, this study provides insights into how bifidobacteria interact with their natural yet highly complex environment, the infant gut. Importance The ecological success of bifidobacteria relies on the activity of extracellular proteins that are involved in the metabolism of nutrients and the interaction with the environment. To date, information on secreted proteins encoded by bifidobacteria are incomplete and just related to few species. In this study, we reconstructed the bifidobacterial pan-secretome, revealing extracellular proteins that modulate the interaction of bifidobacteria with their natural environment. Furthermore, a survey of secretion system between bifidobacterial genomes allowed the identification of a conserved Sec-dependent secretion machinery in all the analyzed genomes and the Tat protein translocation system in the chromosomes of 23 strains belonging to Bifidobacterium longum subsp. longum and Bifidobacterium aesculapii . Copyright © 2018 American Society for Microbiology.
Petrucco, S; Bolchi, A; Foroni, C; Percudani, R; Rossi, G L; Ottonello, S
1996-01-01
we isolated a novel gene that is selectively induced both in roots and shoots in response to sulfur starvation. This gene encodes a cytosolic, monomeric protein of 33 kD that selectively binds NADPH. The predicted polypeptide is highly homologous ( > 70%) to leguminous isoflavone reductases (IFRs), but the maize protein (IRL for isoflavone reductase-like) belongs to a novel family of proteins present in a variety of plants. Anti-IRL antibodies specifically recognize IFR polypeptides, yet the maize protein is unable to use various isoflavonoids as substrates. IRL expression is correlated closely to glutathione availability: it is persistently induced in seedlings whose glutathione content is about fourfold lower than controls, and it is down-regulated rapidly when control levels of glutathione are restored. This glutathione-dependent regulation indicates that maize IRL may play a crucial role in the establishment of a thiol-independent response to oxidative stress under glutathione shortage conditions. PMID:8597660
Petrucco, S; Bolchi, A; Foroni, C; Percudani, R; Rossi, G L; Ottonello, S
1996-01-01
we isolated a novel gene that is selectively induced both in roots and shoots in response to sulfur starvation. This gene encodes a cytosolic, monomeric protein of 33 kD that selectively binds NADPH. The predicted polypeptide is highly homologous ( > 70%) to leguminous isoflavone reductases (IFRs), but the maize protein (IRL for isoflavone reductase-like) belongs to a novel family of proteins present in a variety of plants. Anti-IRL antibodies specifically recognize IFR polypeptides, yet the maize protein is unable to use various isoflavonoids as substrates. IRL expression is correlated closely to glutathione availability: it is persistently induced in seedlings whose glutathione content is about fourfold lower than controls, and it is down-regulated rapidly when control levels of glutathione are restored. This glutathione-dependent regulation indicates that maize IRL may play a crucial role in the establishment of a thiol-independent response to oxidative stress under glutathione shortage conditions.
Primary structure, expression and chromosomal locus of a human homolog of rat ERK3.
Meloche, S; Beatty, B G; Pellerin, J
1996-10-03
We report the cloning and characterization of a human cDNA encoding a novel homolog of rat extracellular signal-regulated kinase 3 (ERK3). The cDNA encodes a predicted protein of 721 amino acids which shares 92% amino acid identity with rat ERK3 over their shared length. Interestingly, the human protein contains a unique extension of 178 amino acids at its carboxy terminal extremity. The human ERK3 protein also displays various degrees of homology to other members of the MAP kinases family, but does not contain the typical TXY regulatory motif between subdomains VII and VIII. Northern blot analysis revealed that ERK3 mRNA is widely distributed in human tissues, with the highest expression detected in skeletal muscle. The human ERK3 gene was mapped by fluorescence in situ hybridization to chromosome 15q21, a region associated with chromosomal abnormalities in acute nonlymphoblastic leukemias. This information should prove valuable in designing studies to define the cellular function of the ERK3 protein kinase.
Lo, Miranda; Murray, Gerald L; Khoo, Chen Ai; Haake, David A; Zuerner, Richard L; Adler, Ben
2010-11-01
Leptospirosis is a globally significant zoonosis caused by Leptospira spp. Iron is essential for growth of most bacterial species. Since iron availability is low in the host, pathogens have evolved complex iron acquisition mechanisms to survive and establish infection. In many bacteria, expression of iron uptake and storage proteins is regulated by Fur. L. interrogans encodes four predicted Fur homologs; we have constructed a mutation in one of these, la1857. We conducted microarray analysis to identify iron-responsive genes and to study the effects of la1857 mutation on gene expression. Under iron-limiting conditions, 43 genes were upregulated and 49 genes were downregulated in the wild type. Genes encoding proteins with predicted involvement in inorganic ion transport and metabolism (including TonB-dependent proteins and outer membrane transport proteins) were overrepresented in the upregulated list, while 54% of differentially expressed genes had no known function. There were 16 upregulated genes of unknown function which are absent from the saprophyte L. biflexa and which therefore may encode virulence-associated factors. Expression of iron-responsive genes was not significantly affected by mutagenesis of la1857, indicating that LA1857 is not a global regulator of iron homeostasis. Upregulation of heme biosynthetic genes and a putative catalase in the mutant suggested that LA1857 is more similar to PerR, a regulator of the oxidative stress response. Indeed, the la1857 mutant was more resistant to peroxide stress than the wild type. Our results provide insights into the role of iron in leptospiral metabolism and regulation of the oxidative stress response, including genes likely to be important for virulence.
Bown, David P; Gatehouse, John A
2004-05-01
Carboxypeptidases were purified from guts of larvae of corn earworm (Helicoverpa armigera), a lepidopteran crop pest, by affinity chromatography on immobilized potato carboxypeptidase inhibitor, and characterized by N-terminal sequencing. A larval gut cDNA library was screened using probes based on these protein sequences. cDNA HaCA42 encoded a carboxypeptidase with sequence similarity to enzymes of clan MC [Barrett, A. J., Rawlings, N. D. & Woessner, J. F. (1998) Handbook of Proteolytic Enzymes. Academic Press, London.], but with a novel predicted specificity towards C-terminal acidic residues. This carboxypeptidase was expressed as a recombinant proprotein in the yeast Pichia pastoris. The expressed protein could be activated by treatment with bovine trypsin; degradation of bound pro-region, rather than cleavage of pro-region from mature protein, was the rate-limiting step in activation. Activated HaCA42 carboxypeptidase hydrolysed a synthetic substrate for glutamate carboxypeptidases (FAEE, C-terminal Glu), but did not hydrolyse substrates for carboxypeptidase A or B (FAPP or FAAK, C-terminal Phe or Lys) or methotrexate, cleaved by clan MH glutamate carboxypeptidases. The enzyme was highly specific for C-terminal glutamate in peptide substrates, with slow hydrolysis of C-terminal aspartate also observed. Glutamate carboxypeptidase activity was present in larval gut extract from H. armigera. The HaCA42 protein is the first glutamate-specific metallocarboxypeptidase from clan MC to be identified and characterized. The genome of Drosophila melanogaster contains genes encoding enzymes with similar sequences and predicted specificity, and a cDNA encoding a similar enzyme has been isolated from gut tissue in tsetse fly. We suggest that digestive carboxypeptidases with sequence similarity to the classical mammalian enzymes, but with specificity towards C-terminal glutamate, are widely distributed in insects.
Munteanu, Cristian R; Gonzalez-Diaz, Humberto; Garcia, Rafael; Loza, Mabel; Pazos, Alejandro
2015-01-01
The molecular information encoding into molecular descriptors is the first step into in silico Chemoinformatics methods in Drug Design. The Machine Learning methods are a complex solution to find prediction models for specific biological properties of molecules. These models connect the molecular structure information such as atom connectivity (molecular graphs) or physical-chemical properties of an atom/group of atoms to the molecular activity (Quantitative Structure - Activity Relationship, QSAR). Due to the complexity of the proteins, the prediction of their activity is a complicated task and the interpretation of the models is more difficult. The current review presents a series of 11 prediction models for proteins, implemented as free Web tools on an Artificial Intelligence Model Server in Biosciences, Bio-AIMS (http://bio-aims.udc.es/TargetPred.php). Six tools predict protein activity, two models evaluate drug - protein target interactions and the other three calculate protein - protein interactions. The input information is based on the protein 3D structure for nine models, 1D peptide amino acid sequence for three tools and drug SMILES formulas for two servers. The molecular graph descriptor-based Machine Learning models could be useful tools for in silico screening of new peptides/proteins as future drug targets for specific treatments.
Global analyses of Ceratocystis cacaofunesta mitochondria: from genome to proteome.
Ambrosio, Alinne Batista; do Nascimento, Leandro Costa; Oliveira, Bruno V; Teixeira, Paulo José P L; Tiburcio, Ricardo A; Toledo Thomazella, Daniela P; Leme, Adriana F P; Carazzolle, Marcelo F; Vidal, Ramon O; Mieczkowski, Piotr; Meinhardt, Lyndel W; Pereira, Gonçalo A G; Cabrera, Odalys G
2013-02-11
The ascomycete fungus Ceratocystis cacaofunesta is the causal agent of wilt disease in cacao, which results in significant economic losses in the affected producing areas. Despite the economic importance of the Ceratocystis complex of species, no genomic data are available for any of its members. Given that mitochondria play important roles in fungal virulence and the susceptibility/resistance of fungi to fungicides, we performed the first functional analysis of this organelle in Ceratocystis using integrated "omics" approaches. The C. cacaofunesta mitochondrial genome (mtDNA) consists of a single, 103,147-bp circular molecule, making this the second largest mtDNA among the Sordariomycetes. Bioinformatics analysis revealed the presence of 15 conserved genes and 37 intronic open reading frames in C. cacaofunesta mtDNA. Here, we predicted the mitochondrial proteome (mtProt) of C. cacaofunesta, which is comprised of 1,124 polypeptides - 52 proteins that are mitochondrially encoded and 1,072 that are nuclearly encoded. Transcriptome analysis revealed 33 probable novel genes. Comparisons among the Gene Ontology results of the predicted mtProt of C. cacaofunesta, Neurospora crassa and Saccharomyces cerevisiae revealed no significant differences. Moreover, C. cacaofunesta mitochondria were isolated, and the mtProt was subjected to mass spectrometric analysis. The experimental proteome validated 27% of the predicted mtProt. Our results confirmed the existence of 110 hypothetical proteins and 7 novel proteins of which 83 and 1, respectively, had putative mitochondrial localization. The present study provides the first partial genomic analysis of a species of the Ceratocystis genus and the first predicted mitochondrial protein inventory of a phytopathogenic fungus. In addition to the known mitochondrial role in pathogenicity, our results demonstrated that the global function analysis of this organelle is similar in pathogenic and non-pathogenic fungi, suggesting that its relevance in the lifestyle of these organisms should be based on a small number of specific proteins and/or with respect to differential gene regulation. In this regard, particular interest should be directed towards mitochondrial proteins with unknown function and the novel protein that might be specific to this species. Further functional characterization of these proteins could enhance our understanding of the role of mitochondria in phytopathogenicity.
Global analyses of Ceratocystis cacaofunesta mitochondria: from genome to proteome
2013-01-01
Background The ascomycete fungus Ceratocystis cacaofunesta is the causal agent of wilt disease in cacao, which results in significant economic losses in the affected producing areas. Despite the economic importance of the Ceratocystis complex of species, no genomic data are available for any of its members. Given that mitochondria play important roles in fungal virulence and the susceptibility/resistance of fungi to fungicides, we performed the first functional analysis of this organelle in Ceratocystis using integrated “omics” approaches. Results The C. cacaofunesta mitochondrial genome (mtDNA) consists of a single, 103,147-bp circular molecule, making this the second largest mtDNA among the Sordariomycetes. Bioinformatics analysis revealed the presence of 15 conserved genes and 37 intronic open reading frames in C. cacaofunesta mtDNA. Here, we predicted the mitochondrial proteome (mtProt) of C. cacaofunesta, which is comprised of 1,124 polypeptides - 52 proteins that are mitochondrially encoded and 1,072 that are nuclearly encoded. Transcriptome analysis revealed 33 probable novel genes. Comparisons among the Gene Ontology results of the predicted mtProt of C. cacaofunesta, Neurospora crassa and Saccharomyces cerevisiae revealed no significant differences. Moreover, C. cacaofunesta mitochondria were isolated, and the mtProt was subjected to mass spectrometric analysis. The experimental proteome validated 27% of the predicted mtProt. Our results confirmed the existence of 110 hypothetical proteins and 7 novel proteins of which 83 and 1, respectively, had putative mitochondrial localization. Conclusions The present study provides the first partial genomic analysis of a species of the Ceratocystis genus and the first predicted mitochondrial protein inventory of a phytopathogenic fungus. In addition to the known mitochondrial role in pathogenicity, our results demonstrated that the global function analysis of this organelle is similar in pathogenic and non-pathogenic fungi, suggesting that its relevance in the lifestyle of these organisms should be based on a small number of specific proteins and/or with respect to differential gene regulation. In this regard, particular interest should be directed towards mitochondrial proteins with unknown function and the novel protein that might be specific to this species. Further functional characterization of these proteins could enhance our understanding of the role of mitochondria in phytopathogenicity. PMID:23394930
Zhao, Chaoyang; Shukle, Richard; Navarro-Escalante, Lucio; Chen, Mingshun; Richards, Stephen; Stuart, Jeffrey J
2016-01-01
The genetic tractability of the Hessian fly (HF, Mayetiola destructor) provides an opportunity to investigate the mechanisms insects use to induce plant gall formation. Here we demonstrate that capacity using the newly sequenced HF genome by identifying the gene (vH24) that elicits effector-triggered immunity in wheat (Triticum spp.) seedlings carrying HF resistance gene H24. vH24 was mapped within a 230-kb genomic fragment near the telomere of HF chromosome X1. That fragment contains only 21 putative genes. The best candidate vH24 gene in this region encodes a protein containing a secretion signal and a type-2 serine/threonine protein phosphatase (PP2C) domain. This gene has an H24-virulence associated insertion in its promoter that appears to silence transcription of the gene in H24-virulent larvae. Candidate vH24 is a member of a small family of genes that encode secretion signals and PP2C domains. It belongs to the fraction of genes in the HF genome previously predicted to encode effector proteins. Because PP2C proteins are not normally secreted, our results suggest that these are PP2C effectors that HF larvae inject into wheat cells to redirect, or interfere, with wheat signal transduction pathways. Copyright © 2015 Elsevier Ltd. All rights reserved.
Shinzato, Naoya; Enoki, Miho; Sato, Hiroaki; Nakamura, Kohei; Matsui, Toru; Kamagata, Yoichi
2008-10-01
Two methyl coenzyme M reductases (MCRs) encoded by the mcr and mrt operons of the hydrogenotrophic methanogen Methanothermobacter thermautotrophicus DeltaH are expressed in response to H(2) availability. In the present study, cis elements and trans-acting factors responsible for the gene expression of MCRs were investigated by using electrophoretic mobility shift assay (EMSA) and affinity particle purification. A survey of their operator regions by EMSA with protein extracts from mrt-expressing cultures restricted them to 46- and 41-bp-long mcr and mrt upstream regions, respectively. Affinity particle purification of DNA-binding proteins conjugated with putative operator regions resulted in the retrieval of a protein attributed to IMP dehydrogenase-related protein VII (IMPDH VII). IMPDH VII is predicted to have a winged helix-turn-helix DNA-binding motif and two cystathionine beta-synthase domains, and it has been suspected to be an energy-sensing module. EMSA with oligonucleotide probes with unusual sequences showed that the binding site of IMPDH VII mostly overlaps the factor B-responsible element-TATA box of the mcr operon. The results presented here suggest that IMPDH VII encoded by MTH126 is a plausible candidate for the transcriptional regulator of the mcr operon in this methanogen.
Rice Ribosomal Protein Large Subunit Genes and Their Spatio-temporal and Stress Regulation
Moin, Mazahar; Bakshi, Achala; Saha, Anusree; Dutta, Mouboni; Madhav, Sheshu M.; Kirti, P. B.
2016-01-01
Ribosomal proteins (RPs) are well-known for their role in mediating protein synthesis and maintaining the stability of the ribosomal complex, which includes small and large subunits. In the present investigation, in a genome-wide survey, we predicted that the large subunit of rice ribosomes is encoded by at least 123 genes including individual gene copies, distributed throughout the 12 chromosomes. We selected 34 candidate genes, each having 2–3 identical copies, for a detailed characterization of their gene structures, protein properties, cis-regulatory elements and comprehensive expression analysis. RPL proteins appear to be involved in interactions with other RP and non-RP proteins and their encoded RNAs have a higher content of alpha-helices in their predicted secondary structures. The majority of RPs have binding sites for metal and non-metal ligands. Native expression profiling of 34 ribosomal protein large (RPL) subunit genes in tissues covering the major stages of rice growth shows that they are predominantly expressed in vegetative tissues and seedlings followed by meiotically active tissues like flowers. The putative promoter regions of these genes also carry cis-elements that respond specifically to stress and signaling molecules. All the 34 genes responded differentially to the abiotic stress treatments. Phytohormone and cold treatments induced significant up-regulation of several RPL genes, while heat and H2O2 treatments down-regulated a majority of them. Furthermore, infection with a bacterial pathogen, Xanthomonas oryzae, which causes leaf blight also induced the expression of 80% of the RPL genes in leaves. Although the expression of RPL genes was detected in all the tissues studied, they are highly responsive to stress and signaling molecules indicating that their encoded proteins appear to have roles in stress amelioration besides house-keeping. This shows that the RPL gene family is a valuable resource for manipulation of stress tolerance in rice and other crops, which may be achieved by overexpressing and raising independent transgenic plants carrying the genes that became up-regulated significantly and instantaneously. PMID:27605933
cDNA sequence and expression of a cold-responsive gene in Citrus unshiu.
Hara, M; Wakasugi, Y; Ikoma, Y; Yano, M; Ogawa, K; Kuboi, T
1999-02-01
A cDNA clone encoding a protein (CuCOR19), the sequence of which is similar to Poncirus COR19, of the dehydrin family was isolated from the epicarp of Citrus unshiu. The molecular mass of the predicted protein was 18,980 daltons. CuCOR19 was highly hydrophilic and contained three repeating elements including Lys-rich motifs. The gene expression in leaves increased by cold stress.
The genome of Brucella melitensis.
DelVecchio, Vito G; Kapatral, Vinayak; Elzer, Philip; Patra, Guy; Mujer, Cesar V
2002-12-20
The genome of Brucella melitensis strain 16M was sequenced and contained 3,294,931 bp distributed over two circular chromosomes. Chromosome I was composed of 2,117,144 bp and chromosome II has 1,177,787 bp. A total of 3,198 ORFs were predicted. The origins of replication of the chromosomes are similar to each other and to those of other alpha-proteobacteria. Housekeeping genes such as those that encode for DNA replication, protein synthesis, core metabolism, and cell-wall biosynthesis were found on both chromosomes. Genes encoding adhesins, invasins, and hemolysins were also identified.
Improving prediction of heterodimeric protein complexes using combination with pairwise kernel.
Ruan, Peiying; Hayashida, Morihiro; Akutsu, Tatsuya; Vert, Jean-Philippe
2018-02-19
Since many proteins become functional only after they interact with their partner proteins and form protein complexes, it is essential to identify the sets of proteins that form complexes. Therefore, several computational methods have been proposed to predict complexes from the topology and structure of experimental protein-protein interaction (PPI) network. These methods work well to predict complexes involving at least three proteins, but generally fail at identifying complexes involving only two different proteins, called heterodimeric complexes or heterodimers. There is however an urgent need for efficient methods to predict heterodimers, since the majority of known protein complexes are precisely heterodimers. In this paper, we use three promising kernel functions, Min kernel and two pairwise kernels, which are Metric Learning Pairwise Kernel (MLPK) and Tensor Product Pairwise Kernel (TPPK). We also consider the normalization forms of Min kernel. Then, we combine Min kernel or its normalization form and one of the pairwise kernels by plugging. We applied kernels based on PPI, domain, phylogenetic profile, and subcellular localization properties to predicting heterodimers. Then, we evaluate our method by employing C-Support Vector Classification (C-SVC), carrying out 10-fold cross-validation, and calculating the average F-measures. The results suggest that the combination of normalized-Min-kernel and MLPK leads to the best F-measure and improved the performance of our previous work, which had been the best existing method so far. We propose new methods to predict heterodimers, using a machine learning-based approach. We train a support vector machine (SVM) to discriminate interacting vs non-interacting protein pairs, based on informations extracted from PPI, domain, phylogenetic profiles and subcellular localization. We evaluate in detail new kernel functions to encode these data, and report prediction performance that outperforms the state-of-the-art.
Multi-Omics Driven Assembly and Annotation of the Sandalwood (Santalum album) Genome.
Mahesh, Hirehally Basavarajegowda; Subba, Pratigya; Advani, Jayshree; Shirke, Meghana Deepak; Loganathan, Ramya Malarini; Chandana, Shankara Lingu; Shilpa, Siddappa; Chatterjee, Oishi; Pinto, Sneha Maria; Prasad, Thottethodi Subrahmanya Keshava; Gowda, Malali
2018-04-01
Indian sandalwood ( Santalum album ) is an important tropical evergreen tree known for its fragrant heartwood-derived essential oil and its valuable carving wood. Here, we applied an integrated genomic, transcriptomic, and proteomic approach to assemble and annotate the Indian sandalwood genome. Our genome sequencing resulted in the establishment of a draft map of the smallest genome for any woody tree species to date (221 Mb). The genome annotation predicted 38,119 protein-coding genes and 27.42% repetitive DNA elements. In-depth proteome analysis revealed the identities of 72,325 unique peptides, which confirmed 10,076 of the predicted genes. The addition of transcriptomic and proteogenomic approaches resulted in the identification of 53 novel proteins and 34 gene-correction events that were missed by genomic approaches. Proteogenomic analysis also helped in reassigning 1,348 potential noncoding RNAs as bona fide protein-coding messenger RNAs. Gene expression patterns at the RNA and protein levels indicated that peptide sequencing was useful in capturing proteins encoded by nuclear and organellar genomes alike. Mass spectrometry-based proteomic evidence provided an unbiased approach toward the identification of proteins encoded by organellar genomes. Such proteins are often missed in transcriptome data sets due to the enrichment of only messenger RNAs that contain poly(A) tails. Overall, the use of integrated omic approaches enhanced the quality of the assembly and annotation of this nonmodel plant genome. The availability of genomic, transcriptomic, and proteomic data will enhance genomics-assisted breeding, germplasm characterization, and conservation of sandalwood trees. © 2018 American Society of Plant Biologists. All Rights Reserved.
Basse, Christoph W.; Kerschbamer, Christine; Brustmann, Markus; Altmann, Thomas; Kahmann, Regine
2002-01-01
We have identified a gene (udh1) in the basidiomycete Ustilago maydis that is induced during the parasitic interaction with its host plant maize (Zea mays). udh1 encodes a protein with high similarity to mammalian and plant 5α-steroid reductases. Udh1 differs from those of known 5α-steroid reductases by six additional domains, partially predicted to be membrane-spanning. A fusion protein of Udh1 and the green fluorescent protein provided evidence for endoplasmic reticulum localization in U. maydis. The function of the Udh1 protein was demonstrated by complementing Arabidopsis det2-1 mutants, which display a dwarf phenotype due to a mutation in the 5α-steroid reductase encoding DET2 gene. det2-1 mutant plants expressing either the udh1 or the DET2 gene controlled by the cauliflower mosaic virus 35S promoter differed from wild-type Columbia plants by accelerated stem growth, flower and seed development and a reduction in size and number of rosette leaves. The accelerated growth phenotype of udh1 transgenic plants was stably inherited and was favored under reduced light conditions. Truncation of the N-terminal 70 amino acids of the Udh1 protein abolished the ability to restore growth in det2-1 plants. Our results demonstrate the existence of a 5α-steroid reductase encoding gene in fungi and suggest a common ancestor between fungal, plant, and mammalian proteins. PMID:12068114
Basse, Christoph W; Kerschbamer, Christine; Brustmann, Markus; Altmann, Thomas; Kahmann, Regine
2002-06-01
We have identified a gene (udh1) in the basidiomycete Ustilago maydis that is induced during the parasitic interaction with its host plant maize (Zea mays). udh1 encodes a protein with high similarity to mammalian and plant 5alpha-steroid reductases. Udh1 differs from those of known 5alpha-steroid reductases by six additional domains, partially predicted to be membrane-spanning. A fusion protein of Udh1 and the green fluorescent protein provided evidence for endoplasmic reticulum localization in U. maydis. The function of the Udh1 protein was demonstrated by complementing Arabidopsis det2-1 mutants, which display a dwarf phenotype due to a mutation in the 5alpha-steroid reductase encoding DET2 gene. det2-1 mutant plants expressing either the udh1 or the DET2 gene controlled by the cauliflower mosaic virus 35S promoter differed from wild-type Columbia plants by accelerated stem growth, flower and seed development and a reduction in size and number of rosette leaves. The accelerated growth phenotype of udh1 transgenic plants was stably inherited and was favored under reduced light conditions. Truncation of the N-terminal 70 amino acids of the Udh1 protein abolished the ability to restore growth in det2-1 plants. Our results demonstrate the existence of a 5alpha-steroid reductase encoding gene in fungi and suggest a common ancestor between fungal, plant, and mammalian proteins.
Current Understanding of Usher Syndrome Type II
Yang, Jun; Wang, Le; Song, Hongman; Sokolov, Maxim
2012-01-01
Usher syndrome is the most common deafness-blindness caused by genetic mutations. To date, three genes have been identified underlying the most prevalent form of Usher syndrome, the type II form (USH2). The proteins encoded by these genes are demonstrated to form a complex in vivo. This complex is localized mainly at the periciliary membrane complex in photoreceptors and the ankle-link of the stereocilia in hair cells. Many proteins have been found to interact with USH2 proteins in vitro, suggesting that they are potential additional components of this USH2 complex and that the genes encoding these proteins may be the candidate USH2 genes. However, further investigations are critical to establish their existence in the USH2 complex in vivo. Based on the predicted functional domains in USH2 proteins, their cellular localizations in photoreceptors and hair cells, the observed phenotypes in USH2 mutant mice, and the known knowledge about diseases similar to USH2, putative biological functions of the USH2 complex have been proposed. Finally, therapeutic approaches for this group of diseases are now being actively explored. PMID:22201796
Two Membrane-Associated Tyrosine Phosphatase Homologs Potentiate C. elegans AKT-1/PKB Signaling
Hu, Patrick J; Xu, Jinling; Ruvkun, Gary
2006-01-01
Akt/protein kinase B (PKB) functions in conserved signaling cascades that regulate growth and metabolism. In humans, Akt/PKB is dysregulated in diabetes and cancer; in Caenorhabditis elegans, Akt/PKB functions in an insulin-like signaling pathway to regulate larval development. To identify molecules that modulate C. elegans Akt/PKB signaling, we performed a genetic screen for enhancers of the akt-1 mutant phenotype (eak). We report the analysis of three eak genes. eak-6 and eak-5/sdf-9 encode protein tyrosine phosphatase homologs; eak-4 encodes a novel protein with an N-myristoylation signal. All three genes are expressed primarily in the two endocrine XXX cells, and their predicted gene products localize to the plasma membrane. Genetic evidence indicates that these proteins function in parallel to AKT-1 to inhibit the FoxO transcription factor DAF-16. These results define two membrane-associated protein tyrosine phosphatase homologs that may potentiate C. elegans Akt/PKB signaling by cell autonomous and cell nonautonomous mechanisms. Similar molecules may modulate Akt/PKB signaling in human endocrine tissues. PMID:16839187
The Xylella fastidiosa PD1063 Protein Is Secreted in Association with Outer Membrane Vesicles
Pierce, Brittany K.; Voegel, Tanja; Kirkpatrick, Bruce C.
2014-01-01
Xylella fastidiosa is a gram-negative, xylem-limited plant pathogenic bacterium that causes disease in a variety of economically important agricultural crops including Pierce's disease of grapevines. Xylella fastidiosa biofilms formed in the xylem vessels of plants play a key role in early colonization and pathogenicity by providing a protected niche and enhanced cell survival. Here we investigate the role of Xylella fastidiosa PD1063, the predicted ortholog of Xanthomonas oryzae pv. oryzae PXO_03968, which encodes an outer membrane protein. To assess the function of the Xylella fastidiosa ortholog, we created Xylella fastidiosa mutants deleted for PD1063 and then assessed biofilm formation, cell-cell aggregation and cell growth in vitro. We also assessed disease severity and pathogen titers in grapevines mechanically inoculated with the Xylella fastidiosa PD1063 mutant. We found a significant decrease in cell-cell aggregation among PD1063 mutants but no differences in cell growth, biofilm formation, disease severity or titers in planta. Based on the demonstration that Xanthomonas oryzae pv. oryzae PXO_03968 encodes an outer membrane protein, secreted in association with outer membrane vesicles, we predicted that PD1063 would also be secreted in a similar manner. Using anti-PD1063 antibodies, we found PD1063 in the supernatant and secreted in association with outer membrane vesicles. PD1063 purified from the supernatant, outer membrane fractions and outer membrane vesicles was 19.2 kD, corresponding to the predicted size of the processed protein. Our findings suggest Xylella fastidiosa PD1063 is not essential for development of Pierce's disease in Vitis vinifera grapevines although further research is required to determine the function of the PD1063 outer membrane protein in Xylella fastidiosa. PMID:25426629
The Xylella fastidiosa PD1063 protein is secreted in association with outer membrane vesicles.
Pierce, Brittany K; Voegel, Tanja; Kirkpatrick, Bruce C
2014-01-01
Xylella fastidiosa is a gram-negative, xylem-limited plant pathogenic bacterium that causes disease in a variety of economically important agricultural crops including Pierce's disease of grapevines. Xylella fastidiosa biofilms formed in the xylem vessels of plants play a key role in early colonization and pathogenicity by providing a protected niche and enhanced cell survival. Here we investigate the role of Xylella fastidiosa PD1063, the predicted ortholog of Xanthomonas oryzae pv. oryzae PXO_03968, which encodes an outer membrane protein. To assess the function of the Xylella fastidiosa ortholog, we created Xylella fastidiosa mutants deleted for PD1063 and then assessed biofilm formation, cell-cell aggregation and cell growth in vitro. We also assessed disease severity and pathogen titers in grapevines mechanically inoculated with the Xylella fastidiosa PD1063 mutant. We found a significant decrease in cell-cell aggregation among PD1063 mutants but no differences in cell growth, biofilm formation, disease severity or titers in planta. Based on the demonstration that Xanthomonas oryzae pv. oryzae PXO_03968 encodes an outer membrane protein, secreted in association with outer membrane vesicles, we predicted that PD1063 would also be secreted in a similar manner. Using anti-PD1063 antibodies, we found PD1063 in the supernatant and secreted in association with outer membrane vesicles. PD1063 purified from the supernatant, outer membrane fractions and outer membrane vesicles was 19.2 kD, corresponding to the predicted size of the processed protein. Our findings suggest Xylella fastidiosa PD1063 is not essential for development of Pierce's disease in Vitis vinifera grapevines although further research is required to determine the function of the PD1063 outer membrane protein in Xylella fastidiosa.
A putative regulatory genetic locus modulates virulence in the pathogen Leptospira interrogans.
Eshghi, Azad; Becam, Jérôme; Lambert, Ambroise; Sismeiro, Odile; Dillies, Marie-Agnès; Jagla, Bernd; Wunder, Elsio A; Ko, Albert I; Coppee, Jean-Yves; Goarant, Cyrille; Picardeau, Mathieu
2014-06-01
Limited research has been conducted on the role of transcriptional regulators in relation to virulence in Leptospira interrogans, the etiological agent of leptospirosis. Here, we identify an L. interrogans locus that encodes a sensor protein, an anti-sigma factor antagonist, and two genes encoding proteins of unknown function. Transposon insertion into the gene encoding the sensor protein led to dampened transcription of the other 3 genes in this locus. This lb139 insertion mutant (the lb139(-) mutant) displayed attenuated virulence in the hamster model of infection and reduced motility in vitro. Whole-transcriptome analyses using RNA sequencing revealed the downregulation of 115 genes and the upregulation of 28 genes, with an overrepresentation of gene products functioning in motility and signal transduction and numerous gene products with unknown functions, predicted to be localized to the extracellular space. Another significant finding encompassed suppressed expression of the majority of the genes previously demonstrated to be upregulated at physiological osmolarity, including the sphingomyelinase C precursor Sph2 and LigB. We provide insight into a possible requirement for transcriptional regulation as it relates to leptospiral virulence and suggest various biological processes that are affected due to the loss of native expression of this genetic locus.
ssrA (tmRNA) Plays a Role in Salmonella enterica Serovar Typhimurium Pathogenesis
Julio, Steven M.; Heithoff, Douglas M.; Mahan, Michael J.
2000-01-01
Escherichia coli ssrA encodes a small stable RNA molecule, tmRNA, that has many diverse functions, including tagging abnormal proteins for degradation, supporting phage growth, and modulating the activity of DNA binding proteins. Here we show that ssrA plays a role in Salmonella enterica serovar Typhimurium pathogenesis and in the expression of several genes known to be induced during infection. Moreover, the phage-like attachment site, attL, encoded within ssrA, serves as the site of integration of a region of Salmonella-specific sequence; adjacent to the 5′ end of ssrA is another region of Salmonella-specific sequence with extensive homology to predicted proteins encoded within the unlinked Salmonella pathogenicity island SPI4. S. enterica serovar Typhimurium ssrA mutants fail to support the growth of phage P22 and are delayed in their ability to form viable phage particles following induction of a phage P22 lysogen. These data indicate that ssrA plays a role in the pathogenesis of Salmonella, serves as an attachment site for Salmonella-specific sequences, and is required for the growth of phage P22. PMID:10692360
Bell, Andrew; Moreau, Carol; Chinoy, Catherine; Spanner, Rebecca; Dalmais, Marion; Le Signor, Christine; Bendahmane, Abdel; Klenell, Markus; Domoney, Claire
2015-12-01
Among a set of genes in pea (Pisum sativum L.) that were induced under drought-stress growth conditions, one encoded a protein with significant similarity to a regulator of chlorophyll catabolism, SGR. This gene, SGRL, is distinct from SGR in genomic location, encoded carboxy-terminal motif, and expression through plant and seed development. Divergence of the two encoded proteins is associated with a loss of similarity in intron/exon gene structure. Transient expression of SGRL in leaves of Nicotiana benthamiana promoted the degradation of chlorophyll, in a manner that was distinct from that shown by SGR. Removal of a predicted transmembrane domain from SGRL reduced its activity in transient expression assays, although variants with and without this domain reduced SGR-induced chlorophyll degradation, indicating that the effects of the two proteins are not additive. The combined data suggest that the function of SGRL during growth and development is in chlorophyll re-cycling, and its mode of action is distinct from that of SGR. Studies of pea sgrL mutants revealed that plants had significantly lower stature and yield, a likely consequence of reduced photosynthetic efficiencies in mutant compared with control plants under conditions of high light intensity.
Ferreira, Célia; van Voorst, Frank; Martins, António; Neves, Luisa; Oliveira, Rui; Kielland-Brandt, Morten C.; Lucas, Cândida; Brandt, Anders
2005-01-01
Glycerol and other polyols are used as osmoprotectants by many organisms. Several yeasts and other fungi can take up glycerol by proton symport. To identify genes involved in active glycerol uptake in Saccharomyces cerevisiae we screened a deletion mutant collection comprising 321 genes encoding proteins with 6 or more predicted transmembrane domains for impaired growth on glycerol medium. Deletion of STL1, which encodes a member of the sugar transporter family, eliminates active glycerol transport. Stl1p is present in the plasma membrane in S. cerevisiae during conditions where glycerol symport is functional. Both the Stl1 protein and the active glycerol transport are subject to glucose-induced inactivation, following identical patterns. Furthermore, the Stl1 protein and the glycerol symporter activity are strongly but transiently induced when cells are subjected to osmotic shock. STL1 was heterologously expressed in Schizosaccharomyces pombe, a yeast that does not contain its own active glycerol transport system. In S. pombe, STL1 conferred the ability to take up glycerol against a concentration gradient in a proton motive force-dependent manner. We conclude that the glycerol proton symporter in S. cerevisiae is encoded by STL1. PMID:15703210
Exploring Mouse Protein Function via Multiple Approaches.
Huang, Guohua; Chu, Chen; Huang, Tao; Kong, Xiangyin; Zhang, Yunhua; Zhang, Ning; Cai, Yu-Dong
2016-01-01
Although the number of available protein sequences is growing exponentially, functional protein annotations lag far behind. Therefore, accurate identification of protein functions remains one of the major challenges in molecular biology. In this study, we presented a novel approach to predict mouse protein functions. The approach was a sequential combination of a similarity-based approach, an interaction-based approach and a pseudo amino acid composition-based approach. The method achieved an accuracy of about 0.8450 for the 1st-order predictions in the leave-one-out and ten-fold cross-validations. For the results yielded by the leave-one-out cross-validation, although the similarity-based approach alone achieved an accuracy of 0.8756, it was unable to predict the functions of proteins with no homologues. Comparatively, the pseudo amino acid composition-based approach alone reached an accuracy of 0.6786. Although the accuracy was lower than that of the previous approach, it could predict the functions of almost all proteins, even proteins with no homologues. Therefore, the combined method balanced the advantages and disadvantages of both approaches to achieve efficient performance. Furthermore, the results yielded by the ten-fold cross-validation indicate that the combined method is still effective and stable when there are no close homologs are available. However, the accuracy of the predicted functions can only be determined according to known protein functions based on current knowledge. Many protein functions remain unknown. By exploring the functions of proteins for which the 1st-order predicted functions are wrong but the 2nd-order predicted functions are correct, the 1st-order wrongly predicted functions were shown to be closely associated with the genes encoding the proteins. The so-called wrongly predicted functions could also potentially be correct upon future experimental verification. Therefore, the accuracy of the presented method may be much higher in reality.
Exploring Mouse Protein Function via Multiple Approaches
Huang, Tao; Kong, Xiangyin; Zhang, Yunhua; Zhang, Ning
2016-01-01
Although the number of available protein sequences is growing exponentially, functional protein annotations lag far behind. Therefore, accurate identification of protein functions remains one of the major challenges in molecular biology. In this study, we presented a novel approach to predict mouse protein functions. The approach was a sequential combination of a similarity-based approach, an interaction-based approach and a pseudo amino acid composition-based approach. The method achieved an accuracy of about 0.8450 for the 1st-order predictions in the leave-one-out and ten-fold cross-validations. For the results yielded by the leave-one-out cross-validation, although the similarity-based approach alone achieved an accuracy of 0.8756, it was unable to predict the functions of proteins with no homologues. Comparatively, the pseudo amino acid composition-based approach alone reached an accuracy of 0.6786. Although the accuracy was lower than that of the previous approach, it could predict the functions of almost all proteins, even proteins with no homologues. Therefore, the combined method balanced the advantages and disadvantages of both approaches to achieve efficient performance. Furthermore, the results yielded by the ten-fold cross-validation indicate that the combined method is still effective and stable when there are no close homologs are available. However, the accuracy of the predicted functions can only be determined according to known protein functions based on current knowledge. Many protein functions remain unknown. By exploring the functions of proteins for which the 1st-order predicted functions are wrong but the 2nd-order predicted functions are correct, the 1st-order wrongly predicted functions were shown to be closely associated with the genes encoding the proteins. The so-called wrongly predicted functions could also potentially be correct upon future experimental verification. Therefore, the accuracy of the presented method may be much higher in reality. PMID:27846315
Kozak, Natalia A; Buss, Meghan; Lucas, Claressa E; Frace, Michael; Govil, Dhwani; Travis, Tatiana; Olsen-Rasmussen, Melissa; Benson, Robert F; Fields, Barry S
2010-02-01
Legionella longbeachae causes most cases of legionellosis in Australia and may be underreported worldwide due to the lack of L. longbeachae-specific diagnostic tests. L. longbeachae displays distinctive differences in intracellular trafficking, caspase 1 activation, and infection in mouse models compared to Legionella pneumophila, yet these two species have indistinguishable clinical presentations in humans. Unlike other legionellae, which inhabit freshwater systems, L. longbeachae is found predominantly in moist soil. In this study, we sequenced and annotated the genome of an L. longbeachae clinical isolate from Oregon, isolate D-4968, and compared it to the previously published genomes of L. pneumophila. The results revealed that the D-4968 genome is larger than the L. pneumophila genome and has a gene order that is different from that of the L. pneumophila genome. Genes encoding structural components of type II, type IV Lvh, and type IV Icm/Dot secretion systems are conserved. In contrast, only 42/140 homologs of genes encoding L. pneumophila Icm/Dot substrates have been found in the D-4968 genome. L. longbeachae encodes numerous proteins with eukaryotic motifs and eukaryote-like proteins unique to this species, including 16 ankyrin repeat-containing proteins and a novel U-box protein. We predict that these proteins are secreted by the L. longbeachae Icm/Dot secretion system. In contrast to the L. pneumophila genome, the L. longbeachae D-4968 genome does not contain flagellar biosynthesis genes, yet it contains a chemotaxis operon. The lack of a flagellum explains the failure of L. longbeachae to activate caspase 1 and trigger pyroptosis in murine macrophages. These unique features of L. longbeachae may reflect adaptation of this species to life in soil.
Frietze, Kathryn M.; Campos, Samuel K.; Kajon, Adriana E.
2010-01-01
Subspecies B1 human adenoviruses (HAdV-B1s) are important causative agents of acute respiratory disease, but the molecular bases of their distinct pathobiology are still poorly understood. Marked differences in genetic content between HAdV-B1s and the well-characterized HAdV-Cs that may contribute to distinct pathogenic properties map to the E3 region. Between the highly conserved E3-19K and E3-10.4K/RIDα open reading frames (ORFs), and in the same location as the HAdV-C ADP/E3-11.6K ORF, HAdV-B1s carry ORFs E3-20.1K and E3-20.5K and a polymorphic third ORF, designated E3-10.9K, that varies in the size of its predicted product among HAdV-B1 serotypes and genomic variants. As an initial effort to define the function of the E3-10.9K ORF, we carried out a biochemical characterization of E3-10.9K-encoded orthologous proteins and investigated their expression in infected cells. Sequence-based predictions suggested that E3-10.9K orthologs with a hydrophobic domain are integral membrane proteins. Ectopically expressed, C-terminally tagged (with enhanced green fluorescent protein [EGFP]) E3-10.9K and E3-9K localized primarily to the plasma membrane, while E3-7.7K localized primarily to a juxtanuclear compartment that could not be identified. EGFP fusion proteins with a hydrophobic domain were N and O glycosylated. EGFP-tagged E3-4.8K, which lacked the hydrophobic domain, displayed diffuse cellular localization similar to that of the EGFP control. E3-10.9K transcripts from the major late promoter were detected at late time points postinfection. A C-terminally hemagglutinin-tagged version of E3-9K was detected by immunoprecipitation at late times postinfection in the membrane fraction of mutant virus-infected cells. These data suggest a role for ORF E3-10.9K-encoded proteins at late stages of HAdV-B1 replication, with potentially important functional implications for the documented ORF polymorphism. PMID:20739542
Naqvi, Ahmad Abu Turab; Shahbaaz, Mohd; Ahmad, Faizan; Hassan, Md Imtaiyaz
2015-01-01
Syphilis is a globally occurring venereal disease, and its infection is propagated through sexual contact. The causative agent of syphilis, Treponema pallidum ssp. pallidum, a Gram-negative sphirochaete, is an obligate human parasite. Genome of T. pallidum ssp. pallidum SS14 strain (RefSeq NC_010741.1) encodes 1,027 proteins, of which 444 proteins are known as hypothetical proteins (HPs), i.e., proteins of unknown functions. Here, we performed functional annotation of HPs of T. pallidum ssp. pallidum using various database, domain architecture predictors, protein function annotators and clustering tools. We have analyzed the sequences of 444 HPs of T. pallidum ssp. pallidum and subsequently predicted the function of 207 HPs with a high level of confidence. However, functions of 237 HPs are predicted with less accuracy. We found various enzymes, transporters, binding proteins in the annotated group of HPs that may be possible molecular targets, facilitating for the survival of pathogen. Our comprehensive analysis helps to understand the mechanism of pathogenesis to provide many novel potential therapeutic interventions.
Transgenic cells with increased plastoquinone levels and methods of use
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sayre, Richard T.; Subramanian, Sowmya; Cahoon, Edgar
Disclosed herein are transgenic cells expressing a heterologous nucleic acid encoding a prephenate dehydrogenase (PDH) protein, a heterologous nucleic acid encoding a homogentisate solanesyl transferase (HST) protein, a heterologous nucleic acid encoding a deoxyxylulose phosphate synthase (DXS) protein, or a combination of two or more thereof. In particular examples, the disclosed transgenic cells have increased plastoquinone levels. Also disclosed are methods of increasing cell growth rates or production of biomass by cultivating transgenic cells expressing a heterologous nucleic acid encoding a PDH protein, a heterologous nucleic acid encoding an HST protein, a heterologous nucleic acid encoding a DXS protein, ormore » a combination of two or more thereof under conditions sufficient to produce cell growth or biomass.« less
Li, Jiying; Hu, Jianping; Bassham, Diane
2015-09-14
Peroxisomes are essential organelles that house a wide array of metabolic reactions important for plant growth and development. However, our knowledge regarding the role of peroxisomal proteins in various biological processes, including plant stress response, is still incomplete. Recent proteomic studies of plant peroxisomes significantly increased the number of known peroxisomal proteins and greatly facilitated the study of peroxisomes at the systems level. The objectives of this study were to determine whether genes that encode peroxisomal proteins with related functions are co-expressed in Arabidopsis and identify peroxisomal proteins involved in stress response using in silico analysis and mutant screens. Usingmore » microarray data from online databases, we performed hierarchical clustering analysis to generate a comprehensive view of transcript level changes for Arabidopsis peroxisomal genes during development and under abiotic and biotic stress conditions. Many genes involved in the same metabolic pathways exhibited co-expression, some genes known to be involved in stress response are regulated by the corresponding stress conditions, and function of some peroxisomal proteins could be predicted based on their coexpression pattern. Since drought caused expression changes to the highest number of genes that encode peroxisomal proteins, we subjected a subset of Arabidopsis peroxisomal mutants to a drought stress assay. Mutants of the LON2 protease and the photorespiratory enzyme hydroxypyruvate reductase 1 (HPR1) showed enhanced susceptibility to drought, suggesting the involvement of peroxisomal quality control and photorespiration in drought resistance. Lastly, our study provided a global view of how genes that encode peroxisomal proteins respond to developmental and environmental cues and began to reveal additional peroxisomal proteins involved in stress response, thus opening up new avenues to investigate the role of peroxisomes in plant adaptation to environmental stresses.« less
Mattow, J; Jungblut, P R; Schaible, U E; Mollenkopf, H J; Lamer, S; Zimny-Arndt, U; Hagens, K; Müller, E C; Kaufmann, S H
2001-08-01
A proteome approach, combining high-resolution two-dimensional electrophoresis (2-DE) with mass spectrometry, was used to compare the cellular protein composition of two virulent strains of Mycobacterium tuberculosis with two attenuated strains of Mycobacterium bovis Bacillus Calmette-Guerin (BCG), in order to identify unique proteins of these strains. Emphasis was given to the identification of M. tuberculosis specific proteins, because we consider these proteins to represent putative virulence factors and interesting candidates for vaccination and diagnosis of tuberculosis. The genome of M. tuberculosis strain H37Rv comprises nearly 4000 predicted open reading frames. In contrast, the separation of proteins from whole mycobacterial cells by 2-DE resulted in silver-stained patterns comprising about 1800 distinct protein spots. Amongst these, 96 spots were exclusively detected either in the virulent (56 spots) or in the attenuated (40 spots) mycobacterial strains. Fifty-three of these spots were analyzed by mass spectrometry, of which 41 were identified, including 32 M. tuberculosis specific spots. Twelve M. tuberculosis specific spots were identified as proteins, encoded by genes previously reported to be deleted in M. bovis BCG. The remaining 20 spots unique for M. tuberculosis were identified as proteins encoded by genes that are not known to be missing in M. bovis BCG.
Morais do Amaral, Alexandre; Antoniw, John; Rudd, Jason J.; Hammond-Kosack, Kim E.
2012-01-01
The Dothideomycete fungus Mycosphaerella graminicola is the causal agent of Septoria tritici blotch, a devastating disease of wheat leaves that causes dramatic decreases in yield. Infection involves an initial extended period of symptomless intercellular colonisation prior to the development of visible necrotic disease lesions. Previous functional genomics and gene expression profiling studies have implicated the production of secreted virulence effector proteins as key facilitators of the initial symptomless growth phase. In order to identify additional candidate virulence effectors, we re-analysed and catalogued the predicted protein secretome of M. graminicola isolate IPO323, which is currently regarded as the reference strain for this species. We combined several bioinformatic approaches in order to increase the probability of identifying truly secreted proteins with either a predicted enzymatic function or an as yet unknown function. An initial secretome of 970 proteins was predicted, whilst further stringent selection criteria predicted 492 proteins. Of these, 321 possess some functional annotation, the composition of which may reflect the strictly intercellular growth habit of this pathogen, leaving 171 with no functional annotation. This analysis identified a protein family encoding secreted peroxidases/chloroperoxidases (PF01328) which is expanded within all members of the family Mycosphaerellaceae. Further analyses were done on the non-annotated proteins for size and cysteine content (effector protein hallmarks), and then by studying the distribution of homologues in 17 other sequenced Dothideomycete fungi within an overall total of 91 predicted proteomes from fungal, oomycete and nematode species. This detailed M. graminicola secretome analysis provides the basis for further functional and comparative genomics studies. PMID:23236356
NASA Technical Reports Server (NTRS)
Hsieh, H. L.; Tong, C. G.; Thomas, C.; Roux, S. J.
1996-01-01
A CDNA encoding a 47 kDa nucleoside triphosphatase (NTPase) that is associated with the chromatin of pea nuclei has been cloned and sequenced. The translated sequence of the cDNA includes several domains predicted by known biochemical properties of the enzyme, including five motifs characteristic of the ATP-binding domain of many proteins, several potential casein kinase II phosphorylation sites, a helix-turn-helix region characteristic of DNA-binding proteins, and a potential calmodulin-binding domain. The deduced primary structure also includes an N-terminal sequence that is a predicted signal peptide and an internal sequence that could serve as a bipartite-type nuclear localization signal. Both in situ immunocytochemistry of pea plumules and immunoblots of purified cell fractions indicate that most of the immunodetectable NTPase is within the nucleus, a compartment proteins typically reach through nuclear pores rather than through the endoplasmic reticulum pathway. The translated sequence has some similarity to that of human lamin C, but not high enough to account for the earlier observation that IgG against human lamin C binds to the NTPase in immunoblots. Northern blot analysis shows that the NTPase MRNA is strongly expressed in etiolated plumules, but only poorly or not at all in the leaf and stem tissues of light-grown plants. Accumulation of NTPase mRNA in etiolated seedlings is stimulated by brief treatments with both red and far-red light, as is characteristic of very low-fluence phytochrome responses. Southern blotting with pea genomic DNA indicates the NTPase is likely to be encoded by a single gene.
Kuan, Lisa; Schaffer, Jessica N.; Zouzias, Christos D.
2014-01-01
Proteus mirabilis is a Gram-negative enteric bacterium that causes complicated urinary tract infections, particularly in patients with indwelling catheters. Sequencing of clinical isolate P. mirabilis HI4320 revealed the presence of 17 predicted chaperone-usher fimbrial operons. We classified these fimbriae into three groups by their genetic relationship to other chaperone-usher fimbriae. Sixteen of these fimbriae are encoded by all seven currently sequenced P. mirabilis genomes. The predicted protein sequence of the major structural subunit for 14 of these fimbriae was highly conserved (≥95 % identity), whereas three other structural subunits (Fim3A, UcaA and Fim6A) were variable. Further examination of 58 clinical isolates showed that 14 of the 17 predicted major structural subunit genes of the fimbriae were present in most strains (>85 %). Transcription of the predicted major structural subunit genes for all 17 fimbriae was measured under different culture conditions designed to mimic conditions in the urinary tract. The majority of the fimbrial genes were induced during stationary phase, static culture or colony growth when compared to exponential-phase aerated culture. Major structural subunit proteins for six of these fimbriae were detected using MS of proteins sheared from the surface of broth-cultured P. mirabilis, demonstrating that this organism may produce multiple fimbriae within a single culture. The high degree of conservation of P. mirabilis fimbriae stands in contrast to uropathogenic Escherichia coli and Salmonella enterica, which exhibit greater variability in their fimbrial repertoires. These findings suggest there may be evolutionary pressure for P. mirabilis to maintain a large fimbrial arsenal. PMID:24809384
Paterno, Gary D; Ding, Zhihu; Lew, Yuan-Y; Nash, Gord W; Mercer, F Corinne; Gillespie, Laura L
2002-07-24
mi-er1 (previously called er1) is a fibroblast growth factor-inducible early response gene activated during mesoderm induction in Xenopus embryos and encoding a nuclear protein that functions as a transcriptional activator. The human orthologue of mi-er1 was shown to be upregulated in breast carcinoma cell lines and breast tumours when compared to normal breast cells. In this report, we investigate the structure of the human mi-er1 (hmi-er1) gene and characterize the alternatively spliced transcripts and protein isoforms. hmi-er1 is a single copy gene located at 1p31.2 and spanning 63 kb. It contains 17 exons and includes one skipped exon, a facultative intron and three polyadenylation signals to produce 12 transcripts encoding six distinct proteins. hmi-er1 transcripts were expressed at very low levels in most human adult tissues and the mRNA isoform pattern varied with the tissue. The 12 transcripts encode proteins containing a common internal sequence with variable N- and C-termini. Three distinct N- and two distinct C-termini were identified, giving rise to six protein isoforms. The two C-termini differ significantly in size and sequence and arise from alternate use of a facultative intron to produce hMI-ER1alpha and hMI-ER1beta. In all tissues except testis, transcripts encoding the beta isoform were predominant. hMI-ER1alpha lacks the predicted nuclear localization signal and transfection assays revealed that, unlike hMI-ER1beta, it is not a nuclear protein, but remains in the cytoplasm. Our results demonstrate that alternate use of a facultative intron regulates the subcellular localization of hMI-ER1 proteins and this may have important implications for hMI-ER1 function.
Delk, Nikkí A.; Johnson, Keith A.; Chowdhury, Naweed I.; Braam, Janet
2005-01-01
Changes in intracellular calcium (Ca2+) levels serve to signal responses to diverse stimuli. Ca2+ signals are likely perceived through proteins that bind Ca2+, undergo conformation changes following Ca2+ binding, and interact with target proteins. The 50-member calmodulin-like (CML) Arabidopsis (Arabidopsis thaliana) family encodes proteins containing the predicted Ca2+-binding EF-hand motif. The functions of virtually all these proteins are unknown. CML24, also known as TCH2, shares over 40% amino acid sequence identity with calmodulin, has four EF hands, and undergoes Ca2+-dependent changes in hydrophobic interaction chromatography and migration rate through denaturing gel electrophoresis, indicating that CML24 binds Ca2+ and, as a consequence, undergoes conformational changes. CML24 expression occurs in all major organs, and transcript levels are increased from 2- to 15-fold in plants subjected to touch, darkness, heat, cold, hydrogen peroxide, abscisic acid (ABA), and indole-3-acetic acid. However, CML24 protein accumulation changes were not detectable. The putative CML24 regulatory region confers reporter expression at sites of predicted mechanical stress; in regions undergoing growth; in vascular tissues and various floral organs; and in stomata, trichomes, and hydathodes. CML24-underexpressing transgenics are resistant to ABA inhibition of germination and seedling growth, are defective in long-day induction of flowering, and have enhanced tolerance to CoCl2, molybdic acid, ZnSO4, and MgCl2. MgCl2 tolerance is not due to reduced uptake or to elevated Ca2+ accumulation. Together, these data present evidence that CML24, a gene expressed in diverse organs and responsive to diverse stimuli, encodes a potential Ca2+ sensor that may function to enable responses to ABA, daylength, and presence of various salts. PMID:16113225
2009-01-01
Background MicroRNAs (miRNAs) are endogenous single-stranded small RNAs that regulate the expression of specific mRNAs involved in diverse biological processes. In plants, miRNAs are generally encoded as a single species in independent transcriptional units, referred to as MIRNA genes, in contrast to animal miRNAs, which are frequently clustered. Results We performed a comparative genomic analysis in three model plants (rice, poplar and Arabidopsis) and characterized miRNA clusters containing two to eight miRNA species. These clusters usually encode miRNAs of the same family and certain share a common evolutionary origin across monocot and dicot lineages. In addition, we identified miRNA clusters harboring miRNAs with unrelated sequences that are usually not evolutionarily conserved. Strikingly, non-homologous miRNAs from the same cluster were predicted to target transcripts encoding related proteins. At least four Arabidopsis non-homologous clusters were expressed as single transcriptional units. Overexpression of one of these polycistronic precursors, producing Ath-miR859 and Ath-miR774, led to the DCL1-dependent accumulation of both miRNAs and down-regulation of their different mRNA targets encoding F-box proteins. Conclusions In addition to polycistronic precursors carrying related miRNAs, plants also contain precursors allowing coordinated expression of non-homologous miRNAs to co-regulate functionally related target transcripts. This mechanism paves the way for using polycistronic MIRNA precursors as a new molecular tool for plant biologists to simultaneously control the expression of different genes. PMID:19951405
Kariithi, Henry M; Ince, Ikbal A; Boeren, Sjef; Abd-Alla, Adly M M; Parker, Andrew G; Aksoy, Serap; Vlak, Just M; Oers, Monique M van
2011-11-01
The competence of the tsetse fly Glossina pallidipes (Diptera; Glossinidae) to acquire salivary gland hypertrophy virus (SGHV), to support virus replication and successfully transmit the virus depends on complex interactions between Glossina and SGHV macromolecules. Critical requisites to SGHV transmission are its replication and secretion of mature virions into the fly's salivary gland (SG) lumen. However, secretion of host proteins is of equal importance for successful transmission and requires cataloging of G. pallidipes secretome proteins from hypertrophied and non-hypertrophied SGs. After electrophoretic profiling and in-gel trypsin digestion, saliva proteins were analyzed by nano-LC-MS/MS. MaxQuant/Andromeda search of the MS data against the non-redundant (nr) GenBank database and a G. morsitans morsitans SG EST database, yielded a total of 521 hits, 31 of which were SGHV-encoded. On a false discovery rate limit of 1% and detection threshold of least 2 unique peptides per protein, the analysis resulted in 292 Glossina and 25 SGHV MS-supported proteins. When annotated by the Blast2GO suite, at least one gene ontology (GO) term could be assigned to 89.9% (285/317) of the detected proteins. Five (∼1.8%) Glossina and three (∼12%) SGHV proteins remained without a predicted function after blast searches against the nr database. Sixty-five of the 292 detected Glossina proteins contained an N-terminal signal/secretion peptide sequence. Eight of the SGHV proteins were predicted to be non-structural (NS), and fourteen are known structural (VP) proteins. SGHV alters the protein expression pattern in Glossina. The G. pallidipes SG secretome encompasses a spectrum of proteins that may be required during the SGHV infection cycle. These detected proteins have putative interactions with at least 21 of the 25 SGHV-encoded proteins. Our findings opens venues for developing novel SGHV mitigation strategies to block SGHV infections in tsetse production facilities such as using SGHV-specific antibodies and phage display-selected gut epithelia-binding peptides.
Bombyx mori Nucleopolyhedrovirus Encodes a DNA-Binding Protein Capable of Destabilizing Duplex DNA
Mikhailov, Victor S.; Mikhailova, Alla L.; Iwanaga, Masashi; Gomi, Sumiko; Maeda, Susumu
1998-01-01
A DNA-binding protein (designated DBP) with an apparent molecular mass of 38 kDa was purified to homogeneity from BmN cells (derived from Bombyx mori) infected with the B. mori nucleopolyhedrovirus (BmNPV). Six peptides obtained after digestion of the isolated protein with Achromobacter protease I were partially or completely sequenced. The determined amino acid sequences indicated that DBP was encoded by an open reading frame (ORF16) located at nucleotides (nt) 16189 to 17139 in the BmNPV genome (GenBank accession no. L33180). This ORF (designated dbp) is a homolog of Autographa californica multicapsid NPV ORF25, whose product has not been identified. BmNPV DBP is predicted to contain 317 amino acids (calculated molecular mass of 36.7 kDa) and to have an isoelectric point of 7.8. DBP showed a tendency to multimerization in the course of purification and was found to bind preferentially to single-stranded DNA. When bound to oligonucleotides, DBP protected them from hydrolysis by phage T4 DNA polymerase-associated 3′→5′ exonuclease. The sizes of the protected fragments indicated that a binding site size for DBP is about 30 nt per protein monomer. DBP, but not BmNPV LEF-3, was capable of unwinding partial DNA duplexes in an in vitro system. This helix-destabilizing ability is consistent with the prediction that DBP functions as a single-stranded DNA binding protein in virus replication. PMID:9525636
Efficient search, mapping, and optimization of multi-protein genetic systems in diverse bacteria
Farasat, Iman; Kushwaha, Manish; Collens, Jason; Easterbrook, Michael; Guido, Matthew; Salis, Howard M
2014-01-01
Developing predictive models of multi-protein genetic systems to understand and optimize their behavior remains a combinatorial challenge, particularly when measurement throughput is limited. We developed a computational approach to build predictive models and identify optimal sequences and expression levels, while circumventing combinatorial explosion. Maximally informative genetic system variants were first designed by the RBS Library Calculator, an algorithm to design sequences for efficiently searching a multi-protein expression space across a > 10,000-fold range with tailored search parameters and well-predicted translation rates. We validated the algorithm's predictions by characterizing 646 genetic system variants, encoded in plasmids and genomes, expressed in six gram-positive and gram-negative bacterial hosts. We then combined the search algorithm with system-level kinetic modeling, requiring the construction and characterization of 73 variants to build a sequence-expression-activity map (SEAMAP) for a biosynthesis pathway. Using model predictions, we designed and characterized 47 additional pathway variants to navigate its activity space, find optimal expression regions with desired activity response curves, and relieve rate-limiting steps in metabolism. Creating sequence-expression-activity maps accelerates the optimization of many protein systems and allows previous measurements to quantitatively inform future designs. PMID:24952589
FSPP: A Tool for Genome-Wide Prediction of smORF-Encoded Peptides and Their Functions
Li, Hui; Xiao, Li; Zhang, Lili; Wu, Jiarui; Wei, Bin; Sun, Ninghui; Zhao, Yi
2018-01-01
smORFs are small open reading frames of less than 100 codons. Recent low throughput experiments showed a lot of smORF-encoded peptides (SEPs) played crucial rule in processes such as regulation of transcription or translation, transportation through membranes and the antimicrobial activity. In order to gather more functional SEPs, it is necessary to have access to genome-wide prediction tools to give profound directions for low throughput experiments. In this study, we put forward a functional smORF-encoded peptides predictor (FSPP) which tended to predict authentic SEPs and their functions in a high throughput method. FSPP used the overlap of detected SEPs from Ribo-seq and mass spectrometry as target objects. With the expression data on transcription and translation levels, FSPP built two co-expression networks. Combing co-location relations, FSPP constructed a compound network and then annotated SEPs with functions of adjacent nodes. Tested on 38 sequenced samples of 5 human cell lines, FSPP successfully predicted 856 out of 960 annotated proteins. Interestingly, FSPP also highlighted 568 functional SEPs from these samples. After comparison, the roles predicted by FSPP were consistent with known functions. These results suggest that FSPP is a reliable tool for the identification of functional small peptides. FSPP source code can be acquired at https://www.bioinfo.org/FSPP. PMID:29675032
Improving the annotation of the Heterorhabditis bacteriophora genome.
McLean, Florence; Berger, Duncan; Laetsch, Dominik R; Schwartz, Hillel T; Blaxter, Mark
2018-04-01
Genome assembly and annotation remain exacting tasks. As the tools available for these tasks improve, it is useful to return to data produced with earlier techniques to assess their credibility and correctness. The entomopathogenic nematode Heterorhabditis bacteriophora is widely used to control insect pests in horticulture. The genome sequence for this species was reported to encode an unusually high proportion of unique proteins and a paucity of secreted proteins compared to other related nematodes. We revisited the H. bacteriophora genome assembly and gene predictions to determine whether these unusual characteristics were biological or methodological in origin. We mapped an independent resequencing dataset to the genome and used the blobtools pipeline to identify potential contaminants. While present (0.2% of the genome span, 0.4% of predicted proteins), assembly contamination was not significant. Re-prediction of the gene set using BRAKER1 and published transcriptome data generated a predicted proteome that was very different from the published one. The new gene set had a much reduced complement of unique proteins, better completeness values that were in line with other related species' genomes, and an increased number of proteins predicted to be secreted. It is thus likely that methodological issues drove the apparent uniqueness of the initial H. bacteriophora genome annotation and that similar contamination and misannotation issues affect other published genome assemblies.
Holzman, L B; Marks, R M; Dixit, V M
1990-11-01
We have previously described the cloning of a group of novel cellular immediate-early response genes whose expression in human umbilical vein endothelial cells is induced by tumor necrosis factor alpha in the presence of cycloheximide. These genes are likely to participate in mediating the response of the vascular endothelium to proinflammatory cytokines. In this study, we further characterized one of these novel gene products named B61. Sequence analysis of cDNA clones encoding B61 revealed that its protein product has no significant homology to previously described proteins. Southern analysis suggested that B61 is an evolutionarily conserved single-copy gene. B61 is primarily a hydrophilic molecule but contains both a hydrophobic N-terminal and a hydrophobic C-terminal region. The N-terminal region is typical of a signal peptide, which is consistent with the secreted nature of the protein. The mature form of the predicted protein consists of 187 amino acid residues and has a molecular weight of 22,000. Immunoprecipitation of metabolically labeled human umbilical vein endothelial cell preparations revealed that B61 is a 25-kilodalton secreted protein which is markedly induced by tumor necrosis factor.
Holzman, L B; Marks, R M; Dixit, V M
1990-01-01
We have previously described the cloning of a group of novel cellular immediate-early response genes whose expression in human umbilical vein endothelial cells is induced by tumor necrosis factor alpha in the presence of cycloheximide. These genes are likely to participate in mediating the response of the vascular endothelium to proinflammatory cytokines. In this study, we further characterized one of these novel gene products named B61. Sequence analysis of cDNA clones encoding B61 revealed that its protein product has no significant homology to previously described proteins. Southern analysis suggested that B61 is an evolutionarily conserved single-copy gene. B61 is primarily a hydrophilic molecule but contains both a hydrophobic N-terminal and a hydrophobic C-terminal region. The N-terminal region is typical of a signal peptide, which is consistent with the secreted nature of the protein. The mature form of the predicted protein consists of 187 amino acid residues and has a molecular weight of 22,000. Immunoprecipitation of metabolically labeled human umbilical vein endothelial cell preparations revealed that B61 is a 25-kilodalton secreted protein which is markedly induced by tumor necrosis factor. Images PMID:2233719
Polypeptide p41 of a Norwalk-Like Virus Is a Nucleic Acid-Independent Nucleoside Triphosphatase
Pfister, Thomas; Wimmer, Eckard
2001-01-01
Southampton virus (SHV) is a member of the Norwalk-like viruses (NLVs), one of four genera of the family Caliciviridae. The genome of SHV contains three open reading frames (ORFs). ORF 1 encodes a polyprotein that is autocatalytically processed into six proteins, one of which is p41. p41 shares sequence motifs with protein 2C of picornaviruses and superfamily 3 helicases. We have expressed p41 of SHV in bacteria. Purified p41 exhibited nucleoside triphosphate (NTP)-binding and NTP hydrolysis activities. The NTPase activity was not stimulated by single-stranded nucleic acids. SHV p41 had no detectable helicase activity. Protein sequence comparison between the consensus sequences of NLV p41 and enterovirus protein 2C revealed regions of high similarity. According to secondary structure prediction, the conserved regions were located within a putative central domain of alpha helices and beta strands. This study reveals for the first time an NTPase activity associated with a calicivirus-encoded protein. Based on enzymatic properties and sequence information, a functional relationship between NLV p41 and enterovirus 2C is discussed in regard to the role of 2C-like proteins in virus replication. PMID:11160659
The arbuscular mycorrhizal fungal protein glomalin is a putative homolog of heat shock protein 60.
Gadkar, Vijay; Rillig, Matthias C
2006-10-01
Work on glomalin-related soil protein produced by arbuscular mycorrhizal (AM) fungi (AMF) has been limited because of the unknown identity of the protein. A protein band cross-reactive with the glomalin-specific antibody MAb32B11 from the AM fungus Glomus intraradices was partially sequenced using tandem liquid chromatography-mass spectrometry. A 17 amino acid sequence showing similarity to heat shock protein 60 (hsp 60) was obtained. Based on degenerate PCR, a full-length cDNA of 1773 bp length encoding the hsp 60 gene was isolated from a G. intraradices cDNA library. The ORF was predicted to encode a protein of 590 amino acids. The protein sequence had three N-terminal glycosylation sites and a string of GGM motifs at the C-terminal end. The GiHsp 60 ORF had three introns of 67, 76 and 131 bp length. The GiHsp 60 was expressed using an in vitro translation system, and the protein was purified using the 6xHis-tag system. A dot-blot assay on the purified protein showed that it was highly cross-reactive with the glomalin-specific antibody MAb32B11. The present work provides the first evidence for the identity of the glomalin protein in the model AMF G. intraradices, thus facilitating further characterization of this protein, which is of great interest in soil ecology.
Christie, Andrew E; Nolan, Daniel H; Garcia, Zachery A; McCoole, Matthew D; Harmon, Sarah M; Congdon-Jones, Benjamin; Ohno, Paul; Hartline, Niko; Congdon, Clare Bates; Baer, Kevin N; Lenz, Petra H
2011-02-01
The Onychophora, Priapulida and Tardigrada, along with the Arthropoda, Nematoda and several other small phyla, form the superphylum Ecdysozoa. Numerous peptidomic studies have been undertaken for both the arthropods and nematodes, resulting in the identification of many peptides from each group. In contrast, little is known about the peptides used as paracrines/hormones by species from the other ecdysozoan taxa. Here, transcriptome mining and bioinformatic peptide prediction were used to identify peptides in members of the Onychophora, Priapulida and Tardigrada, the only non-arthropod, non-nematode members of the Ecdysozoa for which there are publicly accessible expressed sequence tags (ESTs). The extant ESTs for each phylum were queried using 106 arthropod/nematode peptide precursors. Transcripts encoding calcitonin-like diuretic hormone and pigment-dispersing hormone (PDH) were identified for the onychophoran Peripatopsis sedgwicki, with transcripts encoding C-type allatostatin (C-AST) and FMRFamide-like peptide identified for the priapulid Priapulus caudatus. For the Tardigrada, transcripts encoding members of the A-type allatostatin, C-AST, insect kinin, orcokinin, PDH and tachykinin-related peptide families were identified, all but one from Hypsibius dujardini (the exception being a Milnesium tardigradum orcokinin-encoding transcript). The proteins deduced from these ESTs resulted in the prediction of 48 novel peptides, six onychophoran, eight priapulid and 34 tardigrade, which are the first described from these phyla. Copyright © 2010 Elsevier Inc. All rights reserved.
MicroRNA biogenesis and function in plants.
Chen, Xuemei
2005-10-31
A microRNA (miRNA) is a 21-24 nucleotide RNA product of a non-protein-coding gene. Plants, like animals, have a large number of miRNA-encoding genes in their genomes. The biogenesis of miRNAs in Arabidopsis is similar to that in animals in that miRNAs are processed from primary precursors by at least two steps mediated by RNAse III-like enzymes and that the miRNAs are incorporated into a protein complex named RISC. However, the biogenesis of plant miRNAs consists of an additional step, i.e., the miRNAs are methylated on the ribose of the last nucleotide by the miRNA methyltransferase HEN1. The high degree of sequence complementarity between plant miRNAs and their target mRNAs has facilitated the bioinformatic prediction of miRNA targets, many of which have been subsequently validated. Plant miRNAs have been predicted or confirmed to regulate a variety of processes, such as development, metabolism, and stress responses. A large category of miRNA targets consists of genes encoding transcription factors that play important roles in patterning the plant form.
Yi, Hai-Cheng; You, Zhu-Hong; Huang, De-Shuang; Li, Xiao; Jiang, Tong-Hai; Li, Li-Ping
2018-06-01
The interactions between non-coding RNAs (ncRNAs) and proteins play an important role in many biological processes, and their biological functions are primarily achieved by binding with a variety of proteins. High-throughput biological techniques are used to identify protein molecules bound with specific ncRNA, but they are usually expensive and time consuming. Deep learning provides a powerful solution to computationally predict RNA-protein interactions. In this work, we propose the RPI-SAN model by using the deep-learning stacked auto-encoder network to mine the hidden high-level features from RNA and protein sequences and feed them into a random forest (RF) model to predict ncRNA binding proteins. Stacked assembling is further used to improve the accuracy of the proposed method. Four benchmark datasets, including RPI2241, RPI488, RPI1807, and NPInter v2.0, were employed for the unbiased evaluation of five established prediction tools: RPI-Pred, IPMiner, RPISeq-RF, lncPro, and RPI-SAN. The experimental results show that our RPI-SAN model achieves much better performance than other methods, with accuracies of 90.77%, 89.7%, 96.1%, and 99.33%, respectively. It is anticipated that RPI-SAN can be used as an effective computational tool for future biomedical researches and can accurately predict the potential ncRNA-protein interacted pairs, which provides reliable guidance for biological research. Copyright © 2018 The Author(s). Published by Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Langi, Gladys Emmanuella Putri; Moeis, Maelita R.; Ihsanawati, Giri-Rachman, Ernawati Arifin
2014-03-01
Mycobacterium tuberculosis (Mtb), the sole cause of Tuberculosis (TB), is still a major global problem. The discovery of new anti-tubercular drugs is needed to face the increasing TB cases, especially to prevent the increase of cases with resistant Mtb. A potential novel drug target is the Mtb PhoR sensor domain protein which is the histidine kinase extracellular domain for receiving environmental signals. This protein is the initial part of the two-component system PhoR-PhoP regulating 114 genes related to the virulence of Mtb. In this study, the gene encoding PhoR sensor domain (SensPhoR) was subcloned from pGEM-T SensPhoR from the previous study (Suwanto, 2012) to pColdII. The construct pColdII SensPhoR was confirmed through restriction analysis and sequencing. Using the construct, SensPhoR was overexpressed at 15°C using Escherichia coli BL21 (DE3). Low temperature was chosen because according to the solubility prediction program of recombinant proteins from The University of Oklahama, the PhoR sensor domain has a chance of 79.8% to be expressed as insoluble proteins in Escherichia coli's (E. coli) cytoplasm. This prediction is also supported by other similar programs: PROSO and PROSO II. The SDS PAGE result indicated that the PhoR sensor domain recombinant protein was overexpressed. For future studies, this protein will be purified and used for structure analysis which can be used to find potential drugs through rational drug design.
Walia, Rasna R; Caragea, Cornelia; Lewis, Benjamin A; Towfic, Fadi; Terribilini, Michael; El-Manzalawy, Yasser; Dobbs, Drena; Honavar, Vasant
2012-05-10
RNA molecules play diverse functional and structural roles in cells. They function as messengers for transferring genetic information from DNA to proteins, as the primary genetic material in many viruses, as catalysts (ribozymes) important for protein synthesis and RNA processing, and as essential and ubiquitous regulators of gene expression in living organisms. Many of these functions depend on precisely orchestrated interactions between RNA molecules and specific proteins in cells. Understanding the molecular mechanisms by which proteins recognize and bind RNA is essential for comprehending the functional implications of these interactions, but the recognition 'code' that mediates interactions between proteins and RNA is not yet understood. Success in deciphering this code would dramatically impact the development of new therapeutic strategies for intervening in devastating diseases such as AIDS and cancer. Because of the high cost of experimental determination of protein-RNA interfaces, there is an increasing reliance on statistical machine learning methods for training predictors of RNA-binding residues in proteins. However, because of differences in the choice of datasets, performance measures, and data representations used, it has been difficult to obtain an accurate assessment of the current state of the art in protein-RNA interface prediction. We provide a review of published approaches for predicting RNA-binding residues in proteins and a systematic comparison and critical assessment of protein-RNA interface residue predictors trained using these approaches on three carefully curated non-redundant datasets. We directly compare two widely used machine learning algorithms (Naïve Bayes (NB) and Support Vector Machine (SVM)) using three different data representations in which features are encoded using either sequence- or structure-based windows. Our results show that (i) Sequence-based classifiers that use a position-specific scoring matrix (PSSM)-based representation (PSSMSeq) outperform those that use an amino acid identity based representation (IDSeq) or a smoothed PSSM (SmoPSSMSeq); (ii) Structure-based classifiers that use smoothed PSSM representation (SmoPSSMStr) outperform those that use PSSM (PSSMStr) as well as sequence identity based representation (IDStr). PSSMSeq classifiers, when tested on an independent test set of 44 proteins, achieve performance that is comparable to that of three state-of-the-art structure-based predictors (including those that exploit geometric features) in terms of Matthews Correlation Coefficient (MCC), although the structure-based methods achieve substantially higher Specificity (albeit at the expense of Sensitivity) compared to sequence-based methods. We also find that the expected performance of the classifiers on a residue level can be markedly different from that on a protein level. Our experiments show that the classifiers trained on three different non-redundant protein-RNA interface datasets achieve comparable cross-validation performance. However, we find that the results are significantly affected by differences in the distance threshold used to define interface residues. Our results demonstrate that protein-RNA interface residue predictors that use a PSSM-based encoding of sequence windows outperform classifiers that use other encodings of sequence windows. While structure-based methods that exploit geometric features can yield significant increases in the Specificity of protein-RNA interface residue predictions, such increases are offset by decreases in Sensitivity. These results underscore the importance of comparing alternative methods using rigorous statistical procedures, multiple performance measures, and datasets that are constructed based on several alternative definitions of interface residues and redundancy cutoffs as well as including evaluations on independent test sets into the comparisons.
McBride, Ruth; Fielding, Burtram C.
2012-01-01
A respiratory disease caused by a novel coronavirus, termed the severe acute respiratory syndrome coronavirus (SARS-CoV), was first reported in China in late 2002. The subsequent efficient human-to-human transmission of this virus eventually affected more than 30 countries worldwide, resulting in a mortality rate of ~10% of infected individuals. The spread of the virus was ultimately controlled by isolation of infected individuals and there has been no infections reported since April 2004. However, the natural reservoir of the virus was never identified and it is not known if this virus will re-emerge and, therefore, research on this virus continues. The SARS-CoV genome is about 30 kb in length and is predicted to contain 14 functional open reading frames (ORFs). The genome encodes for proteins that are homologous to known coronavirus proteins, such as the replicase proteins (ORFs 1a and 1b) and the four major structural proteins: nucleocapsid (N), spike (S), membrane (M) and envelope (E). SARS-CoV also encodes for eight unique proteins, called accessory proteins, with no known homologues. This review will summarize the current knowledge on SARS-CoV accessory proteins and will include: (i) expression and processing; (ii) the effects on cellular processes; and (iii) functional studies. PMID:23202509
Protein structural similarity search by Ramachandran codes
Lo, Wei-Cheng; Huang, Po-Jung; Chang, Chih-Hung; Lyu, Ping-Chiang
2007-01-01
Background Protein structural data has increased exponentially, such that fast and accurate tools are necessary to access structure similarity search. To improve the search speed, several methods have been designed to reduce three-dimensional protein structures to one-dimensional text strings that are then analyzed by traditional sequence alignment methods; however, the accuracy is usually sacrificed and the speed is still unable to match sequence similarity search tools. Here, we aimed to improve the linear encoding methodology and develop efficient search tools that can rapidly retrieve structural homologs from large protein databases. Results We propose a new linear encoding method, SARST (Structural similarity search Aided by Ramachandran Sequential Transformation). SARST transforms protein structures into text strings through a Ramachandran map organized by nearest-neighbor clustering and uses a regenerative approach to produce substitution matrices. Then, classical sequence similarity search methods can be applied to the structural similarity search. Its accuracy is similar to Combinatorial Extension (CE) and works over 243,000 times faster, searching 34,000 proteins in 0.34 sec with a 3.2-GHz CPU. SARST provides statistically meaningful expectation values to assess the retrieved information. It has been implemented into a web service and a stand-alone Java program that is able to run on many different platforms. Conclusion As a database search method, SARST can rapidly distinguish high from low similarities and efficiently retrieve homologous structures. It demonstrates that the easily accessible linear encoding methodology has the potential to serve as a foundation for efficient protein structural similarity search tools. These search tools are supposed applicable to automated and high-throughput functional annotations or predictions for the ever increasing number of published protein structures in this post-genomic era. PMID:17716377
Drosophila Nora virus capsid proteins differ from those of other picorna-like viruses.
Ekström, Jens-Ola; Habayeb, Mazen S; Srivastava, Vaibhav; Kieselbach, Thomas; Wingsle, Gunnar; Hultmark, Dan
2011-09-01
The recently discovered Nora virus from Drosophila melanogaster is a single-stranded RNA virus. Its published genomic sequence encodes a typical picorna-like cassette of replicative enzymes, but no capsid proteins similar to those in other picorna-like viruses. We have now done additional sequencing at the termini of the viral genome, extending it by 455 nucleotides at the 5' end, but no more coding sequence was found. The completeness of the final 12,333-nucleotide sequence was verified by the production of infectious virus from the cloned genome. To identify the capsid proteins, we purified Nora virus particles and analyzed their proteins by mass spectrometry. Our results show that the capsid is built from three major proteins, VP4A, B and C, encoded in the fourth open reading frame of the viral genome. The viral particles also contain traces of a protein from the third open reading frame, VP3. VP4A and B are not closely related to other picorna-like virus capsid proteins in sequence, but may form similar jelly roll folds. VP4C differs from the others and is predicted to have an essentially α-helical conformation. In a related virus, identified from EST database sequences from Nasonia parasitoid wasps, VP4C is encoded in a separate open reading frame, separated from VP4A and B by a frame-shift. This opens a possibility that VP4C is produced in non-equimolar quantities. Altogether, our results suggest that the Nora virus capsid has a different protein organization compared to the order Picornavirales. Copyright © 2011 Elsevier B.V. All rights reserved.
Jelínek, Jan; Škoda, Petr; Hoksza, David
2017-12-06
Protein-protein interactions (PPI) play a key role in an investigation of various biochemical processes, and their identification is thus of great importance. Although computational prediction of which amino acids take part in a PPI has been an active field of research for some time, the quality of in-silico methods is still far from perfect. We have developed a novel prediction method called INSPiRE which benefits from a knowledge base built from data available in Protein Data Bank. All proteins involved in PPIs were converted into labeled graphs with nodes corresponding to amino acids and edges to pairs of neighboring amino acids. A structural neighborhood of each node was then encoded into a bit string and stored in the knowledge base. When predicting PPIs, INSPiRE labels amino acids of unknown proteins as interface or non-interface based on how often their structural neighborhood appears as interface or non-interface in the knowledge base. We evaluated INSPiRE's behavior with respect to different types and sizes of the structural neighborhood. Furthermore, we examined the suitability of several different features for labeling the nodes. Our evaluations showed that INSPiRE clearly outperforms existing methods with respect to Matthews correlation coefficient. In this paper we introduce a new knowledge-based method for identification of protein-protein interaction sites called INSPiRE. Its knowledge base utilizes structural patterns of known interaction sites in the Protein Data Bank which are then used for PPI prediction. Extensive experiments on several well-established datasets show that INSPiRE significantly surpasses existing PPI approaches.
NASA Astrophysics Data System (ADS)
Woon, J. S. K.; Murad, A. M. A.; Abu Bakar, F. D.
2015-09-01
A cellobiohydrolase B (CbhB) from Aspergillus niger ATCC 10574 was cloned and expressed in E. coli. CbhB has an open reading frame of 1611 bp encoding a putative polypeptide of 536 amino acids. Analysis of the encoded polypeptide predicted a molecular mass of 56.2 kDa, a cellulose binding module (CBM) and a catalytic module. In order to obtain the mRNA of cbhB, total RNA was extracted from A. niger cells induced by 1% Avicel. First strand cDNA was synthesized from total RNA via reverse transcription. The full length cDNA of cbhB was amplified by PCR and cloned into the cloning vector, pGEM-T Easy. A comparison between genomic DNA and cDNA sequences of cbhB revealed that the gene is intronless. Upon the removal of the signal peptide, the cDNA of cbhB was cloned into the expression vector pET-32b. However, the recombinant CbhB was expressed in Escherichia coli Origami DE3 as an insoluble protein. A homology model of CbhB predicted the presence of nine disulfide bonds in the protein structure which may have contributed to the improper folding of the protein and thus, resulting in inclusion bodies in E. coli.
Esteban-Torres, María; Reverón, Inés; Santamaría, Laura; Mancheño, José M.; de las Rivas, Blanca; Muñoz, Rosario
2016-01-01
Lactobacillus plantarum species is a good source of esterases since both lipolytic and esterase activities have been described for strains of this species. No fundamental biochemical difference exists among esterases and lipases since both share a common catalytic mechanism. L. plantarum WCFS1 possesses a protein, Lp_3561, which is 44% identical to a previously described lipase, Lp_3562. In contrast to Lp_3562, Lp_3561 was unable to degrade esters possessing a chain length higher than C4 and the triglyceride tributyrin. As in other L. plantarum esterases, the electrostatic potential surface around the active site in Lp_3561 is predicted to be basic, whereas it is essentially neutral in the Lp_3562 lipase. The fact that the genes encoding both proteins were located contiguously in the L. plantarum WCFS1 genome, suggests that they originated by tandem duplication, and therefore are paralogs as new functions have arisen during evolution. The presence of the contiguous lp_3561 and lp_3562 genes was studied among L. plantarum strains. They are located in a 8,903 bp DNA fragment that encodes proteins involved in the catabolism of sialic acid and are predicted to increase bacterial adaptability under certain growth conditions. PMID:27486450
DOE Office of Scientific and Technical Information (OSTI.GOV)
Akileswaran, L.; Brock, B.J.; Cereghino, J.L.
1999-02-01
A cDNA clone encoding a quinone reductase (QR) from the white rot basidiomycete Phanerochaete chrysosporium was isolated and sequenced. The cDNA consisted of 1,007 nucleotides and a poly(A) tail and encoded a deduced protein containing 271 amino acids. The experimentally determined eight-amino-acid N-germinal sequence of the purified QR protein from P. chrysosporium matched amino acids 72 to 79 of the predicted translation product of the cDNA. The M{sub r} of the predicted translation product, beginning with Pro-72, was essentially identical to the experimentally determined M{sub r} of one monomer of the QR dimer, and this finding suggested that QR ismore » synthesized as a proenzyme. The results of in vitro transcription-translation experiments suggested that QR is synthesized as a proenzyme with a 71-amino-acid leader sequence. This leader sequence contains two potential KEX2 cleavage sites and numerous potential cleavage sites for dipeptidyl aminopeptidase. The QR activity in cultures of P. chrysosporium increased following the addition of 2-dimethoxybenzoquinone, vanillic acid, or several other aromatic compounds. An immunoblot analysis indicated that induction resulted in an increase in the amount of QR protein, and a Northern blot analysis indicated that this regulation occurs at the level of the qr mRNA.« less
Mani, Chinnasamy; Selvakumari, Jeyaperumal; Han, YeonSoo; Jo, YongHun; Thirugnanasambantham, Krishnaraj; Sundarapandian, Somaiah; Poopathi, Subbiah
2018-04-01
A marine Bacillus cereus (VCRC B540) with mosquitocidal effect was recently reported from red snapper fish (Lutjanus sanguineous) gut and surface layer protein (S-layer protein, SLP) was reported to be mosquito larvicidal factor. In this present study, the gene encoding the surface layer protein was amplified from the genomic DNA and functionally characterized. Amplification of SLP-encoding gene revealed 1,518 bp PCR product, and analysis of the sequence revealed the presence of 1482 bp open reading frame with coding capacity for a polypeptide of 493 amino acids. Phylogenetic analysis revealed with homology among closely related Bacillus cereus groups of organisms as well as Bacillus strains. Removal of nucleotides encoding signaling peptide revealed the functional cloning fragment of length 1398 bp. Theoretical molecular weight (51.7 kDa) and isoelectric point (5.99) of the deduced functional SLP protein were predicted using ProtParam. The amplified PCR product was cloned into a plasmid vector (pGEM-T), and the open reading frame free off signaling peptide was subsequently cloned inpET-28a(+) and expressed in Escherichia coli BL21 (DE3). The isopropyl-β-D-thiogalactopyranoside (IPTG)-induced recombinant SLP was confirmed using western blotting, and functional SLP revealed mosquito larvicidal property. Therefore, the major findings revealed that SLP is a factor responsible for mosquitocidal activity, and the molecular characterization of this toxin was extensively studied.
Quantitative Profiling of Brain Lipid Raft Proteome in a Mouse Model of Fragile X Syndrome
Kalinowska, Magdalena; Castillo, Catherine; Francesconi, Anna
2015-01-01
Fragile X Syndrome, a leading cause of inherited intellectual disability and autism, arises from transcriptional silencing of the FMR1 gene encoding an RNA-binding protein, Fragile X Mental Retardation Protein (FMRP). FMRP can regulate the expression of approximately 4% of brain transcripts through its role in regulation of mRNA transport, stability and translation, thus providing a molecular rationale for its potential pleiotropic effects on neuronal and brain circuitry function. Several intracellular signaling pathways are dysregulated in the absence of FMRP suggesting that cellular deficits may be broad and could result in homeostatic changes. Lipid rafts are specialized regions of the plasma membrane, enriched in cholesterol and glycosphingolipids, involved in regulation of intracellular signaling. Among transcripts targeted by FMRP, a subset encodes proteins involved in lipid biosynthesis and homeostasis, dysregulation of which could affect the integrity and function of lipid rafts. Using a quantitative mass spectrometry-based approach we analyzed the lipid raft proteome of Fmr1 knockout mice, an animal model of Fragile X syndrome, and identified candidate proteins that are differentially represented in Fmr1 knockout mice lipid rafts. Furthermore, network analysis of these candidate proteins reveals connectivity between them and predicts functional connectivity with genes encoding components of myelin sheath, axonal processes and growth cones. Our findings provide insight to aid identification of molecular and cellular dysfunctions arising from Fmr1 silencing and for uncovering shared pathologies between Fragile X syndrome and other autism spectrum disorders. PMID:25849048
Tian, Guilian; Zhou, Yun; Hajkova, Dagmar; Miyagi, Masaru; Dinculescu, Astra; Hauswirth, William W; Palczewski, Krzysztof; Geng, Ruishuang; Alagramam, Kumar N; Isosomppi, Juha; Sankila, Eeva-Marja; Flannery, John G; Imanishi, Yoshikazu
2009-07-10
Clarin-1 is the protein product encoded by the gene mutated in Usher syndrome III. Although the molecular function of clarin-1 is unknown, its primary structure predicts four transmembrane domains similar to a large family of membrane proteins that include tetraspanins. Here we investigated the role of clarin-1 by using heterologous expression and in vivo model systems. When expressed in HEK293 cells, clarin-1 localized to the plasma membrane and concentrated in low density compartments distinct from lipid rafts. Clarin-1 reorganized actin filament structures and induced lamellipodia. This actin-reorganizing function was absent in the modified protein encoded by the most prevalent North American Usher syndrome III mutation, the N48K form of clarin-1 deficient in N-linked glycosylation. Proteomics analyses revealed a number of clarin-1-interacting proteins involved in cell-cell adhesion, focal adhesions, cell migration, tight junctions, and regulation of the actin cytoskeleton. Consistent with the hypothesized role of clarin-1 in actin organization, F-actin-enriched stereocilia of auditory hair cells evidenced structural disorganization in Clrn1(-/-) mice. These observations suggest a possible role for clarin-1 in the regulation and homeostasis of actin filaments, and link clarin-1 to the interactive network of Usher syndrome gene products.
Pinto-Santini, Delia M.; Salama, Nina R.
2009-01-01
Helicobacter pylori strains harboring the cag pathogenicity island (PAI) have been associated with more severe gastric disease in infected humans. The cag PAI encodes a type IV secretion (T4S) system required for CagA translocation into host cells as well as induction of proinflammatory cytokines, such as interleukin-8 (IL-8). cag PAI genes sharing sequence similarity with T4S components from other bacteria are essential for Cag T4S function. Other cag PAI-encoded genes are also essential for Cag T4S, but lack of sequence-based or structural similarity with genes in existing databases has precluded a functional assignment for the encoded proteins. We have studied the role of one such protein, Cag3 (HP0522), in Cag T4S and determined Cag3 subcellular localization and protein interactions. Cag3 is membrane associated and copurifies with predicted inner and outer membrane Cag T4S components that are essential for Cag T4S as well as putative accessory factors. Coimmunoprecipitation and cross-linking experiments revealed specific interactions with HpVirB7 and CagM, suggesting Cag3 is a new component of the Cag T4S outer membrane subcomplex. Finally, lack of Cag3 lowers HpVirB7 steady-state levels, further indicating Cag3 makes a subcomplex with this protein. PMID:19801411
Ozers, M S; Friesen, P D
1996-12-15
TED is a 7.5-kbp member of the gypsy family of retrotransposons that was first identified by its integration within the baculovirus DNA genome. This lepidopteran (moth) transposon contains three retrovirus-like genes, including functional gag and pol that yield reverse transcriptase-containing virus-like particles. To identify and characterize the product(s) of the third env-like open reading frame, TED ORF3 was expressed in homologous lepidopteran cells by using a baculovirus vector, vENV. Immunoblots and immunoprecipitations with antiserum raised against a bacterial ORF3-fusion protein detected two ORF3-encoded proteins, p68env and gp75env. On the basis of selective incorporation of [3H]mannose and inhibition of modification by tunicamycin which blocks N-linked glycosylation, gp75env is a glycoprotein derived from core precursor p68env. As predicted by the presence of a transmembrane domain near the carboxyl terminus, both p68env and gp75env were associated with heavy membranes of vENV-infected cells. Thus, TED ORF3 encodes a membrane glycoprotein with properties characteristic of retroviral env proteins. These data are consistent with the hypothesis that TED is an invertebrate retrovirus. Moreover, TED integration within the baculovirus genome provides an example of retroelement-mediated acquisition of host genes that may contribute to virus evolution.
Mycobacterium ahvazicum sp. nov., the nineteenth species of the Mycobacterium simiae complex.
Bouam, Amar; Heidarieh, Parvin; Shahraki, Abodolrazagh Hashemi; Pourahmad, Fazel; Mirsaeidi, Mehdi; Hashemzadeh, Mohamad; Baptiste, Emeline; Armstrong, Nicholas; Levasseur, Anthony; Robert, Catherine; Drancourt, Michel
2018-03-07
Four slowly growing mycobacteria isolates were isolated from the respiratory tract and soft tissue biopsies collected in four unrelated patients in Iran. Conventional phenotypic tests indicated that these four isolates were identical to Mycobacterium lentiflavum while 16S rRNA gene sequencing yielded a unique sequence separated from that of M. lentiflavum. One representative strain AFP-003 T was characterized as comprising a 6,121,237-bp chromosome (66.24% guanosine-cytosine content) encoding for 5,758 protein-coding genes, 50 tRNA and one complete rRNA operon. A total of 2,876 proteins were found to be associated with the mobilome, including 195 phage proteins. A total of 1,235 proteins were found to be associated with virulence and 96 with toxin/antitoxin systems. The genome of AFP-003 T has the genetic potential to produce secondary metabolites, with 39 genes found to be associated with polyketide synthases and non-ribosomal peptide syntases and 11 genes encoding for bacteriocins. Two regions encoding putative prophages and three OriC regions separated by the dnaA gene were predicted. Strain AFP-003 T genome exhibits 86% average nucleotide identity with Mycobacterium genavense genome. Genetic and genomic data indicate that strain AFP-003 T is representative of a novel Mycobacterium species that we named Mycobacterium ahvazicum, the nineteenth species of the expanding Mycobacterium simiae complex.
Schaeffer, E; Sninsky, J J
1984-01-01
Proteins that are related evolutionarily may have diverged at the level of primary amino acid sequence while maintaining similar secondary structures. Computer analysis has been used to compare the open reading frames of the hepatitis B virus to those of the woodchuck hepatitis virus at the level of amino acid sequence, and to predict the relative hydrophilic character and the secondary structure of putative polypeptides. Similarity is seen at the levels of relative hydrophilicity and secondary structure, in the absence of sequence homology. These data reinforce the proposal that these open reading frames encode viral proteins. Computer analysis of this type can be more generally used to establish structural similarities between proteins that do not share obvious sequence homology as well as to assess whether an open reading frame is fortuitous or codes for a protein. PMID:6585835
Sequence and analysis of chromosome 4 of the plant Arabidopsis thaliana.
Mayer, K; Schüller, C; Wambutt, R; Murphy, G; Volckaert, G; Pohl, T; Düsterhöft, A; Stiekema, W; Entian, K D; Terryn, N; Harris, B; Ansorge, W; Brandt, P; Grivell, L; Rieger, M; Weichselgartner, M; de Simone, V; Obermaier, B; Mache, R; Müller, M; Kreis, M; Delseny, M; Puigdomenech, P; Watson, M; Schmidtheini, T; Reichert, B; Portatelle, D; Perez-Alonso, M; Boutry, M; Bancroft, I; Vos, P; Hoheisel, J; Zimmermann, W; Wedler, H; Ridley, P; Langham, S A; McCullagh, B; Bilham, L; Robben, J; Van der Schueren, J; Grymonprez, B; Chuang, Y J; Vandenbussche, F; Braeken, M; Weltjens, I; Voet, M; Bastiaens, I; Aert, R; Defoor, E; Weitzenegger, T; Bothe, G; Ramsperger, U; Hilbert, H; Braun, M; Holzer, E; Brandt, A; Peters, S; van Staveren, M; Dirske, W; Mooijman, P; Klein Lankhorst, R; Rose, M; Hauf, J; Kötter, P; Berneiser, S; Hempel, S; Feldpausch, M; Lamberth, S; Van den Daele, H; De Keyser, A; Buysshaert, C; Gielen, J; Villarroel, R; De Clercq, R; Van Montagu, M; Rogers, J; Cronin, A; Quail, M; Bray-Allen, S; Clark, L; Doggett, J; Hall, S; Kay, M; Lennard, N; McLay, K; Mayes, R; Pettett, A; Rajandream, M A; Lyne, M; Benes, V; Rechmann, S; Borkova, D; Blöcker, H; Scharfe, M; Grimm, M; Löhnert, T H; Dose, S; de Haan, M; Maarse, A; Schäfer, M; Müller-Auer, S; Gabel, C; Fuchs, M; Fartmann, B; Granderath, K; Dauner, D; Herzl, A; Neumann, S; Argiriou, A; Vitale, D; Liguori, R; Piravandi, E; Massenet, O; Quigley, F; Clabauld, G; Mündlein, A; Felber, R; Schnabl, S; Hiller, R; Schmidt, W; Lecharny, A; Aubourg, S; Chefdor, F; Cooke, R; Berger, C; Montfort, A; Casacuberta, E; Gibbons, T; Weber, N; Vandenbol, M; Bargues, M; Terol, J; Torres, A; Perez-Perez, A; Purnelle, B; Bent, E; Johnson, S; Tacon, D; Jesse, T; Heijnen, L; Schwarz, S; Scholler, P; Heber, S; Francs, P; Bielke, C; Frishman, D; Haase, D; Lemcke, K; Mewes, H W; Stocker, S; Zaccaria, P; Bevan, M; Wilson, R K; de la Bastide, M; Habermann, K; Parnell, L; Dedhia, N; Gnoj, L; Schutz, K; Huang, E; Spiegel, L; Sehkon, M; Murray, J; Sheet, P; Cordes, M; Abu-Threideh, J; Stoneking, T; Kalicki, J; Graves, T; Harmon, G; Edwards, J; Latreille, P; Courtney, L; Cloud, J; Abbott, A; Scott, K; Johnson, D; Minx, P; Bentley, D; Fulton, B; Miller, N; Greco, T; Kemp, K; Kramer, J; Fulton, L; Mardis, E; Dante, M; Pepin, K; Hillier, L; Nelson, J; Spieth, J; Ryan, E; Andrews, S; Geisel, C; Layman, D; Du, H; Ali, J; Berghoff, A; Jones, K; Drone, K; Cotton, M; Joshu, C; Antonoiu, B; Zidanic, M; Strong, C; Sun, H; Lamar, B; Yordan, C; Ma, P; Zhong, J; Preston, R; Vil, D; Shekher, M; Matero, A; Shah, R; Swaby, I K; O'Shaughnessy, A; Rodriguez, M; Hoffmann, J; Till, S; Granat, S; Shohdy, N; Hasegawa, A; Hameed, A; Lodhi, M; Johnson, A; Chen, E; Marra, M; Martienssen, R; McCombie, W R
1999-12-16
The higher plant Arabidopsis thaliana (Arabidopsis) is an important model for identifying plant genes and determining their function. To assist biological investigations and to define chromosome structure, a coordinated effort to sequence the Arabidopsis genome was initiated in late 1996. Here we report one of the first milestones of this project, the sequence of chromosome 4. Analysis of 17.38 megabases of unique sequence, representing about 17% of the genome, reveals 3,744 protein coding genes, 81 transfer RNAs and numerous repeat elements. Heterochromatic regions surrounding the putative centromere, which has not yet been completely sequenced, are characterized by an increased frequency of a variety of repeats, new repeats, reduced recombination, lowered gene density and lowered gene expression. Roughly 60% of the predicted protein-coding genes have been functionally characterized on the basis of their homology to known genes. Many genes encode predicted proteins that are homologous to human and Caenorhabditis elegans proteins.
Khunjan, Uraiwan; Ekchaweng, Kitiya; Panrat, Tanate; Tian, Miaoying; Churngchow, Nunta
2016-01-01
This is the first report to present a full-length cDNA (designated HbPR-1) encoding a putative basic HbPR-1 protein from rubber tree (Hevea brasiliensis) treated with salicylic acid. It was characterized and also expressed in Nicotiana benthamiana using Agrobacterium-mediated transient gene expression system in order to investigate the role of HbPR-1 gene in rubber tree against its oomycete pathogen Phytopthora palmivora and to produce recombinant HbPR-1 protein for microbial inhibition test. The HbPR-1 cDNA was 647 bp long and contained an open reading frame of 492 nucleotides encoding 163 amino acid residues with a predicted molecular mass of 17,681 Da and an isoelectric point (pI) of 8.56, demonstrating that HbPR-1 protein belongs to the basic PR-1 type. The predicted 3D structure of HbPR-1 was composed of four α-helices, three β-sheets, seven strands, and one junction loop. Expression and purification of recombinant HbPR-1 protein were successful using Agrobacterium-mediated transient expression and one-step of affinity chromatography. Heterologous expression of HbPR-1 in N. benthamiana reduced necrosis areas which were inoculated with P. palmivora zoospores, indicating that the expressed HbPR-1 protein played an important role in plant resistance to pathogens. The purified recombinant HbPR-1 protein was found to inhibit 64% of P. palmivora zoospore germination on a water agar plate compared with control, suggesting that it was an antimicrobial protein against P. palmivora. PMID:27337148
Graentzdoerffer, Andrea; Rauh, David; Pich, Andreas; Andreesen, Jan R
2003-01-01
Two gene clusters encoding similar formate dehydrogenases (FDH) were identified in Eubacterium acidaminophilum. Each cluster is composed of one gene coding for a catalytic subunit ( fdhA-I, fdhA-II) and one for an electron-transferring subunit ( fdhB-I, fdhB-II). Both fdhA genes contain a TGA codon for selenocysteine incorporation and the encoded proteins harbor five putative iron-sulfur clusters in their N-terminal region. Both FdhB subunits resemble the N-terminal region of FdhA on the amino acid level and contain five putative iron-sulfur clusters. Four genes thought to encode the subunits of an iron-only hydrogenase are located upstream of the FDH gene cluster I. By sequence comparison, HymA and HymB are predicted to contain one and four iron-sulfur clusters, respectively, the latter protein also binding sites for FMN and NAD(P). Thus, HymA and HymB seem to represent electron-transferring subunits, and HymC the putative catalytic subunit containing motifs for four iron-sulfur clusters and one H-cluster specific for Fe-only hydrogenases. HymD has six predicted transmembrane helices and might be an integral membrane protein. Viologen-dependent FDH activity was purified from serine-grown cells of E. acidaminophilum and the purified protein complex contained four subunits, FdhA and FdhB, encoded by FDH gene cluster II, and HymA and HymB, identified after determination of their N-terminal sequences. Thus, this complex might represent the most simple type of a formate hydrogen lyase. The purified formate dehydrogenase fraction contained iron, tungsten, a pterin cofactor, and zinc, but no molybdenum. FDH-II had a two-fold higher K(m) for formate (0.37 mM) than FDH-I and also catalyzed CO(2) reduction to formate. Reverse transcription (RT)-PCR pointed to increased expression of FDH-II in serine-grown cells, supporting the isolation of this FDH isoform. The fdhA-I gene was expressed as inactive protein in Escherichia coli. The in-frame UGA codon for selenocysteine incorporation was read in the heterologous system only as stop codon, although its potential SECIS element exhibited a quite high similarity to that of E. coli FDH.
Barnard, G F; Staniunas, R J; Puder, M; Steele, G D; Chen, L B
1994-08-02
Ribosomal protein L37 mRNA is overexpressed in colon cancer. The nucleotide sequences of human L37 from several tumor and normal, colon and liver cDNA sources were determined to be identical. L37 mRNA was approximately 375 nucleotides long encoding 97 amino acids with M(r) = 11,070, pI = 12.6, multiple potential serine/threonine phosphorylation sites and a zinc-finger domain. The human sequence is compared to other species.
Smoot, L M; Smoot, J C; Graham, M R; Somerville, G A; Sturdevant, D E; Migliaccio, C A; Sylva, G L; Musser, J M
2001-08-28
Pathogens are exposed to different temperatures during an infection cycle and must regulate gene expression accordingly. However, the extent to which virulent bacteria alter gene expression in response to temperatures encountered in the host is unknown. Group A Streptococcus (GAS) is a human-specific pathogen that is responsible for illnesses ranging from superficial skin infections and pharyngitis to severe invasive infections such as necrotizing fasciitis and streptococcal toxic shock syndrome. GAS survives and multiplies at different temperatures during human infection. DNA microarray analysis was used to investigate the influence of temperature on global gene expression in a serotype M1 strain grown to exponential phase at 29 degrees C and 37 degrees C. Approximately 9% of genes were differentially expressed by at least 1.5-fold at 29 degrees C relative to 37 degrees C, including genes encoding transporter proteins, proteins involved in iron homeostasis, transcriptional regulators, phage-associated proteins, and proteins with no known homologue. Relatively few known virulence genes were differentially expressed at this threshold. However, transcription of 28 genes encoding proteins with predicted secretion signal sequences was altered, indicating that growth temperature substantially influences the extracellular proteome. TaqMan real-time reverse transcription-PCR assays confirmed the microarray data. We also discovered that transcription of genes encoding hemolysins, and proteins with inferred roles in iron regulation, transport, and homeostasis, was influenced by growth at 40 degrees C. Thus, GAS profoundly alters gene expression in response to temperature. The data delineate the spectrum of temperature-regulated gene expression in an important human pathogen and provide many unforeseen lines of pathogenesis investigation.
Baumgartner, Desiree; Kopf, Matthias; Klähn, Stephan; Steglich, Claudia; Hess, Wolfgang R
2016-11-28
Despite their versatile functions in multimeric protein complexes, in the modification of enzymatic activities, intercellular communication or regulatory processes, proteins shorter than 80 amino acids (μ-proteins) are a systematically underestimated class of gene products in bacteria. Photosynthetic cyanobacteria provide a paradigm for small protein functions due to extensive work on the photosynthetic apparatus that led to the functional characterization of 19 small proteins of less than 50 amino acids. In analogy, previously unstudied small ORFs with similar degrees of conservation might encode small proteins of high relevance also in other functional contexts. Here we used comparative transcriptomic information available for two model cyanobacteria, Synechocystis sp. PCC 6803 and Synechocystis sp. PCC 6714 for the prediction of small ORFs. We found 293 transcriptional units containing candidate small ORFs ≤80 codons in Synechocystis sp. PCC 6803, also including the known mRNAs encoding small proteins of the photosynthetic apparatus. From these transcriptional units, 146 are shared between the two strains, 42 are shared with the higher plant Arabidopsis thaliana and 25 with E. coli. To verify the existence of the respective μ-proteins in vivo, we selected five genes as examples to which a FLAG tag sequence was added and re-introduced them into Synechocystis sp. PCC 6803. These were the previously annotated gene ssr1169, two newly defined genes norf1 and norf4, as well as nsiR6 (nitrogen stress-induced RNA 6) and hliR1(high light-inducible RNA 1) , which originally were considered non-coding. Upon activation of expression via the Cu 2+. responsive petE promoter or from the native promoters, all five proteins were detected in Western blot experiments. The distribution and conservation of these five genes as well as their regulation of expression and the physico-chemical properties of the encoded proteins underline the likely great bandwidth of small protein functions in bacteria and makes them attractive candidates for functional studies.
Characterisation of single domain ATP-binding cassette protien homologues of Theileria parva.
Kibe, M K; Macklin, M; Gobright, E; Bishop, R; Urakawa, T; ole-MoiYoi, O K
2001-09-01
Two distinct genes encoding single domain, ATP-binding cassette transport protein homologues of Theileria parva were cloned and sequenced. Neither of the genes is tandemly duplicated. One gene, TpABC1, encodes a predicted protein of 593 amino acids with an N-terminal hydrophobic domain containing six potential membrane-spanning segments. A single discontinuous ATP-binding element was located in the C-terminal region of TpABC1. The second gene, TpABC2, also contains a single C-terminal ATP-binding motif. Copies of TpABC2 were present at four loci in the T. parva genome on three different chromosomes. TpABC1 exhibited allelic polymorphism between stocks of the parasite. Comparison of cDNA and genomic sequences revealed that TpABC1 contained seven short introns, between 29 and 84 bp in length. The full-length TpABC1 protein was expressed in insect cells using the baculovirus system. Application of antibodies raised against the recombinant antigen to western blots of T. parva piroplasm lysates detected an 85 kDa protein in this life-cycle stage.
Gardenia jasminoides Encodes an Inhibitor-2 Protein for Protein Phosphatase Type 1
NASA Astrophysics Data System (ADS)
Gao, Lan; Li, Hao-Ming
2017-08-01
Protein phosphatase-1 (PP1) regulates diverse, essential cellular processes such as cell cycle progression, protein synthesis, muscle contraction, carbohydrate metabolism, transcription and neuronal signaling. Inhibitor-2 (I-2) can inhibit the activity of PP1 and has been found in diverse organisms. In this work, a Gardenia jasminoides fruit cDNA library was constructed, and the GjI-2 cDNA was isolated from the cDNA library by sequencing method. The GjI-2 cDNA contains a predicted 543 bp open reading frame that encodes 180 amino acids. The bioinformatics analysis suggested that the GjI-2 has conserved PP1c binding motif, and contains a conserved phosphorylation site, which is important in regulation of its activity. The three-dimensional model structure of GjI-2 was buite, its similar with the structure of I-2 from mouse. The results suggest that GjI-2 has relatively conserved RVxF, FxxR/KxR/K and HYNE motif, and these motifs are involved in interaction with PP1.
Mutations in CSPP1 lead to classical Joubert syndrome.
Akizu, Naiara; Silhavy, Jennifer L; Rosti, Rasim Ozgur; Scott, Eric; Fenstermaker, Ali G; Schroth, Jana; Zaki, Maha S; Sanchez, Henry; Gupta, Neerja; Kabra, Madhulika; Kara, Majdi; Ben-Omran, Tawfeg; Rosti, Basak; Guemez-Gamboa, Alicia; Spencer, Emily; Pan, Roger; Cai, Na; Abdellateef, Mostafa; Gabriel, Stacey; Halbritter, Jan; Hildebrandt, Friedhelm; van Bokhoven, Hans; Gunel, Murat; Gleeson, Joseph G
2014-01-02
Joubert syndrome and related disorders (JSRDs) are genetically heterogeneous and characterized by a distinctive mid-hindbrain malformation. Causative mutations lead to primary cilia dysfunction, which often results in variable involvement of other organs such as the liver, retina, and kidney. We identified predicted null mutations in CSPP1 in six individuals affected by classical JSRDs. CSPP1 encodes a protein localized to centrosomes and spindle poles, as well as to the primary cilium. Despite the known interaction between CSPP1 and nephronophthisis-associated proteins, none of the affected individuals in our cohort presented with kidney disease, and further, screening of a large cohort of individuals with nephronophthisis demonstrated no mutations. CSPP1 is broadly expressed in neural tissue, and its encoded protein localizes to the primary cilium in an in vitro model of human neurogenesis. Here, we show abrogated protein levels and ciliogenesis in affected fibroblasts. Our data thus suggest that CSPP1 is involved in neural-specific functions of primary cilia. Copyright © 2014 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.
Sun, Yan-Lin; Hong, Soon-Kwan
2012-08-01
Sea buckthorn (Hippophae rhamnoides L.) is naturally distributed from Asia to Europe. It has been widely planted as an ornamental shrub and is rich in nutritional and medicinal compounds. Fungal pathogens that cause diseases such as dried-shrink disease are threats to the production of this plant. In this study, we isolated the dried-shrink disease pathogen from bark and total chitinase protein from leaves of infected plants. The results of the Oxford Cup experiment suggested that chitinase protein inhibited the growth of this pathogen. To improve pathogen resistance, we cloned chitinase Class I and III genes in H. rhamnoides, designated Hrchi1 and Hrchi3. The full-length cDNA of the open reading frame region of Hrchi1 contained 903 bp encoding 300 amino acids and Hrchi3 contained 894 bp encoding 297 amino acids. Active domain analysis, protein types, and secondary and 3D structures were predicted using online software.
Molecular cloning and characterization of Aspergillus nidulans cyclophilin B.
Joseph, J D; Heitman, J; Means, A R
1999-06-01
Cyclophilins are an evolutionarily conserved family of proteins which serve as the intracellular receptors for the immunosuppressive drug cyclosporin A. Here we report the characterization of the first cyclophilin cloned from the filamentous fungus Aspergillus nidulans (CYPB). Sequence analysis of the cypB gene predicts an encoded protein with highest homology to the murine cyclophilin B protein. The sequence similarity includes an N-terminal sequence predicted to target the protein to the endoplasmic reticulum (ER) as well as a C-terminal sequence predicted to retain the mature protein in the ER. The bacterially expressed hexa-histidine tagged protein displays peptidyl-prolyl isomerase activity which is inhibited by cyclosporin A. In the presence of cyclosporin A, the expressed protein also inhibits purified calcineurin. When the endogenous cypB gene was disrupted and placed under the control of the regulatable alcohol dehydrogenase promoter, the strain demonstrated no detectable growth phenotype under conditions which induce or repress cypB transcription. Induction or repression of the cypB gene also did not effect sensitivity of A. nidulans to cyclosporin A. cypB mRNA levels were significantly elevated under severe heat shock conditions, indicating a possible role for the A. nidulans cyclophilin B protein during growth in high stress environments. Copyright 1999 Academic Press.
You, Zhu-Hong; Lei, Ying-Ke; Zhu, Lin; Xia, Junfeng; Wang, Bing
2013-01-01
Protein-protein interactions (PPIs) play crucial roles in the execution of various cellular processes and form the basis of biological mechanisms. Although large amount of PPIs data for different species has been generated by high-throughput experimental techniques, current PPI pairs obtained with experimental methods cover only a fraction of the complete PPI networks, and further, the experimental methods for identifying PPIs are both time-consuming and expensive. Hence, it is urgent and challenging to develop automated computational methods to efficiently and accurately predict PPIs. We present here a novel hierarchical PCA-EELM (principal component analysis-ensemble extreme learning machine) model to predict protein-protein interactions only using the information of protein sequences. In the proposed method, 11188 protein pairs retrieved from the DIP database were encoded into feature vectors by using four kinds of protein sequences information. Focusing on dimension reduction, an effective feature extraction method PCA was then employed to construct the most discriminative new feature set. Finally, multiple extreme learning machines were trained and then aggregated into a consensus classifier by majority voting. The ensembling of extreme learning machine removes the dependence of results on initial random weights and improves the prediction performance. When performed on the PPI data of Saccharomyces cerevisiae, the proposed method achieved 87.00% prediction accuracy with 86.15% sensitivity at the precision of 87.59%. Extensive experiments are performed to compare our method with state-of-the-art techniques Support Vector Machine (SVM). Experimental results demonstrate that proposed PCA-EELM outperforms the SVM method by 5-fold cross-validation. Besides, PCA-EELM performs faster than PCA-SVM based method. Consequently, the proposed approach can be considered as a new promising and powerful tools for predicting PPI with excellent performance and less time.
Zhu, Chen; Ai, Lin; Wang, Li; Yin, Pingping; Liu, Chenglan; Li, Shanshan; Zeng, Huiming
2016-01-01
Zoysia japonica brown spot was caused by necrotrophic fungus Rhizoctonia solani invasion, which led to severe financial loss in city lawn and golf ground maintenance. However, little was known about the molecular mechanism of R. solani pathogenicity in Z. japonica. In this study we examined early stage interaction between R. solani AG1 IA strain and Z. japonica cultivar "Zenith" root by cell ultra-structure analysis, pathogenesis-related proteins assay and transcriptome analysis to explore molecular clues for AG1 IA strain pathogenicity in Z. japonica. No obvious cell structure damage was found in infected roots and most pathogenesis-related protein activities showedg a downward trend especially in 36 h post inoculation, which exhibits AG1 IA strain stealthy invasion characteristic. According to Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) database classification, most DEGs in infected "Zenith" roots dynamically changed especially in three aspects, signal transduction, gene translation, and protein synthesis. Total 3422 unigenes of "Zenith" root were predicted into 14 kinds of resistance (R) gene class. Potential fungal resistance related unigenes of "Zenith" root were involved in ligin biosynthesis, phytoalexin synthesis, oxidative burst, wax biosynthesis, while two down-regulated unigenes encoding leucine-rich repeat receptor protein kinase and subtilisin-like protease might be important for host-derived signal perception to AG1 IA strain invasion. According to Pathogen Host Interaction (PHI) database annotation, 1508 unigenes of AG1 IA strain were predicted and classified into 37 known pathogen species, in addition, unigenes encoding virulence, signaling, host stress tolerance, and potential effector were also predicted. This research uncovered transcriptional profiling during the early phase interaction between R. solani AG1 IA strain and Z. japonica, and will greatly help identify key pathogenicity of AG1 IA strain.
Schmitz-Esser, Stephan; Tischler, Patrick; Arnold, Roland; Montanaro, Jacqueline; Wagner, Michael; Rattei, Thomas; Horn, Matthias
2010-01-01
Protozoa play host for many intracellular bacteria and are important for the adaptation of pathogenic bacteria to eukaryotic cells. We analyzed the genome sequence of “Candidatus Amoebophilus asiaticus,” an obligate intracellular amoeba symbiont belonging to the Bacteroidetes. The genome has a size of 1.89 Mbp, encodes 1,557 proteins, and shows massive proliferation of IS elements (24% of all genes), although the genome seems to be evolutionarily relatively stable. The genome does not encode pathways for de novo biosynthesis of cofactors, nucleotides, and almost all amino acids. “Ca. Amoebophilus asiaticus” encodes a variety of proteins with predicted importance for host cell interaction; in particular, an arsenal of proteins with eukaryotic domains, including ankyrin-, TPR/SEL1-, and leucine-rich repeats, which is hitherto unmatched among prokaryotes, is remarkable. Unexpectedly, 26 proteins that can interfere with the host ubiquitin system were identified in the genome. These proteins include F- and U-box domain proteins and two ubiquitin-specific proteases of the CA clan C19 family, representing the first prokaryotic members of this protein family. Consequently, interference with the host ubiquitin system is an important host cell interaction mechanism of “Ca. Amoebophilus asiaticus”. More generally, we show that the eukaryotic domains identified in “Ca. Amoebophilus asiaticus” are also significantly enriched in the genomes of other amoeba-associated bacteria (including chlamydiae, Legionella pneumophila, Rickettsia bellii, Francisella tularensis, and Mycobacterium avium). This indicates that phylogenetically and ecologically diverse bacteria which thrive inside amoebae exploit common mechanisms for interaction with their hosts, and it provides further evidence for the role of amoebae as training grounds for bacterial pathogens of humans. PMID:20023027
Schmitz-Esser, Stephan; Tischler, Patrick; Arnold, Roland; Montanaro, Jacqueline; Wagner, Michael; Rattei, Thomas; Horn, Matthias
2010-02-01
Protozoa play host for many intracellular bacteria and are important for the adaptation of pathogenic bacteria to eukaryotic cells. We analyzed the genome sequence of "Candidatus Amoebophilus asiaticus," an obligate intracellular amoeba symbiont belonging to the Bacteroidetes. The genome has a size of 1.89 Mbp, encodes 1,557 proteins, and shows massive proliferation of IS elements (24% of all genes), although the genome seems to be evolutionarily relatively stable. The genome does not encode pathways for de novo biosynthesis of cofactors, nucleotides, and almost all amino acids. "Ca. Amoebophilus asiaticus" encodes a variety of proteins with predicted importance for host cell interaction; in particular, an arsenal of proteins with eukaryotic domains, including ankyrin-, TPR/SEL1-, and leucine-rich repeats, which is hitherto unmatched among prokaryotes, is remarkable. Unexpectedly, 26 proteins that can interfere with the host ubiquitin system were identified in the genome. These proteins include F- and U-box domain proteins and two ubiquitin-specific proteases of the CA clan C19 family, representing the first prokaryotic members of this protein family. Consequently, interference with the host ubiquitin system is an important host cell interaction mechanism of "Ca. Amoebophilus asiaticus". More generally, we show that the eukaryotic domains identified in "Ca. Amoebophilus asiaticus" are also significantly enriched in the genomes of other amoeba-associated bacteria (including chlamydiae, Legionella pneumophila, Rickettsia bellii, Francisella tularensis, and Mycobacterium avium). This indicates that phylogenetically and ecologically diverse bacteria which thrive inside amoebae exploit common mechanisms for interaction with their hosts, and it provides further evidence for the role of amoebae as training grounds for bacterial pathogens of humans.
Van Hove, B; Staudenmaier, H; Braun, V
1990-12-01
Citrate and iron have to enter only the periplasmic space in order to induce the citrate-dependent iron(III) transport system of Escherichia coli. The five transport genes fecABCDE form an operon and are transcribed from fecA to fecE. Two genes, termed fecI and fecR, that mediate induction by iron(III) dicitrate have been identified upstream of fecA. The fecI gene encodes a protein of 173 amino acids (molecular weight, 19,478); the fecR gene encodes a protein of 317 amino acids (molecular weight, 35,529). Chromosomal fecI::Mu d1 mutants were unable to grow with iron(III) dicitrate as the sole iron source and synthesized no FecA outer membrane receptor protein. Growth was restored by transformation with plasmids encoding fecI or fecI and fecR. FecA and beta-galactosidase syntheses under transcription control of the fecB gene (fecB::Mu d1) were constitutive in fecI transformants and were regulated by iron(III) dicitrate in fecI fecR transformants. The amino acid sequence of the FecI protein contains a region close to the carboxy-terminal end for which a helix-turn-helix motif is predicted, which is typical for DNA-binding regulatory proteins. The FecI protein was found in the membrane, and the FecR protein was found in the periplasmic fraction. It is proposed that the FecR protein is the sensor that recognizes iron(III) dicitrate in the periplasm. The FecI protein activates fec gene expression by binding to the fec operator region. In the absence of citrate, FecR inactivates FecI. The lack of sequence homologies to other transmembrane signaling proteins and the location of the two proteins suggest a new type of transmembrane control mechanism.
Poliovirus replication proteins: RNA sequence encoding P3-1b and the sites of proteolytic processing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Semler, B.L.; Anderson, C.W.; Kitamura, N.
1981-06-01
A partial amino-terminal amino acid sequence of each of the major proteins encoded by the replicase region of the poliovirus genome has been determined. A comparison of this sequence information with the amino acid sequence predicted from the RNA sequence that has been determined for the 3' region of the poliovirus genome has allowed us to locate precisely the proteolytic cleavage sites at which the initial polyprotein is processed to create the poliovirus products P3-1b (NCVP1b), P3-2 (NCVP2), P3-4b (NCVP4b), and P3-7c (NCVP7c). For each of these products, as well as for the small genome-linked protein VPg, proteolytic cleavage occursmore » between a glutamine and a glycine residue to create the amino terminus of each protein. This result suggests that a single proteinase may be responsible for all of these cleavages. The sequence data also allow the precise positioning of the genome-linked protein VPg within the precursor P3-1b just proximal to the amino terminus of polypeptide P3-2.« less
Ellard-Ivey, M; Hopkins, R B; White, T J; Lomax, T L
1999-01-01
We have isolated a full-length cDNA clone (CpCDPK1) encoding a calcium-dependent protein kinase (CDPK) gene from zucchini (Cucurbita pepo L.). The predicted amino acid sequence of the cDNA shows a remarkably high degree of similarity to members of the CDPK gene family from Arabidopsis thaliana, especially AtCPK1 and AtCPK2. Northern analysis of steady-state mRNA levels for CpCPK1 in etiolated and light-grown zucchini seedlings shows that the transcript is most abundant in etiolated hypocotyls and overall expression is suppressed by light. As described for other members of the CDPK gene family from different species, the CpCPK1 clone has a putative N-terminal myristoylation sequence. In this study, site-directed mutagenesis and an in vitro coupled transcription/translation system were used to demonstrate that the protein encoded by this cDNA is specifically myristoylated by a plant N-myristoyl transferase. This is the first demonstration of myristoylation of a CDPK protein which may contribute to the mechanism by which this protein is localized to the plasma membrane.
1990-01-01
The major histological components of the hair follicle are the hair cortex and cuticle. The hair cuticle cells encase and protect the cortex and undergo a different developmental program to that of the cortex. We report the molecular characterization of a set of evolutionarily conserved hair genes which are transcribed in the hair cuticle late in follicle development. Two genes were isolated and characterized, one expressed in the human follicle and one in the sheep follicle. Each gene encodes a small protein of 16 kD, containing greater than 50 cysteine residues, ranging from 31 to 36 mol% cysteine. Their high cysteine content and in vitro expression data identify them as ultra-high-sulfur (UHS) keratin proteins. The predicted proteins are composed almost entirely of cysteine-rich and glycine-rich repeats. Genomic blots reveal that the UHS keratin proteins are encoded by related multigene families in both the human and sheep genomes. Tissue in situ hybridization demonstrates that the expression of both genes is localized to the hair fiber cuticle and occurs at a late stage in fiber morphogenesis. PMID:1703541
Fox, Ellen M.; Gardiner, Donald M.; Keller, Nancy P.; Howlett, Barbara J.
2008-01-01
A gene, sirZ, encoding a Zn(II)2Cys6 DNA binding protein is present in a cluster of genes responsible for the biosynthesis of the epipolythiodioxopiperazine (ETP) toxin, sirodesmin PL in the ascomycete plant pathogen, Leptosphaeria maculans. RNA-mediated silencing of sirZ gives rise to transformants that produce only residual amounts of sirodesmin PL and display a decrease in the transcription of several sirodesmin PL biosynthetic genes. This indicates that SirZ is a major regulator of this gene cluster. Proteins similar to SirZ are encoded in the gliotoxin biosynthetic gene cluster of Aspergillus fumigatus (gliZ) and in an ETP-like cluster in Penicillium lilacinoechinulatum (PlgliZ). Despite its high level of sequence similarity to gliZ, PlgliZ is unable to complement the gliotoxin-deficiency of a mutant of gliZ in A. fumigatus. Putative binding sites for these regulatory proteins in the promoters of genes in these clusters were predicted using bioinformatic analysis. These sites are similar to those commonly bound by other proteins with Zn(II)2Cys6 DNA binding domains. PMID:18023597
Mahan, Kristina M.; Klingeman, Dawn Marie; Robert L. Hettich; ...
2016-01-21
Streptomyces vitaminophilus produces pyrrolomycins, which are halogenated polyketide antibiotics. Some of the pyrrolomycins contain a rare nitro group located on the pyrrole ring. In addition, the 6.5-Mbp genome encodes 5,941 predicted protein-coding sequences in 39 contigs with a 71.9% G+C content.
Klingeman, Dawn M.; Hettich, Robert L.; Parry, Ronald J.
2016-01-01
Streptomyces vitaminophilus produces pyrrolomycins, which are halogenated polyketide antibiotics. Some of the pyrrolomycins contain a rare nitro group located on the pyrrole ring. The 6.5-Mbp genome encodes 5,941 predicted protein-coding sequences in 39 contigs with a 71.9% G+C content. PMID:26798098
The nop gene from Phanerochaete chrysosporium encodes a peroxidase with novel structural features
Luis F. Larrondo; Angel Gonzalez; Tomas Perez-Acle; Dan Cullen; Rafael Vicuna
2005-01-01
Inspection of the genome of the ligninolytic basidiomycete Phanerochaete chrysosporium revealed an unusual peroxidase-like sequence. The corresponding full length cDNA was sequenced and an archetypal secretion signal predicted. The deduced mature protein (NoP, novel peroxidase) contains 295 aa residues and is therefore considerably shorter than other Class II (fungal)...
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mahan, Kristina M.; Klingeman, Dawn Marie; Robert L. Hettich
Streptomyces vitaminophilus produces pyrrolomycins, which are halogenated polyketide antibiotics. Some of the pyrrolomycins contain a rare nitro group located on the pyrrole ring. In addition, the 6.5-Mbp genome encodes 5,941 predicted protein-coding sequences in 39 contigs with a 71.9% G+C content.
Santona, Antonella; Carta, Franco; Fraghí, Peppinetta; Turrini, Franco
2002-01-01
As a first step toward the design of an epitope vaccine to prevent contagious agalactia, the strongly immunogenic 55-kDa protein of Mycoplasma agalactiae was studied and found to correspond to the AvgC protein encoded by the avgC gene. The avg genes of M. agalactiae, which encode four variable surface lipoproteins, display a significant homology to the vsp (variable membrane surface lipoproteins) genes of the bovine pathogen Mycoplasma bovis at their promoter region as well as their N-terminus-encoding regions. Some members of the Vsp family are known to be involved in cytoadhesion to host cells. In order to localize immunogenic peptides in the AvgC antigen, the protein sequence was submitted to epitope prediction analysis, and five sets of overlapping peptides, corresponding to five selected regions, were synthesized by Spot synthesis. Reactive peptides were selected by immunobinding assay with sera from infected sheep. The three most immunogenic epitopes were shown to be surface exposed by immunoprecipitation assays, and one of these was specifically recognized by all tested sera. Our study indicates that selected epitopes of the AvgC lipoprotein may be used to develop a peptide-based vaccine which is effective against M. agalactiae infection. PMID:11748179
Li, Na; Yan, Yunhuan; Zhang, Angke; Gao, Jiming; Zhang, Chong; Wang, Xue; Hou, Gaopeng; Zhang, Gaiping; Jia, Jinbu; Zhou, En-Min; Xiao, Shuqi
2016-12-13
Many viruses encode microRNAs (miRNAs) that are small non-coding single-stranded RNAs which play critical roles in virus-host interactions. Porcine reproductive and respiratory syndrome virus (PRRSV) is one of the most economically impactful viruses in the swine industry. The present study sought to determine whether PRRSV encodes miRNAs that could regulate PRRSV replication. Four viral small RNAs (vsRNAs) were mapped to the stem-loop structures in the ORF1a, ORF1b and GP2a regions of the PRRSV genome by bioinformatics prediction and experimental verification. Of these, the structures with the lowest minimum free energy (MFE) values predicted for PRRSV-vsRNA1 corresponded to typical stem-loop, hairpin structures. Inhibition of PRRSV-vsRNA1 function led to significant increases in viral replication. Transfection with PRRSV-vsRNA1 mimics significantly inhibited PRRSV replication in primary porcine alveolar macrophages (PAMs). The time-dependent increase in the abundance of PRRSV-vsRNA1 mirrored the gradual upregulation of PRRSV RNA expression. Knockdown of proteins associated with cellular miRNA biogenesis demonstrated that Drosha and Argonaute (Ago2) are involved in PRRSV-vsRNA1 biogenesis. Moreover, PRRSV-vsRNA1 bound specifically to the nonstructural protein 2 (NSP2)-coding sequence of PRRSV genome RNA. Collectively, the results reveal that PRRSV encodes a functional PRRSV-vsRNA1 which auto-regulates PRRSV replication by directly targeting and suppressing viral NSP2 gene expression. These findings not only provide new insights into the mechanism of the pathogenesis of PRRSV, but also explore a potential avenue for controlling PRRSV infection using viral small RNAs.
Jo, Yeong Deuk; Ha, Yeaseong; Lee, Joung-Ho; Park, Minkyu; Bergsma, Alex C; Choi, Hong-Il; Goritschnig, Sandra; Kloosterman, Bjorn; van Dijk, Peter J; Choi, Doil; Kang, Byoung-Cheorl
2016-10-01
Using fine mapping techniques, the genomic region co-segregating with Restorer - of - fertility ( Rf ) in pepper was delimited to a region of 821 kb in length. A PPR gene in this region, CaPPR6 , was identified as a strong candidate for Rf based on expression pattern and characteristics of encoding sequence. Cytoplasmic-genic male sterility (CGMS) has been used for the efficient production of hybrid seeds in peppers (Capsicum annuum L.). Although the mitochondrial candidate genes that might be responsible for cytoplasmic male sterility (CMS) have been identified, the nuclear Restorer-of-fertility (Rf) gene has not been isolated. To identify the genomic region co-segregating with Rf in pepper, we performed fine mapping using an Rf-segregating population consisting of 1068 F2 individuals, based on BSA-AFLP and a comparative mapping approach. Through six cycles of chromosome walking, the co-segregating region harboring the Rf locus was delimited to be within 821 kb of sequence. Prediction of expressed genes in this region based on transcription analysis revealed four candidate genes. Among these, CaPPR6 encodes a pentatricopeptide repeat (PPR) protein with PPR motifs that are repeated 14 times. Characterization of the CaPPR6 protein sequence, based on alignment with other homologs, showed that CaPPR6 is a typical Rf-like (RFL) gene reported to have undergone diversifying selection during evolution. A marker developed from a sequence near CaPPR6 showed a higher prediction rate of the Rf phenotype than those of previously developed markers when applied to a panel of breeding lines of diverse origin. These results suggest that CaPPR6 is a strong candidate for the Rf gene in pepper.
Puranik, Swati; Bahadur, Ranjit Prasad; Srivastava, Prem S; Prasad, Manoj
2011-10-01
The plant-specific NAC (NAM, ATAF, and CUC) transcription factors have diverse role in development and stress regulation. A transcript encoding NAC protein, termed SiNAC was identified from a salt stress subtractive cDNA library of S. italica seedling (Puranik et al., J Plant Physiol 168:280-287, 2011). This single/low copy gene containing four exons and four introns within the genomic-sequence encoded a protein of 462 amino acids. Structural analysis revealed that highly divergent C terminus contains a transmembrane domain. The NAC domain consisted of a twisted antiparallel beta-sheet packing against N terminal alpha helix on one side and a shorter helix on the other side. The domain was predicted to homodimerize and control DNA-binding specificity. The physicochemical features of the SiNAC homodimer interface justified the dimeric form of the predicted model. A 1539 bp fragment upstream to the start codon of SiNAC gene was cloned and in silico analysis revealed several putative cis-acting regulatory elements within the promoter sequence. Transactivation analysis indicated that SiNAC activated expression of reporter gene and the activation domain lied at the C terminal. The SiNAC:GFP was detected in the nucleus and cytoplasm while SiNAC ΔC(1-158):GFP was nuclear localized in onion epidermal cells. SiNAC transcripts mostly accumulated in young spikes and were strongly induced by dehydration, salinity, ethephon, and methyl jasmonate. These results suggest that SiNAC encodes a membrane associated NAC-domain protein that may function as a transcriptional activator in response to stress and developmental regulation in plants.
In silico analysis of subtilisin from Glaciozyma antarctica PI12
NASA Astrophysics Data System (ADS)
Mustafha, Siti Mardhiah; Murad, Abdul Munir Abdul; Mahadi, Nor Muhammad; Kamaruddin, Shazilah; Bakar, Farah Diba Abu
2015-09-01
Subtilisin constitute as a major player in industrial enzymes that has a wide range of application especially in the detergent industry. In this study, a cDNA encoding for subtilisin (GaSUBT) was extracted from the psychrophilic yeast, Glaciozyma antarctica PI12, PCR amplified and sequenced. Various bioinformatics tools were used to characterize the GaSUBT. GaSUBT contains 1587 bp nucleotides encoding for 529 amino acids. The predicted molecular weight of the deduced protein is 55.34 kDa with an isoelectric point of 6.25. GaSUBT was predicted to possess a signal peptide and pro-peptide consisting of a peptidase inhibitor I9 sequence. From the sequence alignment analysis of deduced amino acids with other subtilisins in the NCBI database showed that the sequences surrounding the catalytic triad that forms the catalytic domain are well conserved.
The Caulobacter crescentus phage phiCbK: genomics of a canonical phage
2012-01-01
Background The bacterium Caulobacter crescentus is a popular model for the study of cell cycle regulation and senescence. The large prolate siphophage phiCbK has been an important tool in C. crescentus biology, and has been studied in its own right as a model for viral morphogenesis. Although a system of some interest, to date little genomic information is available on phiCbK or its relatives. Results Five novel phiCbK-like C. crescentus bacteriophages, CcrMagneto, CcrSwift, CcrKarma, CcrRogue and CcrColossus, were isolated from the environment. The genomes of phage phiCbK and these five environmental phage isolates were obtained by 454 pyrosequencing. The phiCbK-like phage genomes range in size from 205 kb encoding 318 proteins (phiCbK) to 280 kb encoding 448 proteins (CcrColossus), and were found to contain nonpermuted terminal redundancies of 10 to 17 kb. A novel method of terminal ligation was developed to map genomic termini, which confirmed termini predicted by coverage analysis. This suggests that sequence coverage discontinuities may be useable as predictors of genomic termini in phage genomes. Genomic modules encoding virion morphogenesis, lysis and DNA replication proteins were identified. The phiCbK-like phages were also found to encode a number of intriguing proteins; all contain a clearly T7-like DNA polymerase, and five of the six encode a possible homolog of the C. crescentus cell cycle regulator GcrA, which may allow the phage to alter the host cell’s replicative state. The structural proteome of phage phiCbK was determined, identifying the portal, major and minor capsid proteins, the tail tape measure and possible tail fiber proteins. All six phage genomes are clearly related; phiCbK, CcrMagneto, CcrSwift, CcrKarma and CcrRogue form a group related at the DNA level, while CcrColossus is more diverged but retains significant similarity at the protein level. Conclusions Due to their lack of any apparent relationship to other described phages, this group is proposed as the founding cohort of a new phage type, the phiCbK-like phages. This work will serve as a foundation for future studies on morphogenesis, infection and phage-host interactions in C. crescentus. PMID:23050599
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chan, Chai Ling; Yew, Su Mei; Ngeow, Yun Fong
Background: Daldinia eschscholtzii is a wood-inhabiting fungus that causes wood decay under certain conditions. It has a broad host range and produces a large repertoire of potentially bioactive compounds. However, there is no extensive genome analysis on this fungal species. Results: Two fungal isolates (UM 1400 and UM 1020) from human specimens were identified as Daldinia eschscholtzii by morphological features and ITS-based phylogenetic analysis. Both genomes were similar in size with 10,822 predicted genes in UM 1400 (35.8 Mb) and 11,120 predicted genes in UM 1020 (35.5 Mb). A total of 751 gene families were shared among both UM isolates,more » including gene families associated with fungus-host interactions. In the CAZyme comparative analysis, both genomes were found to contain arrays of CAZyme related to plant cell wall degradation. Genes encoding secreted peptidases were found in the genomes, which encode for the peptidases involved in the degradation of structural proteins in plant cell wall. In addition, arrays of secondary metabolite backbone genes were identified in both genomes, indicating of their potential to produce bioactive secondary metabolites. Both genomes also contained an abundance of gene encoding signaling components, with three proposed MAPK cascades involved in cell wall integrity, osmoregulation, and mating/filamentation. Besides genomic evidence for degrading capability, both isolates also harbored an array of genes encoding stress response proteins that are potentially significant for adaptation to living in the hostile environments. In conclusion: Our genomic studies provide further information for the biological understanding of the D. eschscholtzii and suggest that these wood-decaying fungi are also equipped for adaptation to adverse environments in the human host.« less
Chan, Chai Ling; Yew, Su Mei; Ngeow, Yun Fong; ...
2015-11-18
Background: Daldinia eschscholtzii is a wood-inhabiting fungus that causes wood decay under certain conditions. It has a broad host range and produces a large repertoire of potentially bioactive compounds. However, there is no extensive genome analysis on this fungal species. Results: Two fungal isolates (UM 1400 and UM 1020) from human specimens were identified as Daldinia eschscholtzii by morphological features and ITS-based phylogenetic analysis. Both genomes were similar in size with 10,822 predicted genes in UM 1400 (35.8 Mb) and 11,120 predicted genes in UM 1020 (35.5 Mb). A total of 751 gene families were shared among both UM isolates,more » including gene families associated with fungus-host interactions. In the CAZyme comparative analysis, both genomes were found to contain arrays of CAZyme related to plant cell wall degradation. Genes encoding secreted peptidases were found in the genomes, which encode for the peptidases involved in the degradation of structural proteins in plant cell wall. In addition, arrays of secondary metabolite backbone genes were identified in both genomes, indicating of their potential to produce bioactive secondary metabolites. Both genomes also contained an abundance of gene encoding signaling components, with three proposed MAPK cascades involved in cell wall integrity, osmoregulation, and mating/filamentation. Besides genomic evidence for degrading capability, both isolates also harbored an array of genes encoding stress response proteins that are potentially significant for adaptation to living in the hostile environments. In conclusion: Our genomic studies provide further information for the biological understanding of the D. eschscholtzii and suggest that these wood-decaying fungi are also equipped for adaptation to adverse environments in the human host.« less
Fanning, T; Singer, M
1987-01-01
Recent work suggests that one or more members of the highly repeated LINE-1 (L1) DNA family found in all mammals may encode one or more proteins. Here we report the sequence of a portion of an L1 cloned from the domestic cat (Felis catus). These data permit comparison of the L1 sequences in four mammalian orders (Carnivore, Lagomorph, Rodent and Primate) and the comparison supports the suggested coding potential. In two separate, noncontiguous regions in the carboxy terminal half of the proteins predicted from the DNA sequences, there are several strongly conserved segments. In one region, these share homology with known or suspected reverse transcriptases, as described by others in rodents and primates. In the second region, closer to the carboxy terminus, the strongly conserved segments are over 90% homologous among the four orders. One of the latter segments is cysteine rich and resembles the putative metal binding domains of nucleic acid binding proteins, including those of TFIIIA and retroviruses. PMID:3562227
Seligmann, Hervé
2013-05-07
GenBank's EST database includes RNAs matching exactly human mitochondrial sequences assuming systematic asymmetric nucleotide exchange-transcription along exchange rules: A→G→C→U/T→A (12 ESTs), A→U/T→C→G→A (4 ESTs), C→G→U/T→C (3 ESTs), and A→C→G→U/T→A (1 EST), no RNAs correspond to other potential asymmetric exchange rules. Hypothetical polypeptides translated from nucleotide-exchanged human mitochondrial protein coding genes align with numerous GenBank proteins, predicted secondary structures resemble their putative GenBank homologue's. Two independent methods designed to detect overlapping genes (one based on nucleotide contents analyses in relation to replicative deamination gradients at third codon positions, and circular code analyses of codon contents based on frame redundancy), confirm nucleotide-exchange-encrypted overlapping genes. Methods converge on which genes are most probably active, and which not, and this for the various exchange rules. Mean EST lengths produced by different nucleotide exchanges are proportional to (a) extents that various bioinformatics analyses confirm the protein coding status of putative overlapping genes; (b) known kinetic chemistry parameters of the corresponding nucleotide substitutions by the human mitochondrial DNA polymerase gamma (nucleotide DNA misinsertion rates); (c) stop codon densities in predicted overlapping genes (stop codon readthrough and exchanging polymerization regulate gene expression by counterbalancing each other). Numerous rarely expressed proteins seem encoded within regular mitochondrial genes through asymmetric nucleotide exchange, avoiding lengthening genomes. Intersecting evidence between several independent approaches confirms the working hypothesis status of gene encryption by systematic nucleotide exchanges. Copyright © 2013 Elsevier Ltd. All rights reserved.
Naqvi, Ahmad Abu Turab; Shahbaaz, Mohd; Ahmad, Faizan; Hassan, Md. Imtaiyaz
2015-01-01
Syphilis is a globally occurring venereal disease, and its infection is propagated through sexual contact. The causative agent of syphilis, Treponema pallidum ssp. pallidum, a Gram-negative sphirochaete, is an obligate human parasite. Genome of T. pallidum ssp. pallidum SS14 strain (RefSeq NC_010741.1) encodes 1,027 proteins, of which 444 proteins are known as hypothetical proteins (HPs), i.e., proteins of unknown functions. Here, we performed functional annotation of HPs of T. pallidum ssp. pallidum using various database, domain architecture predictors, protein function annotators and clustering tools. We have analyzed the sequences of 444 HPs of T. pallidum ssp. pallidum and subsequently predicted the function of 207 HPs with a high level of confidence. However, functions of 237 HPs are predicted with less accuracy. We found various enzymes, transporters, binding proteins in the annotated group of HPs that may be possible molecular targets, facilitating for the survival of pathogen. Our comprehensive analysis helps to understand the mechanism of pathogenesis to provide many novel potential therapeutic interventions. PMID:25894582
Griffith, Megan E.; Mayer, Ulrike; Capron, Arnaud; Ngo, Quy A.; Surendrarao, Anandkumar; McClinton, Regina; Jürgens, Gerd; Sundaresan, Venkatesan
2007-01-01
Embryogenesis in Arabidopsis thaliana is marked by a predictable sequence of oriented cell divisions, which precede cell fate determination. We show that mutation of the TORMOZ (TOZ) gene yields embryos with aberrant cell division planes and arrested embryos that appear not to have established normal patterning. The defects in toz mutants differ from previously described mutations that affect embryonic cell division patterns. Longitudinal division planes of the proembryo are frequently replaced by transverse divisions and less frequently by oblique divisions, while divisions of the suspensor cells, which divide only transversely, appear generally unaffected. Expression patterns of selected embryo patterning genes are altered in the mutant embryos, implying that the positional cues required for their proper expression are perturbed by the misoriented divisions. The TOZ gene encodes a nucleolar protein containing WD repeats. Putative TOZ orthologs exist in other eukaryotes including Saccharomyces cerevisiae, where the protein is predicted to function in 18S rRNA biogenesis. We find that disruption of the Sp TOZ gene results in cell division defects in Schizosaccharomyces pombe. Previous studies in yeast and animal cells have identified nucleolar proteins that regulate the exit from M phase and cytokinesis, including factors involved in pre-rRNA processing. Our study suggests that in plant cells, nucleolar functions might interact with the processes of regulated cell divisions and influence the selection of longitudinal division planes during embryogenesis. PMID:17616738
Distribution and Evolution of Yersinia Leucine-Rich Repeat Proteins
Hu, Yueming; Huang, He; Hui, Xinjie; Cheng, Xi; White, Aaron P.
2016-01-01
Leucine-rich repeat (LRR) proteins are widely distributed in bacteria, playing important roles in various protein-protein interaction processes. In Yersinia, the well-characterized type III secreted effector YopM also belongs to the LRR protein family and is encoded by virulence plasmids. However, little has been known about other LRR members encoded by Yersinia genomes or their evolution. In this study, the Yersinia LRR proteins were comprehensively screened, categorized, and compared. The LRR proteins encoded by chromosomes (LRR1 proteins) appeared to be more similar to each other and different from those encoded by plasmids (LRR2 proteins) with regard to repeat-unit length, amino acid composition profile, and gene expression regulation circuits. LRR1 proteins were also different from LRR2 proteins in that the LRR1 proteins contained an E3 ligase domain (NEL domain) in the C-terminal region or an NEL domain-encoding nucleotide relic in flanking genomic sequences. The LRR1 protein-encoding genes (LRR1 genes) varied dramatically and were categorized into 4 subgroups (a to d), with the LRR1a to -c genes evolving from the same ancestor and LRR1d genes evolving from another ancestor. The consensus and ancestor repeat-unit sequences were inferred for different LRR1 protein subgroups by use of a maximum parsimony modeling strategy. Structural modeling disclosed very similar repeat-unit structures between LRR1 and LRR2 proteins despite the different unit lengths and amino acid compositions. Structural constraints may serve as the driving force to explain the observed mutations in the LRR regions. This study suggests that there may be functional variation and lays the foundation for future experiments investigating the functions of the chromosomally encoded LRR proteins of Yersinia. PMID:27217422
Combining Physicochemical and Evolutionary Information for Protein Contact Prediction
Schneider, Michael; Brock, Oliver
2014-01-01
We introduce a novel contact prediction method that achieves high prediction accuracy by combining evolutionary and physicochemical information about native contacts. We obtain evolutionary information from multiple-sequence alignments and physicochemical information from predicted ab initio protein structures. These structures represent low-energy states in an energy landscape and thus capture the physicochemical information encoded in the energy function. Such low-energy structures are likely to contain native contacts, even if their overall fold is not native. To differentiate native from non-native contacts in those structures, we develop a graph-based representation of the structural context of contacts. We then use this representation to train an support vector machine classifier to identify most likely native contacts in otherwise non-native structures. The resulting contact predictions are highly accurate. As a result of combining two sources of information—evolutionary and physicochemical—we maintain prediction accuracy even when only few sequence homologs are present. We show that the predicted contacts help to improve ab initio structure prediction. A web service is available at http://compbio.robotics.tu-berlin.de/epc-map/. PMID:25338092
Gene family encoding the major toxins of lethal Amanita mushrooms
Hallen, Heather E.; Luo, Hong; Scott-Craig, John S.; Walton, Jonathan D.
2007-01-01
Amatoxins, the lethal constituents of poisonous mushrooms in the genus Amanita, are bicyclic octapeptides. Two genes in A. bisporigera, AMA1 and PHA1, directly encode α-amanitin, an amatoxin, and the related bicyclic heptapeptide phallacidin, a phallotoxin, indicating that these compounds are synthesized on ribosomes and not by nonribosomal peptide synthetases. α-Amanitin and phallacidin are synthesized as proproteins of 35 and 34 amino acids, respectively, from which they are predicted to be cleaved by a prolyl oligopeptidase. AMA1 and PHA1 are present in other toxic species of Amanita section Phalloidae but are absent from nontoxic species in other sections. The genomes of A. bisporigera and A. phalloides contain multiple sequences related to AMA1 and PHA1. The predicted protein products of this family of genes are characterized by a hypervariable “toxin” region capable of encoding a wide variety of peptides of 7–10 amino acids flanked by conserved sequences. Our results suggest that these fungi have a broad capacity to synthesize cyclic peptides on ribosomes. PMID:18025465
Prediction of type III secretion signals in genomes of gram-negative bacteria.
Löwer, Martin; Schneider, Gisbert
2009-06-15
Pathogenic bacteria infecting both animals as well as plants use various mechanisms to transport virulence factors across their cell membranes and channel these proteins into the infected host cell. The type III secretion system represents such a mechanism. Proteins transported via this pathway ("effector proteins") have to be distinguished from all other proteins that are not exported from the bacterial cell. Although a special targeting signal at the N-terminal end of effector proteins has been proposed in literature its exact characteristics remain unknown. In this study, we demonstrate that the signals encoded in the sequences of type III secretion system effectors can be consistently recognized and predicted by machine learning techniques. Known protein effectors were compiled from the literature and sequence databases, and served as training data for artificial neural networks and support vector machine classifiers. Common sequence features were most pronounced in the first 30 amino acids of the effector sequences. Classification accuracy yielded a cross-validated Matthews correlation of 0.63 and allowed for genome-wide prediction of potential type III secretion system effectors in 705 proteobacterial genomes (12% predicted candidates protein), their chromosomes (11%) and plasmids (13%), as well as 213 Firmicute genomes (7%). We present a signal prediction method together with comprehensive survey of potential type III secretion system effectors extracted from 918 published bacterial genomes. Our study demonstrates that the analyzed signal features are common across a wide range of species, and provides a substantial basis for the identification of exported pathogenic proteins as targets for future therapeutic intervention. The prediction software is publicly accessible from our web server (www.modlab.org).
Chen, Lei; Pospíšilová, Petra; Strouhal, Michal; Qin, Xiang; Mikalová, Lenka; Norris, Steven J.; Muzny, Donna M.; Gibbs, Richard A.; Fulton, Lucinda L.; Sodergren, Erica; Weinstock, George M.; Šmajs, David
2012-01-01
Background The yaws treponemes, Treponema pallidum ssp. pertenue (TPE) strains, are closely related to syphilis causing strains of Treponema pallidum ssp. pallidum (TPA). Both yaws and syphilis are distinguished on the basis of epidemiological characteristics, clinical symptoms, and several genetic signatures of the corresponding causative agents. Methodology/Principal Findings To precisely define genetic differences between TPA and TPE, high-quality whole genome sequences of three TPE strains (Samoa D, CDC-2, Gauthier) were determined using next-generation sequencing techniques. TPE genome sequences were compared to four genomes of TPA strains (Nichols, DAL-1, SS14, Chicago). The genome structure was identical in all three TPE strains with similar length ranging between 1,139,330 bp and 1,139,744 bp. No major genome rearrangements were found when compared to the four TPA genomes. The whole genome nucleotide divergence (dA) between TPA and TPE subspecies was 4.7 and 4.8 times higher than the observed nucleotide diversity (π) among TPA and TPE strains, respectively, corresponding to 99.8% identity between TPA and TPE genomes. A set of 97 (9.9%) TPE genes encoded proteins containing two or more amino acid replacements or other major sequence changes. The TPE divergent genes were mostly from the group encoding potential virulence factors and genes encoding proteins with unknown function. Conclusions/Significance Hypothetical genes, with genetic differences, consistently found between TPE and TPA strains are candidates for syphilitic treponemes virulence factors. Seventeen TPE genes were predicted under positive selection, and eleven of them coded either for predicted exported proteins or membrane proteins suggesting their possible association with the cell surface. Sequence changes between TPE and TPA strains and changes specific to individual strains represent suitable targets for subspecies- and strain-specific molecular diagnostics. PMID:22292095
Complete genomic sequence of a Tobacco rattle virus isolate from Michigan-grown potatoes.
Crosslin, James M; Hamm, Philip B; Kirk, William W; Hammond, Rosemarie W
2010-04-01
Tobacco rattle virus (TRV) causes stem mottle on potato leaves and necrotic arcs and rings in potato tubers, known as corky ringspot disease. Recently, TRV was reported in Michigan potato tubers cv. FL1879 exhibiting corky ringspot disease. Sequence analysis of the RNA-1-encoded 16-kDa gene of the Michigan isolate, designated MI-1, revealed homology to TRV isolates from Florida and Washington. Here, we report the complete genomic sequence of RNA-1 (6,791 nt) and RNA-2 (3,685 nt) of TRV MI-1. RNA-1 is predicted to contain four open reading frames, and the genome structure and phylogenetic analyses of the RNA-1 nucleotide sequence revealed significant homologies to the known sequences of other TRV-1 isolates. The relationships based on the full-length nucleotide sequence were different from than those based on the 16-kDa gene encoded on genomic RNA-1 and reflect sequence variation within a 20-25-aa residue region of the 16-kDa protein. MI-1 RNA-2 is predicted to contain three ORFs, encoding the coat protein (CP), a 37.6-kDa protein (ORF 2b), and a 33.6-kDa protein (ORF 2c). In addition, it contains a region of similarity to the 3' terminus of RNA-1, including a truncated portion of the 16-kDa cistron. Phylogenetic analysis of RNA-2, based on a comparison of nucleotide sequences with other members of the genus Tobravirus, indicates that TRV MI-1 and other North American isolates cluster as a distinct group. TRV M1-1 is only the second North American isolate for which there is a complete sequence of the genome, and it is distinct from the North American isolate TRV ORY. The relationship of the TRV MI-1 isolate to other tobravirus isolates is discussed.
Luisi-DeLuca, C; Clark, A J; Kolodner, R D
1988-01-01
Exonuclease VIII (exoVIII) of Escherichia coli has been purified from a strain carrying a plasmid-encoded recE gene by using a new procedure. This procedure yielded 30 times more protein per gram of cells, and the protein had a twofold higher specific activity than the enzyme purified by the previously published procedure (J. W. Joseph and R. Kolodner, J. Biol. Chem. 258:10411-10417, 1983). The sequence of the 12 N-terminal amino acids was also obtained and found to correspond to one of the open reading frames predicted from the nucleic acid sequence of the recE region of Rac (C. Chu, A. Templin, and A. J. Clark, manuscript in preparation). Polyclonal antibodies directed against purified exoVIII were also prepared. Cell-free extracts prepared from strains containing a wide range of chromosomal- or plasmid-encoded point, insertion, and deletion mutations which result in expression of exoVIII were examined by Western blot (immunoblot) analysis. This analysis showed that two point sbcA mutations (sbcA5 and sbcA23) and the sbc insertion mutations led to the synthesis of the 140-kilodalton (kDa) polypeptide of wild-type exoVIII. Plasmid-encoded partial deletion mutations of recE reduced the size of the cross-reacting protein(s) in direct proportion to the size of the deletion, even though exonuclease activity was still present. The analysis suggests that 39 kDa of the 140-kDa exoVIII subunit is all that is essential for exonuclease activity. One of the truncated but functional exonucleases (the pRAC3 exonuclease) has been purified and confirmed to be a 41-kDa polypeptide. The first 18 amino acids from the N terminus of the 41-kDa pRAC3 exonuclease were sequenced and fond to correspond to one of the translational start signals predicted from the nucleotide sequence of radC (Chu et al., in preparation). Images PMID:3056915
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhou, En -Min; Murugapiran, Senthil K.; Mefferd, Chrisabelle C.
Thermus amyloliquefaciens type strain YIM 77409 T is a thermophilic, Gram-negative, non-motile and rod-shaped bacterium isolated from Niujie Hot Spring in Eryuan County, Yunnan Province, southwest China. In the present study we describe the features of strain YIM 77409 T together with its genome sequence and annotation. The genome is 2,160,855 bp long and consists of 6 scaffolds with 67.4 % average GC content. A total of 2,313 genes were predicted, comprising 2,257 protein-coding and 56 RNA genes. The genome is predicted to encode a complete glycolysis, pentose phosphate pathway, and tricarboxylic acid cycle. Additionally, a large number of transportersmore » and enzymes for heterotrophy highlight the broad heterotrophic lifestyle of this organism. Furthermore, a denitrification gene cluster included genes predicted to encode enzymes for the sequential reduction of nitrate to nitrous oxide, consistent with the incomplete denitrification phenotype of this strain.« less
Zhou, En -Min; Murugapiran, Senthil K.; Mefferd, Chrisabelle C.; ...
2016-02-27
Thermus amyloliquefaciens type strain YIM 77409 T is a thermophilic, Gram-negative, non-motile and rod-shaped bacterium isolated from Niujie Hot Spring in Eryuan County, Yunnan Province, southwest China. In the present study we describe the features of strain YIM 77409 T together with its genome sequence and annotation. The genome is 2,160,855 bp long and consists of 6 scaffolds with 67.4 % average GC content. A total of 2,313 genes were predicted, comprising 2,257 protein-coding and 56 RNA genes. The genome is predicted to encode a complete glycolysis, pentose phosphate pathway, and tricarboxylic acid cycle. Additionally, a large number of transportersmore » and enzymes for heterotrophy highlight the broad heterotrophic lifestyle of this organism. Furthermore, a denitrification gene cluster included genes predicted to encode enzymes for the sequential reduction of nitrate to nitrous oxide, consistent with the incomplete denitrification phenotype of this strain.« less
Tian, Tian; Salis, Howard M.
2015-01-01
Natural and engineered genetic systems require the coordinated expression of proteins. In bacteria, translational coupling provides a genetically encoded mechanism to control expression level ratios within multi-cistronic operons. We have developed a sequence-to-function biophysical model of translational coupling to predict expression level ratios in natural operons and to design synthetic operons with desired expression level ratios. To quantitatively measure ribosome re-initiation rates, we designed and characterized 22 bi-cistronic operon variants with systematically modified intergenic distances and upstream translation rates. We then derived a thermodynamic free energy model to calculate de novo initiation rates as a result of ribosome-assisted unfolding of intergenic RNA structures. The complete biophysical model has only five free parameters, but was able to accurately predict downstream translation rates for 120 synthetic bi-cistronic and tri-cistronic operons with rationally designed intergenic regions and systematically increased upstream translation rates. The biophysical model also accurately predicted the translation rates of the nine protein atp operon, compared to ribosome profiling measurements. Altogether, the biophysical model quantitatively predicts how translational coupling controls protein expression levels in synthetic and natural bacterial operons, providing a deeper understanding of an important post-transcriptional regulatory mechanism and offering the ability to rationally engineer operons with desired behaviors. PMID:26117546
Ghouzam, Yassine; Postic, Guillaume; Guerin, Pierre-Edouard; de Brevern, Alexandre G.; Gelly, Jean-Christophe
2016-01-01
Protein structure prediction based on comparative modeling is the most efficient way to produce structural models when it can be performed. ORION is a dedicated webserver based on a new strategy that performs this task. The identification by ORION of suitable templates is performed using an original profile-profile approach that combines sequence and structure evolution information. Structure evolution information is encoded into profiles using structural features, such as solvent accessibility and local conformation —with Protein Blocks—, which give an accurate description of the local protein structure. ORION has recently been improved, increasing by 5% the quality of its results. The ORION web server accepts a single protein sequence as input and searches homologous protein structures within minutes. Various databases such as PDB, SCOP and HOMSTRAD can be mined to find an appropriate structural template. For the modeling step, a protein 3D structure can be directly obtained from the selected template by MODELLER and displayed with global and local quality model estimation measures. The sequence and the predicted structure of 4 examples from the CAMEO server and a recent CASP11 target from the ‘Hard’ category (T0818-D1) are shown as pertinent examples. Our web server is accessible at http://www.dsimb.inserm.fr/ORION/. PMID:27319297
Ghouzam, Yassine; Postic, Guillaume; Guerin, Pierre-Edouard; de Brevern, Alexandre G; Gelly, Jean-Christophe
2016-06-20
Protein structure prediction based on comparative modeling is the most efficient way to produce structural models when it can be performed. ORION is a dedicated webserver based on a new strategy that performs this task. The identification by ORION of suitable templates is performed using an original profile-profile approach that combines sequence and structure evolution information. Structure evolution information is encoded into profiles using structural features, such as solvent accessibility and local conformation -with Protein Blocks-, which give an accurate description of the local protein structure. ORION has recently been improved, increasing by 5% the quality of its results. The ORION web server accepts a single protein sequence as input and searches homologous protein structures within minutes. Various databases such as PDB, SCOP and HOMSTRAD can be mined to find an appropriate structural template. For the modeling step, a protein 3D structure can be directly obtained from the selected template by MODELLER and displayed with global and local quality model estimation measures. The sequence and the predicted structure of 4 examples from the CAMEO server and a recent CASP11 target from the 'Hard' category (T0818-D1) are shown as pertinent examples. Our web server is accessible at http://www.dsimb.inserm.fr/ORION/.
Santos, Leonardo N; Silva, Eduardo S; Santos, André S; De Sá, Pablo H; Ramos, Rommel T; Silva, Artur; Cooper, Philip J; Barreto, Maurício L; Loureiro, Sebastião; Pinheiro, Carina S; Alcantara-Neves, Neuza M; Pacheco, Luis G C
2016-07-01
Infection with helminthic parasites, including the soil-transmitted helminth Trichuris trichiura (human whipworm), has been shown to modulate host immune responses and, consequently, to have an impact on the development and manifestation of chronic human inflammatory diseases. De novo derivation of helminth proteomes from sequencing of transcriptomes will provide valuable data to aid identification of parasite proteins that could be evaluated as potential immunotherapeutic molecules in near future. Herein, we characterized the transcriptome of the adult stage of the human whipworm T. trichiura, using next-generation sequencing technology and a de novo assembly strategy. Nearly 17.6 million high-quality clean reads were assembled into 6414 contiguous sequences, with an N50 of 1606bp. In total, 5673 protein-encoding sequences were confidentially identified in the T. trichiura adult worm transcriptome; of these, 1013 sequences represent potential newly discovered proteins for the species, most of which presenting orthologs already annotated in the related species T. suis. A number of transcripts representing probable novel non-coding transcripts for the species T. trichiura were also identified. Among the most abundant transcripts, we found sequences that code for proteins involved in lipid transport, such as vitellogenins, and several chitin-binding proteins. Through a cross-species expression analysis of gene orthologs shared by T. trichiura and the closely related parasites T. suis and T. muris it was possible to find twenty-six protein-encoding genes that are consistently highly expressed in the adult stages of the three helminth species. Additionally, twenty transcripts could be identified that code for proteins previously detected by mass spectrometry analysis of protein fractions of the whipworm somatic extract that present immunomodulatory activities. Five of these transcripts were amongst the most highly expressed protein-encoding sequences in the T. trichiura adult worm. Besides, orthologs of proteins demonstrated to have potent immunomodulatory properties in related parasitic helminths were also predicted from the T. trichiura de novo assembled transcriptome. Copyright © 2016. Published by Elsevier B.V.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bajaj, R. Alexandra; Arbing, Mark A.; Shin, Annie
The structure of Msmeg_6760, a protein of unknown function, has been determined. Biochemical and bioinformatics analyses determined that Msmeg_6760 interacts with a protein encoded in the same operon, Msmeg_6762, and predicted that the operon is a toxin–antitoxin (TA) system. Structural comparison of Msmeg_6760 with proteins of known function suggests that Msmeg_6760 binds a hydrophobic ligand in a buried cavity lined by large hydrophobic residues. Access to this cavity could be controlled by a gate–latch mechanism. The function of the Msmeg_6760 toxin is unknown, but structure-based predictions revealed that Msmeg_6760 and Msmeg_6762 are homologous to Rv2034 and Rv2035, a predicted novelmore » TA system involved inMycobacterium tuberculosislatency during macrophage infection. The Msmeg_6760 toxin fold has not been previously described for bacterial toxins and its unique structural features suggest that toxin activation is likely to be mediated by a novel mechanism.« less
Yan, J; Cheng, Q; Li, C B; Aksoy, S
2001-02-01
Serine proteases are major insect gut enzymes involved in digestion of dietary proteins, and in addition they have been implicated in the process of pathogen establishment in several vector insects. The medically important vector, tsetse fly (Diptera:Glossinidiae), is involved in the transmission of African trypanosomes, which cause devastating diseases in animals and humans. Both the male and female tsetse can transmit trypanosomes and both are strict bloodfeeders throughout all stages of their development. Here, we describe the characterization of two putative serine protease-encoding genes, Glossina serine protease-1 (Gsp1) and Glossina serine protease-2 (Gsp2) from gut tissue. Both putative cDNA products represent prepro peptides with hydrophobic signal peptide sequences associated with their 5'-end terminus. The Gsp1 cDNA encodes a putative mature protein of 245 amino acids with a molecular mass of 26 428 Da, while the predicted size of the 228 amino acid mature peptide encoded by Gsp2 cDNA is 24 573 Da. Both deduced peptides contain the Asp/His/Ser catalytic triad and the conserved residues surrounding it which are characteristic of serine proteases. In addition, both proteins have the six-conserved cysteine residues to form the three-cysteine bonds typically present in invertebrate serine proteases. Based on the presence of substrate specific residues, the Gsp1 gene encodes a chymotrypsin-like protease while Gsp2 gene encodes for a protein with trypsin-like activity. Both proteins are encoded by few loci in tsetse genome, being present in one or two copies only. The mRNA expression levels for the genes do not vary extensively throughout the digestive cycle, and high levels of mRNAs can be readily detected in the gut tissue of newly emerged flies. The levels of trypsin and chymotrypsin activities in the gut lumen increase following blood feeding and change significantly in the gut cells throughout the digestion cycle. Hence, the regulation of expression for trypsin and chymotrypsin occurs at the post-transcriptional level in tsetse. Both the coding sequences and patterns of expression of Gsp1 and Gsp2 genes are similar to the serine proteases that have been reported from the bloodfeeding insect Stomoxys calcitrans.
FastRNABindR: Fast and Accurate Prediction of Protein-RNA Interface Residues.
El-Manzalawy, Yasser; Abbas, Mostafa; Malluhi, Qutaibah; Honavar, Vasant
2016-01-01
A wide range of biological processes, including regulation of gene expression, protein synthesis, and replication and assembly of many viruses are mediated by RNA-protein interactions. However, experimental determination of the structures of protein-RNA complexes is expensive and technically challenging. Hence, a number of computational tools have been developed for predicting protein-RNA interfaces. Some of the state-of-the-art protein-RNA interface predictors rely on position-specific scoring matrix (PSSM)-based encoding of the protein sequences. The computational efforts needed for generating PSSMs severely limits the practical utility of protein-RNA interface prediction servers. In this work, we experiment with two approaches, random sampling and sequence similarity reduction, for extracting a representative reference database of protein sequences from more than 50 million protein sequences in UniRef100. Our results suggest that random sampled databases produce better PSSM profiles (in terms of the number of hits used to generate the profile and the distance of the generated profile to the corresponding profile generated using the entire UniRef100 data as well as the accuracy of the machine learning classifier trained using these profiles). Based on our results, we developed FastRNABindR, an improved version of RNABindR for predicting protein-RNA interface residues using PSSM profiles generated using 1% of the UniRef100 sequences sampled uniformly at random. To the best of our knowledge, FastRNABindR is the only protein-RNA interface residue prediction online server that requires generation of PSSM profiles for query sequences and accepts hundreds of protein sequences per submission. Our approach for determining the optimal BLAST database for a protein-RNA interface residue classification task has the potential of substantially speeding up, and hence increasing the practical utility of, other amino acid sequence based predictors of protein-protein and protein-DNA interfaces.
Music, Nedzad; Gagnon, Carl A
2010-12-01
Porcine reproductive and respiratory syndrome (PRRS) is an economically devastating viral disease affecting the swine industry worldwide. The etiological agent, PRRS virus (PRRSV), possesses a RNA viral genome with nine open reading frames (ORFs). The ORF1a and ORF1b replicase-associated genes encode the polyproteins pp1a and pp1ab, respectively. The pp1a is processed in nine non-structural proteins (nsps): nsp1α, nsp1β, and nsp2 to nsp8. Proteolytic cleavage of pp1ab generates products nsp9 to nsp12. The proteolytic pp1a cleavage products process and cleave pp1a and pp1ab into nsp products. The nsp9 to nsp12 are involved in virus genome transcription and replication. The 3' end of the viral genome encodes four minor and three major structural proteins. The GP(2a), GP₃ and GP₄ (encoded by ORF2a, 3 and 4), are glycosylated membrane associated minor structural proteins. The fourth minor structural protein, the E protein (encoded by ORF2b), is an unglycosylated membrane associated protein. The viral envelope contains two major structural proteins: a glycosylated major envelope protein GP₅ (encoded by ORF5) and an unglycosylated membrane M protein (encoded by ORF6). The third major structural protein is the nucleocapsid N protein (encoded by ORF7). All PRRSV non-structural and structural proteins are essential for virus replication, and PRRSV infectivity is relatively intolerant to subtle changes within the structural proteins. PRRSV virulence is multigenic and resides in both the non-structural and structural viral proteins. This review discusses the molecular characteristics, biological and immunological functions of the PRRSV structural and nsps and their involvement in the virus pathogenesis.
Borca, Manuel V; O'Donnell, Vivian; Holinka, Lauren G; Rai, Devendra K; Sanford, Brenton; Alfano, Marialexia; Carlson, Jolene; Azzinaro, Paul A; Alonso, Covadonga; Gladue, Douglas P
2016-09-02
African swine fever virus (ASFV) is the etiological agent of a contagious and often lethal disease of domestic pigs that has significant economic consequences for the swine industry. The viral genome encodes for more than 150 genes, and only a select few of these genes have been studied in some detail. Here we report the characterization of open reading frame Ep152R that has a predicted complement control module/SCR domain. This domain is found in Vaccinia virus proteins that are involved in blocking the immune response during viral infection. A recombinant ASFV harboring a HA tagged version of the Ep152R protein was developed (ASFV-G-Ep152R-HA) and used to demonstrate that Ep152R is an early virus protein. Attempts to construct recombinant viruses having a deleted Ep152R gene were consistently unsuccessful indicating that Ep152R is an essential gene. Interestingly, analysis of host-protein interactions for Ep152R using a yeast two-hybrid screen, identified BAG6, a protein previously identified as being required for ASFV replication. Furthermore, fluorescent microscopy analysis confirms that Ep152R-BAG6 interaction actually occurs in cells infected with ASFV. Published by Elsevier B.V.
Extensive Use of RNA-Binding Proteins in Drosophila Sensory Neuron Dendrite Morphogenesis
Olesnicky, Eugenia C.; Killian, Darrell J.; Garcia, Evelyn; Morton, Mary C.; Rathjen, Alan R.; Sola, Ismail E.; Gavis, Elizabeth R.
2013-01-01
The large number of RNA-binding proteins and translation factors encoded in the Drosophila and other metazoan genomes predicts widespread use of post-transcriptional regulation in cellular and developmental processes. Previous studies identified roles for several RNA-binding proteins in dendrite branching morphogenesis of Drosophila larval sensory neurons. To determine the larger contribution of post-transcriptional gene regulation to neuronal morphogenesis, we conducted an RNA interference screen to identify additional Drosophila proteins annotated as either RNA-binding proteins or translation factors that function in producing the complex dendritic trees of larval class IV dendritic arborization neurons. We identified 88 genes encoding such proteins whose knockdown resulted in aberrant dendritic morphology, including alterations in dendritic branch number, branch length, field size, and patterning of the dendritic tree. In particular, splicing and translation initiation factors were associated with distinct and characteristic phenotypes, suggesting that different morphogenetic events are best controlled at specific steps in post-transcriptional messenger RNA metabolism. Many of the factors identified in the screen have been implicated in controlling the subcellular distributions and translation of maternal messenger RNAs; thus, common post-transcriptional regulatory strategies may be used in neurogenesis and in the generation of asymmetry in the female germline and embryo. PMID:24347626
Zhao, Suwen; Sakai, Ayano; Zhang, Xinshuai; ...
2014-06-30
Metabolic pathways in eubacteria and archaea often are encoded by operons and/or gene clusters (genome neighborhoods) that provide important clues for assignment of both enzyme functions and metabolic pathways. We describe a bioinformatic approach (genome neighborhood network; GNN) that enables large scale prediction of the in vitro enzymatic activities and in vivo physiological functions (metabolic pathways) of uncharacterized enzymes in protein families. We demonstrate the utility of the GNN approach by predicting in vitro activities and in vivo functions in the proline racemase superfamily (PRS; InterPro IPR008794). The predictions were verified by measuring in vitro activities for 51 proteins inmore » 12 families in the PRS that represent ~85% of the sequences; in vitro activities of pathway enzymes, carbon/nitrogen source phenotypes, and/or transcriptomic studies confirmed the predicted pathways. The synergistic use of sequence similarity networks3 and GNNs will facilitate the discovery of the components of novel, uncharacterized metabolic pathways in sequenced genomes.« less
McLaughlin, Margaret; Lockhart, Ben; Jordan, Ramon; Denton, Geoff; Mollov, Dimitre
2017-05-01
Clematis chlorotic mottle virus (ClCMV) is a previously undescribed virus associated with symptoms of yellow mottling and veining, chlorotic ring spots, line pattern mosaics, and flower distortion and discoloration on ornamental Clematis. The ClCMV genome is 3,880 nt in length with five open reading frames (ORFs) encoding a 27-kDa protein (ORF 1), an 87-kDa replicase protein (ORF 2), two centrally located movement proteins (ORF 3 and 4), and a 37-kDa capsid protein (ORF 5). Based on morphological, genomic, and phylogenetic analysis, ClCMV is predicted to be a member of the genus Pelarspovirus in the family Tombusviridae.
Seki, N; Muramatsu, M; Sugano, S; Suzuki, Y; Nakagawara, A; Ohhira, M; Hayashi, A; Hori, T; Saito, T
1998-01-01
Huntington disease (HD) is an inherited neurodegenerative disorder which is associated with CAG expansion in the coding region of the gene for huntingtin protein. Recently, a huntingtin interacting protein, HIP1, was isolated by the yeast two-hybrid system. Here we report the isolation of a cDNA clone for HIP1R (huntingtin interacting protein-1 related), which encodes a predicted protein product sharing a striking homology with HIP1. RT-PCR analysis showed that the messenger RNA was ubiquitously expressed in various human tissues. Based on PCR-assisted analysis of a radiation hybrid panel and fluorescence in situ hybridization, HIP1R was localized to the q24 region of chromosome 12.
Protein profiles associated with survival in lung adenocarcinoma
Chen, Guoan; Gharib, Tarek G; Wang, Hong; Huang, Chiang-Ching; Kuick, Rork; Thomas, Dafydd G.; Shedden, Kerby A.; Misek, David E.; Taylor, Jeremy M. G.; Giordano, Thomas J.; Kardia, Sharon L. R.; Iannettoni, Mark D.; Yee, John; Hogg, Philip J.; Orringer, Mark B.; Hanash, Samir M.; Beer, David G.
2003-01-01
Morphologic assessment of lung tumors is informative but insufficient to adequately predict patient outcome. We previously identified transcriptional profiles that predict patient survival, and here we identify proteins associated with patient survival in lung adenocarcinoma. A total of 682 individual protein spots were quantified in 90 lung adenocarcinomas by using quantitative two-dimensional polyacrylamide gel electrophoresis analysis. A leave-one-out cross-validation procedure using the top 20 survival-associated proteins identified by Cox modeling indicated that protein profiles as a whole can predict survival in stage I tumor patients (P = 0.01). Thirty-three of 46 survival-associated proteins were identified by using mass spectrometry. Expression of 12 candidate proteins was confirmed as tumor-derived with immunohistochemical analysis and tissue microarrays. Oligonucleotide microarray results from both the same tumors and from an independent study showed mRNAs associated with survival for 11 of 27 encoded genes. Combined analysis of protein and mRNA data revealed 11 components of the glycolysis pathway as associated with poor survival. Among these candidates, phosphoglycerate kinase 1 was associated with survival in the protein study, in both mRNA studies and in an independent validation set of 117 adenocarcinomas and squamous lung tumors using tissue microarrays. Elevated levels of phosphoglycerate kinase 1 in the serum were also significantly correlated with poor outcome in a validation set of 107 patients with lung adenocarcinomas using ELISA analysis. These studies identify new prognostic biomarkers and indicate that protein expression profiles can predict the outcome of patients with early-stage lung cancer. PMID:14573703
Jiang, W; Woitach, J T; Gupta, D; Bhavanandan, V P
1998-10-20
Secreted epithelial mucins are extremely large and heterogeneous glycoproteins. We report the 5 kilobase DNA sequence of a second gene, BSM2, which encodes bovine submaxillary mucin. The determined nucleotide and deduced amino acid sequences of BSM2 are 95.2% and 92. 2% identical, respectively, to those of the previously described BSM1 gene isolated from the same cow. Further, the five predicted protein domains of the two genes are 100%, 94%, 93%, 77%, and 88% identical. Based on the above results, we propose that expression of multiple homologous core proteins from a single animal is a factor in generating diversity of saccharides in mucins and in providing resistance of the molecules to proteolysis. In addition, this work raises several important issues in mucin cloning such as assembling sequences from seemingly overlapping clones and deducing consensus sequences for nearly identical tandem repeats. Copyright 1998 Academic Press.
Genome of the opportunistic pathogen Streptococcus sanguinis.
Xu, Ping; Alves, Joao M; Kitten, Todd; Brown, Arunsri; Chen, Zhenming; Ozaki, Luiz S; Manque, Patricio; Ge, Xiuchun; Serrano, Myrna G; Puiu, Daniela; Hendricks, Stephanie; Wang, Yingping; Chaplin, Michael D; Akan, Doruk; Paik, Sehmi; Peterson, Darrell L; Macrina, Francis L; Buck, Gregory A
2007-04-01
The genome of Streptococcus sanguinis is a circular DNA molecule consisting of 2,388,435 bp and is 177 to 590 kb larger than the other 21 streptococcal genomes that have been sequenced. The G+C content of the S. sanguinis genome is 43.4%, which is considerably higher than the G+C contents of other streptococci. The genome encodes 2,274 predicted proteins, 61 tRNAs, and four rRNA operons. A 70-kb region encoding pathways for vitamin B(12) biosynthesis and degradation of ethanolamine and propanediol was apparently acquired by horizontal gene transfer. The gene complement suggests new hypotheses for the pathogenesis and virulence of S. sanguinis and differs from the gene complements of other pathogenic and nonpathogenic streptococci. In particular, S. sanguinis possesses a remarkable abundance of putative surface proteins, which may permit it to be a primary colonizer of the oral cavity and agent of streptococcal endocarditis and infection in neutropenic patients.
Xu, Aishi; Li, Guang; Yang, Dong; Wu, Songfeng; Ouyang, Hongsheng; Xu, Ping; He, Fuchu
2015-12-04
Although the "missing protein" is a temporary concept in C-HPP, the biological information for their "missing" could be an important clue in evolutionary studies. Here we classified missing-protein-encoding genes into two groups, the genes encoding PE2 proteins (with transcript evidence) and the genes encoding PE3/4 proteins (with no transcript evidence). These missing-protein-encoding genes distribute unevenly among different chromosomes, chromosomal regions, or gene clusters. In the view of evolutionary features, PE3/4 genes tend to be young, spreading at the nonhomology chromosomal regions and evolving at higher rates. Interestingly, there is a higher proportion of singletons in PE3/4 genes than the proportion of singletons in all genes (background) and OTCSGs (organ, tissue, cell type-specific genes). More importantly, most of the paralogous PE3/4 genes belong to the newly duplicated members of the paralogous gene groups, which mainly contribute to special biological functions, such as "smell perception". These functions are heavily restricted into specific type of cells, tissues, or specific developmental stages, acting as the new functional requirements that facilitated the emergence of the missing-protein-encoding genes during evolution. In addition, the criteria for the extremely special physical-chemical proteins were first set up based on the properties of PE2 proteins, and the evolutionary characteristics of those proteins were explored. Overall, the evolutionary analyses of missing-protein-encoding genes are expected to be highly instructive for proteomics and functional studies in the future.
Kinnear, Ekaterina; Caproni, Lisa J; Tregoning, John S
2015-01-01
DNA vaccines can be manufactured cheaply, easily and rapidly and have performed well in pre-clinical animal studies. However, clinical trials have so far been disappointing, failing to evoke a strong immune response, possibly due to poor antigen expression. To improve antigen expression, improved technology to monitor DNA vaccine transfection efficiency is required. In the current study, we compared plasmid encoded tdTomato, mCherry, Katushka, tdKatushka2 and luciferase as reporter proteins for whole animal in vivo imaging. The intramuscular, subcutaneous and tattooing routes were compared and electroporation was used to enhance expression. We observed that overall, fluorescent proteins were not a good tool to assess expression from DNA plasmids, with a highly heterogeneous response between animals. Of the proteins used, intramuscular delivery of DNA encoding either tdTomato or luciferase gave the clearest signal, with some Katushka and tdKatushka2 signal observed. Subcutaneous delivery was weakly visible and nothing was observed following DNA tattooing. DNA encoding haemagglutinin was used to determine whether immune responses mirrored visible expression levels. A protective immune response against H1N1 influenza was induced by all routes, even after a single dose of DNA, though qualitative differences were observed, with tattooing leading to high antibody responses and subcutaneous DNA leading to high CD8 responses. We conclude that of the reporter proteins used, expression from DNA plasmids can best be assessed using tdTomato or luciferase. But, the disconnect between visible expression level and immunogenicity suggests that in vivo whole animal imaging of fluorescent proteins has limited utility for predicting DNA vaccine efficacy.
Naqvi, Ahmad Abu Turab; Ahmad, Faizan; Hassan, Md Imtaiyaz
2015-01-01
Mycobacterium leprae is an intracellular obligate parasite that causes leprosy in humans, and it leads to the destruction of peripheral nerves and skin deformation. Here, we report an extensive analysis of the hypothetical proteins (HPs) from M. leprae strain Br4923, assigning their functions to better understand the mechanism of pathogenesis and to search for potential therapeutic interventions. The genome of M. leprae encodes 1604 proteins, of which the functions of 632 are not known (HPs). In this paper, we predicted the probable functions of 312 HPs. First, we classified all HPs into families and subfamilies on the basis of sequence similarity, followed by domain assignment, which provides many clues for their possible function. However, the functions of 320 proteins were not predicted because of low sequence similarity with proteins of known function. Annotated HPs were categorized into enzymes, binding proteins, transporters, and proteins involved in cellular processes. We found several novel proteins whose functions were unknown for M. leprae. These proteins have a requisite association with bacterial virulence and pathogenicity. Finally, our sequence-based analysis will be helpful for further validation and the search for potential drug targets while developing effective drugs to cure leprosy.
A Plastidial Lysophosphatidic Acid Acyltransferase from Oilseed Rape1
Bourgis, Fabienne; Kader, Jean-Claude; Barret, Pierre; Renard, Michel; Robinson, David; Robinson, Colin; Delseny, Michel; Roscoe, Thomas J.
1999-01-01
The biosynthesis of phosphatidic acid, a key intermediate in the biosynthesis of lipids, is controlled by lysophosphatidic acid (LPA, or 1-acyl-glycerol-3-P) acyltransferase (LPAAT, EC 2.3.1.51). We have isolated a cDNA encoding a novel LPAAT by functional complementation of the Escherichia coli mutant plsC with an immature embryo cDNA library of oilseed rape (Brassica napus). Transformation of the acyltransferase-deficient E. coli strain JC201 with the cDNA sequence BAT2 alleviated the temperature-sensitive phenotype of the plsC mutant and conferred a palmitoyl-coenzyme A-preferring acyltransferase activity to membrane fractions. The BAT2 cDNA encoded a protein of 351 amino acids with a predicted molecular mass of 38 kD and an isoelectric point of 9.7. Chloroplast-import experiments showed processing of a BAT2 precursor protein to a mature protein of approximately 32 kD, which was localized in the membrane fraction. BAT2 is encoded by a minimum of two genes that may be expressed ubiquitously. These data are consistent with the identity of BAT2 as the plastidial enzyme of the prokaryotic glycerol-3-P pathway that uses a palmitoyl-ACP to produce phosphatidic acid with a prokaryotic-type acyl composition. The homologies between the deduced protein sequence of BAT2 with prokaryotic and eukaryotic microsomal LAP acytransferases suggest that seed microsomal forms may have evolved from the plastidial enzyme. PMID:10398728
Molecular cloning and expression of the CRISP family of proteins in the boar.
Vadnais, Melissa L; Foster, Douglas N; Roberts, Kenneth P
2008-12-01
The family of mammalian cysteine-rich secretory proteins (CRISP) have been well characterized in the rat, mouse, and human. Here we report the molecular cloning and expression analysis of CRISP1, CRISP2, and CRISP3 in the boar. A partial sequence published in the National Center for Biotechnology Information (NCBI) database was used to derive the full-length sequences for CRISP1 and CRISP2 using rapid amplification of cDNA ends. RT-PCR confirmed the expression of these mRNAs in the boar reproductive tract, and real time RT-PCR showed CRISP1 to be highly expressed throughout the epididymis, with CRISP2 highly expressed in the testis. A search of the porcine genomic sequence in the NCBI database identified a BAC (CH242-199E6) encoding the CRISP1 gene. This BAC is derived from porcine Chromosome 7 and is syntenic with the regions of the mouse, rat, and human genomes encoding the CRISP gene family. This BAC was found to encode a third CRISP protein with a predicted amino acid sequence of high similarity to human CRISP3. Using RT-PCR we show that CRISP3 expression in the boar reproductive tract is confined to the prostate. Recombinant porcine (rp) CRISP2 protein was produced and purified. When incubated with capacitated boar sperm, rpCRISP2 induced an acrosome reaction, consistent with its demonstrated ability to alter the activity of calcium channels.
Kariithi, Henry M.; Ince, Ikbal A.; Boeren, Sjef; Abd-Alla, Adly M. M.; Parker, Andrew G.; Aksoy, Serap; Vlak, Just M.; van Oers, Monique M.
2011-01-01
Background The competence of the tsetse fly Glossina pallidipes (Diptera; Glossinidae) to acquire salivary gland hypertrophy virus (SGHV), to support virus replication and successfully transmit the virus depends on complex interactions between Glossina and SGHV macromolecules. Critical requisites to SGHV transmission are its replication and secretion of mature virions into the fly's salivary gland (SG) lumen. However, secretion of host proteins is of equal importance for successful transmission and requires cataloging of G. pallidipes secretome proteins from hypertrophied and non-hypertrophied SGs. Methodology/Principal Findings After electrophoretic profiling and in-gel trypsin digestion, saliva proteins were analyzed by nano-LC-MS/MS. MaxQuant/Andromeda search of the MS data against the non-redundant (nr) GenBank database and a G. morsitans morsitans SG EST database, yielded a total of 521 hits, 31 of which were SGHV-encoded. On a false discovery rate limit of 1% and detection threshold of least 2 unique peptides per protein, the analysis resulted in 292 Glossina and 25 SGHV MS-supported proteins. When annotated by the Blast2GO suite, at least one gene ontology (GO) term could be assigned to 89.9% (285/317) of the detected proteins. Five (∼1.8%) Glossina and three (∼12%) SGHV proteins remained without a predicted function after blast searches against the nr database. Sixty-five of the 292 detected Glossina proteins contained an N-terminal signal/secretion peptide sequence. Eight of the SGHV proteins were predicted to be non-structural (NS), and fourteen are known structural (VP) proteins. Conclusions/Significance SGHV alters the protein expression pattern in Glossina. The G. pallidipes SG secretome encompasses a spectrum of proteins that may be required during the SGHV infection cycle. These detected proteins have putative interactions with at least 21 of the 25 SGHV-encoded proteins. Our findings opens venues for developing novel SGHV mitigation strategies to block SGHV infections in tsetse production facilities such as using SGHV-specific antibodies and phage display-selected gut epithelia-binding peptides. PMID:22132244
Positive selection on human gamete-recognition genes
Stover, Daryn A.; Guerra, Vanessa; Mozaffari, Sahar V.; Ober, Carole; Mugal, Carina F.; Kaj, Ingemar
2018-01-01
Coevolution of genes that encode interacting proteins expressed on the surfaces of sperm and eggs can lead to variation in reproductive compatibility between mates and reproductive isolation between members of different species. Previous studies in mice and other mammals have focused in particular on evidence for positive or diversifying selection that shapes the evolution of genes that encode sperm-binding proteins expressed in the egg coat or zona pellucida (ZP). By fitting phylogenetic models of codon evolution to data from the 1000 Genomes Project, we identified candidate sites evolving under diversifying selection in the human genes ZP3 and ZP2. We also identified one candidate site under positive selection in C4BPA, which encodes a repetitive protein similar to the mouse protein ZP3R that is expressed in the sperm head and binds to the ZP at fertilization. Results from several additional analyses that applied population genetic models to the same data were consistent with the hypothesis of selection on those candidate sites leading to coevolution of sperm- and egg-expressed genes. By contrast, we found no candidate sites under selection in a fourth gene (ZP1) that encodes an egg coat structural protein not directly involved in sperm binding. Finally, we found that two of the candidate sites (in C4BPA and ZP2) were correlated with variation in family size and birth rate among Hutterite couples, and those two candidate sites were also in linkage disequilibrium in the same Hutterite study population. All of these lines of evidence are consistent with predictions from a previously proposed hypothesis of balancing selection on epistatic interactions between C4BPA and ZP3 at fertilization that lead to the evolution of co-adapted allele pairs. Such patterns also suggest specific molecular traits that may be associated with both natural reproductive variation and clinical infertility. PMID:29340252
Davidsson, Sabina; Carlsson, Jessica; Mölling, Paula; Gashi, Natyra; Andrén, Ove; Andersson, Swen-Olof; Brzuszkiewicz, Elzbieta; Poehlein, Anja; Al-Zeer, Munir A.; Brinkmann, Volker; Scavenius, Carsten; Nazipi, Seven; Söderquist, Bo; Brüggemann, Holger
2017-01-01
Inflammation is one of the hallmarks of prostate cancer. The origin of inflammation is unknown, but microbial infections are suspected to play a role. In previous studies, the Gram-positive, low virulent bacterium Cutibacterium (formerly Propionibacterium) acnes was frequently isolated from prostatic tissue. It is unclear if the presence of the bacterium represents a true infection or a contamination. Here we investigated Cutibacterium acnes type II, also called subspecies defendens, which is the most prevalent type among prostatic C. acnes isolates. Genome sequencing of type II isolates identified large plasmids in several genomes. The plasmids are highly similar to previously identified linear plasmids of type I C. acnes strains associated with acne vulgaris. A PCR-based analysis revealed that 28.4% (21 out of 74) of all type II strains isolated from cancerous prostates carry a plasmid. The plasmid shows signatures for conjugative transfer. In addition, it contains a gene locus for tight adherence (tad) that is predicted to encode adhesive Flp (fimbrial low-molecular weight protein) pili. In subsequent experiments a tad locus-encoded putative pilin subunit was identified in the surface-exposed protein fraction of plasmid-positive C. acnes type II strains by mass spectrometry, indicating that the tad locus is functional. Additional plasmid-encoded proteins were detected in the secreted protein fraction, including two signal peptide-harboring proteins; the corresponding genes are specific for type II C. acnes, thus lacking from plasmid-positive type I C. acnes strains. Further support for the presence of Flp pili in C. acnes type II was provided by electron microscopy, revealing cell appendages in tad locus-positive strains. Our study provides new insight in the most prevalent prostatic subspecies of C. acnes, subsp. defendens, and indicates the existence of Flp pili in plasmid-positive strains. Such pili may support colonization and persistent infection of human prostates by C. acnes. PMID:29201018
De Coi, Niccolò; Feuermann, Marc; Schmid-Siegert, Emanuel; Băguţ, Elena-Tatiana; Mignon, Bernard; Waridel, Patrice; Peter, Corinne; Pradervand, Sylvain
2016-01-01
ABSTRACT Dermatophytes are the most common agents of superficial mycoses in humans and animals. The aim of the present investigation was to systematically identify the extracellular, possibly secreted, proteins that are putative virulence factors and antigenic molecules of dermatophytes. A complete gene expression profile of Arthroderma benhamiae was obtained during infection of its natural host (guinea pig) using RNA sequencing (RNA-seq) technology. This profile was completed with those of the fungus cultivated in vitro in two media containing either keratin or soy meal protein as the sole source of nitrogen and in Sabouraud medium. More than 60% of transcripts deduced from RNA-seq data differ from those previously deposited for A. benhamiae. Using these RNA-seq data along with an automatic gene annotation procedure, followed by manual curation, we produced a new annotation of the A. benhamiae genome. This annotation comprised 7,405 coding sequences (CDSs), among which only 2,662 were identical to the currently available annotation, 383 were newly identified, and 15 secreted proteins were manually corrected. The expression profile of genes encoding proteins with a signal peptide in infected guinea pigs was found to be very different from that during in vitro growth when using keratin as the substrate. Especially, the sets of the 12 most highly expressed genes encoding proteases with a signal sequence had only the putative vacuolar aspartic protease gene PEP2 in common, during infection and in keratin medium. The most upregulated gene encoding a secreted protease during infection was that encoding subtilisin SUB6, which is a known major allergen in the related dermatophyte Trichophyton rubrum. IMPORTANCE Dermatophytoses (ringworm, jock itch, athlete’s foot, and nail infections) are the most common fungal infections, but their virulence mechanisms are poorly understood. Combining transcriptomic data obtained from growth under various culture conditions with data obtained during infection led to a significantly improved genome annotation. About 65% of the protein-encoding genes predicted with our protocol did not match the existing annotation for A. benhamiae. Comparing gene expression during infection on guinea pigs with keratin degradation in vitro, which is supposed to mimic the host environment, revealed the critical importance of using real in vivo conditions for investigating virulence mechanisms. The analysis of genes expressed in vivo, encoding cell surface and secreted proteins, particularly proteases, led to the identification of new allergen and virulence factor candidates. PMID:27822542
Tran, Van Du T; De Coi, Niccolò; Feuermann, Marc; Schmid-Siegert, Emanuel; Băguţ, Elena-Tatiana; Mignon, Bernard; Waridel, Patrice; Peter, Corinne; Pradervand, Sylvain; Pagni, Marco; Monod, Michel
2016-01-01
Dermatophytes are the most common agents of superficial mycoses in humans and animals. The aim of the present investigation was to systematically identify the extracellular, possibly secreted, proteins that are putative virulence factors and antigenic molecules of dermatophytes. A complete gene expression profile of Arthroderma benhamiae was obtained during infection of its natural host (guinea pig) using RNA sequencing (RNA-seq) technology. This profile was completed with those of the fungus cultivated in vitro in two media containing either keratin or soy meal protein as the sole source of nitrogen and in Sabouraud medium. More than 60% of transcripts deduced from RNA-seq data differ from those previously deposited for A. benhamiae . Using these RNA-seq data along with an automatic gene annotation procedure, followed by manual curation, we produced a new annotation of the A. benhamiae genome. This annotation comprised 7,405 coding sequences (CDSs), among which only 2,662 were identical to the currently available annotation, 383 were newly identified, and 15 secreted proteins were manually corrected. The expression profile of genes encoding proteins with a signal peptide in infected guinea pigs was found to be very different from that during in vitro growth when using keratin as the substrate. Especially, the sets of the 12 most highly expressed genes encoding proteases with a signal sequence had only the putative vacuolar aspartic protease gene PEP2 in common, during infection and in keratin medium. The most upregulated gene encoding a secreted protease during infection was that encoding subtilisin SUB6, which is a known major allergen in the related dermatophyte Trichophyton rubrum . IMPORTANCE Dermatophytoses (ringworm, jock itch, athlete's foot, and nail infections) are the most common fungal infections, but their virulence mechanisms are poorly understood. Combining transcriptomic data obtained from growth under various culture conditions with data obtained during infection led to a significantly improved genome annotation. About 65% of the protein-encoding genes predicted with our protocol did not match the existing annotation for A. benhamiae . Comparing gene expression during infection on guinea pigs with keratin degradation in vitro , which is supposed to mimic the host environment, revealed the critical importance of using real in vivo conditions for investigating virulence mechanisms. The analysis of genes expressed in vivo , encoding cell surface and secreted proteins, particularly proteases, led to the identification of new allergen and virulence factor candidates.
Mahan, Kristina M; Klingeman, Dawn M; Hettich, Robert L; Parry, Ronald J; Graham, David E
2016-01-21
Streptomyces vitaminophilus produces pyrrolomycins, which are halogenated polyketide antibiotics. Some of the pyrrolomycins contain a rare nitro group located on the pyrrole ring. The 6.5-Mbp genome encodes 5,941 predicted protein-coding sequences in 39 contigs with a 71.9% G+C content. Copyright © 2016 Mahan et al.
Chen, Gao; Murdoch, Robert W.; Mack, E. Erin; ...
2017-09-14
Dehalobacterium formicoaceticum utilizes dichloromethane as the sole energy source in defined anoxic bicarbonate-buffered mineral salt medium. The products are formate, acetate, inorganic chloride, and biomass. The bacterium’s genome was sequenced using PacBio, assembled, and annotated. The complete genome consists of one 3.77-Mb circular chromosome harboring 3,935 predicted protein-encoding genes.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chen, Gao; Murdoch, Robert W.; Mack, E. Erin
Dehalobacterium formicoaceticum utilizes dichloromethane as the sole energy source in defined anoxic bicarbonate-buffered mineral salt medium. The products are formate, acetate, inorganic chloride, and biomass. The bacterium’s genome was sequenced using PacBio, assembled, and annotated. The complete genome consists of one 3.77-Mb circular chromosome harboring 3,935 predicted protein-encoding genes.
Draft Genome Sequence of Hafnia paralvei Strain GTA-HAF03.
Kohlman, Melissa E; Carrillo, Catherine D; Wong, Alex
2015-02-19
Hafnia paralvei is a Gram-negative member of the Enterobacteriaceae family, closely related to the opportunistic pathogen Hafnia alvei. We report here the first draft genome sequence of H. paralvei, from the beef trim isolate GTA-HAF03, consisting of a 5.0-Mbp assembly encoding 4,382 proteins and 90 predicted RNAs. Copyright © 2015 Kohlman et al.
Kopylov, Artur T; Ilgisonis, Ekaterina V; Moysa, Alexander A; Tikhonova, Olga V; Zavialova, Maria G; Novikova, Svetlana E; Lisitsa, Andrey V; Ponomarenko, Elena A; Moshkovskii, Sergei A; Markin, Andrey A; Grigoriev, Anatoly I; Zgoda, Victor G; Archakov, Alexander I
2016-11-04
This work was aimed at estimating the concentrations of proteins encoded by human chromosome 18 (Chr 18) in plasma samples of 54 healthy male volunteers (aged 20-47). These young persons have been certified by the medical evaluation board as healthy subjects ready for space flight training. Over 260 stable isotope-labeled peptide standards (SIS) were synthesized to perform the measurements of proteins encoded by Chr 18. Selected reaction monitoring (SRM) with SIS allowed an estimate of the levels of 84 of 276 proteins encoded by Chr 18. These proteins were quantified in whole and depleted plasma samples. Concentration of the proteins detected varied from 10 -6 M (transthyretin, P02766) to 10 -11 M (P4-ATPase, O43861). A minor part of the proteins (mostly representing intracellular proteins) was characterized by extremely high inter individual variations. The results provide a background for studies of a potential biomarker in plasma among proteins encoded by Chr 18. The SRM raw data are available in ProteomeXchange repository (PXD004374).
Melendrez, Melanie C.; Lange, Rachel K.; Cohan, Frederick M.; Ward, David M.
2011-01-01
Previous research has shown that sequences of 16S rRNA genes and 16S-23S rRNA internal transcribed spacer regions may not have enough genetic resolution to define all ecologically distinct Synechococcus populations (ecotypes) inhabiting alkaline, siliceous hot spring microbial mats. To achieve higher molecular resolution, we studied sequence variation in three protein-encoding loci sampled by PCR from 60°C and 65°C sites in the Mushroom Spring mat (Yellowstone National Park, WY). Sequences were analyzed using the ecotype simulation (ES) and AdaptML algorithms to identify putative ecotypes. Between 4 and 14 times more putative ecotypes were predicted from variation in protein-encoding locus sequences than from variation in 16S rRNA and 16S-23S rRNA internal transcribed spacer sequences. The number of putative ecotypes predicted depended on the number of sequences sampled and the molecular resolution of the locus. Chao estimates of diversity indicated that few rare ecotypes were missed. Many ecotypes hypothesized by sequence analyses were different in their habitat specificities, suggesting different adaptations to temperature or other parameters that vary along the flow channel. PMID:21169433
Jia, Fan; Gampala, Srinivas S.L.; Mittal, Amandeep; Luo, Qingjun; Rock, Christopher D.
2009-01-01
The 14,200 available full length Arabidopsis thaliana cDNAs in the Universal Plasmid System (UPS) donor vector pUNI51 should be applied broadly and efficiently to leverage a “functional map-space” of homologous plant genes. We have engineered Cre-lox UPS host acceptor vectors (pCR701- 705) with N-terminal epitope tags in frame with the loxH site and downstream from the maize Ubiquitin promoter for use in transient protoplast expression assays and particle bombardment transformation of monocots. As an example of the utility of these vectors, we recombined them with several Arabidopsis cDNAs encoding Ser/Thr protein phosphatase type 2C (PP2Cs) known from genetic studies or predicted by hierarchical clustering meta-analysis to be involved in ABA and stress responses. Our functional results in Zea mays mesophyll protoplasts on ABA-inducible expression effects on the Late Embryogenesis Abundant promoter ProEm:GUS reporter were consistent with predictions and resulted in identification of novel activities of some PP2Cs. Deployment of these vectors can facilitate functional genomics and proteomics and identification of novel gene activities. PMID:19499346
Sillanpää, Jouko; Nallapareddy, Sreedhar R.; Prakash, Vittal P.; Qin, Xiang; Hook, Magnus; Weinstock, George M.; Murray, Barbara E.
2009-01-01
SUMMARY Attention has recently been drawn to Enterococcus faecium because of an increasing number of nosocomial infections caused by this species and its resistance to multiple antibacterial agents. However, relatively little is known about pathogenic determinants of this organism. We have previously identified a cell wall anchored collagen adhesin, Acm, produced by some isolates of E. faecium, and a secreted antigen, SagA, exhibiting broad spectrum binding to extracellular matrix proteins. Here, we analyzed the draft genome of strain TX0016 for potential MSCRAMMs (microbial surface component recognizing adhesive matrix molecules). Genome-based bioinformatics identified 22 predicted cell wall anchored E. faeciumsurface proteins (Fms) of which 15 (including Acm) have typical characteristics of MSCRAMMs including predicted folding into a modular architecture with multiple immunoglobulin-like domains. Functional characterization of one (Fms10, redesignated Scm for second collagen adhesin of E. faeciu m) revealed that recombinant Scm65 (A- and B-domains) and Scm36 (A-domain) bound efficiently to collagen type V in a concentration dependent manner, bound considerably less to collagen type I and fibrinogen, and differed from Acm in their binding specificities to collagen types IV and V. Results from far-UV circular dichroism of recombinant Scm36 and of Acm37 indicated that these proteins are rich in β-sheets, supporting our folding predictions. Whole-cell ELISA and FACS analyses unambiguously demonstrated surface expression of Scm in most E. faecium isolates. Strikingly, 11 of the 15 predicted MSCRAMMs clustered in four loci, each with a class C sortase gene; 9 of these showed similarity to Enterococcus faecalis Ebp pilus subunits and also contained motifs essential for pilus assembly. Antibodies against one of the predicted major pilus proteins, Fms9 (redesignated as EbpCfm), detected a “ladder” pattern of high-molecular weight protein bands in a Western blot analysis of cell surface extracts from E. faecium, suggesting that EbpCfm is polymerized into a pilus structure. Further analysis of the transcripts of the corresponding gene cluster indicated that fms1 (ebpAfm), fms5 (ebpBfm) and ebpCfm are co-transcribed, consistent with pilus-encoding gene clusters of other gram-positive bacteria. All 15 genes occurred frequently in 30 clinically-derived diverse E. faecium isolates tested. The common occurrence of MSCRAMM and pilus-encoding genes and the presence of a second collagen-binding protein may have important implications for our understanding of this emerging pathogen. PMID:18832325
DOE Office of Scientific and Technical Information (OSTI.GOV)
Woon, J. S. K., E-mail: jameswoon@siswa.ukm.edu.my; Murad, A. M. A., E-mail: munir@ukm.edu.my; Abu Bakar, F. D., E-mail: fabyff@ukm.edu.my
A cellobiohydrolase B (CbhB) from Aspergillus niger ATCC 10574 was cloned and expressed in E. coli. CbhB has an open reading frame of 1611 bp encoding a putative polypeptide of 536 amino acids. Analysis of the encoded polypeptide predicted a molecular mass of 56.2 kDa, a cellulose binding module (CBM) and a catalytic module. In order to obtain the mRNA of cbhB, total RNA was extracted from A. niger cells induced by 1% Avicel. First strand cDNA was synthesized from total RNA via reverse transcription. The full length cDNA of cbhB was amplified by PCR and cloned into the cloning vector, pGEM-Tmore » Easy. A comparison between genomic DNA and cDNA sequences of cbhB revealed that the gene is intronless. Upon the removal of the signal peptide, the cDNA of cbhB was cloned into the expression vector pET-32b. However, the recombinant CbhB was expressed in Escherichia coli Origami DE3 as an insoluble protein. A homology model of CbhB predicted the presence of nine disulfide bonds in the protein structure which may have contributed to the improper folding of the protein and thus, resulting in inclusion bodies in E. coli.« less
Liu, Tingli; Ye, Wenwu; Ru, Yanyan; Yang, Xinyu; Gu, Biao; Tao, Kai; Lu, Shan; Dong, Suomeng; Zheng, Xiaobo; Shan, Weixing; Wang, Yuanchao; Dou, Daolong
2011-01-01
Phytophthora sojae encodes hundreds of putative host cytoplasmic effectors with conserved FLAK motifs following signal peptides, termed crinkling- and necrosis-inducing proteins (CRN) or Crinkler. Their functions and mechanisms in pathogenesis are mostly unknown. Here, we identify a group of five P. sojae-specific CRN-like genes with high levels of sequence similarity, of which three are putative pseudogenes. Functional analysis shows that the two functional genes encode proteins with predicted nuclear localization signals that induce contrasting responses when expressed in Nicotiana benthamiana and soybean (Glycine max). PsCRN63 induces cell death, while PsCRN115 suppresses cell death elicited by the P. sojae necrosis-inducing protein (PsojNIP) or PsCRN63. Expression of CRN fragments with deleted signal peptides and FLAK motifs demonstrates that the carboxyl-terminal portions of PsCRN63 or PsCRN115 are sufficient for their activities. However, the predicted nuclear localization signal is required for PsCRN63 to induce cell death but not for PsCRN115 to suppress cell death. Furthermore, silencing of the PsCRN63 and PsCRN115 genes in P. sojae stable transformants leads to a reduction of virulence on soybean. Intriguingly, the silenced transformants lose the ability to suppress host cell death and callose deposition on inoculated plants. These results suggest a role for CRN effectors in the suppression of host defense responses.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Long, C.M.; Rohrmann, G.F.; Merrill, G.F., E-mail: merrillg@onid.orst.ed
2009-06-05
Open reading frame 92 of the Autographa californica baculovirus (Ac92) is one of about 30 core genes present in all sequenced baculovirus genomes. Computer analyses predicted that the Ac92 encoded protein (called p33) and several of its baculovirus orthologs were related to a family of flavin adenine dinucleotide (FAD)-linked sulfhydryl oxidases. Alignment of these proteins indicated that, although they were highly diverse, a number of amino acids in common with the Erv1p/Alrp family of sulfhydryl oxidases are present. Some of these conserved amino acids are predicted to stack against the isoalloxazine and adenine components of FAD, whereas others are involvedmore » in electron transfer. To investigate this relationship, Ac92 was expressed in bacteria as a His-tagged fusion protein, purified, and characterized both spectrophotometrically and for its enzymatic activity. The purified protein was found to have the color (yellow) and absorption spectrum consistent with it being a FAD-containing protein. Furthermore, it was demonstrated to have sulfhydryl oxidase activity using dithiothreitol and thioredoxin as substrates.« less
Long, C M; Rohrmann, G F; Merrill, G F
2009-06-05
Open reading frame 92 of the Autographa californica baculovirus (Ac92) is one of about 30 core genes present in all sequenced baculovirus genomes. Computer analyses predicted that the Ac92 encoded protein (called p33) and several of its baculovirus orthologs were related to a family of flavin adenine dinucleotide (FAD)-linked sulfhydryl oxidases. Alignment of these proteins indicated that, although they were highly diverse, a number of amino acids in common with the Erv1p/Alrp family of sulfhydryl oxidases are present. Some of these conserved amino acids are predicted to stack against the isoalloxazine and adenine components of FAD, whereas others are involved in electron transfer. To investigate this relationship, Ac92 was expressed in bacteria as a His-tagged fusion protein, purified, and characterized both spectrophotometrically and for its enzymatic activity. The purified protein was found to have the color (yellow) and absorption spectrum consistent with it being a FAD-containing protein. Furthermore, it was demonstrated to have sulfhydryl oxidase activity using dithiothreitol and thioredoxin as substrates.
Mitotic Cortical Waves Predict Future Division Sites by Encoding Positional and Size Information.
Xiao, Shengping; Tong, Cheesan; Yang, Yang; Wu, Min
2017-11-20
Dynamic spatial patterns such as traveling waves could theoretically encode spatial information, but little is known about whether or how they are employed by biological systems, especially higher eukaryotes. Here, we show that concentric target or spiral waves of active Cdc42 and the F-BAR protein FBP17 are invoked in adherent cells at the onset of mitosis. These waves predict the future sites of cell divisions and represent the earliest known spatial cues for furrow assembly. Unlike interphase waves, the frequencies and wavelengths of the mitotic waves display size-dependent scaling properties. While the positioning role of the metaphase waves requires microtubule dynamics, spindle and microtubule-independent inhibitory signals are propagated by the mitotic waves to ensure the singularity of furrow formation. Taken together, we propose that metaphase cortical waves integrate positional and cell size information for division-plane specification in adhesion-dependent cytokinesis. Copyright © 2017 Elsevier Inc. All rights reserved.
Complete Sequence of a 184-Kilobase Catabolic Plasmid from Sphingomonas aromaticivorans F199†
Romine, Margaret F.; Stillwell, Lisa C.; Wong, Kwong-Kwok; Thurston, Sarah J.; Sisk, Ellen C.; Sensen, Christoph; Gaasterland, Terry; Fredrickson, Jim K.; Saffer, Jeffrey D.
1999-01-01
The complete 184,457-bp sequence of the aromatic catabolic plasmid, pNL1, from Sphingomonas aromaticivorans F199 has been determined. A total of 186 open reading frames (ORFs) are predicted to encode proteins, of which 79 are likely directly associated with catabolism or transport of aromatic compounds. Genes that encode enzymes associated with the degradation of biphenyl, naphthalene, m-xylene, and p-cresol are predicted to be distributed among 15 gene clusters. The unusual coclustering of genes associated with different pathways appears to have evolved in response to similarities in biochemical mechanisms required for the degradation of intermediates in different pathways. A putative efflux pump and several hypothetical membrane-associated proteins were identified and predicted to be involved in the transport of aromatic compounds and/or intermediates in catabolism across the cell wall. Several genes associated with integration and recombination, including two group II intron-associated maturases, were identified in the replication region, suggesting that pNL1 is able to undergo integration and excision events with the chromosome and/or other portions of the plasmid. Conjugative transfer of pNL1 to another Sphingomonas sp. was demonstrated, and genes associated with this function were found in two large clusters. Approximately one-third of the ORFs (59 of them) have no obvious homology to known genes. PMID:10049392
Molecular mechanisms for protein-encoded inheritance
Wiltzius, Jed J. W.; Landau, Meytal; Nelson, Rebecca; Sawaya, Michael R.; Apostol, Marcin I.; Goldschmidt, Lukasz; Soriaga, Angela B.; Cascio, Duilio; Rajashankar, Kanagalaghatta; Eisenberg, David
2013-01-01
Strains are phenotypic variants, encoded by nucleic acid sequences in chromosomal inheritance and by protein “conformations” in prion inheritance and transmission. But how is a protein “conformation” stable enough to endure transmission between cells or organisms? Here new polymorphic crystal structures of segments of prion and other amyloid proteins offer structural mechanisms for prion strains. In packing polymorphism, prion strains are encoded by alternative packings (polymorphs) of β-sheets formed by the same segment of a protein; in a second mechanism, segmental polymorphism, prion strains are encoded by distinct β-sheets built from different segments of a protein. Both forms of polymorphism can produce enduring “conformations,” capable of encoding strains. These molecular mechanisms for transfer of information into prion strains share features with the familiar mechanism for transfer of information by nucleic acid inheritance, including sequence specificity and recognition by non-covalent bonds. PMID:19684598
Nanoparticles-cell association predicted by protein corona fingerprints
NASA Astrophysics Data System (ADS)
Palchetti, S.; Digiacomo, L.; Pozzi, D.; Peruzzi, G.; Micarelli, E.; Mahmoudi, M.; Caracciolo, G.
2016-06-01
In a physiological environment (e.g., blood and interstitial fluids) nanoparticles (NPs) will bind proteins shaping a ``protein corona'' layer. The long-lived protein layer tightly bound to the NP surface is referred to as the hard corona (HC) and encodes information that controls NP bioactivity (e.g. cellular association, cellular signaling pathways, biodistribution, and toxicity). Decrypting this complex code has become a priority to predict the NP biological outcomes. Here, we use a library of 16 lipid NPs of varying size (Ø ~ 100-250 nm) and surface chemistry (unmodified and PEGylated) to investigate the relationships between NP physicochemical properties (nanoparticle size, aggregation state and surface charge), protein corona fingerprints (PCFs), and NP-cell association. We found out that none of the NPs' physicochemical properties alone was exclusively able to account for association with human cervical cancer cell line (HeLa). For the entire library of NPs, a total of 436 distinct serum proteins were detected. We developed a predictive-validation modeling that provides a means of assessing the relative significance of the identified corona proteins. Interestingly, a minor fraction of the HC, which consists of only 8 PCFs were identified as main promoters of NP association with HeLa cells. Remarkably, identified PCFs have several receptors with high level of expression on the plasma membrane of HeLa cells.In a physiological environment (e.g., blood and interstitial fluids) nanoparticles (NPs) will bind proteins shaping a ``protein corona'' layer. The long-lived protein layer tightly bound to the NP surface is referred to as the hard corona (HC) and encodes information that controls NP bioactivity (e.g. cellular association, cellular signaling pathways, biodistribution, and toxicity). Decrypting this complex code has become a priority to predict the NP biological outcomes. Here, we use a library of 16 lipid NPs of varying size (Ø ~ 100-250 nm) and surface chemistry (unmodified and PEGylated) to investigate the relationships between NP physicochemical properties (nanoparticle size, aggregation state and surface charge), protein corona fingerprints (PCFs), and NP-cell association. We found out that none of the NPs' physicochemical properties alone was exclusively able to account for association with human cervical cancer cell line (HeLa). For the entire library of NPs, a total of 436 distinct serum proteins were detected. We developed a predictive-validation modeling that provides a means of assessing the relative significance of the identified corona proteins. Interestingly, a minor fraction of the HC, which consists of only 8 PCFs were identified as main promoters of NP association with HeLa cells. Remarkably, identified PCFs have several receptors with high level of expression on the plasma membrane of HeLa cells. Electronic supplementary information (ESI) available: Table S1. Cell viability (%) and cell association of the different nanoparticles used. Table S2. Total number of identified proteins on the different nanoparticles used. Tables S3-S18. Top 25 most abundant corona proteins identified in the protein corona of nanoparticles NP2-NP16 following 1 hour incubation with HP. Table S19. List of descriptors used. Table S20. Potential targets of protein corona fingerprints with its own interaction score (mentha) and the expression median value in Hela cells. Fig. S1 and S2. Effect of exposure to human plasma on size and zeta potential of NPs. Fig. S3. Predictive modeling of nanoparticle-cell association. See DOI: 10.1039/c6nr03898k
Campuzano, Susana; Serra, Beatriz; Llull, Daniel; García, José L; García, Pedro
2009-09-01
A Streptococcus mitis genomic DNA fragment carrying the SMT1224 gene encoding a putative beta-galactosidase was identified, cloned, and expressed in Escherichia coli. This gene encodes a protein 2,411 amino acids long with a predicted molecular mass of 268 kDa. The deduced protein contains an N-terminal signal peptide and a C-terminal choline-binding domain consisting of five consensus repeats, which facilitates the anchoring of the secreted enzyme to the cell wall. The choline-binding capacity of the protein facilitates its purification using DEAE-cellulose affinity chromatography, although its complete purification was achieved by constructing a His-tagged fusion protein. The recombinant protein was characterized as a monomeric beta-galactosidase showing a specific activity of around 2,500 U/mg of protein, with optimum temperature and pH ranges of 30 to 40 degrees C and 6.0 to 6.5, respectively. Enzyme activity is not inhibited by glucose, even at 200 mM, and remains highly stable in solution or immobilized at room temperature in the absence of protein stabilizers. In S. mitis, the enzyme was located attached to the cell surface, but a significant activity was also detected in the culture medium. This novel enzyme represents the first beta-galactosidase having a modular structure with a choline-binding domain, a peculiar property that can also be useful for some biotechnological applications.
Hoter, Abdullah; Amiri, Mahdi; Warda, Mohamad; Naim, Hassan Y
2018-05-27
Endoplasmin, or GRP94, is an ER-located stress inducible molecular chaperone implicated in the folding and assembly of many proteins. The Arabian one-humped camel lives in an environment of thermal stress, nevertheless is able to encounter the risk of misfolded proteins. Here, the cDNA encoding camel GRP94 was isolated by rapid amplification of cDNA ends. The isolated cDNA contained an open reading frame of 2412 bp encoding a protein of 803 amino acids with predicted molecular mass of 92.5 kDa. Nucleotide and protein BLAST analysis of cGRP94 revealed strong conservation between camel and other domestic mammals. Overexpression of cGRP94 in COS-1 cells revealed multiple isoforms including one N-glycosylated species. Immunofluorescence colocalized cGRP94 with the ER resident protein calnexin. Interestingly, none of the cGRP94 isoforms expressed in CHO cells was N-glycosylated, presumably due to folding determinants that mask the N-glycosylation sites as proposed by in silico modelling. Surprisingly, isoforms of cGRP94 were detected in the culture media of transfected cells indicating that the protein, although an ER resident, also is trafficked and secreted into the exterior milieu. The overall striking structural homologies of GRP94s among mammalian reflects their pivotal role in the ER quality control and protein homeostasis. Copyright © 2017. Published by Elsevier B.V.
Zhou, Jiyun; Lu, Qin; Xu, Ruifeng; He, Yulan; Wang, Hongpeng
2017-08-29
Prediction of DNA-binding residue is important for understanding the protein-DNA recognition mechanism. Many computational methods have been proposed for the prediction, but most of them do not consider the relationships of evolutionary information between residues. In this paper, we first propose a novel residue encoding method, referred to as the Position Specific Score Matrix (PSSM) Relation Transformation (PSSM-RT), to encode residues by utilizing the relationships of evolutionary information between residues. PDNA-62 and PDNA-224 are used to evaluate PSSM-RT and two existing PSSM encoding methods by five-fold cross-validation. Performance evaluations indicate that PSSM-RT is more effective than previous methods. This validates the point that the relationship of evolutionary information between residues is indeed useful in DNA-binding residue prediction. An ensemble learning classifier (EL_PSSM-RT) is also proposed by combining ensemble learning model and PSSM-RT to better handle the imbalance between binding and non-binding residues in datasets. EL_PSSM-RT is evaluated by five-fold cross-validation using PDNA-62 and PDNA-224 as well as two independent datasets TS-72 and TS-61. Performance comparisons with existing predictors on the four datasets demonstrate that EL_PSSM-RT is the best-performing method among all the predicting methods with improvement between 0.02-0.07 for MCC, 4.18-21.47% for ST and 0.013-0.131 for AUC. Furthermore, we analyze the importance of the pair-relationships extracted by PSSM-RT and the results validates the usefulness of PSSM-RT for encoding DNA-binding residues. We propose a novel prediction method for the prediction of DNA-binding residue with the inclusion of relationship of evolutionary information and ensemble learning. Performance evaluation shows that the relationship of evolutionary information between residues is indeed useful in DNA-binding residue prediction and ensemble learning can be used to address the data imbalance issue between binding and non-binding residues. A web service of EL_PSSM-RT ( http://hlt.hitsz.edu.cn:8080/PSSM-RT_SVM/ ) is provided for free access to the biological research community.
Thionin-D4E1 chimeric protein protects plants against bacterial infections
Stover, Eddie W; Gupta, Goutam; Hao, Guixia
2017-08-08
The generation of a chimeric protein containing a first domain encoding either a pro-thionon or thionin, a second domain encoding D4E1 or pro-D4E1, and a third domain encoding a peptide linker located between the first domain and second domain is described. Either the first domain or the second domain is located at the amino terminal of the chimeric protein and the other domain (second domain or first domain, respectively) is located at the carboxyl terminal. The chimeric protein has antibacterial activity. Genetically altered plants and their progeny expressing a polynucleotide encoding the chimeric protein resist diseases caused by bacteria.
Vibrio Phage KVP40 Encodes a Functional NAD+ Salvage Pathway.
Lee, Jae Yun; Li, Zhiqun; Miller, Eric S
2017-05-01
The genome of T4-type Vibrio bacteriophage KVP40 has five genes predicted to encode proteins of pyridine nucleotide metabolism, of which two, nadV and natV , would suffice for an NAD + salvage pathway. NadV is an apparent nicotinamide phosphoribosyltransferase (NAmPRTase), and NatV is an apparent bifunctional nicotinamide mononucleotide adenylyltransferase (NMNATase) and nicotinamide-adenine dinucleotide pyrophosphatase (Nudix hydrolase). Genes encoding the predicted salvage pathway were cloned and expressed in Escherichia coli , the proteins were purified, and their enzymatic properties were examined. KVP40 NadV NAmPRTase is active in vitro , and a clone complements a Salmonella mutant defective in both the bacterial de novo and salvage pathways. Similar to other NAmPRTases, the KVP40 enzyme displayed ATPase activity indicative of energy coupling in the reaction mechanism. The NatV NMNATase activity was measured in a coupled reaction system demonstrating NAD + biosynthesis from nicotinamide, phosphoribosyl pyrophosphate, and ATP. The NatV Nudix hydrolase domain was also shown to be active, with preferred substrates of ADP-ribose, NAD + , and NADH. Expression analysis using reverse transcription-quantitative PCR (qRT-PCR) and enzyme assays of infected Vibrio parahaemolyticus cells demonstrated nadV and natV transcription during the early and delayed-early periods of infection when other KVP40 genes of nucleotide precursor metabolism are expressed. The distribution and phylogeny of NadV and NatV proteins among several large double-stranded DNA (dsDNA) myophages, and also those from some very large siphophages, suggest broad relevance of pyridine nucleotide scavenging in virus-infected cells. NAD + biosynthesis presents another important metabolic resource control point by large, rapidly replicating dsDNA bacteriophages. IMPORTANCE T4-type bacteriophages enhance DNA precursor synthesis through reductive reactions that use NADH/NADPH as the electron donor and NAD + for ADP-ribosylation of proteins involved in transcribing and translating the phage genome. We show here that phage KVP40 encodes a functional pyridine nucleotide scavenging pathway that is expressed during the metabolic period of the infection cycle. The pathway is conserved in other large, dsDNA phages in which the two genes, nadV and natV , share an evolutionary history in their respective phage-host group. Copyright © 2017 American Society for Microbiology.
Pasion, S G; Hines, J C; Aebersold, R; Ray, D S
1992-01-01
A type II DNA topoisomerase, topoIImt, was shown previously to be associated with the kinetoplast DNA of the trypanosomatid Crithidia fasciculata. The gene encoding this kinetoplast-associated topoisomerase has been cloned by immunological screening of a Crithidia genomic expression library with monoclonal antibodies raised against the purified enzyme. The gene CfaTOP2 is a single copy gene and is expressed as a 4.8-kb polyadenylated transcript. The nucleotide sequence of CfaTOP2 has been determined and encodes a predicted polypeptide of 1239 amino acids with a molecular mass of 138,445. The identification of the cloned gene is supported by immunoblot analysis of the beta-galactosidase-CfaTOP2 fusion protein expressed in Escherichia coli and by analysis of tryptic peptide sequences derived from purified topoIImt. CfaTOP2 shares significant homology with nuclear type II DNA topoisomerases of other eukaryotes suggesting that in Crithidia both nuclear and mitochondrial forms of topoisomerase II are encoded by the same gene.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sun Wei; Huang Youhua; Zhao Zhe
2006-12-08
The 3{beta}-hydroxysteroid dehydrogenase (3{beta}-HSD) isoenzymes play a key role in cellular steroid hormone synthesis. Here, a 3{beta}-HSD gene homolog was cloned from Rana grylio virus (RGV), a member of family Iridoviridae. RGV 3{beta}-HSD gene has 1068 bp, encoding a 355 aa predicted protein. Transcription analyses showed that RGV 3{beta}-HSD gene was transcribed immediate-early during infection from an initiation site 19 nucleotides upstream of the translation start site. Confocal microscopy revealed that the 3{beta}-HSD-EGFP fusion protein was exclusively colocalized with the mitochondria marker (pDsRed2-Mito) in EPC cells. Upon morphological observation and MTT assay, it was revealed that overexpression of RGV 3{beta}-HSDmore » in EPC cells could apparently suppress RGV-induced cytopathic effect (CPE). The present studies indicate that the RGV immediate-early 3{beta}-HSD gene encodes a mitochondria-localized protein, which has a novel role in suppressing virus-induced CPE. All these suggest that RGV 3{beta}-HSD might be a protein involved in host-virus interaction.« less
Fuchs, W; Ziemann, K; Teifke, J P; Werner, O; Mettenleiter, T C
2000-03-01
The DNA sequence of the infectious laryngotracheitis virus (ILTV) UL50, UL51 and UL52 gene homologues was determined. Although the deduced UL50 protein lacks the first of five conserved domains of the corresponding proteins of mammalian alphaherpesviruses, the ILTV gene product was also shown to possess dUTPase activity. The generation of UL50-negative ILTV mutants was facilitated by recombination plasmids encoding green fluorescent protein (GFP), and expression constructs of predicted transactivator proteins of ILTV (alphaTIF, ICP4) were successfully used to increase the infectivity of viral genomic DNA. A GFP-expressing UL50-deletion mutant of ILTV showed reduced cell-to-cell spread in vitro, and was attenuated in vivo. A similar deletion mutant without the foreign gene, however, propagated like wild-type ILTV in cell culture and was pathogenic in chickens. We conclude that the viral dUTPase is not required for efficient replication of ILTV in the respiratory tract of infected animals. The replication defect of the GFP-expressing ILTV recombinant is most likely caused by toxic effects of the reporter gene product, since spontaneously occurring inactivation mutants exhibited wild-type-like growth.
Sequencing proteins with transverse ionic transport in nanochannels.
Boynton, Paul; Di Ventra, Massimiliano
2016-05-03
De novo protein sequencing is essential for understanding cellular processes that govern the function of living organisms and all sequence modifications that occur after a protein has been constructed from its corresponding DNA code. By obtaining the order of the amino acids that compose a given protein one can then determine both its secondary and tertiary structures through structure prediction, which is used to create models for protein aggregation diseases such as Alzheimer's Disease. Here, we propose a new technique for de novo protein sequencing that involves translocating a polypeptide through a synthetic nanochannel and measuring the ionic current of each amino acid through an intersecting perpendicular nanochannel. We find that the distribution of ionic currents for each of the 20 proteinogenic amino acids encoded by eukaryotic genes is statistically distinct, showing this technique's potential for de novo protein sequencing.
Tetrahymena thermophila acidic ribosomal protein L37 contains an archaebacterial type of C-terminus.
Hansen, T S; Andreasen, P H; Dreisig, H; Højrup, P; Nielsen, H; Engberg, J; Kristiansen, K
1991-09-15
We have cloned and characterized a Tetrahymena thermophila macronuclear gene (L37) encoding the acidic ribosomal protein (A-protein) L37. The gene contains a single intron located in the 3'-part of the coding region. Two major and three minor transcription start points (tsp) were mapped 39 to 63 nucleotides upstream from the translational start codon. The uppermost tsp mapped to the first T in a putative T. thermophila RNA polymerase II initiator element, TATAA. The coding region of L37 predicts a protein of 109 amino acid (aa) residues. A substantial part of the deduced aa sequence was verified by protein sequencing. The T. thermophila L37 clearly belongs to the P1-type family of eukaryotic A-proteins, but the C-terminal region has the hallmarks of archaebacterial A-proteins.
Novel RepA-MCM proteins encoded in plasmids pTAU4, pORA1 and pTIK4 from Sulfolobus neozealandicus
Greve, Bo; Jensen, Susanne; Phan, Hoa; Brügger, Kim; Zillig, Wolfram; She, Qunxin; Garrett, Roger A.
2005-01-01
Three plasmids isolated from the crenarchaeal thermoacidophile Sulfolobus neozealandicus were characterized. Plasmids pTAU4 (7,192 bp), pORA1 (9,689 bp) and pTIK4 (13,638 bp) show unusual properties that distinguish them from previously characterized cryptic plasmids of the genus Sulfolobus. Plasmids pORA1 and pTIK4 encode RepA proteins, only the former of which carries the novel polymerase–primase domain of other known Sulfolobus plasmids. Plasmid pTAU4 encodes a mini-chromosome maintenance protein homolog and no RepA protein; the implications for DNA replication are considered. Plasmid pORA1 is the first Sulfolobus plasmid to be characterized that does not encode the otherwise highly conserved DNA-binding PlrA protein. Another encoded protein appears to be specific for the New Zealand plasmids. The three plasmids should provide useful model systems for functional studies of these important crenarchaeal proteins. PMID:15876565
Fast and Forceful Refolding of Stretched α-Helical Solenoid Proteins
Kim, Minkyu; Abdi, Khadar; Lee, Gwangrog; Rabbi, Mahir; Lee, Whasil; Yang, Ming; Schofield, Christopher J.; Bennett, Vann; Marszalek, Piotr E.
2010-01-01
Abstract Anfinsen's thermodynamic hypothesis implies that proteins can encode for stretching through reversible loss of structure. However, large in vitro extensions of proteins that occur through a progressive unfolding of their domains typically dissipate a significant amount of energy, and therefore are not thermodynamically reversible. Some coiled-coil proteins have been found to stretch nearly reversibly, although their extension is typically limited to 2.5 times their folded length. Here, we report investigations on the mechanical properties of individual molecules of ankyrin-R, β-catenin, and clathrin, which are representative examples of over 800 predicted human proteins composed of tightly packed α-helical repeats (termed ANK, ARM, or HEAT repeats, respectively) that form spiral-shaped protein domains. Using atomic force spectroscopy, we find that these polypeptides possess unprecedented stretch ratios on the order of 10–15, exceeding that of other proteins studied so far, and their extension and relaxation occurs with minimal energy dissipation. Their sequence-encoded elasticity is governed by stepwise unfolding of small repeats, which upon relaxation of the stretching force rapidly and forcefully refold, minimizing the hysteresis between the stretching and relaxing parts of the cycle. Thus, we identify a new class of proteins that behave as highly reversible nanosprings that have the potential to function as mechanosensors in cells and as building blocks in springy nanostructures. Our physical view of the protein component of cells as being comprised of predominantly inextensible structural elements under tension may need revision to incorporate springs. PMID:20550922
Zheng, Linli; Ge, Yumei; Hu, Weilin; Yan, Jie
2013-03-01
To determine expression changes of major outer membrane protein(OMP) antigens of Leptospira interrogans serogroup Icterohaemorrhagiae serovar Lai strain Lai during infection of human macrophages and its mechanism. OmpR encoding genes and OmpR-related histidine kinase (HK) encoding gene of L.interrogans strain Lai and their functional domains were predicted using bioinformatics technique. mRNA level changes of the leptospiral major OMP-encoding genes before and after infection of human THP-1 macrophages were detected by real-time fluorescence quantitative RT-PCR. Effects of the OmpR-encoding genes and HK-encoding gene on the expression of leptospiral OMPs during infection were determined by HK-peptide antiserum block assay and closantel inhibitive assays. The bioinformatics analysis indicated that LB015 and LB333 were referred to OmpR-encoding genes of the spirochete, while LB014 might act as a OmpR-related HK-encoding gene. After the spirochete infecting THP-1 cells, mRNA levels of leptospiral lipL21, lipL32 and lipL41 genes were rapidly and persistently down-regulated (P <0.01), whereas mRNA levels of leptospiral groEL, mce, loa22 and ligB genes were rapidly but transiently up-regulated (P<0.01). The treatment with closantel and HK-peptide antiserum partly reversed the infection-based down-regulated mRNA levels of lipL21 and lipL48 genes (P <0.01). Moreover, closantel caused a decrease of the infection-based up-regulated mRNA levels of groEL, mce, loa22 and ligB genes (P <0.01). Expression levels of L.interrogans strain Lai major OMP antigens present notable changes during infection of human macrophages. There is a group of OmpR-and HK-encoding genes which may play a major role in down-regulation of expression levels of partial OMP antigens during infection.
Turkowski, Kari L; Tester, David J; Bos, J Martijn; Haugaa, Kristina H; Ackerman, Michael J
2017-03-01
Arrhythmogenic cardiomyopathy (ACM) is a heritable disease characterized by fibrofatty replacement of cardiomyocytes, has a prevalence of approximately 1 in 5000 individuals, and accounts for approximately 20% of sudden cardiac death in the young (≤35 years). ACM is most often inherited as an autosomal dominant trait with incomplete penetrance and variable expression. While mutations in several genes that encode key desmosomal proteins underlie about half of all ACM, the remainder is elusive genetically. Here, whole exome sequencing (WES) was performed with genomic triangulation in an effort to identify a novel explanation for a phenotype-positive, genotype-negative multi-generational pedigree with a presumed autosomal dominant, maternal inheritance of ACM. WES and genomic triangulation was performed on a symptomatic 14-year-old female proband, her affected mother and affected sister, and her unaffected father to elucidate a novel ACM-susceptibility gene for this pedigree. Following variant filtering using Ingenuity® Variant Analysis, gene priority ranking was performed on the candidate genes using ToppGene and Endeavour. The phylogenetic and physiochemical properties of candidate mutations were assessed further by 6 in silico prediction tools. Species alignment and amino acid conservation analysis was performed using the Uniprot Consortium. Tissue expression data was abstracted from Expression Atlas. Following WES and genomic triangulation, CDH2 emerged as a novel, autosomal dominant, ACM-susceptibility gene. The CDH2-encoded N-cadherin is a cell-cell adhesion protein predominately expressed in the heart. Cardiac dysfunction has been demonstrated in prior CDH2 knockout and over-expression animal studies. Further in silico mutation prediction, species conservation, and protein expression analysis supported the ultra-rare (minor allele frequency <0.005%) p.Asp407Asn-CDH2 variant as a likely pathogenic variant. Herein, it is demonstrated that genetic mutations in CDH2-encoded N-cadherin may represent a novel pathogenetic basis for ACM in humans. The prevalence of CDH2-mediated ACM in heretofore genetically elusive ACM remains to be determined. © 2017 Wiley Periodicals, Inc.
Cytochrome P460 Genes from the Methanotroph Methylococcus capsulatus Bath†
Bergmann, David J.; Zahn, James A.; Hooper, Alan B.; DiSpirito, Alan A.
1998-01-01
P460 cytochromes catalyze the oxidation of hydroxylamine to nitrite. They have been isolated from the ammonia-oxidizing bacterium Nitrosomonas europaea (R. H. Erickson and A. B. Hooper, Biochim. Biophys. Acta 275:231–244, 1972) and the methane-oxidizing bacterium Methylococcus capsulatus Bath (J. A. Zahn et al., J. Bacteriol. 176:5879–5887, 1994). A degenerate oligonucleotide probe was synthesized based on the N-terminal amino acid sequence of cytochrome P460 and used to identify a DNA fragment from M. capsulatus Bath that contains cyp, the gene encoding cytochrome P460. cyp is part of a gene cluster that contains three open reading frames (ORFs), the first predicted to encode a 59,000-Da membrane-bound polypeptide, the second predicted to encode a 12,000-Da periplasmic protein, and the third (cyp) encoding cytochrome P460. The products of the first two ORFs have no apparent similarity to any proteins in the GenBank database. The overall sequence similarity of the P460 cytochromes from M. capsulatus Bath and N. europaea was low (24.3% of residues identical), although short regions of conserved residues are present in the two proteins. Both cytochromes have a C-terminal, c-heme binding motif (CXXCH) and a conserved lysine residue (K61) that may provide an additional covalent cross-link to the heme (D. M. Arciero and A. B. Hooper, FEBS Lett. 410:457–460, 1997). Gene probing using cyp indicated that a cytochrome P460 similar to that from M. capsulatus Bath may be present in the type II methanotrophs Methylosinus trichosporium OB3b and Methylocystis parvus OBBP but not in the type I methanotrophs Methylobacter marinus A45, Methylomicrobium albus BG8, and Methylomonas sp. strains MN and MM2. Immunoblot analysis with antibodies against cytochrome P460 from M. capsulatus Bath indicated that the expression level of cytochrome P460 was not affected either by expression of the two different methane monooxygenases or by addition of ammonia to the culture medium. PMID:9851984
Ndah, Elvis; Jonckheere, Veronique
2017-01-01
Proteogenomics is an emerging research field yet lacking a uniform method of analysis. Proteogenomic studies in which N-terminal proteomics and ribosome profiling are combined, suggest that a high number of protein start sites are currently missing in genome annotations. We constructed a proteogenomic pipeline specific for the analysis of N-terminal proteomics data, with the aim of discovering novel translational start sites outside annotated protein coding regions. In summary, unidentified MS/MS spectra were matched to a specific N-terminal peptide library encompassing protein N termini encoded in the Arabidopsis thaliana genome. After a stringent false discovery rate filtering, 117 protein N termini compliant with N-terminal methionine excision specificity and indicative of translation initiation were found. These include N-terminal protein extensions and translation from transposable elements and pseudogenes. Gene prediction provided supporting protein-coding models for approximately half of the protein N termini. Besides the prediction of functional domains (partially) contained within the newly predicted ORFs, further supporting evidence of translation was found in the recently released Araport11 genome re-annotation of Arabidopsis and computational translations of sequences stored in public repositories. Most interestingly, complementary evidence by ribosome profiling was found for 23 protein N termini. Finally, by analyzing protein N-terminal peptides, an in silico analysis demonstrates the applicability of our N-terminal proteogenomics strategy in revealing protein-coding potential in species with well- and poorly-annotated genomes. PMID:28432195
Willems, Patrick; Ndah, Elvis; Jonckheere, Veronique; Stael, Simon; Sticker, Adriaan; Martens, Lennart; Van Breusegem, Frank; Gevaert, Kris; Van Damme, Petra
2017-06-01
Proteogenomics is an emerging research field yet lacking a uniform method of analysis. Proteogenomic studies in which N-terminal proteomics and ribosome profiling are combined, suggest that a high number of protein start sites are currently missing in genome annotations. We constructed a proteogenomic pipeline specific for the analysis of N-terminal proteomics data, with the aim of discovering novel translational start sites outside annotated protein coding regions. In summary, unidentified MS/MS spectra were matched to a specific N-terminal peptide library encompassing protein N termini encoded in the Arabidopsis thaliana genome. After a stringent false discovery rate filtering, 117 protein N termini compliant with N-terminal methionine excision specificity and indicative of translation initiation were found. These include N-terminal protein extensions and translation from transposable elements and pseudogenes. Gene prediction provided supporting protein-coding models for approximately half of the protein N termini. Besides the prediction of functional domains (partially) contained within the newly predicted ORFs, further supporting evidence of translation was found in the recently released Araport11 genome re-annotation of Arabidopsis and computational translations of sequences stored in public repositories. Most interestingly, complementary evidence by ribosome profiling was found for 23 protein N termini. Finally, by analyzing protein N-terminal peptides, an in silico analysis demonstrates the applicability of our N-terminal proteogenomics strategy in revealing protein-coding potential in species with well- and poorly-annotated genomes. © 2017 by The American Society for Biochemistry and Molecular Biology, Inc.
Robust enzyme design: bioinformatic tools for improved protein stability.
Suplatov, Dmitry; Voevodin, Vladimir; Švedas, Vytas
2015-03-01
The ability of proteins and enzymes to maintain a functionally active conformation under adverse environmental conditions is an important feature of biocatalysts, vaccines, and biopharmaceutical proteins. From an evolutionary perspective, robust stability of proteins improves their biological fitness and allows for further optimization. Viewed from an industrial perspective, enzyme stability is crucial for the practical application of enzymes under the required reaction conditions. In this review, we analyze bioinformatic-driven strategies that are used to predict structural changes that can be applied to wild type proteins in order to produce more stable variants. The most commonly employed techniques can be classified into stochastic approaches, empirical or systematic rational design strategies, and design of chimeric proteins. We conclude that bioinformatic analysis can be efficiently used to study large protein superfamilies systematically as well as to predict particular structural changes which increase enzyme stability. Evolution has created a diversity of protein properties that are encoded in genomic sequences and structural data. Bioinformatics has the power to uncover this evolutionary code and provide a reproducible selection of hotspots - key residues to be mutated in order to produce more stable and functionally diverse proteins and enzymes. Further development of systematic bioinformatic procedures is needed to organize and analyze sequences and structures of proteins within large superfamilies and to link them to function, as well as to provide knowledge-based predictions for experimental evaluation. Copyright © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Interspecific and host-related gene expression patterns in nematode-trapping fungi.
Andersson, Karl-Magnus; Kumar, Dharmendra; Bentzer, Johan; Friman, Eva; Ahrén, Dag; Tunlid, Anders
2014-11-11
Nematode-trapping fungi are soil-living fungi that capture and kill nematodes using special hyphal structures called traps. They display a large diversity of trapping mechanisms and differ in their host preferences. To provide insights into the genetic basis for this variation, we compared the transcriptome expressed by three species of nematode-trapping fungi (Arthrobotrys oligospora, Monacrosporium cionopagum and Arthrobotrys dactyloides, which use adhesive nets, adhesive branches or constricting rings, respectively, to trap nematodes) during infection of two different plant-pathogenic nematode hosts (the root knot nematode Meloidogyne hapla and the sugar beet cyst nematode Heterodera schachtii). The divergence in gene expression between the fungi was significantly larger than that related to the nematode species being infected. Transcripts predicted to encode secreted proteins and proteins with unknown function (orphans) were overrepresented among the highly expressed transcripts in all fungi. Genes that were highly expressed in all fungi encoded endopeptidases, such as subtilisins and aspartic proteases; cell-surface proteins containing the carbohydrate-binding domain WSC; stress response proteins; membrane transporters; transcription factors; and transcripts containing the Ricin-B lectin domain. Differentially expressed transcripts among the fungal species encoded various lectins, such as the fungal fruit-body lectin and the D-mannose binding lectin; transcription factors; cell-signaling components; proteins containing a WSC domain; and proteins containing a DUF3129 domain. A small set of transcripts were differentially expressed in infections of different host nematodes, including peptidases, WSC domain proteins, tyrosinases, and small secreted proteins with unknown function. This is the first study on the variation of infection-related gene expression patterns in nematode-trapping fungi infecting different host species. A better understanding of these patterns will facilitate the improvements of these fungi in biological control programs, by providing molecular markers for screening programs and candidates for genetic manipulations of virulence and host preferences.
USDA-ARS?s Scientific Manuscript database
Plant resistance (R) genes typically encode proteins with nucleotide binding site-leucine rich repeat (NLR) domains. We identified a novel, broad-spectrum rice blast R gene, Ptr, encoding a non-NLR protein with four Armadillo repeats. Ptr was originally identified by fast neutron mutagenesis as a ...
Alternative intronic promoters in development and disease.
Vacik, Tomas; Raska, Ivan
2017-05-01
Approximately 20,000 mammalian genes are estimated to encode between 250 thousand and 1 million different proteins. This enormous diversity of the mammalian proteome is caused by the ability of a single-gene locus to encode multiple protein isoforms. Protein isoforms encoded by one gene locus can be functionally distinct, and they can even have antagonistic functions. One of the mechanisms involved in creating this proteome complexity is alternative promoter usage. Alternative intronic promoters are located downstream from their canonical counterparts and drive the expression of alternative RNA isoforms that lack upstream exons. These upstream exons can encode some important functional domains, and proteins encoded by alternative mRNA isoforms can be thus functionally distinct from the full-length protein encoded by canonical mRNA isoforms. Since any misbalance of functionally distinct protein isoforms is likely to have detrimental consequences for the cell and the whole organism, their expression must be precisely regulated. Misregulation of alternative intronic promoters is frequently associated with various developmental defects and diseases including cancer, and it is becoming increasingly clear that this phenomenon deserves more attention.
Palmer, J E; Dikeman, D A; Fujinuma, T; Kim, B; Jones, J I; Denda, M; Martínez-Zapater, J M; Cruz-Alvarez, M
2001-04-01
The species Brassica oleracea includes several agricultural varieties characterized by the proliferation of different types of meristems. Using a combination of subtractive hybridization and PCR (polymerase chain reaction) techniques we have identified several genes which are expressed in the reproductive meristems of the cauliflower curd (B. oleracea var. botrytis) but not in the vegetative meristems of Brussels sprouts (B. oleracea var. gemmifera) axillary buds. One of the cloned genes, termed CCE1 (CAULIFLOWER CURD EXPRESSION 1) shows specific expression in the botrytis variety. Preferential expression takes place in this variety in the meristems of the curd and in the stem throughout the vegetative and reproductive stages of plant growth. CCE1 transcripts are not detected in any of the organs of other B. oleracea varieties analyzed. Based on the nucleotide sequence of a cDNA encompassing the complete coding region, we predict that this gene encodes a transmembrane protein, with three transmembrane domains. The deduced amino acid sequence includes motifs conserved in G-protein-coupled receptors (GPCRs) from yeast and animal species. Our results suggest that the cloned gene encodes a protein belonging to a new, so far unidentified, family of transmembrane receptors in plants. The expression pattern of the gene suggests that the receptor may be involved in the control of meristem development/arrest that takes place in cauliflower.
Transcriptomic analysis of the autophagy machinery in crustaceans.
Suwansa-Ard, Saowaros; Kankuan, Wilairat; Thongbuakaew, Tipsuda; Saetan, Jirawat; Kornthong, Napamanee; Kruangkum, Thanapong; Khornchatri, Kanjana; Cummins, Scott F; Isidoro, Ciro; Sobhon, Prasert
2016-08-09
The giant freshwater prawn, Macrobrachium rosenbergii, is a decapod crustacean that is commercially important as a food source. Farming of commercial crustaceans requires an efficient management strategy because the animals are easily subjected to stress and diseases during the culture. Autophagy, a stress response process, is well-documented and conserved in most animals, yet it is poorly studied in crustaceans. In this study, we have performed an in silico search for transcripts encoding autophagy-related (Atg) proteins within various tissue transcriptomes of M. rosenbergii. Basic Local Alignment Search Tool (BLAST) search using previously known Atg proteins as queries revealed 41 transcripts encoding homologous M. rosenbergii Atg proteins. Among these Atg proteins, we selected commonly used autophagy markers, including Beclin 1, vacuolar protein sorting (Vps) 34, microtubule-associated proteins 1A/1B light chain 3B (MAP1LC3B), p62/sequestosome 1 (SQSTM1), and lysosomal-associated membrane protein 1 (Lamp-1) for further sequence analyses using comparative alignment and protein structural prediction. We found that crustacean autophagy marker proteins contain conserved motifs typical of other animal Atg proteins. Western blotting using commercial antibodies raised against human Atg marker proteins indicated their presence in various M. rosenbergii tissues, while immunohistochemistry localized Atg marker proteins within ovarian tissue, specifically late stage oocytes. This study demonstrates that the molecular components of autophagic process are conserved in crustaceans, which is comparable to autophagic process in mammals. Furthermore, it provides a foundation for further studies of autophagy in crustaceans that may lead to more understanding of the reproduction- and stress-related autophagy, which will enable the efficient aquaculture practices.
A Screen for Modifiers of Hedgehog Signaling in Drosophila melanogaster Identifies swm and mts
Casso, David J.; Liu, Songmei; Iwaki, D. David; Ogden, Stacey K.; Kornberg, Thomas B.
2008-01-01
Signaling by Hedgehog (Hh) proteins shapes most tissues and organs in both vertebrates and invertebrates, and its misregulation has been implicated in many human diseases. Although components of the signaling pathway have been identified, key aspects of the signaling mechanism and downstream targets remain to be elucidated. We performed an enhancer/suppressor screen in Drosophila to identify novel components of the pathway and identified 26 autosomal regions that modify a phenotypic readout of Hh signaling. Three of the regions include genes that contribute constituents to the pathway—patched, engrailed, and hh. One of the other regions includes the gene microtubule star (mts) that encodes a subunit of protein phosphatase 2A. We show that mts is necessary for full activation of Hh signaling. A second region includes the gene second mitotic wave missing (swm). swm is recessive lethal and is predicted to encode an evolutionarily conserved protein with RNA binding and Zn+ finger domains. Characterization of newly isolated alleles indicates that swm is a negative regulator of Hh signaling and is essential for cell polarity. PMID:18245841
Moodley, Yoshan; Uhr, Markus; Stamer, Christiana; Vauterin, Marc; Suerbaum, Sebastian; Achtman, Mark
2010-01-01
The Helicobacter pylori cag pathogenicity island (cagPAI) encodes a type IV secretion system. Humans infected with cagPAI–carrying H. pylori are at increased risk for sequelae such as gastric cancer. Housekeeping genes in H. pylori show considerable genetic diversity; but the diversity of virulence factors such as the cagPAI, which transports the bacterial oncogene CagA into host cells, has not been systematically investigated. Here we compared the complete cagPAI sequences for 38 representative isolates from all known H. pylori biogeographic populations. Their gene content and gene order were highly conserved. The phylogeny of most cagPAI genes was similar to that of housekeeping genes, indicating that the cagPAI was probably acquired only once by H. pylori, and its genetic diversity reflects the isolation by distance that has shaped this bacterial species since modern humans migrated out of Africa. Most isolates induced IL-8 release in gastric epithelial cells, indicating that the function of the Cag secretion system has been conserved despite some genetic rearrangements. More than one third of cagPAI genes, in particular those encoding cell-surface exposed proteins, showed signatures of diversifying (Darwinian) selection at more than 5% of codons. Several unknown gene products predicted to be under Darwinian selection are also likely to be secreted proteins (e.g. HP0522, HP0535). One of these, HP0535, is predicted to code for either a new secreted candidate effector protein or a protein which interacts with CagA because it contains two genetic lineages, similar to cagA. Our study provides a resource that can guide future research on the biological roles and host interactions of cagPAI proteins, including several whose function is still unknown. PMID:20808891
Olbermann, Patrick; Josenhans, Christine; Moodley, Yoshan; Uhr, Markus; Stamer, Christiana; Vauterin, Marc; Suerbaum, Sebastian; Achtman, Mark; Linz, Bodo
2010-08-19
The Helicobacter pylori cag pathogenicity island (cagPAI) encodes a type IV secretion system. Humans infected with cagPAI-carrying H. pylori are at increased risk for sequelae such as gastric cancer. Housekeeping genes in H. pylori show considerable genetic diversity; but the diversity of virulence factors such as the cagPAI, which transports the bacterial oncogene CagA into host cells, has not been systematically investigated. Here we compared the complete cagPAI sequences for 38 representative isolates from all known H. pylori biogeographic populations. Their gene content and gene order were highly conserved. The phylogeny of most cagPAI genes was similar to that of housekeeping genes, indicating that the cagPAI was probably acquired only once by H. pylori, and its genetic diversity reflects the isolation by distance that has shaped this bacterial species since modern humans migrated out of Africa. Most isolates induced IL-8 release in gastric epithelial cells, indicating that the function of the Cag secretion system has been conserved despite some genetic rearrangements. More than one third of cagPAI genes, in particular those encoding cell-surface exposed proteins, showed signatures of diversifying (Darwinian) selection at more than 5% of codons. Several unknown gene products predicted to be under Darwinian selection are also likely to be secreted proteins (e.g. HP0522, HP0535). One of these, HP0535, is predicted to code for either a new secreted candidate effector protein or a protein which interacts with CagA because it contains two genetic lineages, similar to cagA. Our study provides a resource that can guide future research on the biological roles and host interactions of cagPAI proteins, including several whose function is still unknown.
Judelson, Howard S; Ah-Fong, Audrey M V; Aux, George; Avrova, Anna O; Bruce, Catherine; Cakir, Cahid; da Cunha, Luis; Grenville-Briggs, Laura; Latijnhouwers, Maita; Ligterink, Wilco; Meijer, Harold J G; Roberts, Samuel; Thurber, Carrie S; Whisson, Stephen C; Birch, Paul R J; Govers, Francine; Kamoun, Sophien; van West, Pieter; Windass, John
2008-04-01
Much of the pathogenic success of Phytophthora infestans, the potato and tomato late blight agent, relies on its ability to generate from mycelia large amounts of sporangia, which release zoospores that encyst and form infection structures. To better understand these stages, Affymetrix GeneChips based on 15,650 unigenes were designed and used to profile the life cycle. Approximately half of P. infestans genes were found to exhibit significant differential expression between developmental transitions, with approximately (1)/(10) being stage-specific and most changes occurring during zoosporogenesis. Quantitative reverse-transcription polymerase chain reaction assays confirmed the robustness of the array results and showed that similar patterns of differential expression were obtained regardless of whether hyphae were from laboratory media or infected tomato. Differentially expressed genes encode potential cellular regulators, especially protein kinases; metabolic enzymes such as those involved in glycolysis, gluconeogenesis, or the biosynthesis of amino acids or lipids; regulators of DNA synthesis; structural proteins, including predicted flagellar proteins; and pathogenicity factors, including cell-wall-degrading enzymes, RXLR effector proteins, and enzymes protecting against plant defense responses. Curiously, some stage-specific transcripts do not appear to encode functional proteins. These findings reveal many new aspects of oomycete biology, as well as potential targets for crop protection chemicals.
Molecular Characterization of Bombyx mori Cytoplasmic Polyhedrosis Virus Genome Segment 4
Ikeda, Keiko; Nagaoka, Sumiharu; Winkler, Stefan; Kotani, Kumiko; Yagi, Hiroaki; Nakanishi, Kae; Miyajima, Shigetoshi; Kobayashi, Jun; Mori, Hajime
2001-01-01
The complete nucleotide sequence of the genome segment 4 (S4) of Bombyx mori cytoplasmic polyhedrosis virus (BmCPV) was determined. The 3,259-nucleotide sequence contains a single long open reading frame which spans nucleotides 14 to 3187 and which is predicted to encode a protein with a molecular mass of about 130 kDa. Western blot analysis showed that S4 encodes BmCPV protein VP3, which is one of the outer components of the BmCPV virion. Sequence analysis of the deduced amino acid sequence of BmCPV VP3 revealed possible sequence homology with proteins from rice ragged stunt virus (RRSV) S2, Nilaparvata lugens reovirus S4, and Fiji disease fijivirus S4. This may suggest that plant reoviruses originated from insect viruses and that RRSV emerged more recently than other plant reoviruses. A chimeric protein consisting of BmCPV VP3 and green fluorescent protein (GFP) was constructed and expressed with BmCPV polyhedrin using a baculovirus expression vector. The VP3-GFP chimera was incorporated into BmCPV polyhedra and released under alkaline conditions. The results indicate that specific interactions occur between BmCPV polyhedrin and VP3 which might facilitate BmCPV virion occlusion into the polyhedra. PMID:11134312
Expression of Fungal diacylglycerol acyltransferase2 Genes to Increase Kernel Oil in Maize[OA
Oakes, Janette; Brackenridge, Doug; Colletti, Ron; Daley, Maureen; Hawkins, Deborah J.; Xiong, Hui; Mai, Jennifer; Screen, Steve E.; Val, Dale; Lardizabal, Kathryn; Gruys, Ken; Deikman, Jill
2011-01-01
Maize (Zea mays) oil has high value but is only about 4% of the grain by weight. To increase kernel oil content, fungal diacylglycerol acyltransferase2 (DGAT2) genes from Umbelopsis (formerly Mortierella) ramanniana and Neurospora crassa were introduced into maize using an embryo-enhanced promoter. The protein encoded by the N. crassa gene was longer than that of U. ramanniana. It included 353 amino acids that aligned to the U. ramanniana DGAT2A protein and a 243-amino acid sequence at the amino terminus that was unique to the N. crassa DGAT2 protein. Two forms of N. crassa DGAT2 were tested: the predicted full-length protein (L-NcDGAT2) and a shorter form (S-NcDGAT2) that encoded just the sequences that share homology with the U. ramanniana protein. Expression of all three transgenes in maize resulted in small but statistically significant increases in kernel oil. S-NcDGAT2 had the biggest impact on kernel oil, with a 26% (relative) increase in oil in kernels of the best events (inbred). Increases in kernel oil were also obtained in both conventional and high-oil hybrids, and grain yield was not affected by expression of these fungal DGAT2 transgenes. PMID:21245192
NASA Astrophysics Data System (ADS)
Basyuni, M.; Sulistiyono, N.; Wati, R.; Sumardi; Oku, H.; Baba, S.; Sagami, H.
2018-03-01
Cloning of Kandelia obovata KcCAS gene (previously known as Kandelia candel) and Rhizophora stylosa RsCAS have already have been reported and encoded cycloartenol synthases. In this study, the predicted KcCAS and RsCAS protein were analyzed using online software of Phyre2 and Swiss-model. The protein modelling for KcCAS and RsCAS cycloartenol synthases was determined using Pyre2 had similar results with slightly different in sequence identity. By contrast, the Swiss-model for KcCAS slightly had higher sequence identity (47.31%) and Qmean (0.70) compared to RsCAS. No difference of ligands binding site which is considered as modulators for both cycloartenol synthases. The range of predicted protein derived from 91-757 amino acid residues with coverage sequence similarities 0.86, respectively from template model of lanosterol synthase from the human. Homology modelling revealed that 706 residues (93% of the amino acid sequence) had been modelled with 100.0% confidence by the single highest scoring template for both KcCAS and RsCAS using Phyre2. This coverage was more elevated than swiss-model predicted (86%). The present study suggested that both genes are responsible for the genesis of cycloartenol in these mangrove plants.
Constraint Logic Programming approach to protein structure prediction.
Dal Palù, Alessandro; Dovier, Agostino; Fogolari, Federico
2004-11-30
The protein structure prediction problem is one of the most challenging problems in biological sciences. Many approaches have been proposed using database information and/or simplified protein models. The protein structure prediction problem can be cast in the form of an optimization problem. Notwithstanding its importance, the problem has very seldom been tackled by Constraint Logic Programming, a declarative programming paradigm suitable for solving combinatorial optimization problems. Constraint Logic Programming techniques have been applied to the protein structure prediction problem on the face-centered cube lattice model. Molecular dynamics techniques, endowed with the notion of constraint, have been also exploited. Even using a very simplified model, Constraint Logic Programming on the face-centered cube lattice model allowed us to obtain acceptable results for a few small proteins. As a test implementation their (known) secondary structure and the presence of disulfide bridges are used as constraints. Simplified structures obtained in this way have been converted to all atom models with plausible structure. Results have been compared with a similar approach using a well-established technique as molecular dynamics. The results obtained on small proteins show that Constraint Logic Programming techniques can be employed for studying protein simplified models, which can be converted into realistic all atom models. The advantage of Constraint Logic Programming over other, much more explored, methodologies, resides in the rapid software prototyping, in the easy way of encoding heuristics, and in exploiting all the advances made in this research area, e.g. in constraint propagation and its use for pruning the huge search space.
Marathe, Ashish; Krishnan, Veda; Mahajan, Mahesh M; Thimmegowda, Vinutha; Dahuja, Anil; Jolly, Monica; Praveen, Shelly; Sachdev, Archana
2018-01-01
Soybean genome encodes a family of four inositol 1,3,4 trisphosphate 5/6 kinases which belong to the ATP-GRASP group of proteins. Inositol 1,3,4 trisphosphate kinase-2 ( GmItpk2 ), catalyzing the ATP-dependent phosphorylation of Inositol 1,3,4 trisphosphate (IP3) to Inositol 1,3,4,5 tetra phosphate or Inositol 1,3,4,6 tetra phosphate, is a key enzyme diverting the flux of inositol phosphate pool towards phytate biosynthesis. Although considerable research on characterizing genes involved in phytate biosynthesis is accomplished at genomic and transcript level, characterization of the proteins is yet to be explored. In the present study, we report the isolation and expression of single copy Itpk 2 (948 bp) from Glycine max cv Pusa-16 predicted to encode 315 amino acid protein with an isoelectric point of 5.9. Sequence analysis revealed that Gm ITPK2 shared highest similarity (80%) with Phaseolus vulgaris. The predicted 3D model confirmed 12 α helices and 14 β barrel sheets with ATP-binding site close to β sheet present towards the C-terminus of the protein molecule. Spatio-temporal transcript profiling signified GmItpk2 to be seed specific, with higher transcript levels in the early stage of seed development. The present study using various molecular and bio-computational tools could, therefore, help in improving our understanding of this key enzyme and prove to be a potential target towards generating low phytate trait in nutritionally rich crop like soybean.
1996-01-01
Mutations in the Caenorhabditis elegans gene unc-89 result in nematodes having disorganized muscle structure in which thick filaments are not organized into A-bands, and there are no M-lines. Beginning with a partial cDNA from the C. elegans sequencing project, we have cloned and sequenced the unc-89 gene. An unc-89 allele, st515, was found to contain an 84-bp deletion and a 10-bp duplication, resulting in an in- frame stop codon within predicted unc-89 coding sequence. Analysis of the complete coding sequence for unc-89 predicts a novel 6,632 amino acid polypeptide consisting of sequence motifs which have been implicated in protein-protein interactions. UNC-89 begins with 67 residues of unique sequences, SH3, dbl/CDC24, and PH domains, 7 immunoglobulins (Ig) domains, a putative KSP-containing multiphosphorylation domain, and ends with 46 Ig domains. A polyclonal antiserum raised to a portion of unc-89 encoded sequence reacts to a twitchin-sized polypeptide from wild type, but truncated polypeptides from st515 and from the amber allele e2338. By immunofluorescent microscopy, this antiserum localizes to the middle of A-bands, consistent with UNC-89 being a structural component of the M-line. Previous studies indicate that myofilament lattice assembly begins with positional cues laid down in the basement membrane and muscle cell membrane. We propose that the intracellular protein UNC-89 responds to these signals, localizes, and then participates in assembling an M-line. PMID:8603916
USDA-ARS?s Scientific Manuscript database
The cattle tick of Australia, Rhipicephalus australis, is a vector for microbial parasites that cause serious bovine diseases. The Haller's organ, located in the tick's forelegs, is crucial for host detection and mating. To facilitate the development of new technologies for better control of this ag...
Bioinformatic Analysis of Strawberry GSTF12 Gene
NASA Astrophysics Data System (ADS)
Wang, Xiran; Jiang, Leiyu; Tang, Haoru
2018-01-01
GSTF12 has always been known as a key factor of proanthocyanins accumulate in plant testa. Through bioinformatics analysis of the nucleotide and encoded protein sequence of GSTF12, it is more advantageous to the study of genes related to anthocyanin biosynthesis accumulation pathway. Therefore, we chosen GSTF12 gene of 11 kinds species, downloaded their nucleotide and protein sequence from NCBI as the research object, found strawberry GSTF12 gene via bioinformation analyse, constructed phylogenetic tree. At the same time, we analysed the strawberry GSTF12 gene of physical and chemical properties and its protein structure and so on. The phylogenetic tree showed that Strawberry and petunia were closest relative. By the protein prediction, we found that the protein owed one proper signal peptide without obvious transmembrane regions.
The Popeye Domain Containing Genes and Their Function as cAMP Effector Proteins in Striated Muscle.
Brand, Thomas
2018-03-13
The Popeye domain containing (POPDC) genes encode transmembrane proteins, which are abundantly expressed in striated muscle cells. Hallmarks of the POPDC proteins are the presence of three transmembrane domains and the Popeye domain, which makes up a large part of the cytoplasmic portion of the protein and functions as a cAMP-binding domain. Interestingly, despite the prediction of structural similarity between the Popeye domain and other cAMP binding domains, at the protein sequence level they strongly differ from each other suggesting an independent evolutionary origin of POPDC proteins. Loss-of-function experiments in zebrafish and mouse established an important role of POPDC proteins for cardiac conduction and heart rate adaptation after stress. Loss-of function mutations in patients have been associated with limb-girdle muscular dystrophy and AV-block. These data suggest an important role of these proteins in the maintenance of structure and function of striated muscle cells.
Optimizing expression of the pregnancy malaria vaccine candidate, VAR2CSA in Pichia pastoris.
Avril, Marion; Hathaway, Marianne J; Cartwright, Megan M; Gose, Severin O; Narum, David L; Smith, Joseph D
2009-06-29
VAR2CSA is the main candidate for a vaccine against pregnancy-associated malaria, but vaccine development is complicated by the large size and complex disulfide bonding pattern of the protein. Recent X-ray crystallographic information suggests that domain boundaries of VAR2CSA Duffy binding-like (DBL) domains may be larger than previously predicted and include two additional cysteine residues. This study investigated whether longer constructs would improve VAR2CSA recombinant protein secretion from Pichia pastoris and if domain boundaries were applicable across different VAR2CSA alleles. VAR2CSA sequences were bioinformatically analysed to identify the predicted C11 and C12 cysteine residues at the C-termini of DBL domains and revised N- and C-termimal domain boundaries were predicted in VAR2CSA. Multiple construct boundaries were systematically evaluated for protein secretion in P. pastoris and secreted proteins were tested as immunogens. From a total of 42 different VAR2CSA constructs, 15 proteins (36%) were secreted. Longer construct boundaries, including the predicted C11 and C12 cysteine residues, generally improved expression of poorly or non-secreted domains and permitted expression of all six VAR2CSA DBL domains. However, protein secretion was still highly empiric and affected by subtle differences in domain boundaries and allelic variation between VAR2CSA sequences. Eleven of the secreted proteins were used to immunize rabbits. Antibodies reacted with CSA-binding infected erythrocytes, indicating that P. pastoris recombinant proteins possessed native protein epitopes. These findings strengthen emerging data for a revision of DBL domain boundaries in var-encoded proteins and may facilitate pregnancy malaria vaccine development.
Optimizing expression of the pregnancy malaria vaccine candidate, VAR2CSA in Pichia pastoris
Avril, Marion; Hathaway, Marianne J; Cartwright, Megan M; Gose, Severin O; Narum, David L; Smith, Joseph D
2009-01-01
Background VAR2CSA is the main candidate for a vaccine against pregnancy-associated malaria, but vaccine development is complicated by the large size and complex disulfide bonding pattern of the protein. Recent X-ray crystallographic information suggests that domain boundaries of VAR2CSA Duffy binding-like (DBL) domains may be larger than previously predicted and include two additional cysteine residues. This study investigated whether longer constructs would improve VAR2CSA recombinant protein secretion from Pichia pastoris and if domain boundaries were applicable across different VAR2CSA alleles. Methods VAR2CSA sequences were bioinformatically analysed to identify the predicted C11 and C12 cysteine residues at the C-termini of DBL domains and revised N- and C-termimal domain boundaries were predicted in VAR2CSA. Multiple construct boundaries were systematically evaluated for protein secretion in P. pastoris and secreted proteins were tested as immunogens. Results From a total of 42 different VAR2CSA constructs, 15 proteins (36%) were secreted. Longer construct boundaries, including the predicted C11 and C12 cysteine residues, generally improved expression of poorly or non-secreted domains and permitted expression of all six VAR2CSA DBL domains. However, protein secretion was still highly empiric and affected by subtle differences in domain boundaries and allelic variation between VAR2CSA sequences. Eleven of the secreted proteins were used to immunize rabbits. Antibodies reacted with CSA-binding infected erythrocytes, indicating that P. pastoris recombinant proteins possessed native protein epitopes. Conclusion These findings strengthen emerging data for a revision of DBL domain boundaries in var-encoded proteins and may facilitate pregnancy malaria vaccine development. PMID:19563628
2013-01-01
Background Protein-protein interactions (PPIs) play crucial roles in the execution of various cellular processes and form the basis of biological mechanisms. Although large amount of PPIs data for different species has been generated by high-throughput experimental techniques, current PPI pairs obtained with experimental methods cover only a fraction of the complete PPI networks, and further, the experimental methods for identifying PPIs are both time-consuming and expensive. Hence, it is urgent and challenging to develop automated computational methods to efficiently and accurately predict PPIs. Results We present here a novel hierarchical PCA-EELM (principal component analysis-ensemble extreme learning machine) model to predict protein-protein interactions only using the information of protein sequences. In the proposed method, 11188 protein pairs retrieved from the DIP database were encoded into feature vectors by using four kinds of protein sequences information. Focusing on dimension reduction, an effective feature extraction method PCA was then employed to construct the most discriminative new feature set. Finally, multiple extreme learning machines were trained and then aggregated into a consensus classifier by majority voting. The ensembling of extreme learning machine removes the dependence of results on initial random weights and improves the prediction performance. Conclusions When performed on the PPI data of Saccharomyces cerevisiae, the proposed method achieved 87.00% prediction accuracy with 86.15% sensitivity at the precision of 87.59%. Extensive experiments are performed to compare our method with state-of-the-art techniques Support Vector Machine (SVM). Experimental results demonstrate that proposed PCA-EELM outperforms the SVM method by 5-fold cross-validation. Besides, PCA-EELM performs faster than PCA-SVM based method. Consequently, the proposed approach can be considered as a new promising and powerful tools for predicting PPI with excellent performance and less time. PMID:23815620
Sunderasan, E; Bahari, A; Arif, S A M; Zainal, Z; Hamilton, R G; Yeang, H Y
2005-11-01
Hev b 4 is an allergenic natural rubber latex (NRL) protein complex that is reactive in skin prick tests and in vitro immunoassays. On SDS-polyacrylamide gel electrophoresis (SDS-PAGE), Hev b 4 is discerned predominantly at 53-55 kDa together with a 57 kDa minor component previously identified as a cyanogenic glucosidase. Of the 13 NRL allergens recognized by the International Union of Immunological Societies, the 53-55 kDa Hev b 4 major protein is the only candidate that lacks complete cDNA and protein sequence information. We sought to clone the transcript encoding the Hev b 4 major protein, and characterize the native protein and its recombinant form in relation to IgE binding. The 5'/3' rapid amplification of cDNA ends method was employed to obtain the complete cDNA of the Hev b 4 major protein. A recombinant form of the protein was over-expressed in Escherichia coli. The native Hev b 4 major protein was deglycosylated by trifluoromethane sulphonic acid. Western immunoblots of the native, deglycosylated and recombinant proteins were performed using both polyclonal antibodies and sera from latex-allergic patients. The cDNA encoding the Hev b 4 major protein was cloned. Its open reading frame matched lecithinases in the conserved domain database and contained 10 predicted glycosylation sites. Detection of glycans on the Hev b 4 lecithinase homologue confirmed it to be a glycoprotein. The deglycosylated lecithinase homologue was discerned at 40 kDa on SDS-PAGE, this being comparable to the 38.53 kDa mass predicted by its cDNA. Deglycosylation of the lecithinase homologue resulted in the loss of IgE recognition, although reactivity to polyclonal rabbit anti-Hev b 4 was retained. IgE from latex-allergic patients also failed to recognize the non-glycosylated E. coli recombinant lecithinase homologue. The IgE epitopes of the Hev b 4 lecithinase homologue reside mainly in its carbohydrate moiety, which also account for the discrepancy between the observed molecular weight of the protein and the value calculated from its cDNA.
Galloway-Peña, Jessica R.; Liang, Xiaowen; Singh, Kavindra V.; Yadav, Puja; Chang, Chungyu; La Rosa, Sabina Leanti; Shelburne, Samuel; Ton-That, Hung; Höök, Magnus
2014-01-01
The WxL domain recently has been identified as a novel cell wall binding domain found in numerous predicted proteins within multiple Gram-positive bacterial species. However, little is known about the function of proteins containing this novel domain. Here, we identify and characterize 6 Enterococcus faecium proteins containing the WxL domain which, by reverse transcription-PCR (RT-PCR) and genomic analyses, are located in three similarly organized operons, deemed WxL loci A, B, and C. Western blotting, electron microscopy, and enzyme-linked immunosorbent assays (ELISAs) determined that genes of WxL loci A and C encode antigenic, cell surface proteins exposed at higher levels in clinical isolates than in commensal isolates. Secondary structural analyses of locus A recombinant WxL domain-containing proteins found they are rich in β-sheet structure and disordered segments. Using Biacore analyses, we discovered that recombinant WxL proteins from locus A bind human extracellular matrix proteins, specifically type I collagen and fibronectin. Proteins encoded by locus A also were found to bind to each other, suggesting a novel cell surface complex. Furthermore, bile salt survival assays and animal models using a mutant from which all three WxL loci were deleted revealed the involvement of WxL operons in bile salt stress and endocarditis pathogenesis. In summary, these studies extend our understanding of proteins containing the WxL domain and their potential impact on colonization and virulence in E. faecium and possibly other Gram-positive bacterial species. PMID:25512313
Encoding of contextual fear memory requires de novo proteins in the prelimbic cortex
Rizzo, Valerio; Touzani, Khalid; Raveendra, Bindu L.; Swarnkar, Supriya; Lora, Joan; Kadakkuzha, Beena M.; Liu, Xin-An; Zhang, Chao; Betel, Doron; Stackman, Robert W.; Puthanveettil, Sathyanarayanan V.
2016-01-01
Background Despite our understanding of the significance of the prefrontal cortex in the consolidation of long-term memories (LTM), its role in the encoding of LTM remains elusive. Here we investigated the role of new protein synthesis in the mouse medial prefrontal cortex (mPFC) in encoding contextual fear memory. Methods Because a change in the association of mRNAs to polyribosomes is an indicator of new protein synthesis, we assessed the changes in polyribosome-associated mRNAs in the mPFC following contextual fear conditioning (CFC) in the mouse. Differential gene expression in mPFC was identified by polyribosome profiling (n = 18). The role of new protein synthesis in mPFC was determined by focal inhibition of protein synthesis (n = 131) and by intra-prelimbic cortex manipulation (n = 56) of Homer 3, a candidate identified from polyribosome profiling. Results We identified several mRNAs that are differentially and temporally recruited to polyribosomes in the mPFC following CFC. Inhibition of protein synthesis in the prelimbic (PL), but not in the anterior cingulate cortex (ACC) region of the mPFC immediately after CFC disrupted encoding of contextual fear memory. Intriguingly, inhibition of new protein synthesis in the PL 6 hours after CFC did not impair encoding. Furthermore, expression of Homer 3, an mRNA enriched in polyribosomes following CFC, in the PL constrained encoding of contextual fear memory. Conclusions Our studies identify several molecular substrates of new protein synthesis in the mPFC and establish that encoding of contextual fear memories require new protein synthesis in PL subregion of mPFC. PMID:28503670
Hellberg, M E; Moy, G W; Vacquier, V D
2000-03-01
Male-specific proteins have increasingly been reported as targets of positive selection and are of special interest because of the role they may play in the evolution of reproductive isolation. We report the rapid interspecific divergence of cDNA encoding a major acrosomal protein of unknown function (TMAP) of sperm from five species of teguline gastropods. A mitochondrial DNA clock (calibrated by congeneric species divided by the Isthmus of Panama) estimates that these five species diverged 2-10 MYA. Inferred amino acid sequences reveal a propeptide that has diverged rapidly between species. The mature protein has diverged faster still due to high nonsynonymous substitution rates (> 25 nonsynonymous substitutions per site per 10(9) years). cDNA encoding the mature protein (89-100 residues) shows evidence of positive selection (Dn/Ds > 1) for 4 of 10 pairwise species comparisons. cDNA and predicted secondary-structure comparisons suggest that TMAP is neither orthologous nor paralogous to abalone lysin, and thus marks a second, phylogenetically independent, protein subject to strong positive selection in free-spawning marine gastropods. In addition, an internal repeat in one species (Tegula aureotincta) produces a duplicated cleavage site which results in two alternatively processed mature proteins differing by nine amino acid residues. Such alternative processing may provide a mechanism for introducing novel amino acid sequence variation at the amino-termini of proteins. Highly divergent TMAP N-termini from two other tegulines (Tegula regina and Norrisia norrisii) may have originated by such a mechanism.
The organisation and interviral homologies of genes at the 3' end of tobacco rattle virus RNA1
Boccara, Martine; Hamilton, William D. O.; Baulcombe, David C.
1986-01-01
The RNA1 of tobacco rattle virus (TRV) has been cloned as cDNA and the nucleotide sequence determined of 2 kb from the 3'-terminal region. The sequence contains three long open reading frames. One of these starts 5' of the cDNA and probably corresponds to the carboxy-terminal sequence of a 170-K protein encoded on RNA1. The deduced protein sequence from this reading frame shows homology with the putative replicases of tobacco mosaic virus (TMV) and tricornaviruses. The location of the second open reading frame, which encodes a 29-K polypeptide, was shown by Northern blot analysis to coincide with a 1.6-kb subgenomic RNA. The validity of this reading frame was confirmed by showing that the cDNA extending over this region could be transcribed and translated in vitro to produce a polypeptide of the predicted size which co-migrates in electrophoresis with a translation product of authentic viral RNA. The sequence of this 29-K polypeptide showed homology with two regions in the 30-K protein of TMV. This homology includes positions in the TMV 30-K protein where mutations have been identified which affect the transport of virus between cells. The third open reading frame encodes a potential 16-K protein and was shown by Northern blot hybridisation to be contained within the region of a 0.7-kb subgenomic RNA which is found in cellular RNA of infected cells but not virus particles. The many similarities between TRV and TMV in viral morphology, gene organisation and sequence suggest that these two viral groups may share a common viral ancestor. ImagesFig. 2.Fig. 3. PMID:16453668
2008-10-13
Furthermore, the encoded protein of this gene is only 30 kDa. A potential GTG start codon at position 625 also encodes a protein that is too small...horizontal bar and putative alternate translation initiation sites (ATG, GTG , and TTG) are indicated. The sizes and locations of the proteins encoded... gray line with rounded rectangles showing sequence features and motifs, including the Ala- and Pro-rich N-terminal region and the C-terminal Cys and
Le, Tra M; Wong, Hui H; Tay, Felicia P L; Fang, Shouguo; Keng, Choong-Tat; Tan, Yee J; Liu, Ding X
2007-08-01
The most striking difference between the subgenomic mRNA8 of severe acute respiratory syndrome coronavirus isolated from human and some animal species is the deletion of 29 nucleotides, resulting in splitting of a single ORF (ORF8) into two ORFs (ORF8a and ORF8b). ORF8a and ORF8b are predicted to encode two small proteins, 8a and 8b, and ORF8 a single protein, 8ab (a fusion form of 8a and 8b). To understand the functions of these proteins, we cloned cDNA fragments covering these ORFs into expression plasmids, and expressed the constructs in both in vitro and in vivo systems. Expression of a construct containing ORF8a and ORF8b generated only a single protein, 8a; no 8b protein expression was obtained. Expression of a construct containing ORF8 generated the 8ab fusion protein. Site-directed mutagenesis and enzymatic treatment revealed that protein 8ab is modified by N-linked glycosylation on the N81 residue and by ubiquitination. In the absence of the 8a region, protein 8b undergoes rapid degradation by proteasomes, and addition of proteasome inhibitors inhibits the degradation of protein 8b as well as the protein 8b-induced rapid degradation of the severe acute respiratory syndrome coronavirus E protein. Glycosylation could also stabilize protein 8ab. More interestingly, the two proteins could bind to monoubiquitin and polyubiquitin, suggesting the potential involvement of these proteins in the pathogenesis of severe acute respiratory syndrome coronavirus.
Iyer, Lakshminarayan M; Tahiliani, Mamta; Rao, Anjana; Aravind, L
2009-06-01
Modified bases in nucleic acids present a layer of information that directs biological function over and beyond the coding capacity of the conventional bases. While a large number of modified bases have been identified, many of the enzymes generating them still remain to be discovered. Recently, members of the 2-oxoglutarate- and iron(II)-dependent dioxygenase super-family, which modify diverse substrates from small molecules to biopolymers, were predicted and subsequently confirmed to catalyze oxidative modification of bases in nucleic acids. Of these, two distinct families, namely the AlkB and the kinetoplastid base J binding proteins (JBP) catalyze in situ hydroxylation of bases in nucleic acids. Using sensitive computational analysis of sequences, structures and contextual information from genomic structure and protein domain architectures, we report five distinct families of 2-oxoglutarate- and iron(II)-dependent dioxygenase that we predict to be involved in nucleic acid modifications. Among the DNA-modifying families, we show that the dioxygenase domains of the kinetoplastid base J-binding proteins belong to a larger family that includes the Tet proteins, prototyped by the human oncogene Tet1, and proteins from basidiomycete fungi, chlorophyte algae, heterolobosean amoeboflagellates and bacteriophages. We present evidence that some of these proteins are likely to be involved in oxidative modification of the 5-methyl group of cytosine leading to the formation of 5-hydroxymethylcytosine. The Tet/JBP homologs from basidiomycete fungi such as Laccaria and Coprinopsis show large lineage-specific expansions and a tight linkage with genes encoding a novel and distinct family of predicted transposases, and a member of the Maelstrom-like HMG family. We propose that these fungal members are part of a mobile transposon. To the best of our knowledge, this is the first report of a eukaryotic transposable element that encodes its own DNA-modification enzyme with a potential regulatory role. Through a wider analysis of other poorly characterized DNA-modifying enzymes we also show that the phage Mu Mom-like proteins, which catalyze the N6-carbamoylmethylation of adenines, are also linked to diverse families of bacterial transposases, suggesting that DNA modification by transposable elements might have a more general presence than previously appreciated. Among the other families of 2-oxoglutarate- and iron(II)-dependent dioxygenases identified in this study, one which is found in algae, is predicted to mainly comprise of RNA-modifying enzymes and shows a striking diversity in protein domain architectures suggesting the presence of RNA modifications with possibly unique adaptive roles. The results presented here are likely to provide the means for future investigation of unexpected epigenetic modifications, such as hydroxymethyl cytosine, that could profoundly impact our understanding of gene regulation and processes such as DNA demethylation.
A newly identified protein of Leptospira interrogans mediates binding to laminin.
Longhi, Mariana T; Oliveira, Tatiane R; Romero, Eliete C; Gonçales, Amane P; de Morais, Zenaide M; Vasconcellos, Silvio A; Nascimento, Ana L T O
2009-10-01
Pathogenic Leptospira is the aetiological agent of leptospirosis, a life-threatening disease that affects populations worldwide. The search for novel antigens that could be relevant in host-pathogen interactions is being pursued. These antigens have the potential to elicit several activities, including adhesion. This study focused on a hypothetical predicted lipoprotein of Leptospira, encoded by the gene LIC12895, thought to mediate attachment to extracellular matrix (ECM) components. The gene was cloned and expressed in Escherichia coli BL21 Star (DE3)pLys by using the expression vector pAE. The recombinant protein tagged with N-terminal hexahistidine was purified by metal-charged chromatography and characterized by circular dichroism spectroscopy. The capacity of the protein to mediate attachment to ECM components was evaluated by binding assays. The leptospiral protein encoded by LIC12895, named Lsa27 (leptospiral surface adhesin, 27 kDa), bound strongly to laminin in a dose-dependent and saturable fashion. Moreover, Lsa27 was recognized by antibodies from serum samples of confirmed leptospirosis specimens in both the initial and the convalescent phases of the disease. Lsa27 is most likely a surface protein of Leptospira as revealed in liquid-phase immunofluorescence assays with living organisms. Taken together, these data indicate that this newly identified membrane protein is expressed during natural infection and may play a role in mediating adhesion of L. interrogans to its host.
Shen, W C; Selvakumar, D; Stanford, D R; Hopper, A K
1993-09-15
Mutations of the Saccharomyces cerevisiae LOS1 gene cause the accumulation of end matured intron-containing pre-tRNAs at elevated temperatures. In an effort to decipher the role of the LOS1 protein in pre-tRNA splicing, we have analyzed the LOS1 gene and its protein product. The LOS1 gene is located on the left arm of chromosome XI and the order of genes in this area of the chromosome is .... URA1 ... SAC1 TRP3 UBA1 STE6 LOS1 .... FAS1..... The LOS1 open reading frame encodes a putative protein of 1100 amino acids that shows no significant homology to other genes. The LOS1 open reading frame was tagged with the influenza virus hemagglutinin epitope recognized by the 12CA5 antibody. The 12CA5 antibody recognizes an epitope-tagged protein of the size predicted by the LOS1 open reading frame. Using this antibody for indirect immunofluorescence and cell fractionation studies we show that the LOS1 protein is located in nuclei. Los1p cannot be extracted from nuclei by treatment with nucleases, salts, or Triton X-100. This insolubility suggests that Los1p is a component of the nucleoskeleton. We propose that LOS1 mutations may affect pre-tRNA processing via alteration of the nuclear matrix.
PIOX, a new pathogen-induced oxygenase with homology to animal cyclooxygenase.
Sanz, A; Moreno, J I; Castresana, C
1998-09-01
Changes in gene expression induced in tobacco leaves by the harpin HrpN protein elicitor were examined, and a new cDNA, piox (for pathogen-induced oxygenase), with homology to genes encoding cyclooxygenase or prostaglandin endoperoxide synthase (PGHS), was identified. In addition to the amino acid identity determined, the protein encoded by piox is predicted to have a structural core similar to that of ovine PGHS-1. Moreover, studies of protein functionality demonstrate that the PIOX recombinant protein possesses at least one of the two enzymatic activities of PGHSs, that of catalyzing the oxygenation of polyunsaturated fatty acids. piox transcripts accumulated after protein elicitor treatment or inoculation with bacteria. Expression of piox was induced in tissues responding to inoculation with both incompatible and compatible bacteria, but RNA and protein accumulation differed for both types of interactions. We show that expression of piox is rapidly induced in response to various cellular signals mediating plant responses to pathogen infection and that activation of piox expression is most likely related to the oxidative burst that takes place during the cell death processes examined. Cyclooxygenase catalyzes the first committed step in the formation of prostaglandins and thromboxanes, which are lipid-derived signal molecules that mediate many cellular processes, including the immune response in vertebrates. The finding of tobacco PIOX suggests that more similarities than hitherto expected will be found between the lipid-based responses for plant and animal systems.
Wenger, Yvan; Galliot, Brigitte
2013-03-25
Evolutionary studies benefit from deep sequencing technologies that generate genomic and transcriptomic sequences from a variety of organisms. Genome sequencing and RNAseq have complementary strengths. In this study, we present the assembly of the most complete Hydra transcriptome to date along with a comparative analysis of the specific features of RNAseq and genome-predicted transcriptomes currently available in the freshwater hydrozoan Hydra vulgaris. To produce an accurate and extensive Hydra transcriptome, we combined Illumina and 454 Titanium reads, giving the primacy to Illumina over 454 reads to correct homopolymer errors. This strategy yielded an RNAseq transcriptome that contains 48'909 unique sequences including splice variants, representing approximately 24'450 distinct genes. Comparative analysis to the available genome-predicted transcriptomes identified 10'597 novel Hydra transcripts that encode 529 evolutionarily-conserved proteins. The annotation of 170 human orthologs points to critical functions in protein biosynthesis, FGF and TOR signaling, vesicle transport, immunity, cell cycle regulation, cell death, mitochondrial metabolism, transcription and chromatin regulation. However, a majority of these novel transcripts encodes short ORFs, at least 767 of them corresponding to pseudogenes. This RNAseq transcriptome also lacks 11'270 predicted transcripts that correspond either to silent genes or to genes expressed below the detection level of this study. We established a simple and powerful strategy to combine Illumina and 454 reads and we produced, with genome assistance, an extensive and accurate Hydra transcriptome. The comparative analysis of the RNAseq transcriptome with genome-predicted transcriptomes lead to the identification of large populations of novel as well as missing transcripts that might reflect Hydra-specific evolutionary events.
2013-01-01
Background Evolutionary studies benefit from deep sequencing technologies that generate genomic and transcriptomic sequences from a variety of organisms. Genome sequencing and RNAseq have complementary strengths. In this study, we present the assembly of the most complete Hydra transcriptome to date along with a comparative analysis of the specific features of RNAseq and genome-predicted transcriptomes currently available in the freshwater hydrozoan Hydra vulgaris. Results To produce an accurate and extensive Hydra transcriptome, we combined Illumina and 454 Titanium reads, giving the primacy to Illumina over 454 reads to correct homopolymer errors. This strategy yielded an RNAseq transcriptome that contains 48’909 unique sequences including splice variants, representing approximately 24’450 distinct genes. Comparative analysis to the available genome-predicted transcriptomes identified 10’597 novel Hydra transcripts that encode 529 evolutionarily-conserved proteins. The annotation of 170 human orthologs points to critical functions in protein biosynthesis, FGF and TOR signaling, vesicle transport, immunity, cell cycle regulation, cell death, mitochondrial metabolism, transcription and chromatin regulation. However, a majority of these novel transcripts encodes short ORFs, at least 767 of them corresponding to pseudogenes. This RNAseq transcriptome also lacks 11’270 predicted transcripts that correspond either to silent genes or to genes expressed below the detection level of this study. Conclusions We established a simple and powerful strategy to combine Illumina and 454 reads and we produced, with genome assistance, an extensive and accurate Hydra transcriptome. The comparative analysis of the RNAseq transcriptome with genome-predicted transcriptomes lead to the identification of large populations of novel as well as missing transcripts that might reflect Hydra-specific evolutionary events. PMID:23530871
Plant, Ewan P; Rakauskaite, Rasa; Taylor, Deborah R; Dinman, Jonathan D
2010-05-01
In retroviruses and the double-stranded RNA totiviruses, the efficiency of programmed -1 ribosomal frameshifting is critical for ensuring the proper ratios of upstream-encoded capsid proteins to downstream-encoded replicase enzymes. The genomic organizations of many other frameshifting viruses, including the coronaviruses, are very different, in that their upstream open reading frames encode nonstructural proteins, the frameshift-dependent downstream open reading frames encode enzymes involved in transcription and replication, and their structural proteins are encoded by subgenomic mRNAs. The biological significance of frameshifting efficiency and how the relative ratios of proteins encoded by the upstream and downstream open reading frames affect virus propagation has not been explored before. Here, three different strategies were employed to test the hypothesis that the -1 PRF signals of coronaviruses have evolved to produce the correct ratios of upstream- to downstream-encoded proteins. Specifically, infectious clones of the severe acute respiratory syndrome (SARS)-associated coronavirus harboring mutations that lower frameshift efficiency decreased infectivity by >4 orders of magnitude. Second, a series of frameshift-promoting mRNA pseudoknot mutants was employed to demonstrate that the frameshift signals of the SARS-associated coronavirus and mouse hepatitis virus have evolved to promote optimal frameshift efficiencies. Finally, we show that a previously described frameshift attenuator element does not actually affect frameshifting per se but rather serves to limit the fraction of ribosomes available for frameshifting. The findings of these analyses all support a "golden mean" model in which viruses use both programmed ribosomal frameshifting and translational attenuation to control the relative ratios of their encoded proteins.
Ashworth, Justin; Plaisier, Christopher L.; Lo, Fang Yin; Reiss, David J.; Baliga, Nitin S.
2014-01-01
Widespread microbial genome sequencing presents an opportunity to understand the gene regulatory networks of non-model organisms. This requires knowledge of the binding sites for transcription factors whose DNA-binding properties are unknown or difficult to infer. We adapted a protein structure-based method to predict the specificities and putative regulons of homologous transcription factors across diverse species. As a proof-of-concept we predicted the specificities and transcriptional target genes of divergent archaeal feast/famine regulatory proteins, several of which are encoded in the genome of Halobacterium salinarum. This was validated by comparison to experimentally determined specificities for transcription factors in distantly related extremophiles, chromatin immunoprecipitation experiments, and cis-regulatory sequence conservation across eighteen related species of halobacteria. Through this analysis we were able to infer that Halobacterium salinarum employs a divergent local trans-regulatory strategy to regulate genes (carA and carB) involved in arginine and pyrimidine metabolism, whereas Escherichia coli employs an operon. The prediction of gene regulatory binding sites using structure-based methods is useful for the inference of gene regulatory relationships in new species that are otherwise difficult to infer. PMID:25255272
Ashworth, Justin; Plaisier, Christopher L; Lo, Fang Yin; Reiss, David J; Baliga, Nitin S
2014-01-01
Widespread microbial genome sequencing presents an opportunity to understand the gene regulatory networks of non-model organisms. This requires knowledge of the binding sites for transcription factors whose DNA-binding properties are unknown or difficult to infer. We adapted a protein structure-based method to predict the specificities and putative regulons of homologous transcription factors across diverse species. As a proof-of-concept we predicted the specificities and transcriptional target genes of divergent archaeal feast/famine regulatory proteins, several of which are encoded in the genome of Halobacterium salinarum. This was validated by comparison to experimentally determined specificities for transcription factors in distantly related extremophiles, chromatin immunoprecipitation experiments, and cis-regulatory sequence conservation across eighteen related species of halobacteria. Through this analysis we were able to infer that Halobacterium salinarum employs a divergent local trans-regulatory strategy to regulate genes (carA and carB) involved in arginine and pyrimidine metabolism, whereas Escherichia coli employs an operon. The prediction of gene regulatory binding sites using structure-based methods is useful for the inference of gene regulatory relationships in new species that are otherwise difficult to infer.
Identification of a Novel Mucin Gene HCG22 Associated With Steroid-Induced Ocular Hypertension
Jeong, Shinwu; Patel, Nitin; Edlund, Christopher K.; Hartiala, Jaana; Hazelett, Dennis J.; Itakura, Tatsuo; Wu, Pei-Chang; Avery, Robert L.; Davis, Janet L.; Flynn, Harry W.; Lalwani, Geeta; Puliafito, Carmen A.; Wafapoor, Hussein; Hijikata, Minako; Keicho, Naoto; Gao, Xiaoyi; Argüeso, Pablo; Allayee, Hooman; Coetzee, Gerhard A.; Pletcher, Mathew T.; Conti, David V.; Schwartz, Stephen G.; Eaton, Alexander M.; Fini, M. Elizabeth
2015-01-01
Purpose. The pathophysiology of ocular hypertension (OH) leading to primary open-angle glaucoma shares many features with a secondary form of OH caused by treatment with glucocorticoids, but also exhibits distinct differences. In this study, a pharmacogenomics approach was taken to discover candidate genes for this disorder. Methods. A genome-wide association study was performed, followed by an independent candidate gene study, using a cohort enrolled from patients treated with off-label intravitreal triamcinolone, and handling change in IOP as a quantitative trait. Results. An intergenic quantitative trait locus (QTL) was identified at chromosome 6p21.33 near the 5′ end of HCG22 that attained the accepted statistical threshold for genome-level significance. The HCG22 transcript, encoding a novel mucin protein, was expressed in trabecular meshwork cells, and expression was stimulated by IL-1, and inhibited by triamcinolone acetate and TGF-β. Bioinformatic analysis defined the QTL as an approximately 4 kilobase (kb) linkage disequilibrium block containing 10 common single nucleotide polymorphisms (SNPs). Four of these SNPs were identified in the National Center for Biotechnology Information (NCBI) GTEx eQTL browser as modifiers of HCG22 expression. Most are predicted to disrupt or improve motifs for transcription factor binding, the most relevant being disruption of the glucocorticoid receptor binding motif. A second QTL was identified within the predicted signal peptide of the HCG22 encoded protein that could affect its secretion. Translation, O-glycosylation, and secretion of the predicted HCG22 protein was verified in cultured trabecular meshwork cells. Conclusions. Identification of two independent QTLs that could affect expression of the HCG22 mucin gene product via two different mechanisms (transcription or secretion) is highly suggestive of a role in steroid-induced OH. PMID:25813999
Genome sequence of the model medicinal mushroom Ganoderma lucidum
Chen, Shilin; Xu, Jiang; Liu, Chang; Zhu, Yingjie; Nelson, David R.; Zhou, Shiguo; Li, Chunfang; Wang, Lizhi; Guo, Xu; Sun, Yongzhen; Luo, Hongmei; Li, Ying; Song, Jingyuan; Henrissat, Bernard; Levasseur, Anthony; Qian, Jun; Li, Jianqin; Luo, Xiang; Shi, Linchun; He, Liu; Xiang, Li; Xu, Xiaolan; Niu, Yunyun; Li, Qiushi; Han, Mira V.; Yan, Haixia; Zhang, Jin; Chen, Haimei; Lv, Aiping; Wang, Zhen; Liu, Mingzhu; Schwartz, David C.; Sun, Chao
2012-01-01
Ganoderma lucidum is a widely used medicinal macrofungus in traditional Chinese medicine that creates a diverse set of bioactive compounds. Here we report its 43.3-Mb genome, encoding 16,113 predicted genes, obtained using next-generation sequencing and optical mapping approaches. The sequence analysis reveals an impressive array of genes encoding cytochrome P450s (CYPs), transporters and regulatory proteins that cooperate in secondary metabolism. The genome also encodes one of the richest sets of wood degradation enzymes among all of the sequenced basidiomycetes. In all, 24 physical CYP gene clusters are identified. Moreover, 78 CYP genes are coexpressed with lanosterol synthase, and 16 of these show high similarity to fungal CYPs that specifically hydroxylate testosterone, suggesting their possible roles in triterpenoid biosynthesis. The elucidation of the G. lucidum genome makes this organism a potential model system for the study of secondary metabolic pathways and their regulation in medicinal fungi. PMID:22735441
Voelker, R; Mendel-Hartvig, J; Barkan, A
1997-02-01
A nuclear mutant of maize, tha1, which exhibited defects in the translocation of proteins across the thylakoid membrane, was described previously. A transposon insertion at the tha1 locus facilitated the cloning of portions of the tha1 gene. Strong sequence similarity with secA genes from bacteria, pea and spinach indicates that tha1 encodes a SecA homologue (cp-SecA). The tha1-ref allele is either null or nearly so, in that tha1 mRNA is undetectable in mutant leaves and cp-SecA accumulation is reduced > or = 40-fold. These results, in conjunction with the mutant phenotype described previously, demonstrate that cp-SecA functions in vivo to facilitate the translocation of OEC33, PSI-F and plastocyanin but does not function in the translocation of OEC23 and OEC16. Our results confirm predictions for cp-SecA function made from the results of in vitro experiments and establish several new functions for cp-SecA, including roles in the targeting of a chloroplast-encoded protein, cytochrome f, and in protein targeting in the etioplast, a nonphotosynthetic plastid type. Our finding that the accumulation of properly targeted plastocyanin and cytochrome f in tha1-ref thylakoid membranes is reduced only a few-fold despite the near or complete absence of cp-SecA suggests that cp-SecA facilitates but is not essential in vivo for their translocation across the membrane.
Schilhabel, Anke; Studenik, Sandra; Vödisch, Martin; Kreher, Sandra; Schlott, Bernhard; Pierik, Antonio Y.; Diekert, Gabriele
2009-01-01
Anaerobic O-demethylases are inducible multicomponent enzymes which mediate the cleavage of the ether bond of phenyl methyl ethers and the transfer of the methyl group to tetrahydrofolate. The genes of all components (methyltransferases I and II, CP, and activating enzyme [AE]) of the vanillate- and veratrol-O-demethylases of Acetobacterium dehalogenans were sequenced and analyzed. In A. dehalogenans, the genes for methyltransferase I, CP, and methyltransferase II of both O-demethylases are clustered. The single-copy gene for AE is not included in the O-demethylase gene clusters. It was found that AE grouped with COG3894 proteins, the function of which was unknown so far. Genes encoding COG3894 proteins with 20 to 41% amino acid sequence identity with AE are present in numerous genomes of anaerobic microorganisms. Inspection of the domain structure and genetic context of these orthologs predicts that these are also reductive activases for corrinoid enzymes (RACEs), such as carbon monoxide dehydrogenase/acetyl coenzyme A synthases or anaerobic methyltransferases. The genes encoding the O-demethylase components were heterologously expressed with a C-terminal Strep-tag in Escherichia coli, and the recombinant proteins methyltransferase I, CP, and AE were characterized. Gel shift experiments showed that the AE comigrated with the CP. The formation of other protein complexes with the O-demethylase components was not observed under the conditions used. The results point to a strong interaction of the AE with the CP. This is the first report on the functional heterologous expression of acetogenic phenyl methyl ether-cleaving O-demethylases. PMID:19011025
Inhibition of stress-inducible HSP70 impairs mitochondrial proteostasis and function.
Leu, Julia I-Ju; Barnoud, Thibaut; Zhang, Gao; Tian, Tian; Wei, Zhi; Herlyn, Meenhard; Murphy, Maureen E; George, Donna L
2017-07-11
Protein quality control is an important component of survival for all cells. The use of proteasome inhibitors for cancer therapy derives from the fact that tumor cells generally exhibit greater levels of proteotoxic stress than do normal cells, and thus cancer cells tend to be more sensitive to proteasome inhibition. However, this approach has been limited in some cases by toxicity to normal cells. Recently, the concept of inhibiting proteostasis in organelles for cancer therapy has been advanced, in part because it is predicted to have reduced toxicity for normal cells. Here we demonstrate that a fraction of the major stress-induced chaperone HSP70 (also called HSPA1A or HSP72, but hereafter HSP70) is abundantly present in mitochondria of tumor cells, but is expressed at quite low or undetectable levels in mitochondria of most normal tissues and non-tumor cell lines. We show that treatment of tumor cells with HSP70 inhibitors causes a marked change in mitochondrial protein quality control, loss of mitochondrial membrane potential, reduced oxygen consumption rate, and loss of ATP production. We identify several nuclear-encoded mitochondrial proteins, including polyadenylate binding protein-1 (PABPC1), which exhibit decreased abundance in mitochondria following treatment with HSP70 inhibitors. We also show that targeting HSP70 function leads to reduced levels of several mitochondrial-encoded RNA species that encode components of the electron transport chain. Our data indicate that small molecule inhibitors of HSP70 represent a new class of organelle proteostasis inhibitors that impair mitochondrial function in cancer cells, and therefore constitute novel therapeutics.
Protein synthesis in sperm: dialog between mitochondria and cytoplasm.
Gur, Yael; Breitbart, Haim
2008-01-30
Ejaculated sperm are capable of using mRNAs transcripts for protein translation during the final maturation steps before fertilization. In a capacitation-dependent process, nuclear-encoded mRNAs are translated by mitochondrial-type ribosomes while the cytoplasmic translation machinery is not involved. Our findings suggest that new proteins are synthesized to replace degraded proteins while swimming and waiting in the female reproductive tract before fertilization, or produced due to the specific needs of the capacitating spermatozoa. In addition, a growing number of articles have reported evidence for the correlation of nuclear-encoded mRNA and protein synthesis in somatic mitochondria. It is known that all of the proteins necessary for the replication, transcription and translation of the genes encoded in mtDNA are now encoded in the nuclear genome. This genetic investment is far out of proportion to the number of proteins involved, as there have been multiple movements and duplications of genes. However, the evolutionary retention (or secondary uptake) of the mitochondrial machinery for translation of nuclear-encoded mRNAs may shed light on this paradox.
Storbeck, Sonja; Rolfes, Sarah; Raux-Deery, Evelyne; Warren, Martin J; Jahn, Dieter; Layer, Gunhild
2010-12-13
Heme is an essential prosthetic group for many proteins involved in fundamental biological processes in all three domains of life. In Eukaryota and Bacteria heme is formed via a conserved and well-studied biosynthetic pathway. Surprisingly, in Archaea heme biosynthesis proceeds via an alternative route which is poorly understood. In order to formulate a working hypothesis for this novel pathway, we searched 59 completely sequenced archaeal genomes for the presence of gene clusters consisting of established heme biosynthetic genes and colocalized conserved candidate genes. Within the majority of archaeal genomes it was possible to identify such heme biosynthesis gene clusters. From this analysis we have been able to identify several novel heme biosynthesis genes that are restricted to archaea. Intriguingly, several of the encoded proteins display similarity to enzymes involved in heme d(1) biosynthesis. To initiate an experimental verification of our proposals two Methanosarcina barkeri proteins predicted to catalyze the initial steps of archaeal heme biosynthesis were recombinantly produced, purified, and their predicted enzymatic functions verified.
Storbeck, Sonja; Rolfes, Sarah; Raux-Deery, Evelyne; Warren, Martin J.; Jahn, Dieter; Layer, Gunhild
2010-01-01
Heme is an essential prosthetic group for many proteins involved in fundamental biological processes in all three domains of life. In Eukaryota and Bacteria heme is formed via a conserved and well-studied biosynthetic pathway. Surprisingly, in Archaea heme biosynthesis proceeds via an alternative route which is poorly understood. In order to formulate a working hypothesis for this novel pathway, we searched 59 completely sequenced archaeal genomes for the presence of gene clusters consisting of established heme biosynthetic genes and colocalized conserved candidate genes. Within the majority of archaeal genomes it was possible to identify such heme biosynthesis gene clusters. From this analysis we have been able to identify several novel heme biosynthesis genes that are restricted to archaea. Intriguingly, several of the encoded proteins display similarity to enzymes involved in heme d 1 biosynthesis. To initiate an experimental verification of our proposals two Methanosarcina barkeri proteins predicted to catalyze the initial steps of archaeal heme biosynthesis were recombinantly produced, purified, and their predicted enzymatic functions verified. PMID:21197080
Analysis of Ribosome Stalling and Translation Elongation Dynamics by Deep Learning.
Zhang, Sai; Hu, Hailin; Zhou, Jingtian; He, Xuan; Jiang, Tao; Zeng, Jianyang
2017-09-27
Ribosome stalling is manifested by the local accumulation of ribosomes at specific codon positions of mRNAs. Here, we present ROSE, a deep learning framework to analyze high-throughput ribosome profiling data and estimate the probability of a ribosome stalling event occurring at each genomic location. Extensive validation tests on independent data demonstrated that ROSE possessed higher prediction accuracy than conventional prediction models, with an increase in the area under the receiver operating characteristic curve by up to 18.4%. In addition, genome-wide statistical analyses showed that ROSE predictions can be well correlated with diverse putative regulatory factors of ribosome stalling. Moreover, the genome-wide ribosome stalling landscapes of both human and yeast computed by ROSE recovered the functional interplays between ribosome stalling and cotranslational events in protein biogenesis, including protein targeting by the signal recognition particles and protein secondary structure formation. Overall, our study provides a novel method to complement the ribosome profiling techniques and further decipher the complex regulatory mechanisms underlying translation elongation dynamics encoded in the mRNA sequence. Copyright © 2017 Elsevier Inc. All rights reserved.
Human AZU-1 gene, variants thereof and expressed gene products
Chen, Huei-Mei; Bissell, Mina
2004-06-22
A human AZU-1 gene, mutants, variants and fragments thereof. Protein products encoded by the AZU-1 gene and homologs encoded by the variants of AZU-1 gene acting as tumor suppressors or markers of malignancy progression and tumorigenicity reversion. Identification, isolation and characterization of AZU-1 and AZU-2 genes localized to a tumor suppressive locus at chromosome 10q26, highly expressed in nonmalignant and premalignant cells derived from a human breast tumor progression model. A recombinant full length protein sequences encoded by the AZU-1 gene and nucleotide sequences of AZU-1 and AZU-2 genes and variant and fragments thereof. Monoclonal or polyclonal antibodies specific to AZU-1, AZU-2 encoded protein and to AZU-1, or AZU-2 encoded protein homologs.
USDA-ARS?s Scientific Manuscript database
Bean pod mottle virus (BPMV) is a bipartite, positive sense (+) RNA plant virus in the Secoviridae family. Its RNA1 encodes proteins required for genome replication, whereas RNA2 primarily encodes proteins needed for virion assembly and cell-to-cell movement. However, the function of a 58 kilo-dalto...
USDA-ARS?s Scientific Manuscript database
The members of Capillovirus genus encode two overlapping open reading frames (ORFs): ORF1 encodes a large polyprotein containing the domains of replication-associated proteins plus a coat protein (CP), and ORF2 encodes a movement protein, located within ORF1 in a different reading frame. Organizatio...
Lieutaud, Philippe; Uversky, Alexey V.; Uversky, Vladimir N.; Longhi, Sonia
2016-01-01
ABSTRACT In the last 2 decades it has become increasingly evident that a large number of proteins are either fully or partially disordered. Intrinsically disordered proteins lack a stable 3D structure, are ubiquitous and fulfill essential biological functions. Their conformational heterogeneity is encoded in their amino acid sequences, thereby allowing intrinsically disordered proteins or regions to be recognized based on properties of these sequences. The identification of disordered regions facilitates the functional annotation of proteins and is instrumental for delineating boundaries of protein domains amenable to structural determination with X-ray crystallization. This article discusses a comprehensive selection of databases and methods currently employed to disseminate experimental and putative annotations of disorder, predict disorder and identify regions involved in induced folding. It also provides a set of detailed instructions that should be followed to perform computational analysis of disorder. PMID:28232901
NASA Astrophysics Data System (ADS)
Drillien, Robert; Spehner, Daniele; Kirn, Andre; Giraudon, Pascale; Buckland, Robin; Wild, Fabian; Lecocq, Jean-Pierre
1988-02-01
Vaccinia virus recombinants encoding the hemagglutinin or fusion protein of measles virus have been constructed. Infection of cell cultures with the recombinants led to the synthesis of authentic measles proteins as judged by their electrophoretic mobility, recognition by antibodies, glycosylation, proteolytic cleavage, and presentation on the cell surface. Mice vaccinated with a single dose of the recombinant encoding the hemagglutinin protein developed antibodies capable of both inhibiting hemagglutination activity and neutralizing measles virus, whereas animals vaccinated with the recombinant encoding the fusion protein developed measles neutralizing antibodies. Mice vaccinated with either of the recombinants resisted a normally lethal intracerebral inoculation of a cell-associated measles virus subacute sclerosing panencephalitis strain.
Cook, W B; Walker, J C
1992-01-01
A cDNA encoding a nuclear-encoded chloroplast nucleic acid-binding protein (NBP) has been isolated from maize. Identified as an in vitro DNA-binding activity, NBP belongs to a family of nuclear-encoded chloroplast proteins which share a common domain structure and are thought to be involved in posttranscriptional regulation of chloroplast gene expression. NBP contains an N-terminal chloroplast transit peptide, a highly acidic domain and a pair of ribonucleoprotein consensus sequence domains. NBP is expressed in a light-dependent, organ-specific manner which is consistent with its involvement in chloroplast biogenesis. The relationship of NBP to the other members of this protein family and their possible regulatory functions are discussed. Images PMID:1346929
Predicting residue-wise contact orders in proteins by support vector regression.
Song, Jiangning; Burrage, Kevin
2006-10-03
The residue-wise contact order (RWCO) describes the sequence separations between the residues of interest and its contacting residues in a protein sequence. It is a new kind of one-dimensional protein structure that represents the extent of long-range contacts and is considered as a generalization of contact order. Together with secondary structure, accessible surface area, the B factor, and contact number, RWCO provides comprehensive and indispensable important information to reconstructing the protein three-dimensional structure from a set of one-dimensional structural properties. Accurately predicting RWCO values could have many important applications in protein three-dimensional structure prediction and protein folding rate prediction, and give deep insights into protein sequence-structure relationships. We developed a novel approach to predict residue-wise contact order values in proteins based on support vector regression (SVR), starting from primary amino acid sequences. We explored seven different sequence encoding schemes to examine their effects on the prediction performance, including local sequence in the form of PSI-BLAST profiles, local sequence plus amino acid composition, local sequence plus molecular weight, local sequence plus secondary structure predicted by PSIPRED, local sequence plus molecular weight and amino acid composition, local sequence plus molecular weight and predicted secondary structure, and local sequence plus molecular weight, amino acid composition and predicted secondary structure. When using local sequences with multiple sequence alignments in the form of PSI-BLAST profiles, we could predict the RWCO distribution with a Pearson correlation coefficient (CC) between the predicted and observed RWCO values of 0.55, and root mean square error (RMSE) of 0.82, based on a well-defined dataset with 680 protein sequences. Moreover, by incorporating global features such as molecular weight and amino acid composition we could further improve the prediction performance with the CC to 0.57 and an RMSE of 0.79. In addition, combining the predicted secondary structure by PSIPRED was found to significantly improve the prediction performance and could yield the best prediction accuracy with a CC of 0.60 and RMSE of 0.78, which provided at least comparable performance compared with the other existing methods. The SVR method shows a prediction performance competitive with or at least comparable to the previously developed linear regression-based methods for predicting RWCO values. In contrast to support vector classification (SVC), SVR is very good at estimating the raw value profiles of the samples. The successful application of the SVR approach in this study reinforces the fact that support vector regression is a powerful tool in extracting the protein sequence-structure relationship and in estimating the protein structural profiles from amino acid sequences.
Cloning and Expression of the Benzoate Dioxygenase Genes from Rhodococcus sp. Strain 19070
Haddad, Sandra; Eby, D. Matthew; Neidle, Ellen L.
2001-01-01
The bopXYZ genes from the gram-positive bacterium Rhodococcus sp. strain 19070 encode a broad-substrate-specific benzoate dioxygenase. Expression of the BopXY terminal oxygenase enabled Escherichia coli to convert benzoate or anthranilate (2-aminobenzoate) to a nonaromatic cis-diol or catechol, respectively. This expression system also rapidly transformed m-toluate (3-methylbenzoate) to an unidentified product. In contrast, 2-chlorobenzoate was not a good substrate. The BopXYZ dioxygenase was homologous to the chromosomally encoded benzoate dioxygenase (BenABC) and the plasmid-encoded toluate dioxygenase (XylXYZ) of gram-negative acinetobacters and pseudomonads. Pulsed-field gel electrophoresis failed to identify any plasmid in Rhodococcus sp. strain 19070. Catechol 1,2- and 2,3-dioxygenase activity indicated that strain 19070 possesses both meta- and ortho-cleavage degradative pathways, which are associated in pseudomonads with the xyl and ben genes, respectively. Open reading frames downstream of bopXYZ, designated bopL and bopK, resembled genes encoding cis-diol dehydrogenases and benzoate transporters, respectively. The bop genes were in the same order as the chromosomal ben genes of P. putida PRS2000. The deduced sequences of BopXY were 50 to 60% identical to the corresponding proteins of benzoate and toluate dioxygenases. The reductase components of these latter dioxygenases, BenC and XylZ, are 201 residues shorter than the deduced BopZ sequence. As predicted from the sequence, expression of BopZ in E. coli yielded an approximately 60-kDa protein whose presence corresponded to increased cytochrome c reductase activity. While the N-terminal region of BopZ was approximately 50% identical in sequence to the entire BenC or XylZ reductases, the C terminus was unlike other known protein sequences. PMID:11375157
de Vries, Ronald P.; van den Broeck, Hetty C.; Dekkers, Ester; Manzanares, Paloma; de Graaff, Leo H.; Visser, Jaap
1999-01-01
A gene encoding a third α-galactosidase (AglB) from Aspergillus niger has been cloned and sequenced. The gene consists of an open reading frame of 1,750 bp containing six introns. The gene encodes a protein of 443 amino acids which contains a eukaryotic signal sequence of 16 amino acids and seven putative N-glycosylation sites. The mature protein has a calculated molecular mass of 48,835 Da and a predicted pI of 4.6. An alignment of the AglB amino acid sequence with those of other α-galactosidases revealed that it belongs to a subfamily of α-galactosidases that also includes A. niger AglA. A. niger AglC belongs to a different subfamily that consists mainly of prokaryotic α-galactosidases. The expression of aglA, aglB, aglC, and lacA, the latter of which encodes an A. niger β-galactosidase, has been studied by using a number of monomeric, oligomeric, and polymeric compounds as growth substrates. Expression of aglA is only detected on galactose and galactose-containing oligomers and polymers. The aglB gene is expressed on all of the carbon sources tested, including glucose. Elevated expression was observed on xylan, which could be assigned to regulation via XlnR, the xylanolytic transcriptional activator. Expression of aglC was only observed on glucose, fructose, and combinations of glucose with xylose and galactose. High expression of lacA was detected on arabinose, xylose, xylan, and pectin. Similar to aglB, the expression on xylose and xylan can be assigned to regulation via XlnR. All four genes have distinct expression patterns which seem to mirror the natural substrates of the encoded proteins. PMID:10347026
Butts, Carter T.; Bierma, Jan C.; Martin, Rachel W.
2016-01-01
In his 1875 monograph on insectivorous plants, Darwin described the feeding reactions of Drosera flypaper traps and predicted that their secretions contained a “ferment” similar to mammalian pepsin, an aspartic protease. Here we report a high-quality draft genome sequence for the cape sundew, Drosera capensis, the first genome of a carnivorous plant from order Caryophyllales, which also includes the Venus flytrap (Dionaea) and the tropical pitcher plants (Nepenthes). This species was selected in part for its hardiness and ease of cultivation, making it an excellent model organism for further investigations of plant carnivory. Analysis of predicted protein sequences yields genes encoding proteases homologous to those found in other plants, some of which display sequence and structural features that suggest novel functionalities. Because the sequence similarity to proteins of known structure is in most cases too low for traditional homology modeling, 3D structures of representative proteases are predicted using comparative modeling with all-atom refinement. Although the overall folds and active residues for these proteins are conserved, we find structural and sequence differences consistent with a diversity of substrate recognition patterns. Finally, we predict differences in substrate specificities using in silico experiments, providing targets for structure/function studies of novel enzymes with biological and technological significance. PMID:27353064
Cytochrome b5 gene and protein of Candida tropicalis and methods relating thereto
Craft, David L.; Madduri, Krishna M.; Loper, John C.
2003-01-01
A novel gene has been isolated which encodes cytochrome b5 (CYTb5) protein of the .omega.-hydroxylase complex of C. tropicalis 20336. Vectors including this gene, and transformed host cells are provided. Methods of increasing the production of a CYTb5 protein are also provided which involve transforming a host cell with a gene encoding this protein and culturing the cells. Methods of increasing the production of a dicarboxylic acid are also provided which involve increasing in the host cell the number of genes encoding this protein.
Lsa63, a newly identified surface protein of Leptospira interrogans binds laminin and collagen IV.
Vieira, Monica L; de Morais, Zenaide M; Gonçales, Amane P; Romero, Eliete C; Vasconcellos, Silvio A; Nascimento, Ana L T O
2010-01-01
Leptospira interrogans is the etiological agent of leptospirosis, a zoonotic disease that affects populations worldwide. We have identified in proteomic studies a protein that is encoded by the gene LIC10314 and expressed in virulent strain of L. interrogans serovar Pomona. This protein was predicted to be surface exposed by PSORT program and contains a p83/100 domain identified by BLAST analysis that is conserved in protein antigens of several strains of Borrelia and Treponema spp. The proteins containing this domain have been claimed antigen candidates for serodiagnosis of Lyme borreliosis. Thus, we have cloned the LIC10314 and expressed the protein in Escherichia coli BL21-SI strain by using the expression vector pAE. The recombinant protein tagged with N-terminal hexahistidine was purified by metal-charged chromatography and characterized by circular dichroism spectroscopy. This protein is conserved among several species of pathogenic Leptospira and absent in the saprophytic strain L. biflexa. We confirm by liquid-phase immunofluorescence assays with living organisms that this protein is most likely a new surface leptospiral protein. The ability of the protein to mediate attachment to ECM components was evaluated by binding assays. The leptospiral protein encoded by LIC10314, named Lsa63 (Leptospiral surface adhesin of 63kDa), binds strongly to laminin and collagen IV in a dose-dependent and saturable fashion. In addition, Lsa63 is probably expressed during infection since it was recognized by antibodies of serum samples of confirmed-leptospirosis patients in convalescent phase of the disease. Altogether, the data suggests that this novel identified surface protein may be involved in leptospiral pathogenesis. 2009 The British Infection Society. Published by Elsevier Ltd. All rights reserved.
Stevenson, G; Andrianopoulos, K; Hobbs, M; Reeves, P R
1996-01-01
Colanic acid (CA) is an extracellular polysaccharide produced by most Escherichia coli strains as well as by other species of the family Enterobacteriaceae. We have determined the sequence of a 23-kb segment of the E. coli K-12 chromosome which includes the cluster of genes necessary for production of CA. The CA cluster comprises 19 genes. Two other sequenced genes (orf1.3 and galF), which are situated between the CA cluster and the O-antigen cluster, were shown to be unnecessary for CA production. The CA cluster includes genes for synthesis of GDP-L-fucose, one of the precursors of CA, and the gene for one of the enzymes in this pathway (GDP-D-mannose 4,6-dehydratase) was identified by biochemical assay. Six of the inferred proteins show sequence similarity to glycosyl transferases, and two others have sequence similarity to acetyl transferases. Another gene (wzx) is predicted to encode a protein with multiple transmembrane segments and may function in export of the CA repeat unit from the cytoplasm into the periplasm in a process analogous to O-unit export. The first three genes of the cluster are predicted to encode an outer membrane lipoprotein, a phosphatase, and an inner membrane protein with an ATP-binding domain. Since homologs of these genes are found in other extracellular polysaccharide gene clusters, they may have a common function, such as export of polysaccharide from the cell. PMID:8759852
Hammond, John P.; Broadley, Martin R.; Bowen, Helen C.; Spracklen, William P.; Hayden, Rory M.; White, Philip J.
2011-01-01
Background There are compelling economic and environmental reasons to reduce our reliance on inorganic phosphate (Pi) fertilisers. Better management of Pi fertiliser applications is one option to improve the efficiency of Pi fertiliser use, whilst maintaining crop yields. Application rates of Pi fertilisers are traditionally determined from analyses of soil or plant tissues. Alternatively, diagnostic genes with altered expression under Pi limiting conditions that suggest a physiological requirement for Pi fertilisation, could be used to manage Pifertiliser applications, and might be more precise than indirect measurements of soil or tissue samples. Results We grew potato (Solanum tuberosum L.) plants hydroponically, under glasshouse conditions, to control their nutrient status accurately. Samples of total leaf RNA taken periodically after Pi was removed from the nutrient solution were labelled and hybridised to potato oligonucleotide arrays. A total of 1,659 genes were significantly differentially expressed following Pi withdrawal. These included genes that encode proteins involved in lipid, protein, and carbohydrate metabolism, characteristic of Pi deficient leaves and included potential novel roles for genes encoding patatin like proteins in potatoes. The array data were analysed using a support vector machine algorithm to identify groups of genes that could predict the Pi status of the crop. These groups of diagnostic genes were tested using field grown potatoes that had either been fertilised or unfertilised. A group of 200 genes could correctly predict the Pi status of field grown potatoes. Conclusions This paper provides a proof-of-concept demonstration for using microarrays and class prediction tools to predict the Pi status of a field grown potato crop. There is potential to develop this technology for other biotic and abiotic stresses in field grown crops. Ultimately, a better understanding of crop stresses may improve our management of the crop, improving the sustainability of agriculture. PMID:21935429
Computational intelligence techniques for biological data mining: An overview
NASA Astrophysics Data System (ADS)
Faye, Ibrahima; Iqbal, Muhammad Javed; Said, Abas Md; Samir, Brahim Belhaouari
2014-10-01
Computational techniques have been successfully utilized for a highly accurate analysis and modeling of multifaceted and raw biological data gathered from various genome sequencing projects. These techniques are proving much more effective to overcome the limitations of the traditional in-vitro experiments on the constantly increasing sequence data. However, most critical problems that caught the attention of the researchers may include, but not limited to these: accurate structure and function prediction of unknown proteins, protein subcellular localization prediction, finding protein-protein interactions, protein fold recognition, analysis of microarray gene expression data, etc. To solve these problems, various classification and clustering techniques using machine learning have been extensively used in the published literature. These techniques include neural network algorithms, genetic algorithms, fuzzy ARTMAP, K-Means, K-NN, SVM, Rough set classifiers, decision tree and HMM based algorithms. Major difficulties in applying the above algorithms include the limitations found in the previous feature encoding and selection methods while extracting the best features, increasing classification accuracy and decreasing the running time overheads of the learning algorithms. The application of this research would be potentially useful in the drug design and in the diagnosis of some diseases. This paper presents a concise overview of the well-known protein classification techniques.
DNA encoding a DNA repair protein
Petrini, John H.; Morgan, William Francis; Maser, Richard Scott; Carney, James Patrick
2006-08-15
An isolated and purified DNA molecule encoding a DNA repair protein, p95, is provided, as is isolated and purified p95. Also provided are methods of detecting p95 and DNA encoding p95. The invention further provides p95 knock-out mice.
Zhang, Shujian; Chakrabarty, Pranjib K; Fleites, Laura A; Rayside, Patricia A; Hopkins, Donald L; Gabriel, Dean W
2015-01-01
Xylella fastidiosa (X. fastidiosa) infects a wide range of plant hosts and causes economically serious diseases, including Pierce's Disease (PD) of grapevines. X. fastidiosa biocontrol strain EB92-1 was isolated from elderberry and is infectious and persistent in grapevines but causes only very slight symptoms under ideal conditions. The draft genome of EB92-1 revealed that it appeared to be missing genes encoding 10 potential PD pathogenicity effectors found in Temecula1. Subsequent PCR and sequencing analyses confirmed that EB92-1 was missing the following predicted effectors found in Temecula1: two type II secreted enzymes, including a lipase (LipA; PD1703) and a serine protease (PD0956); two identical genes encoding proteins similar to Zonula occludens toxins (Zot; PD0915 and PD0928), and at least one relatively short, hemagglutinin-like protein (PD0986). Leaves of tobacco and citrus inoculated with cell-free, crude protein extracts of E. coli BL21(DE3) overexpressing PD1703 exhibited a hypersensitive response (HR) in less than 24 hours. When cloned into shuttle vector pBBR1MCS-5, PD1703 conferred strong secreted lipase activity to Xanthomonas citri, E. coli and X. fastidiosa EB92-1 in plate assays. EB92-1/PD1703 transformants also showed significantly increased disease symptoms on grapevines, characteristic of PD. Genes predicted to encode PD0928 (Zot) and a PD0986 (hemagglutinin) were also cloned into pBBR1MCS-5 and moved into EB92-1; both transformants also showed significantly increased symptoms on V. vinifera vines, characteristic of PD. Together, these results reveal that PD effectors include at least a lipase, two Zot-like toxins and a possibly redundant hemagglutinin, none of which are necessary for parasitic survival of X. fastidiosa populations in grapevines or elderberry.
Zhang, Shujian; Chakrabarty, Pranjib K.; Fleites, Laura A.; Rayside, Patricia A.; Hopkins, Donald L.; Gabriel, Dean W.
2015-01-01
Xylella fastidiosa (X. fastidiosa) infects a wide range of plant hosts and causes economically serious diseases, including Pierce's Disease (PD) of grapevines. X. fastidiosa biocontrol strain EB92-1 was isolated from elderberry and is infectious and persistent in grapevines but causes only very slight symptoms under ideal conditions. The draft genome of EB92-1 revealed that it appeared to be missing genes encoding 10 potential PD pathogenicity effectors found in Temecula1. Subsequent PCR and sequencing analyses confirmed that EB92-1 was missing the following predicted effectors found in Temecula1: two type II secreted enzymes, including a lipase (LipA; PD1703) and a serine protease (PD0956); two identical genes encoding proteins similar to Zonula occludens toxins (Zot; PD0915 and PD0928), and at least one relatively short, hemagglutinin-like protein (PD0986). Leaves of tobacco and citrus inoculated with cell-free, crude protein extracts of E. coli BL21(DE3) overexpressing PD1703 exhibited a hypersensitive response (HR) in less than 24 hours. When cloned into shuttle vector pBBR1MCS-5, PD1703 conferred strong secreted lipase activity to Xanthomonas citri, E. coli and X. fastidiosa EB92-1 in plate assays. EB92-1/PD1703 transformants also showed significantly increased disease symptoms on grapevines, characteristic of PD. Genes predicted to encode PD0928 (Zot) and a PD0986 (hemagglutinin) were also cloned into pBBR1MCS-5 and moved into EB92-1; both transformants also showed significantly increased symptoms on V. vinifera vines, characteristic of PD. Together, these results reveal that PD effectors include at least a lipase, two Zot-like toxins and a possibly redundant hemagglutinin, none of which are necessary for parasitic survival of X. fastidiosa populations in grapevines or elderberry. PMID:26218423
Figueiredo, Luisa M.; Rocha, Eduardo P. C.; Mancio-Silva, Liliana; Prevost, Christine; Hernandez-Verdun, Danièle; Scherf, Artur
2005-01-01
Telomerase replicates chromosome ends, a function necessary for maintaining genome integrity. We have identified the gene that encodes the catalytic reverse transcriptase (RT) component of this enzyme in the malaria parasite Plasmodium falciparum (PfTERT) as well as the orthologous genes from two rodent and one simian malaria species. PfTERT is predicted to encode a basic protein that contains the major sequence motifs previously identified in known telomerase RTs (TERTs). At ∼2500 amino acids, PfTERT is three times larger than other characterized TERTs. We observed remarkable sequence diversity between TERT proteins of different Plasmodial species, with conserved domains alternating with hypervariable regions. Immunofluorescence analysis revealed that PfTERT is expressed in asexual blood stage parasites that have begun DNA synthesis. Surprisingly, rather than at telomere clusters, PfTERT typically localizes into a discrete nuclear compartment. We further demonstrate that this compartment is associated with the nucleolus, hereby defined for the first time in P.falciparum. PMID:15722485
Johnson, Jeremiah G; Murphy, Caitlin N; Sippy, Jean; Johnson, Tylor J; Clegg, Steven
2011-07-01
Klebsiella pneumoniae is an opportunistic pathogen which frequently causes hospital-acquired urinary and respiratory tract infections. K. pneumoniae may establish these infections in vivo following adherence, using the type 3 fimbriae, to indwelling devices coated with extracellular matrix components. Using a colony immunoblot screen, we identified transposon insertion mutants which were deficient for type 3 fimbrial surface production. One of these mutants possessed a transposon insertion within a gene, designated mrkI, encoding a putative transcriptional regulator. A site-directed mutant of this gene was constructed and shown to be deficient for fimbrial surface expression under aerobic conditions. MrkI mutants have a significantly decreased ability to form biofilms on both abiotic and extracellular matrix-coated surfaces. This gene was found to be cotranscribed with a gene predicted to encode a PilZ domain-containing protein, designated MrkH. This protein was found to bind cyclic-di-GMP (c-di-GMP) and regulate type 3 fimbrial expression.
Johnson, Jeremiah G.; Murphy, Caitlin N.; Sippy, Jean; Johnson, Tylor J.; Clegg, Steven
2011-01-01
Klebsiella pneumoniae is an opportunistic pathogen which frequently causes hospital-acquired urinary and respiratory tract infections. K. pneumoniae may establish these infections in vivo following adherence, using the type 3 fimbriae, to indwelling devices coated with extracellular matrix components. Using a colony immunoblot screen, we identified transposon insertion mutants which were deficient for type 3 fimbrial surface production. One of these mutants possessed a transposon insertion within a gene, designated mrkI, encoding a putative transcriptional regulator. A site-directed mutant of this gene was constructed and shown to be deficient for fimbrial surface expression under aerobic conditions. MrkI mutants have a significantly decreased ability to form biofilms on both abiotic and extracellular matrix-coated surfaces. This gene was found to be cotranscribed with a gene predicted to encode a PilZ domain-containing protein, designated MrkH. This protein was found to bind cyclic-di-GMP (c-di-GMP) and regulate type 3 fimbrial expression. PMID:21571997
Berger, Philipp; Sirkowski, Erich E; Scherer, Steven S; Suter, Ueli
2004-11-01
Mutations in the gene encoding N-myc downstream-regulated gene-1 (NDRG1) lead to truncations of the encoded protein and are associated with an autosomal recessive demyelinating neuropathy--hereditary motor and sensory neuropathy-Lom. NDRG1 protein is highly expressed in peripheral nerve and is localized in the cytoplasm of myelinating Schwann cells, including the paranodes and Schmidt-Lanterman incisures. In contrast, sensory and motor neurons as well as their axons lack NDRG1. NDRG1 mRNA levels in developing and injured adult sciatic nerves parallel those of myelin-related genes, indicating that the expression of NDRG1 in myelinating Schwann cells is regulated by axonal interactions. Oligodendrocytes also express NDRG1, and the subtle CNS deficits of affected patients may result from a lack of NDRG1 in these cells. Our data predict that the loss of NDRG1 leads to a Schwann cell autonomous phenotype resulting in demyelination, with secondary axonal loss.
Tan, Yung-Chie; Ang, Cheng-Liang; Wong, Mui-Yun; Ho, Chai-Ling
2016-01-01
Plant defensins are plant defence peptides that have many different biological activities, including antifungal, antimicrobial, and insecticidal activities. A cDNA (EgDFS) encoding defensin was isolated from Elaeis guineensis. The open reading frame of EgDFS contained 231 nucleotides encoding a 71-amino acid protein with a predicted molecular weight at 8.69 kDa, and a potential signal peptide. The eight highly conserved cysteine sites in plant defensins were also conserved in EgDFS. The EgDFS sequence lacking 30 amino acid residues at its N-terminus (EgDFSm) was cloned into Escherichia coli BL21 (DE3) pLysS and successfully expressed as a soluble recombinant protein. The recombinant EgDFSm was found to be a thermal stable peptide which demonstrated inhibitory activity against the growth of G. boninense possibly by inhibiting starch assimilation. The role of EgDFSm in oil palm defence system against the infection of pathogen G. boninense was discussed.
Mechanisms of adaptation to nitrosative stress in Bacillus subtilis.
Rogstam, Annika; Larsson, Jonas T; Kjelgaard, Peter; von Wachenfeldt, Claes
2007-04-01
Bacteria use a number of mechanisms for coping with the toxic effects exerted by nitric oxide (NO) and its derivatives. Here we show that the flavohemoglobin encoded by the hmp gene has a vital role in an adaptive response to protect the soil bacterium Bacillus subtilis from nitrosative stress. We further show that nitrosative stress induced by the nitrosonium cation donor sodium nitroprusside (SNP) leads to deactivation of the transcriptional repressor NsrR, resulting in derepression of hmp. Nitrosative stress induces the sigma B-controlled general stress regulon. However, a sigB null mutant did not show increased sensitivity to SNP, suggesting that the sigma B-dependent stress proteins are involved in a nonspecific protection against stress whereas the Hmp flavohemoglobin plays a central role in detoxification. Mutations in the yjbIH operon, which encodes a truncated hemoglobin (YjbI) and a predicted 34-kDa cytosolic protein of unknown function (YjbH), rendered B. subtilis hypersensitive to SNP, suggesting roles in nitrosative stress management.
Mechanisms of Adaptation to Nitrosative Stress in Bacillus subtilis▿ †
Rogstam, Annika; Larsson, Jonas T.; Kjelgaard, Peter; von Wachenfeldt, Claes
2007-01-01
Bacteria use a number of mechanisms for coping with the toxic effects exerted by nitric oxide (NO) and its derivatives. Here we show that the flavohemoglobin encoded by the hmp gene has a vital role in an adaptive response to protect the soil bacterium Bacillus subtilis from nitrosative stress. We further show that nitrosative stress induced by the nitrosonium cation donor sodium nitroprusside (SNP) leads to deactivation of the transcriptional repressor NsrR, resulting in derepression of hmp. Nitrosative stress induces the sigma B-controlled general stress regulon. However, a sigB null mutant did not show increased sensitivity to SNP, suggesting that the sigma B-dependent stress proteins are involved in a nonspecific protection against stress whereas the Hmp flavohemoglobin plays a central role in detoxification. Mutations in the yjbIH operon, which encodes a truncated hemoglobin (YjbI) and a predicted 34-kDa cytosolic protein of unknown function (YjbH), rendered B. subtilis hypersensitive to SNP, suggesting roles in nitrosative stress management. PMID:17293416
Peoples, R J; Cisco, M J; Kaplan, P; Francke, U
1998-01-01
We have identified a novel gene (WBSCR9) within the common Williams-Beuren syndrome (WBS) deletion by interspecies sequence conservation. The WBSCR9 gene encodes a roughly 7-kb transcript with an open reading frame of 1483 amino acids and a predicted protein product size of 170.8 kDa. WBSCR9 is comprised of at least 20 exons extending over 60 kb. The transcript is expressed ubiquitously throughout development and is subject to alternative splicing. Functional motifs identified by sequence homology searches include a bromodomain; a PHD, or C4HC3, finger; several putative nuclear localization signals; four nuclear receptor binding motifs; a polyglutamate stretch and two PEST sequences. Bromodomains, PHD motifs and nuclear receptor binding motifs are cardinal features of proteins that are involved in chromatin remodeling and modulation of transcription. Haploinsufficiency for WBSCR9 gene products may contribute to the complex phenotype of WBS by interacting with tissue-specific regulatory factors during development.
Plett, Jonathan M.; Yin, Hengfu; Mewalal, Ritesh; ...
2017-03-23
During symbiosis, organisms use a range of metabolic and protein-based signals to communicate. Of these protein signals, one class is defined as ‘effectors’, i.e., small secreted proteins (SSPs) that cause phenotypical and physiological changes in another organism. To date, protein-based effectors have been described in aphids, nematodes, fungi and bacteria. Using RNA sequencing of Populus trichocarpa roots in mutualistic symbiosis with the ectomycorrhizal fungus Laccaria bicolor, we sought to determine if host plants also contain genes encoding effector-like proteins. We identified 417 plant-encoded putative SSPs that were significantly regulated during this interaction, including 161 SSPs specific to P. trichocarpa andmore » 15 SSPs exhibiting expansion in Populus and closely related lineages. We demonstrate that a subset of these SSPs can enter L. bicolor hyphae, localize to the nucleus and affect hyphal growth and morphology. Finally, we conclude that plants encode proteins that appear to function as effector proteins that may regulate symbiotic associations.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Plett, Jonathan M.; Yin, Hengfu; Mewalal, Ritesh
During symbiosis, organisms use a range of metabolic and protein-based signals to communicate. Of these protein signals, one class is defined as ‘effectors’, i.e., small secreted proteins (SSPs) that cause phenotypical and physiological changes in another organism. To date, protein-based effectors have been described in aphids, nematodes, fungi and bacteria. Using RNA sequencing of Populus trichocarpa roots in mutualistic symbiosis with the ectomycorrhizal fungus Laccaria bicolor, we sought to determine if host plants also contain genes encoding effector-like proteins. We identified 417 plant-encoded putative SSPs that were significantly regulated during this interaction, including 161 SSPs specific to P. trichocarpa andmore » 15 SSPs exhibiting expansion in Populus and closely related lineages. We demonstrate that a subset of these SSPs can enter L. bicolor hyphae, localize to the nucleus and affect hyphal growth and morphology. Finally, we conclude that plants encode proteins that appear to function as effector proteins that may regulate symbiotic associations.« less
Actions of plant Argonautes: predictable or unpredictable?
Ma, Zeyang; Zhang, Xiuren
2018-05-29
Argonaute (AGO) proteins are the key effector of RNA-induced silencing complex (RISC). Land plants typically encode numerous AGO proteins, and they can be typically divided into two major functional groups based on the species of their housed small RNAs (sRNAs). One group of AGOs, guided by 24-nucleotide (nt) sRNAs, canonically function in nuclei to implement transcriptional gene silencing (TGS), whereas the other group of AGOs, guided by 21-nt sRNAs, act in the cytoplasm to fulfill posttranscriptional gene silencing (PTGS). Many new discoveries have been recently made on functions and mechanisms of AGO proteins in plants, and some of the findings change our views on the conventional classification and roles of AGO proteins. In this review, we summarize our current knowledge of AGO proteins in plants. Copyright © 2018 Elsevier Ltd. All rights reserved.
Palmer, M. J.; Mergner, V. A.; Richman, R.; Manning, J. E.; Kuroda, M. I.; Lucchesi, J. C.
1993-01-01
male-specific lethal-one (msl-1) is one of four genes that are required for dosage compensation in Drosophila males. To determine the molecular basis of msl-1 regulation of dosage compensation, we have cloned the gene and characterized its products. The predicted msl-1 protein (MSL-1) has no significant similarity to proteins in the current data bases but contains an acidic N terminus characteristic of proteins involved in transcription and chromatin modeling. We present evidence that the msl-1 protein is associated with hundreds of sites along the length of the X chromosome in male, but not in female, nuclei. Our findings support the hypothesis that msl-1 plays a direct role in increasing the level of X-linked gene transcription in male nuclei. PMID:8325488
Prasad, Bhagwat; Evers, Raymond; Gupta, Anshul; Hop, Cornelis E. C. A.; Salphati, Laurent; Shukla, Suneet; Ambudkar, Suresh V.
2014-01-01
Interindividual variability in protein expression of organic anion-transporting polypeptides (OATPs) OATP1B1, OATP1B3, OATP2B1, and multidrug resistance-linked P-glycoprotein (P-gp) or ABCB1 was quantified in frozen human livers (n = 64) and cryopreserved human hepatocytes (n = 12) by a validated liquid chromatography tandem mass spectroscopy (LC-MS/MS) method. Membrane isolation, sample workup, and LC-MS/MS analyses were as described before by our laboratory. Briefly, total native membrane proteins, isolated from the liver tissue and cryopreserved hepatocytes, were trypsin digested and quantified by LC-MS/MS using signature peptide(s) unique to each transporter. The mean ± S.D. (maximum/minimum range in parentheses) protein expression (fmol/µg of membrane protein) in human liver tissue was OATP1B1- 2.0 ± 0.9 (7), OATP1B3- 1.1 ± 0.5 (8), OATP2B1- 1 1.7 ± 0.6 (5), and P-gp- 0.4 ± 0.2 (8). Transporter expression in the liver tissue was comparable to that in the cryopreserved hepatocytes. Most important is that livers with SLCO1B1 (encoding OATP1B1) haplotypes *14/*14 and *14/*1a [i.e., representing single nucleotide polymorphisms (SNPs), c.388A > G, and c.463C > A] had significantly higher (P < 0.0001) protein expression than the reference haplotype (*1a/*1a). Based on these genotype-dependent protein expression data, we predicted (using Simcyp) an up to ∼40% decrease in the mean area under the curve of rosuvastatin or repaglinide in subjects harboring these variant alleles compared with those harboring the reference alleles. SLCO1B3 (encoding OATP1B3) SNPs did not significantly affect protein expression. Age and sex were not associated with transporter protein expression. These data will facilitate the prediction of population-based human transporter-mediated drug disposition, drug-drug interactions, and interindividual variability through physiologically based pharmacokinetic modeling. PMID:24122874
Vasala, A; Dupont, L; Baumann, M; Ritzenthaler, P; Alatossava, T
1993-01-01
Virulent phage LL-H and temperate phage mv4 are two related bacteriophages of Lactobacillus delbrueckii. The gene clusters encoding structural proteins of these two phages have been sequenced and further analyzed. Six open reading frames (ORF-1 to ORF-6) were detected. Protein sequencing and Western immunoblotting experiments confirmed that ORF-3 (g34) encoded the main capsid protein Gp34. The presence of a putative late promoter in front of the phage LL-H g34 gene was suggested by primer extension experiments. Comparative sequence analysis between phage LL-H and phage mv4 revealed striking similarities in the structure and organization of this gene cluster, suggesting that the genes encoding phage structural proteins belong to a highly conservative module. Images PMID:8497043
van der Ley, P
1988-11-01
Gonococci express a family of related outer membrane proteins designated protein II (P.II). These surface proteins are subject to both phase variation and antigenic variation. The P.II gene repertoire of Neisseria gonorrhoeae strain JS3 was found to consist of at least ten genes, eight of which were cloned. Sequence analysis and DNA hybridization studies revealed that one particular P.II-encoding sequence is present in three distinct, but almost identical, copies in the JS3 genome. These genes encode the P.II protein that was previously identified as P.IIc. Comparison of their sequences shows that the multiple copies of this P.IIc-encoding gene might have been generated by both gene conversion and gene duplication.
NASA Astrophysics Data System (ADS)
Ng, Siuk-Mun; Lee, Xin-Wei; Wan, Kiew-Lian; Firdaus-Raih, Mohd
2015-09-01
Regulation of functional nucleus-encoded proteins targeting the plastidial functions was comparatively studied for a plant parasite, Rafflesia cantleyi versus a photosynthetic plant, Arabidopsis thaliana. This study involved two species of different feeding modes and different developmental stages. A total of 30 nucleus-encoded proteins were found to be differentially-regulated during two stages in the parasite; whereas 17 nucleus-encoded proteins were differentially-expressed during two developmental stages in Arabidopsis thaliana. One notable finding observed for the two plants was the identification of genes involved in the regulation of photosynthesis-related processes where these processes, as expected, seem to be present only in the autotroph.
Vera-Otarola, Jorge; Solis, Loretto; Soto-Rifo, Ricardo; Ricci, Emiliano P; Pino, Karla; Tischler, Nicole D; Ohlmann, Théophile; Darlix, Jean-Luc; López-Lastra, Marcelo
2012-02-01
The small mRNA (SmRNA) of all Bunyaviridae encodes the nucleocapsid (N) protein. In 4 out of 5 genera in the Bunyaviridae, the smRNA encodes an additional nonstructural protein denominated NSs. In this study, we show that Andes hantavirus (ANDV) SmRNA encodes an NSs protein. Data show that the NSs protein is expressed in the context of an ANDV infection. Additionally, our results suggest that translation initiation from the NSs initiation codon is mediated by ribosomal subunits that have bypassed the upstream N protein initiation codon through a leaky scanning mechanism.
Vera-Otarola, Jorge; Solis, Loretto; Soto-Rifo, Ricardo; Ricci, Emiliano P.; Pino, Karla; Tischler, Nicole D.; Ohlmann, Théophile; Darlix, Jean-Luc
2012-01-01
The small mRNA (SmRNA) of all Bunyaviridae encodes the nucleocapsid (N) protein. In 4 out of 5 genera in the Bunyaviridae, the smRNA encodes an additional nonstructural protein denominated NSs. In this study, we show that Andes hantavirus (ANDV) SmRNA encodes an NSs protein. Data show that the NSs protein is expressed in the context of an ANDV infection. Additionally, our results suggest that translation initiation from the NSs initiation codon is mediated by ribosomal subunits that have bypassed the upstream N protein initiation codon through a leaky scanning mechanism. PMID:22156529
Fusagene vectors: a novel strategy for the expression of multiple genes from a single cistron.
Gäken, J; Jiang, J; Daniel, K; van Berkel, E; Hughes, C; Kuiper, M; Darling, D; Tavassoli, M; Galea-Lauri, J; Ford, K; Kemeny, M; Russell, S; Farzaneh, F
2000-12-01
Transduction of cells with multiple genes, allowing their stable and co-ordinated expression, is difficult with the available methodologies. A method has been developed for expression of multiple gene products, as fusion proteins, from a single cistron. The encoded proteins are post-synthetically cleaved and processed into each of their constituent proteins as individual, biologically active factors. Specifically, linkers encoding cleavage sites for the Golgi expressed endoprotease, furin, have been incorporated between in-frame cDNA sequences encoding different secreted or membrane bound proteins. With this strategy we have developed expression vectors encoding multiple proteins (IL-2 and B7.1, IL-4 and B7.1, IL-4 and IL-2, IL-12 p40 and p35, and IL-12 p40, p35 and IL-2 ). Transduction and analysis of over 100 individual clones, derived from murine and human tumour cell lines, demonstrate the efficient expression and biological activity of each of the encoded proteins. Fusagene vectors enable the co-ordinated expression of multiple gene products from a single, monocistronic, expression cassette.
Arambula, Diego; Wong, Wenge; Medhekar, Bob A; Guo, Huatao; Gingery, Mari; Czornyj, Elizabeth; Liu, Minghsun; Dey, Sanghamitra; Ghosh, Partho; Miller, Jeff F
2013-05-14
Diversity-generating retroelements (DGRs) are a unique family of retroelements that confer selective advantages to their hosts by facilitating localized DNA sequence evolution through a specialized error-prone reverse transcription process. We characterized a DGR in Legionella pneumophila, an opportunistic human pathogen that causes Legionnaires disease. The L. pneumophila DGR is found within a horizontally acquired genomic island, and it can theoretically generate 10(26) unique nucleotide sequences in its target gene, legionella determinent target A (ldtA), creating a repertoire of 10(19) distinct proteins. Expression of the L. pneumophila DGR resulted in transfer of DNA sequence information from a template repeat to a variable repeat (VR) accompanied by adenine-specific mutagenesis of progeny VRs at the 3'end of ldtA. ldtA encodes a twin-arginine translocated lipoprotein that is anchored in the outer leaflet of the outer membrane, with its C-terminal variable region surface exposed. Related DGRs were identified in L. pneumophila clinical isolates that encode unique target proteins with homologous VRs, demonstrating the adaptability of DGR components. This work characterizes a DGR that diversifies a bacterial protein and confirms the hypothesis that DGR-mediated mutagenic homing occurs through a conserved mechanism. Comparative bioinformatics predicts that surface display of massively variable proteins is a defining feature of a subset of bacterial DGRs.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Trindade, Inês B.; Fonseca, Bruno M.; Matias, Pedro M.
The gene encoding a putative siderophore-interacting protein from the marine bacterium S. frigidimarina was successfully cloned, followed by expression and purification of the gene product. Optimized crystals diffracted to 1.35 Å resolution and preliminary crystallographic analysis is promising with respect to structure determination and increased insight into the poorly understood molecular mechanisms underlying iron acquisition. Siderophore-binding proteins (SIPs) perform a key role in iron acquisition in multiple organisms. In the genome of the marine bacterium Shewanella frigidimarina NCIMB 400, the gene tagged as SFRI-RS12295 encodes a protein from this family. Here, the cloning, expression, purification and crystallization of this proteinmore » are reported, together with its preliminary X-ray crystallographic analysis to 1.35 Å resolution. The SIP crystals belonged to the monoclinic space group P2{sub 1}, with unit-cell parameters a = 48.04, b = 78.31, c = 67.71 Å, α = 90, β = 99.94, γ = 90°, and are predicted to contain two molecules per asymmetric unit. Structure determination by molecular replacement and the use of previously determined ∼2 Å resolution SIP structures with ∼30% sequence identity as templates are ongoing.« less
Identification of ADAM 31: a protein expressed in Leydig cells and specialized epithelia.
Liu, L; Smith, J W
2000-06-01
A family of proteins containing a disintegrin and metalloproteinase domain (ADAMs) has been identified recently. Here, we report the identification of a novel member of the ADAM protein family from mouse. This protein is designated ADAM 31. The complementary DNA sequence of ADAM 31 predicts a transmembrane protein with metalloproteinase, disintegrin, cysteine-rich, and cytoplasmic domains. Messenger RNA encoding ADAM 31 was most abundant in testes, but was also detected in many other tissues. More significantly, the antibodies raised against ADAM 31 reveal that the protein has a unique and restricted expression pattern. ADAM 31 is expressed in Leydig cells of the testes, but unlike many other ADAMs, it is not found on developing sperm. Furthermore, ADAM 31 is highly expressed on four types of specialized epithelia: the cauda epididymidis, the vas deferens, the convoluted tubules of the kidney, and the parietal cells of the stomach.
Breast Reference Set Application: Karen Anderson-ASU (2014) — EDRN Public Portal
In order to increase the predictive value of tumor-specific antibodies for use as immunodiagnostics, our EDRN BDL has developed a novel protein microarray technology, termed Nucleic Acid Protein Programmable Array (NAPPA), which circumvents many of the limitations of traditional protein microarrays. NAPPA arrays are generated by printing full-length cDNAs encoding the target proteins at each feature of the array. The proteins are then transcribed and translated by a cell-free system and immobilized in situ using epitope tags fused to the proteins. Sera are added, and bound IgG is detected by standard secondary reagents. Using a sequential screening strategy to select AAb from 4,988 candidate tumor antigens, we have identified 28 potential AAb biomarkers for the early detection of breast cancer, and here we propose to evaluate these biomarkers using the EDRN Breast Cancer Reference Set.
NASA Astrophysics Data System (ADS)
Wu, Ming-Chya; Forbes, Jeffrey G.; Wang, Kuan
2016-06-01
Nebulin is an about 1 μ m long intrinsically disordered scaffold for the thin filaments of skeletal muscle sarcomere. It is a multifunctional elastic protein that wraps around actin filament, stabilizes thin filaments, and regulates Ca-dependent actomyosin interactions. This study investigates whether the disorder profile of nebulin might encode guidelines for thin and thick filament interactions in the sarcomere of the skeletal muscle. The question was addressed computationally by analyzing the predicted disorder profile of human nebulin (6669 residues, ˜200 actin-binding repeats) by pondr and the periodicity of the A-band stripes (reflecting the locations of myosin-associated proteins) in the electron micrographs of the sarcomere. Using the detrended fluctuation analysis, a scale factor for the A-band stripe image data with respect to the nebulin disorder profile was determined to make the thin and thick filaments aligned to have maximum correlation. The empirical mode decomposition method was then applied to identify hidden periodicities in both the nebulin disorder profile and the rescaled A-band data. The decomposition reveals three characteristic length scales (45 nm, 100 nm, and 200 nm) that are relevant for correlational analysis. The dynamical cross-correlation analyses with moving windows at various sarcomere lengths depict a vernierlike design for both periodicities, thus enabling nebulin to sense position and fine tune sarcomere overlap. This shows that the disorder profile of scaffolding proteins may encode a guideline for cellular architecture.
Chakrabarti, Manohar; Liu, Xiaoxi; Wang, Yanping; Ramos, Alexis
2017-01-01
Increases in fruit weight of cultivated vegetables and fruits accompanied the domestication of these crops. Here we report on the positional cloning of a quantitative trait locus (QTL) controlling fruit weight in tomato. The derived allele of Cell Size Regulator (CSR-D) increases fruit weight predominantly through enlargement of the pericarp areas. The expanded pericarp tissues result from increased mesocarp cell size and not from increased number of cell layers. The effect of CSR on fruit weight and cell size is found across different genetic backgrounds implying a consistent impact of the locus on the trait. In fruits, CSR expression is undetectable early in development from floral meristems to the rapid cell proliferation stage after anthesis. Expression is low but detectable in growing fruit tissues and in or around vascular bundles coinciding with the cell enlargement stage of the fruit maturation process. CSR encodes an uncharacterized protein whose clade has expanded in the Solanaceae family. The mutant allele is predicted to encode a shorter protein due to a 1.4 kb deletion resulting in a 194 amino-acid truncation. Co-expression analyses and GO term enrichment analyses suggest association of CSR with cell differentiation in fruit tissues and vascular bundles. The derived allele arose in Solanum lycopersicum var cerasiforme and appears completely fixed in many cultivated tomato’s market classes. This finding suggests that the selection of this allele was critical to the full domestication of tomato from its intermediate ancestors. PMID:28817560
Asamizu, Erika; Nakamura, Yasukazu; Sato, Shusei; Tabata, Satoshi
2004-02-01
To perform a comprehensive analysis of genes expressed in a model legume, Lotus japonicus, a total of 74472 3'-end expressed sequence tags (EST) were generated from cDNA libraries produced from six different organs. Clustering of sequences was performed with an identity criterion of 95% for 50 bases, and a total of 20457 non-redundant sequences, 8503 contigs and 11954 singletons were generated. EST sequence coverage was analyzed by using the annotated L. japonicus genomic sequence and 1093 of the 1889 predicted protein-encoding genes (57.9%) were hit by the EST sequence(s). Gene content was compared to several plant species. Among the 8503 contigs, 471 were identified as sequences conserved only in leguminous species and these included several disease resistance-related genes. This suggested that in legumes, these genes may have evolved specifically to resist pathogen attack. The rate of gene sequence divergence was assessed by comparing similarity level and functional category based on the Gene Ontology (GO) annotation of Arabidopsis genes. This revealed that genes encoding ribosomal proteins, as well as those related to translation, photosynthesis, and cellular structure were more abundantly represented in the highly conserved class, and that genes encoding transcription factors and receptor protein kinases were abundantly represented in the less conserved class. To make the sequence information and the cDNA clones available to the research community, a Web database with useful services was created at http://www.kazusa.or.jp/en/plant/lotus/EST/.
NASA Astrophysics Data System (ADS)
Zhang, Hao; Liu, Haijun; Blankenship, Robert E.; Gross, Michael L.
2016-01-01
We report an isotope-encoding method coupled with carboxyl-group footprinting to monitor protein conformational changes. The carboxyl groups of aspartic/glutamic acids and of the C-terminus of proteins can serve as reporters for protein conformational changes when labeled with glycine ethyl ester (GEE) mediated by carbodiimide. In the new development, isotope-encoded "heavy" and "light" GEE are used to label separately the two states of the orange carotenoid protein (OCP) from cyanobacteria. Two samples are mixed (1:1 ratio) and analyzed by a single LC-MS/MS experiment. The differences in labeling extent between the two states are represented by the ratio of the "heavy" and "light" peptides, providing information about protein conformational changes. Combining isotope-encoded MS quantitative analysis and carboxyl-group footprinting reduces the time of MS analysis and improves the sensitivity of GEE and other footprinting.
Zhang, Hao; Liu, Haijun; Blankenship, Robert E.; Gross, Michael L.
2015-01-01
We report an isotope-encoding method coupled with carboxyl-group footprinting to monitor protein conformational changes. The carboxyl groups of aspartic/glutamic acids and of the C-terminus of proteins can serve as reporters for protein conformational changes when labeled with glycine ethyl ester (GEE) mediated by carbodiimide. In the new development, isotope-encoded “heavy” and “light” GEE are used to label separately the two states of the Orange Carotenoid Protein (OCP) from cyanobacteria. Two samples are mixed (1:1 ratio) and analyzed by a single LC-MS/MS experiment. The differences in labeling extent between the two states are represented by the ratio of the “heavy” and “light” peptides, providing information about protein conformational changes. Combining isotope-encoded MS quantitative analysis and carboxyl-group footprinting reduces the time of MS analysis and improves the sensitivity of GEE and other footprinting. PMID:26384685
Zhang, Hao; Liu, Haijun; Blankenship, Robert E.; ...
2015-09-18
Here, we report an isotope-encoding method coupled with carboxyl-group footprinting to monitor protein conformational changes. The carboxyl groups of aspartic/glutamic acids and of the C-terminus of proteins can serve as reporters for protein conformational changes when labeled with glycine ethyl ester (GEE) mediated by carbodiimide. In the new development, isotope-encoded “heavy” and “light” GEE are used to label separately the two states of the orange carotenoid protein (OCP) from cyanobacteria. Two samples are mixed (1:1 ratio) and analyzed by a single LC-MS/MS experiment. The differences in labeling extent between the two states are represented by the ratio of the “heavy”more » and “light” peptides, providing information about protein conformational changes. Combining isotope-encoded MS quantitative analysis and carboxyl-group footprinting reduces the time of MS analysis and improves the sensitivity of GEE and other footprinting.« less
Shi, Huazhong; Kim, YongSig; Guo, Yan; Stevenson, Becky; Zhu, Jian-Kang
2003-01-01
Cell surface proteoglycans have been implicated in many aspects of plant growth and development, but genetic evidence supporting their function has been lacking. Here, we report that the Salt Overly Sensitive5 (SOS5) gene encodes a putative cell surface adhesion protein and is required for normal cell expansion. The sos5 mutant was isolated in a screen for Arabidopsis salt-hypersensitive mutants. Under salt stress, the root tips of sos5 mutant plants swell and root growth is arrested. The root-swelling phenotype is caused by abnormal expansion of epidermal, cortical, and endodermal cells. The SOS5 gene was isolated through map-based cloning. The predicted SOS5 protein contains an N-terminal signal sequence for plasma membrane localization, two arabinogalactan protein–like domains, two fasciclin-like domains, and a C-terminal glycosylphosphatidylinositol lipid anchor signal sequence. The presence of fasciclin-like domains, which typically are found in animal cell adhesion proteins, suggests a role for SOS5 in cell-to-cell adhesion in plants. The SOS5 protein was present at the outer surface of the plasma membrane. The cell walls are thinner in the sos5 mutant, and those between neighboring epidermal and cortical cells in sos5 roots appear less organized. SOS5 is expressed ubiquitously in all plant organs and tissues, including guard cells in the leaf. PMID:12509519
Song, Jiangning; Burrage, Kevin; Yuan, Zheng; Huber, Thomas
2006-03-09
The majority of peptide bonds in proteins are found to occur in the trans conformation. However, for proline residues, a considerable fraction of Prolyl peptide bonds adopt the cis form. Proline cis/trans isomerization is known to play a critical role in protein folding, splicing, cell signaling and transmembrane active transport. Accurate prediction of proline cis/trans isomerization in proteins would have many important applications towards the understanding of protein structure and function. In this paper, we propose a new approach to predict the proline cis/trans isomerization in proteins using support vector machine (SVM). The preliminary results indicated that using Radial Basis Function (RBF) kernels could lead to better prediction performance than that of polynomial and linear kernel functions. We used single sequence information of different local window sizes, amino acid compositions of different local sequences, multiple sequence alignment obtained from PSI-BLAST and the secondary structure information predicted by PSIPRED. We explored these different sequence encoding schemes in order to investigate their effects on the prediction performance. The training and testing of this approach was performed on a newly enlarged dataset of 2424 non-homologous proteins determined by X-Ray diffraction method using 5-fold cross-validation. Selecting the window size 11 provided the best performance for determining the proline cis/trans isomerization based on the single amino acid sequence. It was found that using multiple sequence alignments in the form of PSI-BLAST profiles could significantly improve the prediction performance, the prediction accuracy increased from 62.8% with single sequence to 69.8% and Matthews Correlation Coefficient (MCC) improved from 0.26 with single local sequence to 0.40. Furthermore, if coupled with the predicted secondary structure information by PSIPRED, our method yielded a prediction accuracy of 71.5% and MCC of 0.43, 9% and 0.17 higher than the accuracy achieved based on the singe sequence information, respectively. A new method has been developed to predict the proline cis/trans isomerization in proteins based on support vector machine, which used the single amino acid sequence with different local window sizes, the amino acid compositions of local sequence flanking centered proline residues, the position-specific scoring matrices (PSSMs) extracted by PSI-BLAST and the predicted secondary structures generated by PSIPRED. The successful application of SVM approach in this study reinforced that SVM is a powerful tool in predicting proline cis/trans isomerization in proteins and biological sequence analysis.
Barret, Matthieu; Egan, Frank; Fargier, Emilie; Morrissey, John P; O'Gara, Fergal
2011-06-01
Bacteria encode multiple protein secretion systems that are crucial for interaction with the environment and with hosts. In recent years, attention has focused on type VI secretion systems (T6SSs), which are specialized transporters widely encoded in Proteobacteria. The myriad of processes associated with these secretion systems could be explained by subclasses of T6SS, each involved in specialized functions. To assess diversity and predict function associated with different T6SSs, comparative genomic analysis of 34 Pseudomonas genomes was performed. This identified 70 T6SSs, with at least one locus in every strain, except for Pseudomonas stutzeri A1501. By comparing 11 core genes of the T6SS, it was possible to identify five main Pseudomonas phylogenetic clusters, with strains typically carrying T6SSs from more than one clade. In addition, most strains encode additional vgrG and hcp genes, which encode extracellular structural components of the secretion apparatus. Using a combination of phylogenetic and meta-analysis of transcriptome datasets it was possible to associate specific subsets of VgrG and Hcp proteins with each Pseudomonas T6SS clade. Moreover, a closer examination of the genomic context of vgrG genes in multiple strains highlights a number of additional genes associated with these regions. It is proposed that these genes may play a role in secretion or alternatively could be new T6S effectors.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zelinka, L.; McCann, S.; Budde, J.
2011-08-05
Highlights: {yields} Affinity purification of the autoimmune rippling muscle disease immunogenic domain of titin. {yields} Partial sequence analysis confirms that the peptides is in the I band region of titin. {yields} This region of the human titin shows high degree of homology to mouse titin N2-A. -- Abstract: Autoimmune rippling muscle disease (ARMD) is an autoimmune neuromuscular disease associated with myasthenia gravis (MG). Past studies in our laboratory recognized a very high molecular weight skeletal muscle protein antigen identified by ARMD patient antisera as the titin isoform. These past studies used antisera from ARMD and MG patients as probes tomore » screen a human skeletal muscle cDNA library and several pBluescript clones revealed supporting expression of immunoreactive peptides. This study characterizes the products of subcloning the titin immunoreactive domain into pGEX-3X and the subsequent fusion protein. Sequence analysis of the fusion gene indicates the cloned titin domain (GenBank ID: (EU428784)) is in frame and is derived from a sequence of N2-A spanning the exons 248-250 an area that encodes the fibronectin III domain. PCR and EcoR1 restriction mapping studies have demonstrated that the inserted cDNA is of a size that is predicted by bioinformatics analysis of the subclone. Expression of the fusion protein result in the isolation of a polypeptide of 52 kDa consistent with the predicted inferred amino acid sequence. Immunoblot experiments of the fusion protein, using rippling muscle/myasthenia gravis antisera, demonstrate that only the titin domain is immunoreactive.« less
Elmogy, Mohamed; Mohamed, Amr A; Tufail, Muhammad; Uno, Tomohide; Takeda, Makio
2017-05-26
The small Rab GTPases are key regulators of membrane vesicle trafficking. Ovaries of Periplaneta americana (Linnaeus) (Blattodea: Blattidae) have small molecular weight GTP/ATP-binding proteins during early and late vitellogenic periods of oogenesis. However, the identification and characterization of the detected proteins have not been yet reported. Herein, we cloned a cDNA encoding Rab5 from the American cockroach, P. americana, ovaries (PamRab5). It comprises 796 bp, encoding a protein of 213 amino acid residues with a predicted molecular weight of 23.5 kDa. PamRab5 exists as a single-copy gene in the P. americana genome, as revealed by Southern blot analysis. An approximate 2.6 kb ovarian mRNA was transcribed especially at high levels in the previtellogenic ovaries, detected by Northern blot analysis. The muscle and head tissues also showed high levels of PamRab5 transcript. PamRab5 protein was localized, via immunofluorescence labeling, to germline-derived cells of the oocytes, very early during oocyte differentiation. Immunoblotting detected a ∼25 kDa signal as a membrane-associated form revealed after application of detergent in the extraction buffer, and 23 kDa as a cytosolic form consistent with the predicted molecular weight from amino acid sequence in different tissues including ovary, muscles and head. The PamRab5 during late vitellogenic periods is required to regulate the endocytotic machinery during oogenesis in this cockroach. This is the first report on Rab5 from a hemimetabolan, and presents an inaugural step in probing the molecular premises of insect oocyte endocytotic trafficking important for oogenesis and embryonic development. © 2017 Institute of Zoology, Chinese Academy of Sciences.
Paul, Catherine J; Twine, Susan M; Tam, Kevin J; Mullen, James A; Kelly, John F; Austin, John W; Logan, Susan M
2007-05-01
Strains of Clostridium botulinum are traditionally identified by botulinum neurotoxin type; however, identification of an additional target for typing would improve differentiation. Isolation of flagellar filaments and analysis by sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) showed that C. botulinum produced multiple flagellin proteins. Nano-liquid chromatography-tandem mass spectrometry (nLC-MS/MS) analysis of in-gel tryptic digests identified peptides in all flagellin bands that matched two homologous tandem flagellin genes identified in the C. botulinum Hall A genome. Designated flaA1 and flaA2, these open reading frames encode the major structural flagellins of C. botulinum. Colony PCR and sequencing of flaA1/A2 variable regions classified 80 environmental and clinical strains into group I or group II and clustered isolates into 12 flagellar types. Flagellar type was distinct from neurotoxin type, and epidemiologically related isolates clustered together. Sequencing a larger PCR product, obtained during amplification of flaA1/A2 from type E strain Bennett identified a second flagellin gene, flaB. LC-MS analysis confirmed that flaB encoded a large type E-specific flagellin protein, and the predicted molecular mass for FlaB matched that observed by SDS-PAGE. In contrast, the molecular mass of FlaA was 2 to 12 kDa larger than the mass predicted by the flaA1/A2 sequence of a given strain, suggesting that FlaA is posttranslationally modified. While identification of FlaB, and the observation by SDS-PAGE of different masses of the FlaA proteins, showed the flagellin proteins of C. botulinum to be diverse, the presence of the flaA1/A2 gene in all strains examined facilitates single locus sequence typing of C. botulinum using the flagellin variable region.
Expression Analysis of the Theileria parva Subtelomere-Encoded Variable Secreted Protein Gene Family
Schmied, Stéfanie; Affentranger, Sarah; Parvanova, Iana; Kang'a, Simon; Nene, Vishvanath; Katzer, Frank; McKeever, Declan; Müller, Joachim; Bishop, Richard; Pain, Arnab; Dobbelaere, Dirk A. E.
2009-01-01
Background The intracellular protozoan parasite Theileria parva transforms bovine lymphocytes inducing uncontrolled proliferation. Proteins released from the parasite are assumed to contribute to phenotypic changes of the host cell and parasite persistence. With 85 members, genes encoding subtelomeric variable secreted proteins (SVSPs) form the largest gene family in T. parva. The majority of SVSPs contain predicted signal peptides, suggesting secretion into the host cell cytoplasm. Methodology/Principal Findings We analysed SVSP expression in T. parva-transformed cell lines established in vitro by infection of T or B lymphocytes with cloned T. parva parasites. Microarray and quantitative real-time PCR analysis revealed mRNA expression for a wide range of SVSP genes. The pattern of mRNA expression was largely defined by the parasite genotype and not by host background or cell type, and found to be relatively stable in vitro over a period of two months. Interestingly, immunofluorescence analysis carried out on cell lines established from a cloned parasite showed that expression of a single SVSP encoded by TP03_0882 is limited to only a small percentage of parasites. Epitope-tagged TP03_0882 expressed in mammalian cells was found to translocate into the nucleus, a process that could be attributed to two different nuclear localisation signals. Conclusions Our analysis reveals a complex pattern of Theileria SVSP mRNA expression, which depends on the parasite genotype. Whereas in cell lines established from a cloned parasite transcripts can be found corresponding to a wide range of SVSP genes, only a minority of parasites appear to express a particular SVSP protein. The fact that a number of SVSPs contain functional nuclear localisation signals suggests that proteins released from the parasite could contribute to phenotypic changes of the host cell. This initial characterisation will facilitate future studies on the regulation of SVSP gene expression and the potential biological role of these enigmatic proteins. PMID:19325907
Predicting the Pathogenicity of Aminoacyl-tRNA Synthetase Mutations
Oprescu, Stephanie N.; Griffin, Laurie B.; Beg, Asim A.; Antonellis, Anthony
2016-01-01
Aminoacyl-tRNA synthetases (ARSs) are ubiquitously expressed, essential enzymes responsible for charging tRNA with cognate amino acids—the first step in protein synthesis. ARSs are required for protein translation in the cytoplasm and mitochondria of all cells. Surprisingly, mutations in 28 of the 37 nuclear-encoded human ARS genes have been linked to a variety of recessive and dominant tissue-specific disorders. Current data sustains that impaired enzyme function is a robust predictor of the pathogenicity of ARS mutations. However, experimental model systems that distinguish between pathogenic and non-pathogenic ARS variants are required for implicating newly identified ARS mutations in disease. Here, we outline strategies to assist in predicting the pathogenicity of ARS variants and urge cautious evaluation of genetic and functional data prior to linking an ARS mutation to a human disease phenotype. PMID:27876679
Nopaline-type Ti plasmid of Agrobacterium encodes a VirF-like functional F-box protein.
Lacroix, Benoît; Citovsky, Vitaly
2015-11-20
During Agrobacterium-mediated genetic transformation of plants, several bacterial virulence (Vir) proteins are translocated into the host cell to facilitate infection. One of the most important of such translocated factors is VirF, an F-box protein produced by octopine strains of Agrobacterium, which presumably facilitates proteasomal uncoating of the invading T-DNA from its associated proteins. The presence of VirF also is thought to be involved in differences in host specificity between octopine and nopaline strains of Agrobacterium, with the current dogma being that no functional VirF is encoded by nopaline strains. Here, we show that a protein with homology to octopine VirF is encoded by the Ti plasmid of the nopaline C58 strain of Agrobacterium. This protein, C58VirF, possesses the hallmarks of functional F-box proteins: it contains an active F-box domain and specifically interacts, via its F-box domain, with SKP1-like (ASK) protein components of the plant ubiquitin/proteasome system. Thus, our data suggest that nopaline strains of Agrobacterium have evolved to encode a functional F-box protein VirF.
Turner, Lauren Senty; Kanamoto, Taisei; Unoki, Takeshi; Munro, Cindy L.; Wu, Hui; Kitten, Todd
2009-01-01
Streptococcus sanguinis is a member of the viridans group of streptococci and a leading cause of the life-threatening endovascular disease infective endocarditis. Initial contact with the cardiac infection site is likely mediated by S. sanguinis surface proteins. In an attempt to identify the proteins required for this crucial step in pathogenesis, we searched for surface-exposed, cell wall-anchored proteins encoded by S. sanguinis and then used a targeted signature-tagged mutagenesis (STM) approach to evaluate their contributions to virulence. Thirty-three predicted cell wall-anchored proteins were identified—a number much larger than those found in related species. The requirement of each cell wall-anchored protein for infective endocarditis was assessed in the rabbit model. It was found that no single cell wall-anchored protein was essential for the development of early infective endocarditis. STM screening was also employed for the evaluation of three predicted sortase transpeptidase enzymes, which mediate the cell surface presentation of cell wall-anchored proteins. The sortase A mutant exhibited a modest (∼2-fold) reduction in competitiveness, while the other two sortase mutants were indistinguishable from the parental strain. The combined results suggest that while cell wall-anchored proteins may play a role in S. sanguinis infective endocarditis, strategies designed to interfere with individual cell wall-anchored proteins or sortases would not be effective for disease prevention. PMID:19703977
Turner, Lauren Senty; Kanamoto, Taisei; Unoki, Takeshi; Munro, Cindy L; Wu, Hui; Kitten, Todd
2009-11-01
Streptococcus sanguinis is a member of the viridans group of streptococci and a leading cause of the life-threatening endovascular disease infective endocarditis. Initial contact with the cardiac infection site is likely mediated by S. sanguinis surface proteins. In an attempt to identify the proteins required for this crucial step in pathogenesis, we searched for surface-exposed, cell wall-anchored proteins encoded by S. sanguinis and then used a targeted signature-tagged mutagenesis (STM) approach to evaluate their contributions to virulence. Thirty-three predicted cell wall-anchored proteins were identified-a number much larger than those found in related species. The requirement of each cell wall-anchored protein for infective endocarditis was assessed in the rabbit model. It was found that no single cell wall-anchored protein was essential for the development of early infective endocarditis. STM screening was also employed for the evaluation of three predicted sortase transpeptidase enzymes, which mediate the cell surface presentation of cell wall-anchored proteins. The sortase A mutant exhibited a modest (approximately 2-fold) reduction in competitiveness, while the other two sortase mutants were indistinguishable from the parental strain. The combined results suggest that while cell wall-anchored proteins may play a role in S. sanguinis infective endocarditis, strategies designed to interfere with individual cell wall-anchored proteins or sortases would not be effective for disease prevention.
Function and specificity of synthetic Hox transcription factors in vivo
Papadopoulos, Dimitrios K.; Vukojević, Vladana; Adachi, Yoshitsugu; Terenius, Lars; Rigler, Rudolf; Gehring, Walter J.
2010-01-01
Homeotic (Hox) genes encode transcription factors that confer segmental identity along the anteroposterior axis of the embryo. However the molecular mechanisms underlying Hox-mediated transcription and the differential requirements for specificity in the regulation of the vast number of Hox-target genes remain ill-defined. Here we show that synthetic Sex combs reduced (Scr) genes that encode the Scr C terminus containing the homedomain (HD) and YPWM motif (Scr-HD) are functional in vivo. Synthetic Scr-HD peptides can induce ectopic salivary glands in the embryo and homeotic transformations in the adult fly, act as transcriptional activators and repressors during development, and participate in protein-protein interactions. Their transformation capacity was found to be enhanced over their full-length counterpart and mutations known to transform the full-length protein into constitutively active or inactive variants behaved accordingly in the synthetic peptides. Our results show that synthetic Scr-HD genes are sufficient for homeotic function in Drosophila and suggest that the N terminus of Scr has a role in transcriptional potency, rather than specificity. We also demonstrate that synthetic peptides behave largely in a predictable way, by exhibiting Scr-specific phenotypes throughout development, which makes them an important tool for synthetic biology. PMID:20147626
Identification of the initiation site of poliovirus polyprotein synthesis
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dorner, A.J.; Dorner, L.F.; Larsen, G.R.
1982-06-01
The complete nucleotide sequence of poliovirus RNA has a long open reading frame capable of encoding the precursor polyprotein NCVPOO. The first AUG codon in this reading frame is located 743 nucleotides from the 5' end of the RNA and is preceded by eight AUG codons in all three reading frames. Because all proteins that map at the amino terminus of the polyprotein (P1-1a, VPO, and VP4) are blocked at their amino termini and previous studies of ribosome binding have been inconclusive, direct identification of the initiation site of protein synthesis was difficult. We separated and identified all of themore » tryptic peptides of capsid protein VP4 and correlated these peptides with the amino acid sequence predicted to follow the AUG codon at nucleotide 743. Our data indicate that VP4 begins with a blocked glycine that is encoded immediately after the AUG codon at nucleotide 743. An S1 nuclease analysis of poliovirus mRNA failed to reveal a splice in the 5' region. We concluded that synthesis of poliovirus polyprotein is initiated at nucleotide 743, the first AUG codon in the long open reading frame.« less
Nishimura, Yuki; Kamikawa, Ryoma; Hashimoto, Tetsuo; Inagaki, Yuji
2014-01-01
Mitochondrial (mt) genome sequences, which often bear introns, have been sampled from phylogenetically diverse eukaryotes. Thus, we can anticipate novel insights into intron evolution from previously unstudied mt genomes. We here investigated the origins and evolution of three introns in the mt genome of the haptophyte Chrysochromulina sp. NIES-1333, which was sequenced completely in this study. All the three introns were characterized as group II, on the basis of predicted secondary structure, and the conserved sequence motifs at the 5′ and 3′ termini. Our comparative studies on diverse mt genomes prompt us to propose that the Chrysochromulina mt genome laterally acquired the introns from mt genomes in distantly related eukaryotes. Many group II introns harbor intronic open reading frames for the proteins (intron-encoded proteins or IEPs), which likely facilitate the splicing of their host introns. However, we propose that a “free-standing,” IEP-like protein, which is not encoded within any introns in the Chrysochromulina mt genome, is involved in the splicing of the first cox1 intron that lacks any open reading frames. PMID:25054084
Sperschneider, Jana; Gardiner, Donald M.; Thatcher, Louise F.; Lyons, Rebecca; Singh, Karam B.; Manners, John M.; Taylor, Jennifer M.
2015-01-01
Pathogens and hosts are in an ongoing arms race and genes involved in host–pathogen interactions are likely to undergo diversifying selection. Fusarium plant pathogens have evolved diverse infection strategies, but how they interact with their hosts in the biotrophic infection stage remains puzzling. To address this, we analyzed the genomes of three Fusarium plant pathogens for genes that are under diversifying selection. We found a two-speed genome structure both on the chromosome and gene group level. Diversifying selection acts strongly on the dispensable chromosomes in Fusarium oxysporum f. sp. lycopersici and on distinct core chromosome regions in Fusarium graminearum, all of which have associations with virulence. Members of two gene groups evolve rapidly, namely those that encode proteins with an N-terminal [SG]-P-C-[KR]-P sequence motif and proteins that are conserved predominantly in pathogens. Specifically, 29 F. graminearum genes are rapidly evolving, in planta induced and encode secreted proteins, strongly pointing toward effector function. In summary, diversifying selection in Fusarium is strongly reflected as genomic footprints and can be used to predict a small gene set likely to be involved in host–pathogen interactions for experimental verification. PMID:25994930
Kaydamov, C; Tewes, A; Adler, K; Manteuffel, R
2000-04-25
We have isolated cDNA sequences encoding alpha and beta subunits of potential G proteins from a cDNA library prepared from somatic embryos of Nicotiana plumbaginifolia Viv. at early developmental stages. The predicted NPGPA1 and NPGPB1 gene products are 75-98% identical to the known respective plant alpha and beta subunits. Southern hybridizations indicate that NPGPA1 is probably a single-copy gene, whereas at least two copies of NPGPB1 exist in the N. plumbaginifolia genome. Northern analyses reveal that both NPGPA1 and NPGPB1 mRNA are expressed in all embryogenic stages and plant tissues examined and their expression is obviously regulated by the plant hormone auxin. Immunohistological localization of NPGPalpha1 and NPGPbeta1 preferentially on plasma and endoplasmic reticulum membranes and their immunochemical detection exclusively in microsomal cell fractions implicate membrane association of both proteins. The temporal and spatial expression patterns of NPGPA1 and NPGPB1 show conformity as well as differences. This could account for not only cooperative, but also individual activities of both subunits during embryogenesis and plant development.
Radial spoke proteins of Chlamydomonas flagella
Yang, Pinfen; Diener, Dennis R.; Yang, Chun; Kohno, Takahiro; Pazour, Gregory J.; Dienes, Jennifer M.; Agrin, Nathan S.; King, Stephen M.; Sale, Winfield S.; Kamiya, Ritsu; Rosenbaum, Joel L.; Witman, George B.
2007-01-01
Summary The radial spoke is a ubiquitous component of ‘9+2’ cilia and flagella, and plays an essential role in the control of dynein arm activity by relaying signals from the central pair of microtubules to the arms. The Chlamydomonas reinhardtii radial spoke contains at least 23 proteins, only 8 of which have been characterized at the molecular level. Here, we use mass spectrometry to identify 10 additional radial spoke proteins. Many of the newly identified proteins in the spoke stalk are predicted to contain domains associated with signal transduction, including Ca2+-, AKAP- and nucleotide-binding domains. This suggests that the spoke stalk is both a scaffold for signaling molecules and itself a transducer of signals. Moreover, in addition to the recently described HSP40 family member, a second spoke stalk protein is predicted to be a molecular chaperone, implying that there is a sophisticated mechanism for the assembly of this large complex. Among the 18 spoke proteins identified to date, at least 12 have apparent homologs in humans, indicating that the radial spoke has been conserved throughout evolution. The human genes encoding these proteins are candidates for causing primary ciliary dyskinesia, a severe inherited disease involving missing or defective axonemal structures, including the radial spokes. PMID:16507594
Grohmann, L; Brennicke, A; Schuster, W
1992-01-01
The Oenothera mitochondrial genome contains only a gene fragment for ribosomal protein S12 (rps12), while other plants encode a functional gene in the mitochondrion. The complete Oenothera rps12 gene is located in the nucleus. The transit sequence necessary to target this protein to the mitochondrion is encoded by a 5'-extension of the open reading frame. Comparison of the amino acid sequence encoded by the nuclear gene with the polypeptides encoded by edited mitochondrial cDNA and genomic sequences of other plants suggests that gene transfer between mitochondrion and nucleus started from edited mitochondrial RNA molecules. Mechanisms and requirements of gene transfer and activation are discussed. Images PMID:1454526
Zhu, Ruo-Lin; Lei, Xiao-Ying; Ke, Fei; Yuan, Xiu-Ping; Zhang, Qi-Ya
2011-02-01
Genomic sequence of Scophthalmus maximus rhabdovirus (SMRV) isolated from diseased turbot has been characterized. The complete genome of SMRV comprises 11,492 nucleotides and encodes five typical rhabdovirus genes N, P, M, G and L. In addition, two open reading frames (ORF) are predicted overlapping with P gene, one upstream of P and smaller than P (temporarily called Ps), and another in P gene which may encodes a protein similar to the vesicular stomatitis virus C protein. The C ORF is contained within the P ORF. The five typical proteins share the highest sequence identities (48.9%) with the corresponding proteins of rhabdoviruses in genus Vesiculovirus. Phylogenetic analysis of partial L protein sequence indicates that SMRV is close to genus Vesiculovirus. The first 13 nucleotides at the ends of the SMRV genome are absolutely inverse complementarity. The gene junctions between the five genes show conserved polyadenylation signal (CATGA(7)) and intergenic dinucleotide (CT) followed by putative transcription initiation sequence A(A/G)(C/G)A(A/G/T), which are different from known rhabdoviruses. The entire Ps ORF was cloned and expressed, and used to generate polyclonal antibody in mice. One obvious band could be detected in SMRV-infected carp leucocyte cells (CLCs) by anti-Ps/C serum via Western blot, and the subcellular localization of Ps-GFP fusion protein exhibited cytoplasm distribution as multiple punctuate or doughnut shaped foci of uneven size. Copyright © 2010 Elsevier B.V. All rights reserved.
Kim, Hyung-Sae; Lee, Jee Hyun; Kim, Jae Joon; Kim, Chang-Hoon; Jun, Sung-Soo; Hong, Young-Nam
2005-01-03
We used differential screening to isolate a full-length dehydration-responsive cDNA clone encoding a hydrophobic late embryogenesis abundant (LEA)-like protein from PEG-treated hot pepper leaves. Named CaLEA6 (for Capsicum annuum LEA), this gene belongs to the atypical hydrophobic LEA Group 6. The full-length CaLEA6 is 709 bp long with an open reading frame encoding 164 amino acids. It is predicted to produce a highly hydrophobic, but cytoplasmic, protein. The putative M(r) of CaLEA6 protein is 18 kDa, with a theoretical pI of 4.63. Based on our Southern blot analysis, CaLEA6 appears to exist as a small gene family. CaLEA6 was not expressed prior to any treatment, but its transcript was rapidly and greatly increased following trials with PEG, ABA, and NaCl. Chilling also induced its rapid induction, but to a much lesser extent. Accumulation of CaLEA6 protein occurred soon after NaCl applications, but considerably delayed after treatment with PEG. Tobacco plants that overexpressed CaLEA6 showed enhanced tolerance to dehydration and NaCl but not to chilling, as defined by their leaf fresh weights, Chl contents, and the general health status of the leaves. Therefore, we suggest that CaLEA6 protein plays a potentially protective role when water deficit is induced by dehydration and high salinity, but not low temperature.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhang, Weiwen; Culley, David E.; Gritsenko, Marina A.
2006-11-03
ABSTRACT In the previous study, the whole-genome gene expression profiles of D. vulgaris in response to oxidative stress and heat shock were determined. The results showed 24-28% of the responsive genes were hypothetical proteins that have not been experimentally characterized or whose function can not be deduced by simple sequence comparison. To further explore the protecting mechanisms employed in D. vulgaris against the oxidative stress and heat shock, attempt was made in this study to infer functions of these hypothetical proteins by phylogenomic profiling along with detailed sequence comparison against various publicly available databases. By this approach we were abletomore » assign possible functions to 25 responsive hypothetical proteins. The findings included that DVU0725, induced by oxidative stress, may be involved in lipopolysaccharide biosynthesis, implying that the alternation of lipopolysaccharide on cell surface might service as a mechanism against oxidative stress in D. vulgaris. In addition, two responsive proteins, DVU0024 encoding a putative transcriptional regulator and DVU1670 encoding predicted redox protein, were sharing co-evolution atterns with rubrerythrin in Archaeoglobus fulgidus and Clostridium perfringens, respectively, implying that they might be part of the stress response and protective systems in D. vulgaris. The study demonstrated that phylogenomic profiling is a useful tool in interpretation of experimental genomics data, and also provided further insight on cellular response to oxidative stress and heat shock in D. vulgaris.« less
The cDNA sequence of a neutral horseradish peroxidase.
Bartonek-Roxå, E; Eriksson, H; Mattiasson, B
1991-02-16
A cDNA clone encoding a horseradish (Armoracia rusticana) peroxidase has been isolated and characterized. The cDNA contains 1378 nucleotides excluding the poly(A) tail and the deduced protein contains 327 amino acids which includes a 28 amino acid leader sequence. The predicted amino acid sequence is nine amino acids shorter than the major isoenzyme belonging to the horseradish peroxidase C group (HRP-C) and the sequence shows 53.7% identity with this isoenzyme. The described clone encodes nine cysteines of which eight correspond well with the cysteines found in HRP-C. Five potential N-glycosylation sites with the general sequence Asn-X-Thr/Ser are present in the deduced sequence. Compared to the earlier described HRP-C this is three glycosylation sites less. The shorter sequence and fewer N-glycosylation sites give the native isoenzyme a molecular weight of several thousands less than the horseradish peroxidase C isoenzymes. Comparison with the net charge value of HRP-C indicates that the described cDNA clone encodes a peroxidase which has either the same or a slightly less basic pI value, depending on whether the encoded protein is N-terminally blocked or not. This excludes the possibility that HRP-n could belong to either the HRP-A, -D or -E groups. The low sequence identity (53.7%) with HRP-C indicates that the described clone does not belong to the HRP-C isoenzyme group and comparison of the total amino acid composition with the HRP-B group does not place the described clone within this isoenzyme group. Our conclusion is that the described cDNA clone encodes a neutral horseradish peroxidase which belongs to a new, not earlier described, horseradish peroxidase group.
Network-based function prediction and interactomics: the case for metabolic enzymes.
Janga, S C; Díaz-Mejía, J Javier; Moreno-Hagelsieb, G
2011-01-01
As sequencing technologies increase in power, determining the functions of unknown proteins encoded by the DNA sequences so produced becomes a major challenge. Functional annotation is commonly done on the basis of amino-acid sequence similarity alone. Long after sequence similarity becomes undetectable by pair-wise comparison, profile-based identification of homologs can often succeed due to the conservation of position-specific patterns, important for a protein's three dimensional folding and function. Nevertheless, prediction of protein function from homology-driven approaches is not without problems. Homologous proteins might evolve different functions and the power of homology detection has already started to reach its maximum. Computational methods for inferring protein function, which exploit the context of a protein in cellular networks, have come to be built on top of homology-based approaches. These network-based functional inference techniques provide both a first hand hint into a proteins' functional role and offer complementary insights to traditional methods for understanding the function of uncharacterized proteins. Most recent network-based approaches aim to integrate diverse kinds of functional interactions to boost both coverage and confidence level. These techniques not only promise to solve the moonlighting aspect of proteins by annotating proteins with multiple functions, but also increase our understanding on the interplay between different functional classes in a cell. In this article we review the state of the art in network-based function prediction and describe some of the underlying difficulties and successes. Given the volume of high-throughput data that is being reported the time is ripe to employ these network-based approaches, which can be used to unravel the functions of the uncharacterized proteins accumulating in the genomic databases. © 2010 Elsevier Inc. All rights reserved.
Wang, Yukun; Yuan, Guoliang; Yuan, Shaohua; Duan, Wenjing; Wang, Peng; Bai, Jianfang; Zhang, Fengting; Gao, Shiqing; Zhang, Liping; Zhao, Changping
2016-01-29
The 12-oxo-phytodienoic acid reductases (OPRs) are involved in the various processes of growth and development in plants, and classified into the OPRⅠ and OPRⅡ subgroups. In higher plants, only OPRⅡ subgroup genes take part in the biosynthesis of endogenous jasmonic acid. In this study, we isolated a novel OPRⅡ subgroup gene named TaOPR2 (GeneBank accession: KM216389) from the thermo-sensitive genic male sterile (TGMS) wheat cultivar BS366. TaOPR2 was predicted to encode a protein with 390 amino acids. The encoded protein contained the typical oxidored_FMN domain, the C-terminus peroxisomal-targeting signal peptide, and conserved FMN-binding sites. TaOPR2 was mapped to wheat chromosome 7B and located on peroxisome. Protein evolution analysis revealed that TaOPR2 belongs to the OPRⅡ subgroup and shares a high degree of identity with other higher plant OPR proteins. The quantitative real-time PCR results indicated that the expression of TaOPR2 is inhibited by abscisic acid (ABA), salicylic acid (SA), gibberellic acid (GA3), low temperatures and high salinity. In contrast, the expression of TaOPR2 can be induced by wounding, drought and methyl jasmonate (MeJA). Furthermore, the transcription level of TaOPR2 increased after infection with Puccinia striiformis f. sp. tritici and Puccinia recondite f. sp. tritici. TaOPR2 has NADPH-dependent oxidoreductase activity. In addition, the constitutive expression of TaOPR2 can rescue the male sterility phenotype of Arabidopsis mutant opr3. These results suggest that TaOPR2 is involved in the biosynthesis of jasmonic acid (JA) in wheat. Copyright © 2016 Elsevier Inc. All rights reserved.
Johnson, Amanda N.; Weil, P. Anthony
2017-01-01
Repressor activator protein 1 (Rap1) performs multiple vital cellular functions in the budding yeast Saccharomyces cerevisiae. These include regulation of telomere length, transcriptional repression of both telomere-proximal genes and the silent mating type loci, and transcriptional activation of hundreds of mRNA-encoding genes, including the highly transcribed ribosomal protein- and glycolytic enzyme-encoding genes. Studies of the contributions of Rap1 to telomere length regulation and transcriptional repression have yielded significant mechanistic insights. However, the mechanism of Rap1 transcriptional activation remains poorly understood because Rap1 is encoded by a single copy essential gene and is involved in many disparate and essential cellular functions, preventing easy interpretation of attempts to directly dissect Rap1 structure-function relationships. Moreover, conflicting reports on the ability of Rap1-heterologous DNA-binding domain fusion proteins to serve as chimeric transcriptional activators challenge use of this approach to study Rap1. Described here is the development of an altered DNA-binding specificity variant of Rap1 (Rap1AS). We used Rap1AS to map and characterize a 41-amino acid activation domain (AD) within the Rap1 C terminus. We found that this AD is required for transcription of both chimeric reporter genes and authentic chromosomal Rap1 enhancer-containing target genes. Finally, as predicted for a bona fide AD, mutation of this newly identified AD reduced the efficiency of Rap1 binding to a known transcriptional coactivator TFIID-binding target, Taf5. In summary, we show here that Rap1 contains an AD required for Rap1-dependent gene transcription. The Rap1AS variant will likely also be useful for studies of the functions of Rap1 in other biological pathways. PMID:28196871
Smith, Steven D.; Bridou, Romain; Johs, Alexander; ...
2015-02-27
Methylmercury is a potent neurotoxin that is produced by anaerobic microorganisms from inorganic mercury by a recently discovered pathway. A two-gene cluster, consisting of hgcA and hgcB, encodes two of the proteins essential for this activity. hgcA encodes a corrinoid protein with a strictly conserved cysteine proposed to be the ligand for cobalt in the corrinoid cofactor, whereas hgcB encodes a ferredoxin-like protein thought to be an electron donor to HgcA. Deletion of either gene eliminates mercury methylation by the methylator Desulfovibrio desulfuricans ND132. Here, site-directed mutants of HgcA and HgcB were constructed to determine amino acid residues essential formore » mercury methylation. Mutations of the strictly conserved residue Cys93 in HgcA, the proposed ligand for the corrinoid cobalt, to Ala or Thr completely abolished the methylation capacity, but a His substitution produced measurable methylmercury. Mutations of conserved amino acids near Cys93 had various impacts on the methylation capacity but showed that the structure of the putative “cap helix” region harboring Cys93 is crucial for methylation function. In the ferredoxin-like protein HgcB, only one of two conserved cysteines found at the C terminus was necessary for methylation, but either cysteine sufficed. An additional, strictly conserved cysteine, Cys73, was also determined to be essential for methylation. Ultimately, this study supports the previously predicted importance of Cys93 in HgcA for methylation of mercury and reveals additional residues in HgcA and HgcB that facilitate the production of this neurotoxin.« less
Chen, Minjie; Li, Yanjun; Zhang, Li; Wang, Jianying; Zheng, Chunli; Zhang, Xuefeng
2015-02-01
Acidithiobacillus ferrooxidans plays a critical role in metal solubilization in the biomining industry, and occupies an ecological niche characterized by high acidity and high concentrations of toxic heavy metal ions. In order to investigate the possible metal resistance mechanism, the cellular distribution of cadmium was tested. The result indicated that Cd(2+) entered the cells upon initial exposure resulting in increased intracellular concentrations, followed by its excretion from the cells during subsequent growth and adaptation. Sequence homology analyses were used to identify 10 genes predicted to participate in heavy metal homeostasis, and the expression of these genes was investigated in cells cultured in the presence of increasing concentrations of toxic divalent cadmium (Cd(2+)). The results suggested that one gene (cmtR A.f ) encoded a putative Cd(2+)/Pb(2+)-responsive transcriptional regulator; four genes (czcA1 A.f , czcA2 A.f , czcB1 A.f ; and czcC1 A.f ) encoded heavy metal efflux proteins for Cd(2+); two genes (cadA1 A.f and cadB1 A.f ) encoded putative cation channel proteins related to the transport of Cd(2+). No significant enhancement of gene expression was observed at low concentrations of Cd(2+) (5 mM) and most of the putative metal resistance genes were up-regulated except cmtR A.f , cadB3 A.f ; and czcB1 A.f at higher concentrations (15 and 30 mM) according to real-time polymerase chain reaction. A model was developed for the mechanism of resistance to cadmium ions based on homology analyses of the predicted genes, the transcription of putative Cd(2+) resistance genes, and previous work.
Tomazetto, Geizecler; Hahnke, Sarah; Wibberg, Daniel; Pühler, Alfred; Klocke, Michael; Schlüter, Andreas
2018-06-01
Proteiniphilum saccharofermentans str. M3/6 T is a recently described species within the family Porphyromonadaceae (phylum Bacteroidetes ), which was isolated from a mesophilic laboratory-scale biogas reactor. The genome of the strain was completely sequenced and manually annotated to reconstruct its metabolic potential regarding biomass degradation and fermentation pathways. The P. saccharofermentans str. M3/6 T genome consists of a 4,414,963 bp chromosome featuring an average GC-content of 43.63%. Genome analyses revealed that the strain possesses 3396 protein-coding sequences. Among them are 158 genes assigned to the carbohydrate-active-enzyme families as defined by the CAZy database, including 116 genes encoding glycosyl hydrolases (GHs) involved in pectin, arabinogalactan, hemicellulose (arabinan, xylan, mannan, β-glucans), starch, fructan and chitin degradation. The strain also features several transporter genes, some of which are located in polysaccharide utilization loci (PUL). PUL gene products are involved in glycan binding, transport and utilization at the cell surface. In the genome of strain M3/6 T , 64 PUL are present and most of them in association with genes encoding carbohydrate-active enzymes. Accordingly, the strain was predicted to metabolize several sugars yielding carbon dioxide, hydrogen, acetate, formate, propionate and isovalerate as end-products of the fermentation process. Moreover, P. saccharofermentans str. M3/6 T encodes extracellular and intracellular proteases and transporters predicted to be involved in protein and oligopeptide degradation. Comparative analyses between P. saccharofermentans str. M3/6 T and its closest described relative P. acetatigenes str. DSM 18083 T indicate that both strains share a similar metabolism regarding decomposition of complex carbohydrates and fermentation of sugars.
Antalis, T M; Clark, M A; Barnes, T; Lehrbach, P R; Devine, P L; Schevzov, G; Goss, N H; Stephens, R W; Tolstoshev, P
1988-02-01
Human monocyte-derived plasminogen activator inhibitor (mPAI-2) was purified to homogeneity from the U937 cell line and partially sequenced. Oligonucleotide probes derived from this sequence were used to screen a cDNA library prepared from U937 cells. One positive clone was sequenced and contained most of the coding sequence as well as a long incomplete 3' untranslated region (1112 base pairs). This cDNA sequence was shown to encode mPAI-2 by hybrid-select translation. A cDNA clone encoding the remainder of the mPAI-2 mRNA was obtained by primer extension of U937 poly(A)+ RNA using a probe complementary to the mPAI-2 coding region. The coding sequence for mPAI-2 was placed under the control of the lambda PL promoter, and the protein expressed in Escherichia coli formed a complex with urokinase that could be detected immunologically. By nucleotide sequence analysis, mPAI-2 cDNA encodes a protein containing 415 amino acids with a predicted unglycosylated Mr of 46,543. The predicted amino acid sequence of mPAI-2 is very similar to placental PAI-2 (3 amino acid differences) and shows extensive homology with members of the serine protease inhibitor (serpin) superfamily. mPAI-2 was found to be more homologous to ovalbumin (37%) than the endothelial plasminogen activator inhibitor, PAI-1 (26%). Like ovalbumin, mPAI-2 appears to have no typical amino-terminal signal sequence. The 3' untranslated region of the mPAI-2 cDNA contains a putative regulatory sequence that has been associated with the inflammatory mediators.
Antalis, T M; Clark, M A; Barnes, T; Lehrbach, P R; Devine, P L; Schevzov, G; Goss, N H; Stephens, R W; Tolstoshev, P
1988-01-01
Human monocyte-derived plasminogen activator inhibitor (mPAI-2) was purified to homogeneity from the U937 cell line and partially sequenced. Oligonucleotide probes derived from this sequence were used to screen a cDNA library prepared from U937 cells. One positive clone was sequenced and contained most of the coding sequence as well as a long incomplete 3' untranslated region (1112 base pairs). This cDNA sequence was shown to encode mPAI-2 by hybrid-select translation. A cDNA clone encoding the remainder of the mPAI-2 mRNA was obtained by primer extension of U937 poly(A)+ RNA using a probe complementary to the mPAI-2 coding region. The coding sequence for mPAI-2 was placed under the control of the lambda PL promoter, and the protein expressed in Escherichia coli formed a complex with urokinase that could be detected immunologically. By nucleotide sequence analysis, mPAI-2 cDNA encodes a protein containing 415 amino acids with a predicted unglycosylated Mr of 46,543. The predicted amino acid sequence of mPAI-2 is very similar to placental PAI-2 (3 amino acid differences) and shows extensive homology with members of the serine protease inhibitor (serpin) superfamily. mPAI-2 was found to be more homologous to ovalbumin (37%) than the endothelial plasminogen activator inhibitor, PAI-1 (26%). Like ovalbumin, mPAI-2 appears to have no typical amino-terminal signal sequence. The 3' untranslated region of the mPAI-2 cDNA contains a putative regulatory sequence that has been associated with the inflammatory mediators. Images PMID:3257578
Manolson, M F; Proteau, D; Preston, R A; Stenbit, A; Roberts, B T; Hoyt, M A; Preuss, D; Mulholland, J; Botstein, D; Jones, E W
1992-07-15
Yeast vacuolar acidification-defective (vph) mutants were identified using the pH-sensitive fluorescence of 6-carboxyfluorescein diacetate (Preston, R. A., Murphy, R. F., and Jones, E. W. (1989) Proc. Natl. Acad. Sci. U.S.A. 86, 7027-7031). Vacuoles purified from yeast bearing the vph1-1 mutation had no detectable bafilomycin-sensitive ATPase activity or ATP-dependent proton pumping. The peripherally bound nucleotide-binding subunits of the vacuolar H(+)-ATPase (60 and 69 kDa) were no longer associated with vacuolar membranes yet were present in wild type levels in yeast whole cell extracts. The VPH1 gene was cloned by complementation of the vph1-1 mutation and independently cloned by screening a lambda gt11 expression library with antibodies directed against a 95-kDa vacuolar integral membrane protein. Deletion disruption of the VPH1 gene revealed that the VPH1 gene is not essential for viability but is required for vacuolar H(+)-ATPase assembly and vacuolar acidification. VPH1 encodes a predicted polypeptide of 840 amino acid residues (molecular mass 95.6 kDa) and contains six putative membrane-spanning regions. Cell fractionation and immunodetection demonstrate that Vph1p is a vacuolar integral membrane protein that co-purifies with vacuolar H(+)-ATPase activity. Multiple sequence alignments show extensive homology over the entire lengths of the following four polypeptides: Vph1p, the 116-kDa polypeptide of the rat clathrin-coated vesicles/synaptic vesicle proton pump, the predicted polypeptide encoded by the yeast gene STV1 (Similar To VPH1, identified as an open reading frame next to the BUB2 gene), and the TJ6 mouse immune suppressor factor.
Chen, Saihua; Yang, Yi; Shi, Weiwei; Ji, Qing; He, Fei; Zhang, Ziding; Cheng, Zhukuan; Liu, Xiangnong; Xu, Mingliang
2008-01-01
In rice (Oryza sativa), the presence of a dominant Badh2 allele encoding betaine aldehyde dehydrogenase (BADH2) inhibits the synthesis of 2-acetyl-1-pyrroline (2AP), a potent flavor component in rice fragrance. By contrast, its two recessive alleles, badh2-E2 and badh2-E7, induce 2AP formation. Badh2 was found to be transcribed in all tissues tested except for roots, and the transcript was detected at higher abundance in young, healthy leaves than in other tissues. Multiple Badh2 transcript lengths were detected, and the complete, full-length Badh2 transcript was much less abundant than partial Badh2 transcripts. 2AP levels were significantly reduced in cauliflower mosaic virus 35S-driven transgenic lines expressing the complete, but not the partial, Badh2 coding sequences. In accordance, the intact, full-length BADH2 protein (503 residues) appeared exclusively in nonfragrant transgenic lines and rice varieties. These results indicate that the full-length BADH2 protein encoded by Badh2 renders rice nonfragrant by inhibiting 2AP biosynthesis. The BADH2 enzyme was predicted to contain three domains: NAD binding, substrate binding, and oligomerization domains. BADH2 was distributed throughout the cytoplasm, where it is predicted to catalyze the oxidization of betaine aldehyde, 4-aminobutyraldehyde (AB-ald), and 3-aminopropionaldehyde. The presence of null badh2 alleles resulted in AB-ald accumulation and enhanced 2AP biosynthesis. In summary, these data support the hypothesis that BADH2 inhibits 2AP biosynthesis by exhausting AB-ald, a presumed 2AP precursor. PMID:18599581
Identification and correction of abnormal, incomplete and mispredicted proteins in public databases.
Nagy, Alinda; Hegyi, Hédi; Farkas, Krisztina; Tordai, Hedvig; Kozma, Evelin; Bányai, László; Patthy, László
2008-08-27
Despite significant improvements in computational annotation of genomes, sequences of abnormal, incomplete or incorrectly predicted genes and proteins remain abundant in public databases. Since the majority of incomplete, abnormal or mispredicted entries are not annotated as such, these errors seriously affect the reliability of these databases. Here we describe the MisPred approach that may provide an efficient means for the quality control of databases. The current version of the MisPred approach uses five distinct routines for identifying abnormal, incomplete or mispredicted entries based on the principle that a sequence is likely to be incorrect if some of its features conflict with our current knowledge about protein-coding genes and proteins: (i) conflict between the predicted subcellular localization of proteins and the absence of the corresponding sequence signals; (ii) presence of extracellular and cytoplasmic domains and the absence of transmembrane segments; (iii) co-occurrence of extracellular and nuclear domains; (iv) violation of domain integrity; (v) chimeras encoded by two or more genes located on different chromosomes. Analyses of predicted EnsEMBL protein sequences of nine deuterostome (Homo sapiens, Mus musculus, Rattus norvegicus, Monodelphis domestica, Gallus gallus, Xenopus tropicalis, Fugu rubripes, Danio rerio and Ciona intestinalis) and two protostome species (Caenorhabditis elegans and Drosophila melanogaster) have revealed that the absence of expected signal peptides and violation of domain integrity account for the majority of mispredictions. Analyses of sequences predicted by NCBI's GNOMON annotation pipeline show that the rates of mispredictions are comparable to those of EnsEMBL. Interestingly, even the manually curated UniProtKB/Swiss-Prot dataset is contaminated with mispredicted or abnormal proteins, although to a much lesser extent than UniProtKB/TrEMBL or the EnsEMBL or GNOMON-predicted entries. MisPred works efficiently in identifying errors in predictions generated by the most reliable gene prediction tools such as the EnsEMBL and NCBI's GNOMON pipelines and also guides the correction of errors. We suggest that application of the MisPred approach will significantly improve the quality of gene predictions and the associated databases.
Rigoutsos, Isidore; Riek, Peter; Graham, Robert M.; Novotny, Jiri
2003-01-01
One of the promising methods of protein structure prediction involves the use of amino acid sequence-derived patterns. Here we report on the creation of non-degenerate motif descriptors derived through data mining of training sets of residues taken from the transmembrane-spanning segments of polytopic proteins. These residues correspond to short regions in which there is a deviation from the regular α-helical character (i.e. π-helices, 310-helices and kinks). A ‘search engine’ derived from these motif descriptors correctly identifies, and discriminates amongst instances of the above ‘non-canonical’ helical motifs contained in the SwissProt/TrEMBL database of protein primary structures. Our results suggest that deviations from α-helicity are encoded locally in sequence patterns only about 7–9 residues long and can be determined in silico directly from the amino acid sequence. Delineation of such variations in helical habit is critical to understanding the complex structure–function relationships of polytopic proteins and for drug discovery. The success of our current methodology foretells development of similar prediction tools capable of identifying other structural motifs from sequence alone. The method described here has been implemented and is available on the World Wide Web at http://cbcsrv.watson.ibm.com/Ttkw.html. PMID:12888523
Rigoutsos, Isidore; Riek, Peter; Graham, Robert M; Novotny, Jiri
2003-08-01
One of the promising methods of protein structure prediction involves the use of amino acid sequence-derived patterns. Here we report on the creation of non-degenerate motif descriptors derived through data mining of training sets of residues taken from the transmembrane-spanning segments of polytopic proteins. These residues correspond to short regions in which there is a deviation from the regular alpha-helical character (i.e. pi-helices, 3(10)-helices and kinks). A 'search engine' derived from these motif descriptors correctly identifies, and discriminates amongst instances of the above 'non-canonical' helical motifs contained in the SwissProt/TrEMBL database of protein primary structures. Our results suggest that deviations from alpha-helicity are encoded locally in sequence patterns only about 7-9 residues long and can be determined in silico directly from the amino acid sequence. Delineation of such variations in helical habit is critical to understanding the complex structure-function relationships of polytopic proteins and for drug discovery. The success of our current methodology foretells development of similar prediction tools capable of identifying other structural motifs from sequence alone. The method described here has been implemented and is available on the World Wide Web at http://cbcsrv.watson.ibm.com/Ttkw.html.
Zhang, Jian; Gao, Bo; Chai, Haiting; Ma, Zhiqiang; Yang, Guifu
2016-08-26
DNA-binding proteins (DBPs) play fundamental roles in many biological processes. Therefore, the developing of effective computational tools for identifying DBPs is becoming highly desirable. In this study, we proposed an accurate method for the prediction of DBPs. Firstly, we focused on the challenge of improving DBP prediction accuracy with information solely from the sequence. Secondly, we used multiple informative features to encode the protein. These features included evolutionary conservation profile, secondary structure motifs, and physicochemical properties. Thirdly, we introduced a novel improved Binary Firefly Algorithm (BFA) to remove redundant or noisy features as well as select optimal parameters for the classifier. The experimental results of our predictor on two benchmark datasets outperformed many state-of-the-art predictors, which revealed the effectiveness of our method. The promising prediction performance on a new-compiled independent testing dataset from PDB and a large-scale dataset from UniProt proved the good generalization ability of our method. In addition, the BFA forged in this research would be of great potential in practical applications in optimization fields, especially in feature selection problems. A highly accurate method was proposed for the identification of DBPs. A user-friendly web-server named iDbP (identification of DNA-binding Proteins) was constructed and provided for academic use.
Sequence similarity is more relevant than species specificity in probabilistic backtranslation.
Ferro, Alfredo; Giugno, Rosalba; Pigola, Giuseppe; Pulvirenti, Alfredo; Di Pietro, Cinzia; Purrello, Michele; Ragusa, Marco
2007-02-21
Backtranslation is the process of decoding a sequence of amino acids into the corresponding codons. All synthetic gene design systems include a backtranslation module. The degeneracy of the genetic code makes backtranslation potentially ambiguous since most amino acids are encoded by multiple codons. The common approach to overcome this difficulty is based on imitation of codon usage within the target species. This paper describes EasyBack, a new parameter-free, fully-automated software for backtranslation using Hidden Markov Models. EasyBack is not based on imitation of codon usage within the target species, but instead uses a sequence-similarity criterion. The model is trained with a set of proteins with known cDNA coding sequences, constructed from the input protein by querying the NCBI databases with BLAST. Unlike existing software, the proposed method allows the quality of prediction to be estimated. When tested on a group of proteins that show different degrees of sequence conservation, EasyBack outperforms other published methods in terms of precision. The prediction quality of a protein backtranslation methis markedly increased by replacing the criterion of most used codon in the same species with a Hidden Markov Model trained with a set of most similar sequences from all species. Moreover, the proposed method allows the quality of prediction to be estimated probabilistically.
Predicting the Impact of Alternative Splicing on Plant MADS Domain Protein Function
Severing, Edouard I.; van Dijk, Aalt D. J.; Morabito, Giuseppa; Busscher-Lange, Jacqueline; Immink, Richard G. H.; van Ham, Roeland C. H. J.
2012-01-01
Several genome-wide studies demonstrated that alternative splicing (AS) significantly increases the transcriptome complexity in plants. However, the impact of AS on the functional diversity of proteins is difficult to assess using genome-wide approaches. The availability of detailed sequence annotations for specific genes and gene families allows for a more detailed assessment of the potential effect of AS on their function. One example is the plant MADS-domain transcription factor family, members of which interact to form protein complexes that function in transcription regulation. Here, we perform an in silico analysis of the potential impact of AS on the protein-protein interaction capabilities of MIKC-type MADS-domain proteins. We first confirmed the expression of transcript isoforms resulting from predicted AS events. Expressed transcript isoforms were considered functional if they were likely to be translated and if their corresponding AS events either had an effect on predicted dimerisation motifs or occurred in regions known to be involved in multimeric complex formation, or otherwise, if their effect was conserved in different species. Nine out of twelve MIKC MADS-box genes predicted to produce multiple protein isoforms harbored putative functional AS events according to those criteria. AS events with conserved effects were only found at the borders of or within the K-box domain. We illustrate how AS can contribute to the evolution of interaction networks through an example of selective inclusion of a recently evolved interaction motif in the MADS AFFECTING FLOWERING1-3 (MAF1–3) subclade. Furthermore, we demonstrate the potential effect of an AS event in SHORT VEGETATIVE PHASE (SVP), resulting in the deletion of a short sequence stretch including a predicted interaction motif, by overexpression of the fully spliced and the alternatively spliced SVP transcripts. For most of the AS events we were able to formulate hypotheses about the potential impact on the interaction capabilities of the encoded MIKC proteins. PMID:22295091
Botha, M; Pesce, E-R; Blatch, G L
2007-01-01
Extensive structural and functional remodelling of Plasmodium falciparum (malaria)-infected erythrocytes follows the export of a range of proteins of parasite origin (exportome) across the parasitophorous vacuole into the host erythrocyte. The genome of P. falciparum encodes a diverse chaperone complement including at least 43 members of the heat shock protein 40kDa (Hsp40) family, and six members of the heat shock protein 70kDa (Hsp70) family. Nearly half of the Hsp40 proteins of P. falciparum are predicted to contain a PEXEL/HT (Plasmodium export element/host targeting signal) sequence motif, and hence are likely to be part of the exportome. In this review we critically evaluate the classification, sequence similarity and clustering, and possible interactors of the P. falciparum Hsp40 chaperone machinery. In addition to the types I, II and III Hsp40 proteins all exhibiting the signature J-domain, the P. falciparum genome also encodes a number of specialized Hsp40 proteins with a J-like domain, which we have categorized as type IV Hsp40 proteins. Analysis of the potential P. falciparum Hsp40 protein interaction network revealed connections predominantly with cytoskeletal and membrane proteins, transcriptional machinery, DNA repair and replication machinery, translational machinery, the proteasome and proteolytic enzymes, and enzymes involved in cellular physiology. Comparison of the Hsp40 proteins of P. falciparum to those of other apicomplexa reveals that most of the proteins (especially the PEXEL/HT-containing proteins) are unique to P. falciparum. Furthermore, very few of the P. falciparum Hsp40 proteins have human homologs, except for those proteins implicated in fundamental biological processes. Our analysis suggests that P. falciparum has evolved an expanded and specialized Hsp40 protein machinery to enable it successfully to invade and remodel the human erythrocyte, and we propose a model in which these proteins are involved in chaperone-mediated translocation, folding, assembly and regulation of parasite and host proteins.
Plant, Ewan P.; Rakauskaitė, Rasa; Taylor, Deborah R.; Dinman, Jonathan D.
2010-01-01
In retroviruses and the double-stranded RNA totiviruses, the efficiency of programmed −1 ribosomal frameshifting is critical for ensuring the proper ratios of upstream-encoded capsid proteins to downstream-encoded replicase enzymes. The genomic organizations of many other frameshifting viruses, including the coronaviruses, are very different, in that their upstream open reading frames encode nonstructural proteins, the frameshift-dependent downstream open reading frames encode enzymes involved in transcription and replication, and their structural proteins are encoded by subgenomic mRNAs. The biological significance of frameshifting efficiency and how the relative ratios of proteins encoded by the upstream and downstream open reading frames affect virus propagation has not been explored before. Here, three different strategies were employed to test the hypothesis that the −1 PRF signals of coronaviruses have evolved to produce the correct ratios of upstream- to downstream-encoded proteins. Specifically, infectious clones of the severe acute respiratory syndrome (SARS)-associated coronavirus harboring mutations that lower frameshift efficiency decreased infectivity by >4 orders of magnitude. Second, a series of frameshift-promoting mRNA pseudoknot mutants was employed to demonstrate that the frameshift signals of the SARS-associated coronavirus and mouse hepatitis virus have evolved to promote optimal frameshift efficiencies. Finally, we show that a previously described frameshift attenuator element does not actually affect frameshifting per se but rather serves to limit the fraction of ribosomes available for frameshifting. The findings of these analyses all support a “golden mean” model in which viruses use both programmed ribosomal frameshifting and translational attenuation to control the relative ratios of their encoded proteins. PMID:20164235
SVM-Fold: a tool for discriminative multi-class protein fold and superfamily recognition
Melvin, Iain; Ie, Eugene; Kuang, Rui; Weston, Jason; Stafford, William Noble; Leslie, Christina
2007-01-01
Background Predicting a protein's structural class from its amino acid sequence is a fundamental problem in computational biology. Much recent work has focused on developing new representations for protein sequences, called string kernels, for use with support vector machine (SVM) classifiers. However, while some of these approaches exhibit state-of-the-art performance at the binary protein classification problem, i.e. discriminating between a particular protein class and all other classes, few of these studies have addressed the real problem of multi-class superfamily or fold recognition. Moreover, there are only limited software tools and systems for SVM-based protein classification available to the bioinformatics community. Results We present a new multi-class SVM-based protein fold and superfamily recognition system and web server called SVM-Fold, which can be found at . Our system uses an efficient implementation of a state-of-the-art string kernel for sequence profiles, called the profile kernel, where the underlying feature representation is a histogram of inexact matching k-mer frequencies. We also employ a novel machine learning approach to solve the difficult multi-class problem of classifying a sequence of amino acids into one of many known protein structural classes. Binary one-vs-the-rest SVM classifiers that are trained to recognize individual structural classes yield prediction scores that are not comparable, so that standard "one-vs-all" classification fails to perform well. Moreover, SVMs for classes at different levels of the protein structural hierarchy may make useful predictions, but one-vs-all does not try to combine these multiple predictions. To deal with these problems, our method learns relative weights between one-vs-the-rest classifiers and encodes information about the protein structural hierarchy for multi-class prediction. In large-scale benchmark results based on the SCOP database, our code weighting approach significantly improves on the standard one-vs-all method for both the superfamily and fold prediction in the remote homology setting and on the fold recognition problem. Moreover, our code weight learning algorithm strongly outperforms nearest-neighbor methods based on PSI-BLAST in terms of prediction accuracy on every structure classification problem we consider. Conclusion By combining state-of-the-art SVM kernel methods with a novel multi-class algorithm, the SVM-Fold system delivers efficient and accurate protein fold and superfamily recognition. PMID:17570145
Medema, Marnix H; Zhou, Miaomiao; van Hijum, Sacha A F T; Gloerich, Jolein; Wessels, Hans J C T; Siezen, Roland J; Strous, Marc
2010-05-12
Anaerobic ammonium-oxidizing (anammox) bacteria perform a key step in global nitrogen cycling. These bacteria make use of an organelle to oxidize ammonia anaerobically to nitrogen (N2) and so contribute approximately 50% of the nitrogen in the atmosphere. It is currently unknown which proteins constitute the organellar proteome and how anammox bacteria are able to specifically target organellar and cell-envelope proteins to their correct final destinations. Experimental approaches are complicated by the absence of pure cultures and genetic accessibility. However, the genome of the anammox bacterium Candidatus "Kuenenia stuttgartiensis" has recently been sequenced. Here, we make use of these genome data to predict the organellar sub-proteome and address the molecular basis of protein sorting in anammox bacteria. Two training sets representing organellar (30 proteins) and cell envelope (59 proteins) proteins were constructed based on previous experimental evidence and comparative genomics. Random forest (RF) classifiers trained on these two sets could differentiate between organellar and cell envelope proteins with ~89% accuracy using 400 features consisting of frequencies of two adjacent amino acid combinations. A physicochemically distinct organellar sub-proteome containing 562 proteins was predicted with the best RF classifier. This set included almost all catabolic and respiratory factors encoded in the genome. Apparently, the cytoplasmic membrane performs no catabolic functions. We predict that the Tat-translocation system is located exclusively in the organellar membrane, whereas the Sec-translocation system is located on both the organellar and cytoplasmic membranes. Canonical signal peptides were predicted and validated experimentally, but a specific (N- or C-terminal) signal that could be used for protein targeting to the organelle remained elusive. A physicochemically distinct organellar sub-proteome was predicted from the genome of the anammox bacterium K. stuttgartiensis. This result provides strong in silico support for the existing experimental evidence for the existence of an organelle in this bacterium, and is an important step forward in unravelling a geochemically relevant case of cytoplasmic differentiation in bacteria. The predicted dual location of the Sec-translocation system and the apparent absence of a specific N- or C-terminal signal in the organellar proteins suggests that additional chaperones may be necessary that act on an as-yet unknown property of the targeted proteins.
Wilton, Brianne A.; Campbell, Stephanie; Van Buuren, Nicholas; Garneau, Robyn; Furukawa, Manabu; Xiong, Yue; Barry., Michele
2008-01-01
Cellular proteins containing BTB and kelch domains have been shown to function as adapters for the recruitment of substrates to cullin-3-based ubiquitin ligases. Poxviruses are the only family of viruses known to encode multiple BTB/kelch proteins, suggesting that poxviruses may modulate the ubiquitin pathway through interaction with cullin-3. Ectromelia virus encodes four BTB/kelch proteins and one BTB-only protein. Here we demonstrate that two of the ectromelia virus encoded BTB/kelch proteins, EVM150 and EVM167, interacted with cullin-3. Similar to cellular BTB proteins, the BTB domain of EVM150 and EVM167 was necessary and sufficient for cullin-3 interaction. During infection, EVM150 and EVM167 localized to discrete cytoplasmic regions, which co-localized with cullin-3. Furthermore, EVM150 and EVM167 co-localized and interacted with conjugated ubiquitin, as demonstrated by confocal microscopy and co-immunoprecipitation. Our findings suggest that the ectromelia virus encoded BTB/kelch proteins, EVM150 and EVM167, interact with cullin-3 potentially functioning to recruit unidentified substrates for ubiquitination. PMID:18221766
Cheng, Chao; Ung, Matthew; Grant, Gavin D.; Whitfield, Michael L.
2013-01-01
Cell cycle is a complex and highly supervised process that must proceed with regulatory precision to achieve successful cellular division. Despite the wide application, microarray time course experiments have several limitations in identifying cell cycle genes. We thus propose a computational model to predict human cell cycle genes based on transcription factor (TF) binding and regulatory motif information in their promoters. We utilize ENCODE ChIP-seq data and motif information as predictors to discriminate cell cycle against non-cell cycle genes. Our results show that both the trans- TF features and the cis- motif features are predictive of cell cycle genes, and a combination of the two types of features can further improve prediction accuracy. We apply our model to a complete list of GENCODE promoters to predict novel cell cycle driving promoters for both protein-coding genes and non-coding RNAs such as lincRNAs. We find that a similar percentage of lincRNAs are cell cycle regulated as protein-coding genes, suggesting the importance of non-coding RNAs in cell cycle division. The model we propose here provides not only a practical tool for identifying novel cell cycle genes with high accuracy, but also new insights on cell cycle regulation by TFs and cis-regulatory elements. PMID:23874175
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jou, Y.S.; Myers, R.M.
1994-09-01
Huntington disease (HD) appears to be caused by a mutation that results in an expanded number of CAG repeats at the 5{prime} end of the gene. The nucleotide sequence of the gene and cDNA clones predicts a 347 kd protein that contains a stretch of polyglutamine, encoded by the CAG repeat, located 17 amino acids downstream from the proposed translation initiation site. Because understanding the mechanisms of the pathology of HD depends on whether the CAG-repeat is expressed in the protein, we used antibodies directed against portions of the predicted HD gene product to probe the structure of the proteinmore » in tissue culture cells. Two peptides, one located amino-terminal to the proposed polyglutamine stretch (hd1 peptide FESLKSFQQ from amino acids 11-19) and one located in the carboxy-terminal half of the predicted protein (hd2 peptide QQPRNKPLK from amino acids 2531-2539), were used to elicit polyclonal antibodies in NZW rabbits. We affinity-purified the antibodies and used them to analyze the HD protein. Both antisera specifically recognize the peptides used to elicit them, as well as the appropriate portions of the HD protein expressed in E. coli. Western blot analysis showed that both antisera recognize a protein with an apparent molecular weight of approximately 350,000 in human, monkey, rat and mouse cell lines, including two neutronal cell lines. These results, in combination with immunoprecipitation experiments, suggest strongly that the proposed polyglutamine stretch is indeed translated in the HD protein and is evolutionarily conserved in various mammalian species.« less
Kafkas, Alexandros; Montaldi, Daniela
2011-10-01
Thirty-five healthy participants incidentally encoded a set of man-made and natural object pictures, while their pupil response and eye movements were recorded. At retrieval, studied and new stimuli were rated as novel, familiar (strong, moderate, or weak), or recollected. We found that both pupil response and fixation patterns at encoding predict later recognition memory strength. The extent of pupillary response accompanying incidental encoding was found to be predictive of subsequent memory. In addition, the number of fixations was also predictive of later recognition memory strength, suggesting that the accumulation of greater visual detail, even for single objects, is critical for the creation of a strong memory. Moreover, fixation patterns at encoding distinguished between recollection and familiarity at retrieval, with more dispersed fixations predicting familiarity and more clustered fixations predicting recollection. These data reveal close links between the autonomic control of pupil responses and eye movement patterns on the one hand and memory encoding on the other. Moreover, the data illustrate quantitative as well as qualitative differences in the incidental visual processing of stimuli, which are differentially predictive of the strength and the kind of memory experienced at recognition.
Foreman, Pamela [Los Altos, CA; Goedegebuur, Frits [Vlaardingen, NL; Van Solingen, Pieter [Naaldwijk, NL; Ward, Michael [San Francisco, CA
2012-06-19
Described herein are novel gene sequences isolated from Trichoderma reesei. Two genes encoding proteins comprising a cellulose binding domain, one encoding an arabionfuranosidase and one encoding an acetylxylanesterase are described. The sequences, CIP1 and CIP2, contain a cellulose binding domain. These proteins are especially useful in the textile and detergent industry and in pulp and paper industry.
2006-07-01
ATM genetic variant identified affects radiosensitivity and levels of the protein encoded by the ATM gene for each mutation examined. 15. SUBJECT...women without breast cancer. An additional objective is to determine the functional impact upon the protein encoded by the ATM gene for each mutation ...each ATM variant identified affects radiosensitivity and levels of the protein encoded by the ATM gene for mutations identified. Body STATEMENT
A tick B-cell inhibitory protein from salivary glands of the hard tick, Hyalomma asiaticum asiaticum
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yu Da; Department of Life Science and Technology, Changshu Institute of Technology, Changshu 215500; Liang Jiangguo
2006-05-05
Some studies done to date suggest that B-cell inhibitory factor occurred in tick saliva. In this study, a novel protein having B-cell inhibitory activity was purified and characterized from the salivary glands of the hard tick, Hyalomma asiaticum asiaticum. This protein was named B-cell inhibitory factor (BIF). The cDNA encoding BIF was cloned by cDNA library screening. The predicted protein from the cDNA sequence is composed of 138 amino acids including the mature BIF. No similarity was found by Blast search. The lipopolysaccharide-induced B-cell proliferation was inhibited by BIF. This is First report of the identification and characterization of B-cellmore » inhibitory protein from tick. The current study facilitates the study of identifying the interaction among tick, Borrelia burgdorferi, the causative agent of Lyme disease, and host.« less
Proteochemometric model for predicting the inhibition of penicillin-binding proteins
NASA Astrophysics Data System (ADS)
Nabu, Sunanta; Nantasenamat, Chanin; Owasirikul, Wiwat; Lawung, Ratana; Isarankura-Na-Ayudhya, Chartchalerm; Lapins, Maris; Wikberg, Jarl E. S.; Prachayasittikul, Virapong
2015-02-01
Neisseria gonorrhoeae infection threatens to become an untreatable sexually transmitted disease in the near future owing to the increasing emergence of N. gonorrhoeae strains with reduced susceptibility and resistance to the extended-spectrum cephalosporins (ESCs), i.e. ceftriaxone and cefixime, which are the last remaining option for first-line treatment of gonorrhea. Alteration of the penA gene, encoding penicillin-binding protein 2 (PBP2), is the main mechanism conferring penicillin resistance including reduced susceptibility and resistance to ESCs. To predict and investigate putative amino acid mutations causing β-lactam resistance particularly for ESCs, we applied proteochemometric modeling to generalize N. gonorrhoeae susceptibility data for predicting the interaction of PBP2 with therapeutic β-lactam antibiotics. This was afforded by correlating publicly available data on antimicrobial susceptibility of wild-type and mutant N. gonorrhoeae strains for penicillin-G, cefixime and ceftriaxone with 50 PBP2 protein sequence data using partial least-squares projections to latent structures. The generated model revealed excellent predictability ( R 2 = 0.91, Q 2 = 0.77, Q Ext 2 = 0.78). Moreover, our model identified amino acid mutations in PBP2 with the highest impact on antimicrobial susceptibility and provided information on physicochemical properties of amino acid mutations affecting antimicrobial susceptibility. Our model thus provided insight into the physicochemical basis for resistance development in PBP2 suggesting its use for predicting and monitoring novel PBP2 mutations that may emerge in the future.
Merkx-Jacques, Alexandra; Coors, Anja; Brousseau, Roland; Masson, Luke; Mazza, Alberto; Tien, Yuan-Ching; Topp, Edward
2013-04-01
The detection and abundance of Escherichia coli in water is used to monitor and mandate the quality of drinking and recreational water. Distinguishing commensal waterborne E. coli isolates from those that cause diarrhea or extraintestinal disease in humans is important for quantifying human health risk. A DNA microarray was used to evaluate the distribution of virulence genes in 148 E. coli environmental isolates from a watershed in eastern Ontario, Canada, and in eight clinical isolates. Their pathogenic potential was evaluated with Caenorhabditis elegans, and the concordance between the bioassay result and the pathotype deduced by genotyping was explored. Isolates identified as potentially pathogenic on the basis of their complement of virulence genes were significantly more likely to be pathogenic to C. elegans than those determined to be potentially nonpathogenic. A number of isolates that were identified as nonpathogenic on the basis of genotyping were pathogenic in the infection assay, suggesting that genotyping did not capture all potentially pathogenic types. The detection of the adhesin-encoding genes sfaD, focA, and focG, which encode adhesins; of iroN2, which encodes a siderophore receptor; of pic, which encodes an autotransporter protein; and of b1432, which encodes a putative transposase, was significantly associated with pathogenicity in the infection assay. Overall, E. coli isolates predicted to be pathogenic on the basis of genotyping were indeed so in the C. elegans infection assay. Furthermore, the detection of C. elegans-infective environmental isolates predicted to be nonpathogenic on the basis of genotyping suggests that there are hitherto-unrecognized virulence factors or combinations thereof that are important in the establishment of infection.
Secretome Analysis of Vibrio cholerae Type VI Secretion System Reveals a New Effector-Immunity Pair
Altindis, Emrah; Dong, Tao; Catalano, Christy
2015-01-01
ABSTRACT The type VI secretion system (T6SS) is a dynamic macromolecular organelle that many Gram-negative bacteria use to inhibit or kill other prokaryotic or eukaryotic cells. The toxic effectors of T6SS are delivered to the prey cells in a contact-dependent manner. In Vibrio cholerae, the etiologic agent of cholera, T6SS is active during intestinal infection. Here, we describe the use of comparative proteomics coupled with bioinformatics to identify a new T6SS effector-immunity pair. This analysis was able to identify all previously identified secreted substrates of T6SS except PAAR (proline, alanine, alanine, arginine) motif-containing proteins. Additionally, this approach led to the identification of a new secreted protein encoded by VCA0285 (TseH) that carries a predicted hydrolase domain. We confirmed that TseH is toxic when expressed in the periplasm of Escherichia coli and V. cholerae cells. The toxicity observed in V. cholerae was suppressed by coexpression of the protein encoded by VCA0286 (TsiH), indicating that this protein is the cognate immunity protein of TseH. Furthermore, exogenous addition of purified recombinant TseH to permeabilized E. coli cells caused cell lysis. Bioinformatics analysis of the TseH protein sequence suggest that it is a member of a new family of cell wall-degrading enzymes that include proteins belonging to the YD repeat and Rhs superfamilies and that orthologs of TseH are likely expressed by species belonging to phyla as diverse as Bacteroidetes and Proteobacteria. PMID:25759499
Intracellular Localization of Arabidopsis Sulfurtransferases1
Bauer, Michael; Dietrich, Christof; Nowak, Katharina; Sierralta, Walter D.; Papenbrock, Jutta
2004-01-01
Sulfurtransferases (Str) comprise a group of enzymes widely distributed in archaea, eubacteria, and eukaryota which catalyze the transfer of a sulfur atom from suitable sulfur donors to nucleophilic sulfur acceptors. In all organisms analyzed to date, small gene families encoding Str proteins have been identified. The gene products were localized to different compartments of the cells. Our interest concerns the localization of Str proteins encoded in the nuclear genome of Arabidopsis. Computer-based prediction methods revealed localization in different compartments of the cell for six putative AtStrs. Several methods were used to determine the localization of the AtStr proteins experimentally. For AtStr1, a mitochondrial localization was demonstrated by immunodetection in the proteome of isolated mitochondria resolved by one- and two-dimensional gel electrophoresis and subsequent blotting. The respective mature AtStr1 protein was identified by mass spectrometry sequencing. The same result was obtained by transient expression of fusion constructs with the green fluorescent protein in Arabidopsis protoplasts, whereas AtStr2 was exclusively localized to the cytoplasm by this method. Three members of the single-domain AtStr were localized in the chloroplasts as demonstrated by transient expression of green fluorescent protein fusions in protoplasts and stomata, whereas the single-domain AtStr18 was shown to be cytoplasmic. The remarkable subcellular distribution of AtStr15 was additionally analyzed by transmission electron immunomicroscopy using a monospecific antibody against green fluorescent protein, indicating an attachment to the thylakoid membrane. The knowledge of the intracellular localization of the members of this multiprotein family will help elucidate their specific functions in the organism. PMID:15181206
Intracellular localization of Arabidopsis sulfurtransferases.
Bauer, Michael; Dietrich, Christof; Nowak, Katharina; Sierralta, Walter D; Papenbrock, Jutta
2004-06-01
Sulfurtransferases (Str) comprise a group of enzymes widely distributed in archaea, eubacteria, and eukaryota which catalyze the transfer of a sulfur atom from suitable sulfur donors to nucleophilic sulfur acceptors. In all organisms analyzed to date, small gene families encoding Str proteins have been identified. The gene products were localized to different compartments of the cells. Our interest concerns the localization of Str proteins encoded in the nuclear genome of Arabidopsis. Computer-based prediction methods revealed localization in different compartments of the cell for six putative AtStrs. Several methods were used to determine the localization of the AtStr proteins experimentally. For AtStr1, a mitochondrial localization was demonstrated by immunodetection in the proteome of isolated mitochondria resolved by one- and two-dimensional gel electrophoresis and subsequent blotting. The respective mature AtStr1 protein was identified by mass spectrometry sequencing. The same result was obtained by transient expression of fusion constructs with the green fluorescent protein in Arabidopsis protoplasts, whereas AtStr2 was exclusively localized to the cytoplasm by this method. Three members of the single-domain AtStr were localized in the chloroplasts as demonstrated by transient expression of green fluorescent protein fusions in protoplasts and stomata, whereas the single-domain AtStr18 was shown to be cytoplasmic. The remarkable subcellular distribution of AtStr15 was additionally analyzed by transmission electron immunomicroscopy using a monospecific antibody against green fluorescent protein, indicating an attachment to the thylakoid membrane. The knowledge of the intracellular localization of the members of this multiprotein family will help elucidate their specific functions in the organism.
On the role of PDZ domain-encoding genes in Drosophila border cell migration.
Aranjuez, George; Kudlaty, Elizabeth; Longworth, Michelle S; McDonald, Jocelyn A
2012-11-01
Cells often move as collective groups during normal embryonic development and wound healing, although the mechanisms governing this type of migration are poorly understood. The Drosophila melanogaster border cells migrate as a cluster during late oogenesis and serve as a powerful in vivo genetic model for collective cell migration. To discover new genes that participate in border cell migration, 64 out of 66 genes that encode PDZ domain-containing proteins were systematically targeted by in vivo RNAi knockdown. The PDZ domain is one of the largest families of protein-protein interaction domains found in eukaryotes. Proteins that contain PDZ domains participate in a variety of biological processes, including signal transduction and establishment of epithelial apical-basal polarity. Targeting PDZ proteins effectively assesses a larger number of genes via the protein complexes and pathways through which these proteins function. par-6, a known regulator of border cell migration, was a positive hit and thus validated the approach. Knockdown of 14 PDZ domain genes disrupted migration with multiple RNAi lines. The candidate genes have diverse predicted cellular functions and are anticipated to provide new insights into the mechanisms that control border cell movement. As a test of this concept, two genes that disrupted migration were characterized in more detail: big bang and the Dlg5 homolog CG6509. We present evidence that Big bang regulates JAK/STAT signaling, whereas Dlg5/CG6509 maintains cluster cohesion. Moreover, these results demonstrate that targeting a selected class of genes by RNAi can uncover novel regulators of collective cell migration.
Kim, Myoung-Ju; Choi, Jin-Won; Park, Seung-Moon; Cha, Byeong-Jin; Yang, Moon-Sik; Kim, Dae-Hyuk
2002-08-01
The chestnut blight fungus Cryphonectria parasitica and its hypovirus comprise useful model system to study the mechanisms of hypoviral infection. We used degenerate primers based on fungal protein kinases to isolate a gene, cppk1, which encodes a novel Ser/Thr protein kinase of C. parasitica. The gene showed highest homology to ptk1, a Ser/Thr protein kinase from Trichoderma reesei. The encoded protein had a predicted mass of 70.5 kDa and a pI of 7.45. Northern blot analyses revealed that the cppk1 transcript was expressed from the beginning of culture, with a slight increase by 5 days of culture. However, its expression was specifically affected by the presence of virus, and it was transcriptionally upregulated in the fungal strain infected with the hypovirus. A kinase assay using Escherichia coli-derived CpPK1 revealed CpPK1-specific phosphorylated proteins with estimated masses of 50 kDa and 44 kDa. In addition, the phosphorylation of both proteins was higher in a cell-free extract from the hypovirulent strain. The increased expression of cppk1 by the introduction of an additional copy results in a subset of viral symptoms of reduced pigmentation and conidiation in a virus-free isolate. cppk1 overexpression also causes the downregulation of mating factor genes Mf2/1 and Mf2/2, resulting in female sterility. The present study suggests that the hypovirus disturbs fungal signalling by transcriptional upregulation of cppk1, which results in reduced pigmentation and conidiation and female sterility.
Gostinčar, Cene; Ohm, Robin A; Kogej, Tina; Sonjak, Silva; Turk, Martina; Zajc, Janja; Zalar, Polona; Grube, Martin; Sun, Hui; Han, James; Sharma, Aditi; Chiniquy, Jennifer; Ngan, Chew Yee; Lipzen, Anna; Barry, Kerrie; Grigoriev, Igor V; Gunde-Cimerman, Nina
2014-07-01
Aureobasidium pullulans is a black-yeast-like fungus used for production of the polysaccharide pullulan and the antimycotic aureobasidin A, and as a biocontrol agent in agriculture. It can cause opportunistic human infections, and it inhabits various extreme environments. To promote the understanding of these traits, we performed de-novo genome sequencing of the four varieties of A. pullulans. The 25.43-29.62 Mb genomes of these four varieties of A. pullulans encode between 10266 and 11866 predicted proteins. Their genomes encode most of the enzyme families involved in degradation of plant material and many sugar transporters, and they have genes possibly associated with degradation of plastic and aromatic compounds. Proteins believed to be involved in the synthesis of pullulan and siderophores, but not of aureobasidin A, are predicted. Putative stress-tolerance genes include several aquaporins and aquaglyceroporins, large numbers of alkali-metal cation transporters, genes for the synthesis of compatible solutes and melanin, all of the components of the high-osmolarity glycerol pathway, and bacteriorhodopsin-like proteins. All of these genomes contain a homothallic mating-type locus. The differences between these four varieties of A. pullulans are large enough to justify their redefinition as separate species: A. pullulans, A. melanogenum, A. subglaciale and A. namibiae. The redundancy observed in several gene families can be linked to the nutritional versatility of these species and their particular stress tolerance. The availability of the genome sequences of the four Aureobasidium species should improve their biotechnological exploitation and promote our understanding of their stress-tolerance mechanisms, diverse lifestyles, and pathogenic potential.
NASA Astrophysics Data System (ADS)
Yang, Jingwen; Xu, Yuchao; Xu, Ke; Ping, Hongling; Shi, Huilai; Lü, Zhenming; Wu, Changwen; Wang, Tianming
2017-08-01
Neuropeptide Y (NPY) has a pivotal role in the regulation of many physiological processes. In this study, the gene encoding a NPY receptor-like from the common Chinese cuttlefish Sepiella japonica (SjNPYR-like) was identified and characterized. The full-length SjNPYR-like cDNA was cloned containing a 492-bp of 5' untranslated region (UTR), 1 182 bp open reading frame (ORF) encoding a protein of 393 amino acid residues, and 228 bp of 3' UTR. The putative protein was predicted to have a molecular weight of 45.54 kDa and an isoelectric point (pI) of 8.13. By informatic analyses, SjNPYR-like was identified as belonging to the class A G protein coupled receptor (GPCR) family (the rhodopsin-type). The amino acid sequence contained 12 potential phosphorylation sites and five predicted N-linked glycosylation sites. Multiple sequence alignment and 3D structure modeling were conducted to clarify SjNPYR bioinformatics characteristics. Phylogenetic analysis identifies it as an NPYR with identity of 33% to Lymnaea stagnalis NPFR. Transmembrane properties of SjNPYR-like were demonstrated in vitro using HEK293 cells and the pEGFP-N1 plasmid. Relative quantification of SjNPYR-like mRNA level confirmed a high level expression and broad distribution of SjNPYR - like in various tissues of female S. japonica. In addition, the transcriptional profile of SjNPYR - like in the brain, liver, and ovary during gonadal development was analyzed. The results provide basic understanding on the molecular characteristics of SjNPYR-like and its potentially physical functions.
Dolan, Jackie; Walshe, Karen; Alsbury, Samantha; Hokamp, Karsten; O'Keeffe, Sean; Okafuji, Tatsuya; Miller, Suzanne FC; Tear, Guy; Mitchell, Kevin J
2007-01-01
Background Leucine-rich repeats (LRRs) are highly versatile and evolvable protein-ligand interaction motifs found in a large number of proteins with diverse functions, including innate immunity and nervous system development. Here we catalogue all of the extracellular LRR (eLRR) proteins in worms, flies, mice and humans. We use convergent evidence from several transmembrane-prediction and motif-detection programs, including a customised algorithm, LRRscan, to identify eLRR proteins, and a hierarchical clustering method based on TribeMCL to establish their evolutionary relationships. Results This yields a total of 369 proteins (29 in worm, 66 in fly, 135 in mouse and 139 in human), many of them of unknown function. We group eLRR proteins into several classes: those with only LRRs, those that cluster with Toll-like receptors (Tlrs), those with immunoglobulin or fibronectin-type 3 (FN3) domains and those with some other domain. These groups show differential patterns of expansion and diversification across species. Our analyses reveal several clusters of novel genes, including two Elfn genes, encoding transmembrane proteins with eLRRs and an FN3 domain, and six genes encoding transmembrane proteins with eLRRs only (the Elron cluster). Many of these are expressed in discrete patterns in the developing mouse brain, notably in the thalamus and cortex. We have also identified a number of novel fly eLRR proteins with discrete expression in the embryonic nervous system. Conclusion This study provides the necessary foundation for a systematic analysis of the functions of this class of genes, which are likely to include prominently innate immunity, inflammation and neural development, especially the specification of neuronal connectivity. PMID:17868438