uncharacterized protein sequences: Topics by Science.gov

Sample records for uncharacterized protein sequences

Proteins of unknown function in the Protein Data Bank (PDB): an inventory of true uncharacterized proteins and computational tools for their analysis.

PubMed

Nadzirin, Nurul; Firdaus-Raih, Mohd

2012-10-08

Proteins of uncharacterized functions form a large part of many of the currently available biological databases and this situation exists even in the Protein Data Bank (PDB). Our analysis of recent PDB data revealed that only 42.53% of PDB entries (1084 coordinate files) that were categorized under "unknown function" are true examples of proteins of unknown function at this point in time. The remainder 1465 entries also annotated as such appear to be able to have their annotations re-assessed, based on the availability of direct functional characterization experiments for the protein itself, or for homologous sequences or structures thus enabling computational function inference.
Structural genomics analysis of uncharacterized protein families overrepresented in human gut bacteria identifies a novel glycoside hydrolase

PubMed Central

2014-01-01

Background Bacteroides spp. form a significant part of our gut microbiome and are well known for optimized metabolism of diverse polysaccharides. Initial analysis of the archetypal Bacteroides thetaiotaomicron genome identified 172 glycosyl hydrolases and a large number of uncharacterized proteins associated with polysaccharide metabolism. Results BT_1012 from Bacteroides thetaiotaomicron VPI-5482 is a protein of unknown function and a member of a large protein family consisting entirely of uncharacterized proteins. Initial sequence analysis predicted that this protein has two domains, one on the N- and one on the C-terminal. A PSI-BLAST search found over 150 full length and over 90 half size homologs consisting only of the N-terminal domain. The experimentally determined three-dimensional structure of the BT_1012 protein confirms its two-domain architecture and structural analysis of both domains suggests their specific functions. The N-terminal domain is a putative catalytic domain with significant similarity to known glycoside hydrolases, the C-terminal domain has a beta-sandwich fold typically found in C-terminal domains of other glycosyl hydrolases, however these domains are typically involved in substrate binding. We describe the structure of the BT_1012 protein and discuss its sequence-structure relationship and their possible functional implications. Conclusions Structural and sequence analyses of the BT_1012 protein identifies it as a glycosyl hydrolase, expanding an already impressive catalog of enzymes involved in polysaccharide metabolism in Bacteroides spp. Based on this we have renamed the Pfam families representing the two domains found in the BT_1012 protein, PF13204 and PF12904, as putative glycoside hydrolase and glycoside hydrolase-associated C-terminal domain respectively. PMID:24742328
NovelFam3000 – Uncharacterized human protein domains conserved across model organisms

PubMed Central

Kemmer, Danielle; Podowski, Raf M; Arenillas, David; Lim, Jonathan; Hodges, Emily; Roth, Peggy; Sonnhammer, Erik LL; Höög, Christer; Wasserman, Wyeth W

2006-01-01

Background Despite significant efforts from the research community, an extensive portion of the proteins encoded by human genes lack an assigned cellular function. Most metazoan proteins are composed of structural and/or functional domains, of which many appear in multiple proteins. Once a domain is characterized in one protein, the presence of a similar sequence in an uncharacterized protein serves as a basis for inference of function. Thus knowledge of a domain's function, or the protein within which it arises, can facilitate the analysis of an entire set of proteins. Description From the Pfam domain database, we extracted uncharacterized protein domains represented in proteins from humans, worms, and flies. A data centre was created to facilitate the analysis of the uncharacterized domain-containing proteins. The centre both provides researchers with links to dispersed internet resources containing gene-specific experimental data and enables them to post relevant experimental results or comments. For each human gene in the system, a characterization score is posted, allowing users to track the progress of characterization over time or to identify for study uncharacterized domains in well-characterized genes. As a test of the system, a subset of 39 domains was selected for analysis and the experimental results posted to the NovelFam3000 system. For 25 human protein members of these 39 domain families, detailed sub-cellular localizations were determined. Specific observations are presented based on the analysis of the integrated information provided through the online NovelFam3000 system. Conclusion Consistent experimental results between multiple members of a domain family allow for inferences of the domain's functional role. We unite bioinformatics resources and experimental data in order to accelerate the functional characterization of scarcely annotated domain families. PMID:16533400
Prediction and characterization of enzymatic activities guided by sequence similarity and genome neighborhood networks

DOE PAGES

Zhao, Suwen; Sakai, Ayano; Zhang, Xinshuai; ...

2014-06-30

Metabolic pathways in eubacteria and archaea often are encoded by operons and/or gene clusters (genome neighborhoods) that provide important clues for assignment of both enzyme functions and metabolic pathways. We describe a bioinformatic approach (genome neighborhood network; GNN) that enables large scale prediction of the in vitro enzymatic activities and in vivo physiological functions (metabolic pathways) of uncharacterized enzymes in protein families. We demonstrate the utility of the GNN approach by predicting in vitro activities and in vivo functions in the proline racemase superfamily (PRS; InterPro IPR008794). The predictions were verified by measuring in vitro activities for 51 proteins inmore » 12 families in the PRS that represent ~85% of the sequences; in vitro activities of pathway enzymes, carbon/nitrogen source phenotypes, and/or transcriptomic studies confirmed the predicted pathways. The synergistic use of sequence similarity networks3 and GNNs will facilitate the discovery of the components of novel, uncharacterized metabolic pathways in sequenced genomes.« less
Exploiting Amino Acid Composition for Predicting Protein-Protein Interactions

PubMed Central

Roy, Sushmita; Martinez, Diego; Platero, Harriett; Lane, Terran; Werner-Washburne, Margaret

2009-01-01

Background Computational prediction of protein interactions typically use protein domains as classifier features because they capture conserved information of interaction surfaces. However, approaches relying on domains as features cannot be applied to proteins without any domain information. In this paper, we explore the contribution of pure amino acid composition (AAC) for protein interaction prediction. This simple feature, which is based on normalized counts of single or pairs of amino acids, is applicable to proteins from any sequenced organism and can be used to compensate for the lack of domain information. Results AAC performed at par with protein interaction prediction based on domains on three yeast protein interaction datasets. Similar behavior was obtained using different classifiers, indicating that our results are a function of features and not of classifiers. In addition to yeast datasets, AAC performed comparably on worm and fly datasets. Prediction of interactions for the entire yeast proteome identified a large number of novel interactions, the majority of which co-localized or participated in the same processes. Our high confidence interaction network included both well-studied and uncharacterized proteins. Proteins with known function were involved in actin assembly and cell budding. Uncharacterized proteins interacted with proteins involved in reproduction and cell budding, thus providing putative biological roles for the uncharacterized proteins. Conclusion AAC is a simple, yet powerful feature for predicting protein interactions, and can be used alone or in conjunction with protein domains to predict new and validate existing interactions. More importantly, AAC alone performs at par with existing, but more complex, features indicating the presence of sequence-level information that is predictive of interaction, but which is not necessarily restricted to domains. PMID:19936254
Global analysis of bacterial transcription factors to predict cellular target processes.

PubMed

Doerks, Tobias; Andrade, Miguel A; Lathe, Warren; von Mering, Christian; Bork, Peer

2004-03-01

Whole-genome sequences are now available for >100 bacterial species, giving unprecedented power to comparative genomics approaches. We have applied genome-context methods to predict target processes that are regulated by transcription factors (TFs). Of 128 orthologous groups of proteins annotated as TFs, to date, 36 are functionally uncharacterized; in our analysis we predict a probable cellular target process or biochemical pathway for half of these functionally uncharacterized TFs.
Activity-based proteomics of enzyme superfamilies: serine hydrolases as a case study.

PubMed

Simon, Gabriel M; Cravatt, Benjamin F

2010-04-09

Genome sequencing projects have uncovered thousands of uncharacterized enzymes in eukaryotic and prokaryotic organisms. Deciphering the physiological functions of enzymes requires tools to profile and perturb their activities in native biological systems. Activity-based protein profiling has emerged as a powerful chemoproteomic strategy to achieve these objectives through the use of chemical probes that target large swaths of enzymes that share active-site features. Here, we review activity-based protein profiling and its implementation to annotate the enzymatic proteome, with particular attention given to probes that target serine hydrolases, a diverse superfamily of enzymes replete with many uncharacterized members.
Gene discovery in the hamster: a comparative genomics approach for gene annotation by sequencing of hamster testis cDNAs

PubMed Central

Oduru, Sreedhar; Campbell, Janee L; Karri, SriTulasi; Hendry, William J; Khan, Shafiq A; Williams, Simon C

2003-01-01

Background Complete genome annotation will likely be achieved through a combination of computer-based analysis of available genome sequences combined with direct experimental characterization of expressed regions of individual genomes. We have utilized a comparative genomics approach involving the sequencing of randomly selected hamster testis cDNAs to begin to identify genes not previously annotated on the human, mouse, rat and Fugu (pufferfish) genomes. Results 735 distinct sequences were analyzed for their relatedness to known sequences in public databases. Eight of these sequences were derived from previously unidentified genes and expression of these genes in testis was confirmed by Northern blotting. The genomic locations of each sequence were mapped in human, mouse, rat and pufferfish, where applicable, and the structure of their cognate genes was derived using computer-based predictions, genomic comparisons and analysis of uncharacterized cDNA sequences from human and macaque. Conclusion The use of a comparative genomics approach resulted in the identification of eight cDNAs that correspond to previously uncharacterized genes in the human genome. The proteins encoded by these genes included a new member of the kinesin superfamily, a SET/MYND-domain protein, and six proteins for which no specific function could be predicted. Each gene was expressed primarily in testis, suggesting that they may play roles in the development and/or function of testicular cells. PMID:12783626
Proteomic Mapping of Dental Enamel Matrix from Inbred Mouse Strains: Unraveling Potential New Players in Enamel.

PubMed

Lima Leite, Aline; Silva Fernandes, Mileni; Charone, Senda; Whitford, Gary Milton; Everett, Eric T; Buzalaf, Marília Afonso Rabelo

2018-01-01

Enamel formation is a complex 2-step process by which proteins are secreted to form an extracellular matrix, followed by massive protein degradation and subsequent mineralization. Excessive systemic exposure to fluoride can disrupt this process and lead to a condition known as dental fluorosis. The genetic background influences the responses of mineralized tissues to fluoride, such as dental fluorosis, observed in A/J and 129P3/J mice. The aim of the present study was to map the protein profile of enamel matrix from A/J and 129P3/J strains. Enamel matrix samples were obtained from A/J and 129P3/J mice and analyzed by 2-dimensional electrophoresis and liquid chromatography coupled with mass spectrometry. A total of 120 proteins were identified, and 7 of them were classified as putative uncharacterized proteins and analyzed in silico for structural and functional characterization. An interesting finding was the possibility of the uncharacterized sequence Q8BIS2 being an enzyme involved in the degradation of matrix proteins. Thus, the results provide a comprehensive view of the structure and function for putative uncharacterized proteins found in the enamel matrix that could help to elucidate the mechanisms involved in enamel biomineralization and genetic susceptibility to dental fluorosis. © 2018 S. Karger AG, Basel.
Predicting Protein Relationships to Human Pathways through a Relational Learning Approach Based on Simple Sequence Features.

PubMed

García-Jiménez, Beatriz; Pons, Tirso; Sanchis, Araceli; Valencia, Alfonso

2014-01-01

Biological pathways are important elements of systems biology and in the past decade, an increasing number of pathway databases have been set up to document the growing understanding of complex cellular processes. Although more genome-sequence data are becoming available, a large fraction of it remains functionally uncharacterized. Thus, it is important to be able to predict the mapping of poorly annotated proteins to original pathway models. We have developed a Relational Learning-based Extension (RLE) system to investigate pathway membership through a function prediction approach that mainly relies on combinations of simple properties attributed to each protein. RLE searches for proteins with molecular similarities to specific pathway components. Using RLE, we associated 383 uncharacterized proteins to 28 pre-defined human Reactome pathways, demonstrating relative confidence after proper evaluation. Indeed, in specific cases manual inspection of the database annotations and the related literature supported the proposed classifications. Examples of possible additional components of the Electron transport system, Telomere maintenance and Integrin cell surface interactions pathways are discussed in detail. All the human predicted proteins in the 2009 and 2012 releases 30 and 40 of Reactome are available at http://rle.bioinfo.cnio.es.
Network-based function prediction and interactomics: the case for metabolic enzymes.

PubMed

Janga, S C; Díaz-Mejía, J Javier; Moreno-Hagelsieb, G

2011-01-01

As sequencing technologies increase in power, determining the functions of unknown proteins encoded by the DNA sequences so produced becomes a major challenge. Functional annotation is commonly done on the basis of amino-acid sequence similarity alone. Long after sequence similarity becomes undetectable by pair-wise comparison, profile-based identification of homologs can often succeed due to the conservation of position-specific patterns, important for a protein's three dimensional folding and function. Nevertheless, prediction of protein function from homology-driven approaches is not without problems. Homologous proteins might evolve different functions and the power of homology detection has already started to reach its maximum. Computational methods for inferring protein function, which exploit the context of a protein in cellular networks, have come to be built on top of homology-based approaches. These network-based functional inference techniques provide both a first hand hint into a proteins' functional role and offer complementary insights to traditional methods for understanding the function of uncharacterized proteins. Most recent network-based approaches aim to integrate diverse kinds of functional interactions to boost both coverage and confidence level. These techniques not only promise to solve the moonlighting aspect of proteins by annotating proteins with multiple functions, but also increase our understanding on the interplay between different functional classes in a cell. In this article we review the state of the art in network-based function prediction and describe some of the underlying difficulties and successes. Given the volume of high-throughput data that is being reported the time is ripe to employ these network-based approaches, which can be used to unravel the functions of the uncharacterized proteins accumulating in the genomic databases. © 2010 Elsevier Inc. All rights reserved.
Using the underlying biological organization of the Mycobacterium tuberculosis functional network for protein function prediction.

PubMed

Mazandu, Gaston K; Mulder, Nicola J

2012-07-01

Despite ever-increasing amounts of sequence and functional genomics data, there is still a deficiency of functional annotation for many newly sequenced proteins. For Mycobacterium tuberculosis (MTB), more than half of its genome is still uncharacterized, which hampers the search for new drug targets within the bacterial pathogen and limits our understanding of its pathogenicity. As for many other genomes, the annotations of proteins in the MTB proteome were generally inferred from sequence homology, which is effective but its applicability has limitations. We have carried out large-scale biological data integration to produce an MTB protein functional interaction network. Protein functional relationships were extracted from the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) database, and additional functional interactions from microarray, sequence and protein signature data. The confidence level of protein relationships in the additional functional interaction data was evaluated using a dynamic data-driven scoring system. This functional network has been used to predict functions of uncharacterized proteins using Gene Ontology (GO) terms, and the semantic similarity between these terms measured using a state-of-the-art GO similarity metric. To achieve better trade-off between improvement of quality, genomic coverage and scalability, this prediction is done by observing the key principles driving the biological organization of the functional network. This study yields a new functionally characterized MTB strain CDC1551 proteome, consisting of 3804 and 3698 proteins out of 4195 with annotations in terms of the biological process and molecular function ontologies, respectively. These data can contribute to research into the Development of effective anti-tubercular drugs with novel biological mechanisms of action. Copyright © 2011 Elsevier B.V. All rights reserved.
An estimated 5% of new protein structures solved today represent a new Pfam family

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mistry, Jaina; Kloppmann, Edda; Rost, Burkhard

2013-11-01

This study uses the Pfam database to show that the sequence redundancy of protein structures deposited in the PDB is increasing. The possible reasons behind this trend are discussed. High-resolution structural knowledge is key to understanding how proteins function at the molecular level. The number of entries in the Protein Data Bank (PDB), the repository of all publicly available protein structures, continues to increase, with more than 8000 structures released in 2012 alone. The authors of this article have studied how structural coverage of the protein-sequence space has changed over time by monitoring the number of Pfam families that acquiredmore » their first representative structure each year from 1976 to 2012. Twenty years ago, for every 100 new PDB entries released, an estimated 20 Pfam families acquired their first structure. By 2012, this decreased to only about five families per 100 structures. The reasons behind the slower pace at which previously uncharacterized families are being structurally covered were investigated. It was found that although more than 50% of current Pfam families are still without a structural representative, this set is enriched in families that are small, functionally uncharacterized or rich in problem features such as intrinsically disordered and transmembrane regions. While these are important constraints, the reasons why it may not yet be time to give up the pursuit of a targeted but more comprehensive structural coverage of the protein-sequence space are discussed.« less
Directed proteomic analysis of the human nucleolus.

PubMed

Andersen, Jens S; Lyon, Carol E; Fox, Archa H; Leung, Anthony K L; Lam, Yun Wah; Steen, Hanno; Mann, Matthias; Lamond, Angus I

2002-01-08

The nucleolus is a subnuclear organelle containing the ribosomal RNA gene clusters and ribosome biogenesis factors. Recent studies suggest it may also have roles in RNA transport, RNA modification, and cell cycle regulation. Despite over 150 years of research into nucleoli, many aspects of their structure and function remain uncharacterized. We report a proteomic analysis of human nucleoli. Using a combination of mass spectrometry (MS) and sequence database searches, including online analysis of the draft human genome sequence, 271 proteins were identified. Over 30% of the nucleolar proteins were encoded by novel or uncharacterized genes, while the known proteins included several unexpected factors with no previously known nucleolar functions. MS analysis of nucleoli isolated from HeLa cells in which transcription had been inhibited showed that a subset of proteins was enriched. These data highlight the dynamic nature of the nucleolar proteome and show that proteins can either associate with nucleoli transiently or accumulate only under specific metabolic conditions. This extensive proteomic analysis shows that nucleoli have a surprisingly large protein complexity. The many novel factors and separate classes of proteins identified support the view that the nucleolus may perform additional functions beyond its known role in ribosome subunit biogenesis. The data also show that the protein composition of nucleoli is not static and can alter significantly in response to the metabolic state of the cell.
Characterization and complete genome sequence of a previously uncharacterized panicovirus from Bermuda grass detected by high throughput sequencing

USDA-ARS?s Scientific Manuscript database

Bermuda grass samples were examined by transmission electron microscopy and 28-30 nm spherical virus particles were observed. Total RNA from these plants was subjected to high throughput sequencing (HTS). The nearly full genome sequence of a previously uncharacterized Panicovirus was identified from...
hPDI: a database of experimental human protein-DNA interactions.

PubMed

Xie, Zhi; Hu, Shaohui; Blackshaw, Seth; Zhu, Heng; Qian, Jiang

2010-01-15

The human protein DNA Interactome (hPDI) database holds experimental protein-DNA interaction data for humans identified by protein microarray assays. The unique characteristics of hPDI are that it contains consensus DNA-binding sequences not only for nearly 500 human transcription factors but also for >500 unconventional DNA-binding proteins, which are completely uncharacterized previously. Users can browse, search and download a subset or the entire data via a web interface. This database is freely accessible for any academic purposes. http://bioinfo.wilmer.jhu.edu/PDI/.
Cellular automata and its applications in protein bioinformatics.

PubMed

Xiao, Xuan; Wang, Pu; Chou, Kuo-Chen

2011-09-01

With the explosion of protein sequences generated in the postgenomic era, it is highly desirable to develop high-throughput tools for rapidly and reliably identifying various attributes of uncharacterized proteins based on their sequence information alone. The knowledge thus obtained can help us timely utilize these newly found protein sequences for both basic research and drug discovery. Many bioinformatics tools have been developed by means of machine learning methods. This review is focused on the applications of a new kind of science (cellular automata) in protein bioinformatics. A cellular automaton (CA) is an open, flexible and discrete dynamic model that holds enormous potentials in modeling complex systems, in spite of the simplicity of the model itself. Researchers, scientists and practitioners from different fields have utilized cellular automata for visualizing protein sequences, investigating their evolution processes, and predicting their various attributes. Owing to its impressive power, intuitiveness and relative simplicity, the CA approach has great potential for use as a tool for bioinformatics.
Bioinformatic and Comparative Localization of Rab Proteins Reveals Functional Insights into the Uncharacterized GTPases Ypt10p and Ypt11p†

PubMed Central

Buvelot Frei, Stéphanie; Rahl, Peter B.; Nussbaum, Maria; Briggs, Benjamin J.; Calero, Monica; Janeczko, Stephanie; Regan, Andrew D.; Chen, Catherine Z.; Barral, Yves; Whittaker, Gary R.; Collins, Ruth N.

2006-01-01

A striking characteristic of a Rab protein is its steady-state localization to the cytosolic surface of a particular subcellular membrane. In this study, we have undertaken a combined bioinformatic and experimental approach to examine the evolutionary conservation of Rab protein localization. A comprehensive primary sequence classification shows that 10 out of the 11 Rab proteins identified in the yeast (Saccharomyces cerevisiae) genome can be grouped within a major subclass, each comprising multiple Rab orthologs from diverse species. We compared the locations of individual yeast Rab proteins with their localizations following ectopic expression in mammalian cells. Our results suggest that green fluorescent protein-tagged Rab proteins maintain localizations across large evolutionary distances and that the major known player in the Rab localization pathway, mammalian Rab-GDI, is able to function in yeast. These findings enable us to provide insight into novel gene functions and classify the uncharacterized Rab proteins Ypt10p (YBR264C) as being involved in endocytic function and Ypt11p (YNL304W) as being localized to the endoplasmic reticulum, where we demonstrate it is required for organelle inheritance. PMID:16980630
Rapid Identification of Sequences for Orphan Enzymes to Power Accurate Protein Annotation

PubMed Central

Ojha, Sunil; Watson, Douglas S.; Bomar, Martha G.; Galande, Amit K.; Shearer, Alexander G.

2013-01-01

The power of genome sequencing depends on the ability to understand what those genes and their proteins products actually do. The automated methods used to assign functions to putative proteins in newly sequenced organisms are limited by the size of our library of proteins with both known function and sequence. Unfortunately this library grows slowly, lagging well behind the rapid increase in novel protein sequences produced by modern genome sequencing methods. One potential source for rapidly expanding this functional library is the “back catalog” of enzymology – “orphan enzymes,” those enzymes that have been characterized and yet lack any associated sequence. There are hundreds of orphan enzymes in the Enzyme Commission (EC) database alone. In this study, we demonstrate how this orphan enzyme “back catalog” is a fertile source for rapidly advancing the state of protein annotation. Starting from three orphan enzyme samples, we applied mass-spectrometry based analysis and computational methods (including sequence similarity networks, sequence and structural alignments, and operon context analysis) to rapidly identify the specific sequence for each orphan while avoiding the most time- and labor-intensive aspects of typical sequence identifications. We then used these three new sequences to more accurately predict the catalytic function of 385 previously uncharacterized or misannotated proteins. We expect that this kind of rapid sequence identification could be efficiently applied on a larger scale to make enzymology’s “back catalog” another powerful tool to drive accurate genome annotation. PMID:24386392
Rapid identification of sequences for orphan enzymes to power accurate protein annotation.

PubMed

Ramkissoon, Kevin R; Miller, Jennifer K; Ojha, Sunil; Watson, Douglas S; Bomar, Martha G; Galande, Amit K; Shearer, Alexander G

2013-01-01

The power of genome sequencing depends on the ability to understand what those genes and their proteins products actually do. The automated methods used to assign functions to putative proteins in newly sequenced organisms are limited by the size of our library of proteins with both known function and sequence. Unfortunately this library grows slowly, lagging well behind the rapid increase in novel protein sequences produced by modern genome sequencing methods. One potential source for rapidly expanding this functional library is the "back catalog" of enzymology--"orphan enzymes," those enzymes that have been characterized and yet lack any associated sequence. There are hundreds of orphan enzymes in the Enzyme Commission (EC) database alone. In this study, we demonstrate how this orphan enzyme "back catalog" is a fertile source for rapidly advancing the state of protein annotation. Starting from three orphan enzyme samples, we applied mass-spectrometry based analysis and computational methods (including sequence similarity networks, sequence and structural alignments, and operon context analysis) to rapidly identify the specific sequence for each orphan while avoiding the most time- and labor-intensive aspects of typical sequence identifications. We then used these three new sequences to more accurately predict the catalytic function of 385 previously uncharacterized or misannotated proteins. We expect that this kind of rapid sequence identification could be efficiently applied on a larger scale to make enzymology's "back catalog" another powerful tool to drive accurate genome annotation.

A Genome-wide CRISPR Screen in Toxoplasma Identifies Essential Apicomplexan Genes.

PubMed

Sidik, Saima M; Huet, Diego; Ganesan, Suresh M; Huynh, My-Hang; Wang, Tim; Nasamu, Armiyaw S; Thiru, Prathapan; Saeij, Jeroen P J; Carruthers, Vern B; Niles, Jacquin C; Lourido, Sebastian

2016-09-08

Apicomplexan parasites are leading causes of human and livestock diseases such as malaria and toxoplasmosis, yet most of their genes remain uncharacterized. Here, we present the first genome-wide genetic screen of an apicomplexan. We adapted CRISPR/Cas9 to assess the contribution of each gene from the parasite Toxoplasma gondii during infection of human fibroblasts. Our analysis defines ∼200 previously uncharacterized, fitness-conferring genes unique to the phylum, from which 16 were investigated, revealing essential functions during infection of human cells. Secondary screens identify as an invasion factor the claudin-like apicomplexan microneme protein (CLAMP), which resembles mammalian tight-junction proteins and localizes to secretory organelles, making it critical to the initiation of infection. CLAMP is present throughout sequenced apicomplexan genomes and is essential during the asexual stages of the malaria parasite Plasmodium falciparum. These results provide broad-based functional information on T. gondii genes and will facilitate future approaches to expand the horizon of antiparasitic interventions. Copyright © 2016 Elsevier Inc. All rights reserved.
Integrating mRNA and Protein Sequencing Enables the Detection and Quantitative Profiling of Natural Protein Sequence Variants of Populus trichocarpa.

PubMed

Abraham, Paul E; Wang, Xiaojing; Ranjan, Priya; Nookaew, Intawat; Zhang, Bing; Tuskan, Gerald A; Hettich, Robert L

2015-12-04

Next-generation sequencing has transformed the ability to link genotypes to phenotypes and facilitates the dissection of genetic contribution to complex traits. However, it is challenging to link genetic variants with the perturbed functional effects on proteins encoded by such genes. Here we show how RNA sequencing can be exploited to construct genotype-specific protein sequence databases to assess natural variation in proteins, providing information about the molecular toolbox driving cellular processes. For this study, we used two natural genotypes selected from a recent genome-wide association study of Populus trichocarpa, an obligate outcrosser with tremendous phenotypic variation across the natural population. This strategy allowed us to comprehensively catalogue proteins containing single amino acid polymorphisms (SAAPs), as well as insertions and deletions. We profiled the frequency of 128 types of naturally occurring amino acid substitutions, including both expected (neutral) and unexpected (non-neutral) SAAPs, with a subset occurring in regions of the genome having strong polymorphism patterns consistent with recent positive and/or divergent selection. By zeroing in on the molecular signatures of these important regions that might have previously been uncharacterized, we now provide a high-resolution molecular inventory that should improve accessibility and subsequent identification of natural protein variants in future genotype-to-phenotype studies.
A novel class of plant-specific zinc-dependent DNA-binding protein that binds to A/T-rich DNA sequences

PubMed Central

Nagano, Yukio; Furuhashi, Hirofumi; Inaba, Takehito; Sasaki, Yukiko

2001-01-01

Complementary DNA encoding a DNA-binding protein, designated PLATZ1 (plant AT-rich sequence- and zinc-binding protein 1), was isolated from peas. The amino acid sequence of the protein is similar to those of other uncharacterized proteins predicted from the genome sequences of higher plants. However, no paralogous sequences have been found outside the plant kingdom. Multiple alignments among these paralogous proteins show that several cysteine and histidine residues are invariant, suggesting that these proteins are a novel class of zinc-dependent DNA-binding proteins with two distantly located regions, C-x2-H-x11-C-x2-C-x(4–5)-C-x2-C-x(3–7)-H-x2-H and C-x2-C-x(10–11)-C-x3-C. In an electrophoretic mobility shift assay, the zinc chelator 1,10-o-phenanthroline inhibited DNA binding, and two distant zinc-binding regions were required for DNA binding. A protein blot with 65ZnCl2 showed that both regions are required for zinc-binding activity. The PLATZ1 protein non-specifically binds to A/T-rich sequences, including the upstream region of the pea GTPase pra2 and plastocyanin petE genes. Expression of the PLATZ1 repressed those of the reporter constructs containing the coding sequence of luciferase gene driven by the cauliflower mosaic virus (CaMV) 35S90 promoter fused to the tandem repeat of the A/T-rich sequences. These results indicate that PLATZ1 is a novel class of plant-specific zinc-dependent DNA-binding protein responsible for A/T-rich sequence-mediated transcriptional repression. PMID:11600698
Predicting Functions of Proteins in Mouse Based on Weighted Protein-Protein Interaction Network and Protein Hybrid Properties

PubMed Central

Shi, Xiaohe; Lu, Wen-Cong; Cai, Yu-Dong; Chou, Kuo-Chen

2011-01-01

Background With the huge amount of uncharacterized protein sequences generated in the post-genomic age, it is highly desirable to develop effective computational methods for quickly and accurately predicting their functions. The information thus obtained would be very useful for both basic research and drug development in a timely manner. Methodology/Principal Findings Although many efforts have been made in this regard, most of them were based on either sequence similarity or protein-protein interaction (PPI) information. However, the former often fails to work if a query protein has no or very little sequence similarity to any function-known proteins, while the latter had similar problem if the relevant PPI information is not available. In view of this, a new approach is proposed by hybridizing the PPI information and the biochemical/physicochemical features of protein sequences. The overall first-order success rates by the new predictor for the functions of mouse proteins on training set and test set were 69.1% and 70.2%, respectively, and the success rate covered by the results of the top-4 order from a total of 24 orders was 65.2%. Conclusions/Significance The results indicate that the new approach is quite promising that may open a new avenue or direction for addressing the difficult and complicated problem. PMID:21283518
ssHMM: extracting intuitive sequence-structure motifs from high-throughput RNA-binding protein data

PubMed Central

Krestel, Ralf; Ohler, Uwe; Vingron, Martin; Marsico, Annalisa

2017-01-01

Abstract RNA-binding proteins (RBPs) play an important role in RNA post-transcriptional regulation and recognize target RNAs via sequence-structure motifs. The extent to which RNA structure influences protein binding in the presence or absence of a sequence motif is still poorly understood. Existing RNA motif finders either take the structure of the RNA only partially into account, or employ models which are not directly interpretable as sequence-structure motifs. We developed ssHMM, an RNA motif finder based on a hidden Markov model (HMM) and Gibbs sampling which fully captures the relationship between RNA sequence and secondary structure preference of a given RBP. Compared to previous methods which output separate logos for sequence and structure, it directly produces a combined sequence-structure motif when trained on a large set of sequences. ssHMM’s model is visualized intuitively as a graph and facilitates biological interpretation. ssHMM can be used to find novel bona fide sequence-structure motifs of uncharacterized RBPs, such as the one presented here for the YY1 protein. ssHMM reaches a high motif recovery rate on synthetic data, it recovers known RBP motifs from CLIP-Seq data, and scales linearly on the input size, being considerably faster than MEMERIS and RNAcontext on large datasets while being on par with GraphProt. It is freely available on Github and as a Docker image. PMID:28977546
Genetic characterization of K13965, a strain of Oak Vale virus from Western Australia

PubMed Central

Quan, Phenix-Lan; Williams, David T.; Johansen, Cheryl A.; Jain, Komal; Petrosov, Alexandra; Diviney, Sinead M.; Tashmukhamedova, Alla; Hutchison, Stephen K.; Tesh, Robert B.; Mackenzie, John S.; Briese, Thomas; Lipkin, W. Ian

2011-01-01

K13965, an uncharacterized virus, was isolated in 1993 from Anopheles annulipes mosquitoes collected in the Kimberley region of northern Western Australia. Here, we report its genomic sequence, identify it as a rhabdovirus, and characterize its phylogenetic relationships. The genome comprises a P′ (C) and SH protein similar to the recently characterized Tupaia and Durham viruses, and shows overlap between G and L genes. Comparison of K13965 genome sequence to other rhabdoviruses identified K13965 as a strain of the unclassified Australian Oak Vale rhabdovirus, whose complete genome sequence we also determined. Phylogenetic analysis of N and L sequences indicated genetic relationship to a recently proposed Sandjima virus clade, although the Oak Vale virus sequences form a branch separate from the African members of that group. PMID:21740935
Overlooked Short Toxin-Like Proteins: A Shortcut to Drug Design

PubMed Central

Linial, Michal

2017-01-01

Short stable peptides have huge potential for novel therapies and biosimilars. Cysteine-rich short proteins are characterized by multiple disulfide bridges in a compact structure. Many of these metazoan proteins are processed, folded, and secreted as soluble stable folds. These properties are shared by both marine and terrestrial animal toxins. These stable short proteins are promising sources for new drug development. We developed ClanTox (classifier of animal toxins) to identify toxin-like proteins (TOLIPs) using machine learning models trained on a large-scale proteomic database. Insects proteomes provide a rich source for protein innovations. Therefore, we seek overlooked toxin-like proteins from insects (coined iTOLIPs). Out of 4180 short (<75 amino acids) secreted proteins, 379 were predicted as iTOLIPs with high confidence, with as many as 30% of the genes marked as uncharacterized. Based on bioinformatics, structure modeling, and data-mining methods, we found that the most significant group of predicted iTOLIPs carry antimicrobial activity. Among the top predicted sequences were 120 termicin genes from termites with antifungal properties. Structural variations of insect antimicrobial peptides illustrate the similarity to a short version of the defensin fold with antifungal specificity. We also identified 9 proteins that strongly resemble ion channel inhibitors from scorpion and conus toxins. Furthermore, we assigned functional fold to numerous uncharacterized iTOLIPs. We conclude that a systematic approach for finding iTOLIPs provides a rich source of peptides for drug design and innovative therapeutic discoveries. PMID:29109389
SIBIS: a Bayesian model for inconsistent protein sequence estimation.

PubMed

Khenoussi, Walyd; Vanhoutrève, Renaud; Poch, Olivier; Thompson, Julie D

2014-09-01

The prediction of protein coding genes is a major challenge that depends on the quality of genome sequencing, the accuracy of the model used to elucidate the exonic structure of the genes and the complexity of the gene splicing process leading to different protein variants. As a consequence, today's protein databases contain a huge amount of inconsistency, due to both natural variants and sequence prediction errors. We have developed a new method, called SIBIS, to detect such inconsistencies based on the evolutionary information in multiple sequence alignments. A Bayesian framework, combined with Dirichlet mixture models, is used to estimate the probability of observing specific amino acids and to detect inconsistent or erroneous sequence segments. We evaluated the performance of SIBIS on a reference set of protein sequences with experimentally validated errors and showed that the sensitivity is significantly higher than previous methods, with only a small loss of specificity. We also assessed a large set of human sequences from the UniProt database and found evidence of inconsistency in 48% of the previously uncharacterized sequences. We conclude that the integration of quality control methods like SIBIS in automatic analysis pipelines will be critical for the robust inference of structural, functional and phylogenetic information from these sequences. Source code, implemented in C on a linux system, and the datasets of protein sequences are freely available for download at http://www.lbgi.fr/∼julie/SIBIS. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Sialome of a Generalist Lepidopteran Herbivore: Identification of Transcripts and Proteins from Helicoverpa armigera Labial Salivary Glands

PubMed Central

Celorio-Mancera, Maria de la Paz; Courtiade, Juliette; Muck, Alexander; Heckel, David G.; Musser, Richard O.; Vogel, Heiko

2011-01-01

Although the importance of insect saliva in insect-host plant interactions has been acknowledged, there is very limited information on the nature and complexity of the salivary proteome in lepidopteran herbivores. We inspected the labial salivary transcriptome and proteome of Helicoverpa armigera, an important polyphagous pest species. To identify the majority of the salivary proteins we have randomly sequenced 19,389 expressed sequence tags (ESTs) from a normalized cDNA library of salivary glands. In parallel, a non-cytosolic enriched protein fraction was obtained from labial salivary glands and subjected to two-dimensional gel electrophoresis (2-DE) and de novo peptide sequencing. This procedure allowed comparison of peptides and EST sequences and enabled us to identify 65 protein spots from the secreted labial saliva 2DE proteome. The mass spectrometry analysis revealed ecdysone, glucose oxidase, fructosidase, carboxyl/cholinesterase and an uncharacterized protein previously detected in H. armigera midgut proteome. Consistently, their corresponding transcripts are among the most abundant in our cDNA library. We did find redundancy of sequence identification of saliva-secreted proteins suggesting multiple isoforms. As expected, we found several enzymes responsible for digestion and plant offense. In addition, we identified non-digestive proteins such as an arginine kinase and abundant proteins of unknown function. This identification of secreted salivary gland proteins allows a more comprehensive understanding of insect feeding and poses new challenges for the elucidation of protein function. PMID:22046331
Genome sequence of a serotype M3 strain of group A Streptococcus: phage-encoded toxins, the high-virulence phenotype, and clone emergence.

PubMed

Beres, Stephen B; Sylva, Gail L; Barbian, Kent D; Lei, Benfang; Hoff, Jessica S; Mammarella, Nicole D; Liu, Meng-Yao; Smoot, James C; Porcella, Stephen F; Parkins, Larye D; Campbell, David S; Smith, Todd M; McCormick, John K; Leung, Donald Y M; Schlievert, Patrick M; Musser, James M

2002-07-23

Genome sequences are available for many bacterial strains, but there has been little progress in using these data to understand the molecular basis of pathogen emergence and differences in strain virulence. Serotype M3 strains of group A Streptococcus (GAS) are a common cause of severe invasive infections with unusually high rates of morbidity and mortality. To gain insight into the molecular basis of this high-virulence phenotype, we sequenced the genome of strain MGAS315, an organism isolated from a patient with streptococcal toxic shock syndrome. The genome is composed of 1,900,521 bp, and it shares approximately 1.7 Mb of related genetic material with genomes of serotype M1 and M18 strains. Phage-like elements account for the great majority of variation in gene content relative to the sequenced M1 and M18 strains. Recombination produces chimeric phages and strains with previously uncharacterized arrays of virulence factor genes. Strain MGAS315 has phage genes that encode proteins likely to contribute to pathogenesis, such as streptococcal pyrogenic exotoxin A (SpeA) and SpeK, streptococcal superantigen (SSA), and a previously uncharacterized phospholipase A(2) (designated Sla). Infected humans had anti-SpeK, -SSA, and -Sla antibodies, indicating that these GAS proteins are made in vivo. SpeK and SSA were pyrogenic and toxic for rabbits. Serotype M3 strains with the phage-encoded speK and sla genes increased dramatically in frequency late in the 20th century, commensurate with the rise in invasive disease caused by M3 organisms. Taken together, the results show that phage-mediated recombination has played a critical role in the emergence of a new, unusually virulent clone of serotype M3 GAS.
Not all transmembrane helices are born equal: Towards the extension of the sequence homology concept to membrane proteins

PubMed Central

2011-01-01

Background Sequence homology considerations widely used to transfer functional annotation to uncharacterized protein sequences require special precautions in the case of non-globular sequence segments including membrane-spanning stretches composed of non-polar residues. Simple, quantitative criteria are desirable for identifying transmembrane helices (TMs) that must be included into or should be excluded from start sequence segments in similarity searches aimed at finding distant homologues. Results We found that there are two types of TMs in membrane-associated proteins. On the one hand, there are so-called simple TMs with elevated hydrophobicity, low sequence complexity and extraordinary enrichment in long aliphatic residues. They merely serve as membrane-anchoring device. In contrast, so-called complex TMs have lower hydrophobicity, higher sequence complexity and some functional residues. These TMs have additional roles besides membrane anchoring such as intra-membrane complex formation, ligand binding or a catalytic role. Simple and complex TMs can occur both in single- and multi-membrane-spanning proteins essentially in any type of topology. Whereas simple TMs have the potential to confuse searches for sequence homologues and to generate unrelated hits with seemingly convincing statistical significance, complex TMs contain essential evolutionary information. Conclusion For extending the homology concept onto membrane proteins, we provide a necessary quantitative criterion to distinguish simple TMs (and a sufficient criterion for complex TMs) in query sequences prior to their usage in homology searches based on assessment of hydrophobicity and sequence complexity of the TM sequence segments. Reviewers This article was reviewed by Shamil Sunyaev, L. Aravind and Arcady Mushegian. PMID:22024092
Deorphanizing the human transmembrane genome: A landscape of uncharacterized membrane proteins.

PubMed

Babcock, Joseph J; Li, Min

2014-01-01

The sequencing of the human genome has fueled the last decade of work to functionally characterize genome content. An important subset of genes encodes membrane proteins, which are the targets of many drugs. They reside in lipid bilayers, restricting their endogenous activity to a relatively specialized biochemical environment. Without a reference phenotype, the application of systematic screens to profile candidate membrane proteins is not immediately possible. Bioinformatics has begun to show its effectiveness in focusing the functional characterization of orphan proteins of a particular functional class, such as channels or receptors. Here we discuss integration of experimental and bioinformatics approaches for characterizing the orphan membrane proteome. By analyzing the human genome, a landscape reference for the human transmembrane genome is provided.
Genetic characterization of K13965, a strain of Oak Vale virus from Western Australia.

PubMed

Quan, Phenix-Lan; Williams, David T; Johansen, Cheryl A; Jain, Komal; Petrosov, Alexandra; Diviney, Sinead M; Tashmukhamedova, Alla; Hutchison, Stephen K; Tesh, Robert B; Mackenzie, John S; Briese, Thomas; Lipkin, W Ian

2011-09-01

K13965, an uncharacterized virus, was isolated in 1993 from Anopheles annulipes mosquitoes collected in the Kimberley region of northern Western Australia. Here, we report its genomic sequence, identify it as a rhabdovirus, and characterize its phylogenetic relationships. The genome comprises a P' (C) and SH protein similar to the recently characterized Tupaia and Durham viruses, and shows overlap between G and L genes. Comparison of K13965 genome sequence to other rhabdoviruses identified K13965 as a strain of the unclassified Australian Oak Vale rhabdovirus, whose complete genome sequence we also determined. Phylogenetic analysis of N and L sequences indicated genetic relationship to a recently proposed Sandjima virus clade, although the Oak Vale virus sequences form a branch separate from the African members of that group. Copyright © 2011 Elsevier B.V. All rights reserved.
A new method to improve network topological similarity search: applied to fold recognition

PubMed Central

Lhota, John; Hauptman, Ruth; Hart, Thomas; Ng, Clara; Xie, Lei

2015-01-01

Motivation: Similarity search is the foundation of bioinformatics. It plays a key role in establishing structural, functional and evolutionary relationships between biological sequences. Although the power of the similarity search has increased steadily in recent years, a high percentage of sequences remain uncharacterized in the protein universe. Thus, new similarity search strategies are needed to efficiently and reliably infer the structure and function of new sequences. The existing paradigm for studying protein sequence, structure, function and evolution has been established based on the assumption that the protein universe is discrete and hierarchical. Cumulative evidence suggests that the protein universe is continuous. As a result, conventional sequence homology search methods may be not able to detect novel structural, functional and evolutionary relationships between proteins from weak and noisy sequence signals. To overcome the limitations in existing similarity search methods, we propose a new algorithmic framework—Enrichment of Network Topological Similarity (ENTS)—to improve the performance of large scale similarity searches in bioinformatics. Results: We apply ENTS to a challenging unsolved problem: protein fold recognition. Our rigorous benchmark studies demonstrate that ENTS considerably outperforms state-of-the-art methods. As the concept of ENTS can be applied to any similarity metric, it may provide a general framework for similarity search on any set of biological entities, given their representation as a network. Availability and implementation: Source code freely available upon request Contact: lxie@iscb.org PMID:25717198
Large scale ab initio modeling of structurally uncharacterized antimicrobial peptides reveals known and novel folds.

PubMed

Kozic, Mara; Fox, Stephen J; Thomas, Jens M; Verma, Chandra S; Rigden, Daniel J

2018-05-01

Antimicrobial resistance within a wide range of infectious agents is a severe and growing public health threat. Antimicrobial peptides (AMPs) are among the leading alternatives to current antibiotics, exhibiting broad spectrum activity. Their activity is determined by numerous properties such as cationic charge, amphipathicity, size, and amino acid composition. Currently, only around 10% of known AMP sequences have experimentally solved structures. To improve our understanding of the AMP structural universe we have carried out large scale ab initio 3D modeling of structurally uncharacterized AMPs that revealed similarities between predicted folds of the modeled sequences and structures of characterized AMPs. Two of the peptides whose models matched known folds are Lebocin Peptide 1A (LP1A) and Odorranain M, predicted to form β-hairpins but, interestingly, to lack the intramolecular disulfide bonds, cation-π or aromatic interactions that generally stabilize such AMP structures. Other examples include Ponericin Q42, Latarcin 4a, Kassinatuerin 1, Ceratotoxin D, and CPF-B1 peptide, which have α-helical folds, as well as mixed αβ folds of human Histatin 2 peptide and Garvicin A which are, to the best of our knowledge, the first linear αββ fold AMPs lacking intramolecular disulfide bonds. In addition to fold matches to experimentally derived structures, unique folds were also obtained, namely for Microcin M and Ipomicin. These results help in understanding the range of protein scaffolds that naturally bear antimicrobial activity and may facilitate protein design efforts towards better AMPs. © 2018 The Authors Proteins: Structure, Function, and Bioinformatics Published by Wiley Periodicals, Inc.
Analysis of functional redundancies within the Arabidopsis TCP transcription factor family.

PubMed

Danisman, Selahattin; van Dijk, Aalt D J; Bimbo, Andrea; van der Wal, Froukje; Hennig, Lars; de Folter, Stefan; Angenent, Gerco C; Immink, Richard G H

2013-12-01

Analyses of the functions of TEOSINTE-LIKE1, CYCLOIDEA, and PROLIFERATING CELL FACTOR1 (TCP) transcription factors have been hampered by functional redundancy between its individual members. In general, putative functionally redundant genes are predicted based on sequence similarity and confirmed by genetic analysis. In the TCP family, however, identification is impeded by relatively low overall sequence similarity. In a search for functionally redundant TCP pairs that control Arabidopsis leaf development, this work performed an integrative bioinformatics analysis, combining protein sequence similarities, gene expression data, and results of pair-wise protein-protein interaction studies for the 24 members of the Arabidopsis TCP transcription factor family. For this, the work completed any lacking gene expression and protein-protein interaction data experimentally and then performed a comprehensive prediction of potential functional redundant TCP pairs. Subsequently, redundant functions could be confirmed for selected predicted TCP pairs by genetic and molecular analyses. It is demonstrated that the previously uncharacterized class I TCP19 gene plays a role in the control of leaf senescence in a redundant fashion with TCP20. Altogether, this work shows the power of combining classical genetic and molecular approaches with bioinformatics predictions to unravel functional redundancies in the TCP transcription factor family.
SSMART: Sequence-structure motif identification for RNA-binding proteins.

PubMed

Munteanu, Alina; Mukherjee, Neelanjan; Ohler, Uwe

2018-06-11

RNA-binding proteins (RBPs) regulate every aspect of RNA metabolism and function. There are hundreds of RBPs encoded in the eukaryotic genomes, and each recognize its RNA targets through a specific mixture of RNA sequence and structure properties. For most RBPs, however, only a primary sequence motif has been determined, while the structure of the binding sites is uncharacterized. We developed SSMART, an RNA motif finder that simultaneously models the primary sequence and the structural properties of the RNA targets sites. The sequence-structure motifs are represented as consensus strings over a degenerate alphabet, extending the IUPAC codes for nucleotides to account for secondary structure preferences. Evaluation on synthetic data showed that SSMART is able to recover both sequence and structure motifs implanted into 3'UTR-like sequences, for various degrees of structured/unstructured binding sites. In addition, we successfully used SSMART on high-throughput in vivo and in vitro data, showing that we not only recover the known sequence motif, but also gain insight into the structural preferences of the RBP. Availability: SSMART is freely available at https://ohlerlab.mdc-berlin.de/software/SSMART_137/. Supplementary data are available at Bioinformatics online.
PaperBLAST: Text Mining Papers for Information about Homologs.

PubMed

Price, Morgan N; Arkin, Adam P

2017-01-01

Large-scale genome sequencing has identified millions of protein-coding genes whose function is unknown. Many of these proteins are similar to characterized proteins from other organisms, but much of this information is missing from annotation databases and is hidden in the scientific literature. To make this information accessible, PaperBLAST uses EuropePMC to search the full text of scientific articles for references to genes. PaperBLAST also takes advantage of curated resources (Swiss-Prot, GeneRIF, and EcoCyc) that link protein sequences to scientific articles. PaperBLAST's database includes over 700,000 scientific articles that mention over 400,000 different proteins. Given a protein of interest, PaperBLAST quickly finds similar proteins that are discussed in the literature and presents snippets of text from relevant articles or from the curators. PaperBLAST is available at http://papers.genomics.lbl.gov/. IMPORTANCE With the recent explosion of genome sequencing data, there are now millions of uncharacterized proteins. If a scientist becomes interested in one of these proteins, it can be very difficult to find information as to its likely function. Often a protein whose sequence is similar, and which is likely to have a similar function, has been studied already, but this information is not available in any database. To help find articles about similar proteins, PaperBLAST searches the full text of scientific articles for protein identifiers or gene identifiers, and it links these articles to protein sequences. Then, given a protein of interest, it can quickly find similar proteins in its database by using standard software (BLAST), and it can show snippets of text from relevant papers. We hope that PaperBLAST will make it easier for biologists to predict proteins' functions.
PaperBLAST: Text Mining Papers for Information about Homologs

DOE PAGES

Price, Morgan N.; Arkin, Adam P.

2017-08-15

Large-scale genome sequencing has identified millions of protein-coding genes whose function is unknown. Many of these proteins are similar to characterized proteins from other organisms, but much of this information is missing from annotation databases and is hidden in the scientific literature. To make this information accessible, PaperBLAST uses EuropePMC to search the full text of scientific articles for references to genes. PaperBLAST also takes advantage of curated resources (Swiss-Prot, GeneRIF, and EcoCyc) that link protein sequences to scientific articles. PaperBLAST’s database includes over 700,000 scientific articles that mention over 400,000 different proteins. Given a protein of interest, PaperBLAST quicklymore » finds similar proteins that are discussed in the literature and presents snippets of text from relevant articles or from the curators. With the recent explosion of genome sequencing data, there are now millions of uncharacterized proteins. If a scientist becomes interested in one of these proteins, it can be very difficult to find information as to its likely function. Often a protein whose sequence is similar, and which is likely to have a similar function, has been studied already, but this information is not available in any database. To help find articles about similar proteins, PaperBLAST searches the full text of scientific articles for protein identifiers or gene identifiers, and it links these articles to protein sequences. Then, given a protein of interest, it can quickly find similar proteins in its database by using standard software (BLAST), and it can show snippets of text from relevant papers. We hope that PaperBLAST will make it easier for biologists to predict proteins’ functions.« less
PaperBLAST: Text Mining Papers for Information about Homologs

DOE Office of Scientific and Technical Information (OSTI.GOV)

Price, Morgan N.; Arkin, Adam P.

Large-scale genome sequencing has identified millions of protein-coding genes whose function is unknown. Many of these proteins are similar to characterized proteins from other organisms, but much of this information is missing from annotation databases and is hidden in the scientific literature. To make this information accessible, PaperBLAST uses EuropePMC to search the full text of scientific articles for references to genes. PaperBLAST also takes advantage of curated resources (Swiss-Prot, GeneRIF, and EcoCyc) that link protein sequences to scientific articles. PaperBLAST’s database includes over 700,000 scientific articles that mention over 400,000 different proteins. Given a protein of interest, PaperBLAST quicklymore » finds similar proteins that are discussed in the literature and presents snippets of text from relevant articles or from the curators. With the recent explosion of genome sequencing data, there are now millions of uncharacterized proteins. If a scientist becomes interested in one of these proteins, it can be very difficult to find information as to its likely function. Often a protein whose sequence is similar, and which is likely to have a similar function, has been studied already, but this information is not available in any database. To help find articles about similar proteins, PaperBLAST searches the full text of scientific articles for protein identifiers or gene identifiers, and it links these articles to protein sequences. Then, given a protein of interest, it can quickly find similar proteins in its database by using standard software (BLAST), and it can show snippets of text from relevant papers. We hope that PaperBLAST will make it easier for biologists to predict proteins’ functions.« less

PaperBLAST: Text Mining Papers for Information about Homologs

PubMed Central

Arkin, Adam P.

2017-01-01

ABSTRACT Large-scale genome sequencing has identified millions of protein-coding genes whose function is unknown. Many of these proteins are similar to characterized proteins from other organisms, but much of this information is missing from annotation databases and is hidden in the scientific literature. To make this information accessible, PaperBLAST uses EuropePMC to search the full text of scientific articles for references to genes. PaperBLAST also takes advantage of curated resources (Swiss-Prot, GeneRIF, and EcoCyc) that link protein sequences to scientific articles. PaperBLAST’s database includes over 700,000 scientific articles that mention over 400,000 different proteins. Given a protein of interest, PaperBLAST quickly finds similar proteins that are discussed in the literature and presents snippets of text from relevant articles or from the curators. PaperBLAST is available at http://papers.genomics.lbl.gov/. IMPORTANCE With the recent explosion of genome sequencing data, there are now millions of uncharacterized proteins. If a scientist becomes interested in one of these proteins, it can be very difficult to find information as to its likely function. Often a protein whose sequence is similar, and which is likely to have a similar function, has been studied already, but this information is not available in any database. To help find articles about similar proteins, PaperBLAST searches the full text of scientific articles for protein identifiers or gene identifiers, and it links these articles to protein sequences. Then, given a protein of interest, it can quickly find similar proteins in its database by using standard software (BLAST), and it can show snippets of text from relevant papers. We hope that PaperBLAST will make it easier for biologists to predict proteins’ functions. PMID:28845458
Computational mining for hypothetical patterns of amino acid side chains in protein data bank (PDB)

NASA Astrophysics Data System (ADS)

Ghani, Nur Syatila Ab; Firdaus-Raih, Mohd

2018-04-01

The three-dimensional structure of a protein can provide insights regarding its function. Functional relationship between proteins can be inferred from fold and sequence similarities. In certain cases, sequence or fold comparison fails to conclude homology between proteins with similar mechanism. Since the structure is more conserved than the sequence, a constellation of functional residues can be similarly arranged among proteins of similar mechanism. Local structural similarity searches are able to detect such constellation of amino acids among distinct proteins, which can be useful to annotate proteins of unknown function. Detection of such patterns of amino acids on a large scale can increase the repertoire of important 3D motifs since available known 3D motifs currently, could not compensate the ever-increasing numbers of uncharacterized proteins to be annotated. Here, a computational platform for an automated detection of 3D motifs is described. A fuzzy-pattern searching algorithm derived from IMagine an Amino Acid 3D Arrangement search EnGINE (IMAAAGINE) was implemented to develop an automated method for searching of hypothetical patterns of amino acid side chains in Protein Data Bank (PDB), without the need for prior knowledge on related sequence or structure of pattern of interest. We present an example of the searches, which is the detection of a hypothetical pattern derived from known structural motif of C2H2 structural pattern from zinc fingers. The conservation of particular patterns of amino acid side chains in unrelated proteins is highlighted. This approach can act as a complementary method for available structure- and sequence-based platforms and may contribute in improving functional association between proteins.
Enzyme Function Initiative-Enzyme Similarity Tool (EFI-EST): A web tool for generating protein sequence similarity networks.

PubMed

Gerlt, John A; Bouvier, Jason T; Davidson, Daniel B; Imker, Heidi J; Sadkhin, Boris; Slater, David R; Whalen, Katie L

2015-08-01

The Enzyme Function Initiative, an NIH/NIGMS-supported Large-Scale Collaborative Project (EFI; U54GM093342; http://enzymefunction.org/), is focused on devising and disseminating bioinformatics and computational tools as well as experimental strategies for the prediction and assignment of functions (in vitro activities and in vivo physiological/metabolic roles) to uncharacterized enzymes discovered in genome projects. Protein sequence similarity networks (SSNs) are visually powerful tools for analyzing sequence relationships in protein families (H.J. Atkinson, J.H. Morris, T.E. Ferrin, and P.C. Babbitt, PLoS One 2009, 4, e4345). However, the members of the biological/biomedical community have not had access to the capability to generate SSNs for their "favorite" protein families. In this article we announce the EFI-EST (Enzyme Function Initiative-Enzyme Similarity Tool) web tool (http://efi.igb.illinois.edu/efi-est/) that is available without cost for the automated generation of SSNs by the community. The tool can create SSNs for the "closest neighbors" of a user-supplied protein sequence from the UniProt database (Option A) or of members of any user-supplied Pfam and/or InterPro family (Option B). We provide an introduction to SSNs, a description of EFI-EST, and a demonstration of the use of EFI-EST to explore sequence-function space in the OMP decarboxylase superfamily (PF00215). This article is designed as a tutorial that will allow members of the community to use the EFI-EST web tool for exploring sequence/function space in protein families. Copyright © 2015 Elsevier B.V. All rights reserved.
Genomic Enzymology: Web Tools for Leveraging Protein Family Sequence-Function Space and Genome Context to Discover Novel Functions.

PubMed

Gerlt, John A

2017-08-22

The exponentially increasing number of protein and nucleic acid sequences provides opportunities to discover novel enzymes, metabolic pathways, and metabolites/natural products, thereby adding to our knowledge of biochemistry and biology. The challenge has evolved from generating sequence information to mining the databases to integrating and leveraging the available information, i.e., the availability of "genomic enzymology" web tools. Web tools that allow identification of biosynthetic gene clusters are widely used by the natural products/synthetic biology community, thereby facilitating the discovery of novel natural products and the enzymes responsible for their biosynthesis. However, many novel enzymes with interesting mechanisms participate in uncharacterized small-molecule metabolic pathways; their discovery and functional characterization also can be accomplished by leveraging information in protein and nucleic acid databases. This Perspective focuses on two genomic enzymology web tools that assist the discovery novel metabolic pathways: (1) Enzyme Function Initiative-Enzyme Similarity Tool (EFI-EST) for generating sequence similarity networks to visualize and analyze sequence-function space in protein families and (2) Enzyme Function Initiative-Genome Neighborhood Tool (EFI-GNT) for generating genome neighborhood networks to visualize and analyze the genome context in microbial and fungal genomes. Both tools have been adapted to other applications to facilitate target selection for enzyme discovery and functional characterization. As the natural products community has demonstrated, the enzymology community needs to embrace the essential role of web tools that allow the protein and genome sequence databases to be leveraged for novel insights into enzymological problems.
Disruption of SMIM1 causes the Vel− blood type

PubMed Central

Ballif, Bryan A; Helias, Virginie; Peyrard, Thierry; Menanteau, Cécile; Saison, Carole; Lucien, Nicole; Bourgouin, Sébastien; Le Gall, Maude; Cartron, Jean-Pierre; Arnaud, Lionel

2013-01-01

Here, we report the biochemical and genetic basis of the Vel blood group antigen, which has been a vexing mystery for decades, especially as anti-Vel regularly causes severe haemolytic transfusion reactions. The protein carrying the Vel blood group antigen was biochemically purified from red blood cell membranes. Mass spectrometry-based de novo peptide sequencing identified this protein to be small integral membrane protein 1 (SMIM1), a previously uncharacterized single-pass membrane protein. Expression of SMIM1 cDNA in Vel− cultured cells generated anti-Vel cell surface reactivity, confirming that SMIM1 encoded the Vel blood group antigen. A cohort of 70 Vel− individuals was found to be uniformly homozygous for a 17 nucleotide deletion in the coding sequence of SMIM1. The genetic homogeneity of the Vel− blood type, likely having a common origin, facilitated the development of two highly specific DNA-based tests for rapid Vel genotyping, which can be easily integrated into blood group genotyping platforms. These results answer a 60-year-old riddle and provide tools of immediate assistance to all clinicians involved in the care of Vel− patients. PMID:23505126
A novel family of sequence-specific endoribonucleases associated with the clustered regularly interspaced short palindromic repeats.

PubMed

Beloglazova, Natalia; Brown, Greg; Zimmerman, Matthew D; Proudfoot, Michael; Makarova, Kira S; Kudritska, Marina; Kochinyan, Samvel; Wang, Shuren; Chruszcz, Maksymilian; Minor, Wladek; Koonin, Eugene V; Edwards, Aled M; Savchenko, Alexei; Yakunin, Alexander F

2008-07-18

Clustered regularly interspaced short palindromic repeats (CRISPRs) together with the associated CAS proteins protect microbial cells from invasion by foreign genetic elements using presently unknown molecular mechanisms. All CRISPR systems contain proteins of the CAS2 family, suggesting that these uncharacterized proteins play a central role in this process. Here we show that the CAS2 proteins represent a novel family of endoribonucleases. Six purified CAS2 proteins from diverse organisms cleaved single-stranded RNAs preferentially within U-rich regions. A representative CAS2 enzyme, SSO1404 from Sulfolobus solfataricus, cleaved the phosphodiester linkage on the 3'-side and generated 5'-phosphate- and 3'-hydroxyl-terminated oligonucleotides. The crystal structure of SSO1404 was solved at 1.6A resolution revealing the first ribonuclease with a ferredoxin-like fold. Mutagenesis of SSO1404 identified six residues (Tyr-9, Asp-10, Arg-17, Arg-19, Arg-31, and Phe-37) that are important for enzymatic activity and suggested that Asp-10 might be the principal catalytic residue. Thus, CAS2 proteins are sequence-specific endoribonucleases, and we propose that their role in the CRISPR-mediated anti-phage defense might involve degradation of phage or cellular mRNAs.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Bae, Euiyoung; Bingman, Craig A.; Aceti, David J.

LOC79017 (MW 21.0 kDa, residues 1-188) was annotated as a hypothetical protein encoded by Homo sapiens chromosome 7 open reading frame 24. It was selected as a target by the Center for Eukaryotic Structural Genomics (CESG) because it did not share more than 30% sequence identity with any protein for which the three-dimensional structure is known. The biological function of the protein has not been established yet. Parts of LOC79017 were identified as members of uncharacterized Pfam families (residues 1-95 as PB006073 and residues 104-180 as PB031696). BLAST searches revealed homologues of LOC79017 in many eukaryotes, but none of themmore » have been functionally characterized. Here, we report the crystal structure of H. sapiens protein LOC79017 (UniGene code Hs.530024, UniProt code O75223, CESG target number go.35223).« less
A dehydration-inducible gene in the truffle Tuber borchii identifies a novel group of dehydrins

PubMed Central

Abba', Simona; Ghignone, Stefano; Bonfante, Paola

2006-01-01

Background The expressed sequence tag M6G10 was originally isolated from a screening for differentially expressed transcripts during the reproductive stage of the white truffle Tuber borchii. mRNA levels for M6G10 increased dramatically during fruiting body maturation compared to the vegetative mycelial stage. Results Bioinformatics tools, phylogenetic analysis and expression studies were used to support the hypothesis that this sequence, named TbDHN1, is the first dehydrin (DHN)-like coding gene isolated in fungi. Homologs of this gene, all defined as "coding for hypothetical proteins" in public databases, were exclusively found in ascomycetous fungi and in plants. Although complete (or almost complete) fungal genomes and EST collections of some Basidiomycota and Glomeromycota are already available, DHN-like proteins appear to be represented only in Ascomycota. A new and previously uncharacterized conserved signature pattern was identified and proposed to Uniprot database as the main distinguishing feature of this new group of DHNs. Expression studies provide experimental evidence of a transcript induction of TbDHN1 during cellular dehydration. Conclusion Expression pattern and sequence similarities to known plant DHNs indicate that TbDHN1 is the first characterized DHN-like protein in fungi. The high similarity of TbDHN1 with homolog coding sequences implies the existence of a novel fungal/plant group of LEA Class II proteins characterized by a previously undescribed signature pattern. PMID:16512918
Analysis of functional redundancies within the Arabidopsis TCP transcription factor family

PubMed Central

Danisman, Selahattin; de Folter, Stefan; Immink, Richard G. H.

2013-01-01

Analyses of the functions of TEOSINTE-LIKE1, CYCLOIDEA, and PROLIFERATING CELL FACTOR1 (TCP) transcription factors have been hampered by functional redundancy between its individual members. In general, putative functionally redundant genes are predicted based on sequence similarity and confirmed by genetic analysis. In the TCP family, however, identification is impeded by relatively low overall sequence similarity. In a search for functionally redundant TCP pairs that control Arabidopsis leaf development, this work performed an integrative bioinformatics analysis, combining protein sequence similarities, gene expression data, and results of pair-wise protein–protein interaction studies for the 24 members of the Arabidopsis TCP transcription factor family. For this, the work completed any lacking gene expression and protein–protein interaction data experimentally and then performed a comprehensive prediction of potential functional redundant TCP pairs. Subsequently, redundant functions could be confirmed for selected predicted TCP pairs by genetic and molecular analyses. It is demonstrated that the previously uncharacterized class I TCP19 gene plays a role in the control of leaf senescence in a redundant fashion with TCP20. Altogether, this work shows the power of combining classical genetic and molecular approaches with bioinformatics predictions to unravel functional redundancies in the TCP transcription factor family. PMID:24129704
Functional Testing of SLC26A4 Variants—Clinical and Molecular Analysis of a Cohort with Enlarged Vestibular Aqueduct from Austria

PubMed Central

Bernardinelli, Emanuele; Nofziger, Charity; Patsch, Wolfgang; Rasp, Gerd; Paulmichl, Markus; Dossena, Silvia

2018-01-01

The prevalence and spectrum of sequence alterations in the SLC26A4 gene, which codes for the anion exchanger pendrin, are population-specific and account for at least 50% of cases of non-syndromic hearing loss associated with an enlarged vestibular aqueduct. A cohort of nineteen patients from Austria with hearing loss and a radiological alteration of the vestibular aqueduct underwent Sanger sequencing of SLC26A4 and GJB2, coding for connexin 26. The pathogenicity of sequence alterations detected was assessed by determining ion transport and molecular features of the corresponding SLC26A4 protein variants. In this group, four uncharacterized sequence alterations within the SLC26A4 coding region were found. Three of these lead to protein variants with abnormal functional and molecular features, while one should be considered with no pathogenic potential. Pathogenic SLC26A4 sequence alterations were only found in 12% of patients. SLC26A4 sequence alterations commonly found in other Caucasian populations were not detected. This survey represents the first study on the prevalence and spectrum of SLC26A4 sequence alterations in an Austrian cohort and further suggests that genetic testing should always be integrated with functional characterization and determination of the molecular features of protein variants in order to unequivocally identify or exclude a causal link between genotype and phenotype. PMID:29320412
A novel BLAST-Based Relative Distance (BBRD) method can effectively group members of protein arginine methyltransferases and suggest their evolutionary relationship.

PubMed

Wang, Yi-Chun; Wang, Jing-Doo; Chen, Chin-Han; Chen, Yi-Wen; Li, Chuan

2015-03-01

We developed a novel BLAST-Based Relative Distance (BBRD) method by Pearson's correlation coefficient to avoid the problems of tedious multiple sequence alignment and complicated outgroup selection. We showed its application on reconstructing reliable phylogeny for nucleotide and protein sequences as exemplified by the fmr-1 gene and dihydrolipoamide dehydrogenase, respectively. We then used BBRD to resolve 124 protein arginine methyltransferases (PRMTs) that are homologues of nine mammalian PRMTs. The tree placed the uncharacterized PRMT9 with PRMT7 in the same clade, outside of all the Type I PRMTs including PRMT1 and its vertebrate paralogue PRMT8, PRMT3, PRMT6, PRMT2 and PRMT4. The PRMT7/9 branch then connects with the type II PRMT5. Some non-vertebrates contain different PRMTs without high sequence homology with the mammalian PRMTs. For example, in the case of Drosophila arginine methyltransferase (DART) and Trypanosoma brucei methyltransferases (TbPRMTs) in the analyses, the BBRD program grouped them with specific clades and thus suggested their evolutionary relationships. The BBRD method thus provided a great tool to construct a reliable tree for members of protein families through evolution. Copyright © 2015 Elsevier Inc. All rights reserved.
A novel strategy for the determination of a rhabdovirus genome and its application to sequencing of Eggplant mottled dwarf virus.

PubMed

Pappi, Polyxeni G; Dovas, Chrysostomos I; Efthimiou, Konstantinos E; Maliogka, Varvara I; Katis, Nikolaos I

2013-08-01

A novel strategy employing the rhabdovirus untranslated conserved intergenic regions was developed and applied successfully for the determination of the complete nucleotide sequence of Eggplant mottled dwarf virus (EMDV). The EMDV genome contains seven open reading frames with the same organization as Potato yellow dwarf virus (PYDV), the type species of the genus Nucleorhabdovirus. These two species encode five core genes [nucleocapsid (N), phosphoprotein (P), matrix (M), glycoprotein (G), and the polymerase (L)] like other viruses of the genus and an additional one (X), located between N and P, giving rise to a protein with currently unknown function. Furthermore, both EMDV and PYDV contain a gene (Y), inserted between P and M, which probably encodes the virus movement protein, in concordance with the rest of the plant-infecting rhabdoviruses. Phylogenetic analysis of the polymerase gene confirmed the classification of EMDV within the genus Nucleorhabdovirus and showed a close evolutionary relationship to PYDV. The novel sequencing strategy developed is a useful tool for the genome determination of yet uncharacterized rhabdoviruses.
Predictive and comparative analysis of Ebolavirus proteins

PubMed Central

Cong, Qian; Pei, Jimin; Grishin, Nick V

2015-01-01

Ebolavirus is the pathogen for Ebola Hemorrhagic Fever (EHF). This disease exhibits a high fatality rate and has recently reached a historically epidemic proportion in West Africa. Out of the 5 known Ebolavirus species, only Reston ebolavirus has lost human pathogenicity, while retaining the ability to cause EHF in long-tailed macaque. Significant efforts have been spent to determine the three-dimensional (3D) structures of Ebolavirus proteins, to study their interaction with host proteins, and to identify the functional motifs in these viral proteins. Here, in light of these experimental results, we apply computational analysis to predict the 3D structures and functional sites for Ebolavirus protein domains with unknown structure, including a zinc-finger domain of VP30, the RNA-dependent RNA polymerase catalytic domain and a methyltransferase domain of protein L. In addition, we compare sequences of proteins that interact with Ebolavirus proteins from RESTV-resistant primates with those from RESTV-susceptible monkeys. The host proteins that interact with GP and VP35 show an elevated level of sequence divergence between the RESTV-resistant and RESTV-susceptible species, suggesting that they may be responsible for host specificity. Meanwhile, we detect variable positions in protein sequences that are likely associated with the loss of human pathogenicity in RESTV, map them onto the 3D structures and compare their positions to known functional sites. VP35 and VP30 are significantly enriched in these potential pathogenicity determinants and the clustering of such positions on the surfaces of VP35 and GP suggests possible uncharacterized interaction sites with host proteins that contribute to the virulence of Ebolavirus. PMID:26158395
Predictive and comparative analysis of Ebolavirus proteins.

PubMed

Cong, Qian; Pei, Jimin; Grishin, Nick V

2015-01-01

Ebolavirus is the pathogen for Ebola Hemorrhagic Fever (EHF). This disease exhibits a high fatality rate and has recently reached a historically epidemic proportion in West Africa. Out of the 5 known Ebolavirus species, only Reston ebolavirus has lost human pathogenicity, while retaining the ability to cause EHF in long-tailed macaque. Significant efforts have been spent to determine the three-dimensional (3D) structures of Ebolavirus proteins, to study their interaction with host proteins, and to identify the functional motifs in these viral proteins. Here, in light of these experimental results, we apply computational analysis to predict the 3D structures and functional sites for Ebolavirus protein domains with unknown structure, including a zinc-finger domain of VP30, the RNA-dependent RNA polymerase catalytic domain and a methyltransferase domain of protein L. In addition, we compare sequences of proteins that interact with Ebolavirus proteins from RESTV-resistant primates with those from RESTV-susceptible monkeys. The host proteins that interact with GP and VP35 show an elevated level of sequence divergence between the RESTV-resistant and RESTV-susceptible species, suggesting that they may be responsible for host specificity. Meanwhile, we detect variable positions in protein sequences that are likely associated with the loss of human pathogenicity in RESTV, map them onto the 3D structures and compare their positions to known functional sites. VP35 and VP30 are significantly enriched in these potential pathogenicity determinants and the clustering of such positions on the surfaces of VP35 and GP suggests possible uncharacterized interaction sites with host proteins that contribute to the virulence of Ebolavirus.
Enriching the annotation of Mycobacterium tuberculosis H37Rv proteome using remote homology detection approaches: insights into structure and function.

PubMed

Ramakrishnan, Gayatri; Ochoa-Montaño, Bernardo; Raghavender, Upadhyayula S; Mudgal, Richa; Joshi, Adwait G; Chandra, Nagasuma R; Sowdhamini, Ramanathan; Blundell, Tom L; Srinivasan, Narayanaswamy

2015-01-01

The availability of the genome sequence of Mycobacterium tuberculosis H37Rv has encouraged determination of large numbers of protein structures and detailed definition of the biological information encoded therein; yet, the functions of many proteins in M. tuberculosis remain unknown. The emergence of multidrug resistant strains makes it a priority to exploit recent advances in homology recognition and structure prediction to re-analyse its gene products. Here we report the structural and functional characterization of gene products encoded in the M. tuberculosis genome, with the help of sensitive profile-based remote homology search and fold recognition algorithms resulting in an enhanced annotation of the proteome where 95% of the M. tuberculosis proteins were identified wholly or partly with information on structure or function. New information includes association of 244 proteins with 205 domain families and a separate set of new association of folds to 64 proteins. Extending structural information across uncharacterized protein families represented in the M. tuberculosis proteome, by determining superfamily relationships between families of known and unknown structures, has contributed to an enhancement in the knowledge of structural content. In retrospect, such superfamily relationships have facilitated recognition of probable structure and/or function for several uncharacterized protein families, eventually aiding recognition of probable functions for homologous proteins corresponding to such families. Gene products unique to mycobacteria for which no functions could be identified are 183. Of these 18 were determined to be M. tuberculosis specific. Such pathogen-specific proteins are speculated to harbour virulence factors required for pathogenesis. A re-annotated proteome of M. tuberculosis, with greater completeness of annotated proteins and domain assigned regions, provides a valuable basis for experimental endeavours designed to obtain a better understanding of pathogenesis and to accelerate the process of drug target discovery. Copyright © 2014 Elsevier Ltd. All rights reserved.
Deciphering the molecular and functional basis of Dbl family proteins: a novel systematic approach toward classification of selective activation of the Rho family proteins.

PubMed

Jaiswal, Mamta; Dvorsky, Radovan; Ahmadian, Mohammad Reza

2013-02-08

The diffuse B-cell lymphoma (Dbl) family of the guanine nucleotide exchange factors is a direct activator of the Rho family proteins. The Rho family proteins are involved in almost every cellular process that ranges from fundamental (e.g. the establishment of cell polarity) to highly specialized processes (e.g. the contraction of vascular smooth muscle cells). Abnormal activation of the Rho proteins is known to play a crucial role in cancer, infectious and cognitive disorders, and cardiovascular diseases. However, the existence of 74 Dbl proteins and 25 Rho-related proteins in humans, which are largely uncharacterized, has led to increasing complexity in identifying specific upstream pathways. Thus, we comprehensively investigated sequence-structure-function-property relationships of 21 representatives of the Dbl protein family regarding their specificities and activities toward 12 Rho family proteins. The meta-analysis approach provides an unprecedented opportunity to broadly profile functional properties of Dbl family proteins, including catalytic efficiency, substrate selectivity, and signaling specificity. Our analysis has provided novel insights into the following: (i) understanding of the relative differences of various Rho protein members in nucleotide exchange; (ii) comparing and defining individual and overall guanine nucleotide exchange factor activities of a large representative set of the Dbl proteins toward 12 Rho proteins; (iii) grouping the Dbl family into functionally distinct categories based on both their catalytic efficiencies and their sequence-structural relationships; (iv) identifying conserved amino acids as fingerprints of the Dbl and Rho protein interaction; and (v) defining amino acid sequences conserved within, but not between, Dbl subfamilies. Therefore, the characteristics of such specificity-determining residues identified the regions or clusters conserved within the Dbl subfamilies.
A Novel Family of Sequence-specific Endoribonucleases Associated with the Clustered Regularly Interspaced Short Palindromic Repeats

DOE Office of Scientific and Technical Information (OSTI.GOV)

Beloglazova, Natalia; Brown, Greg; Zimmerman, Matthew D.

Clustered regularly interspaced short palindromic repeats (CRISPRs) together with the associated CAS proteins protect microbial cells from invasion by foreign genetic elements using presently unknown molecular mechanisms. All CRISPR systems contain proteins of the CAS2 family, suggesting that these uncharacterized proteins play a central role in this process. Here we show that the CAS2 proteins represent a novel family of endoribonucleases. Six purified CAS2 proteins from diverse organisms cleaved single-stranded RNAs preferentially within U-rich regions. A representative CAS2 enzyme, SSO1404 from Sulfolobus solfataricus, cleaved the phosphodiester linkage on the 3'-side and generated 5'-phosphate- and 3'-hydroxyl-terminated oligonucleotides. The crystal structure ofmore » SSO1404 was solved at 1.6{angstrom} resolution revealing the first ribonuclease with a ferredoxin-like fold. Mutagenesis of SSO1404 identified six residues (Tyr-9, Asp-10, Arg-17, Arg-19, Arg-31, and Phe-37) that are important for enzymatic activity and suggested that Asp-10 might be the principal catalytic residue. Thus, CAS2 proteins are sequence-specific endoribonucleases, and we propose that their role in the CRISPR-mediated anti-phage defense might involve degradation of phage or cellular mRNAs.« less
Structural and immunologic characterization of bovine, horse, and rabbit serum albumins

PubMed Central

Majorek, Karolina A.; Porebski, Przemyslaw J.; Dayal, Arjun; Zimmerman, Matthew D.; Jablonska, Kamila; Stewart, Alan J.; Chruszcz, Maksymilian; Minor, Wladek

2012-01-01

Serum albumin (SA) is the most abundant plasma protein in mammals. SA is a multifunctional protein with extraordinary ligand binding capacity, making it a transporter molecule for a diverse range of metabolites, drugs, nutrients, metals and other molecules. Due to its ligand binding properties, albumins have wide clinical, pharmaceutical, and biochemical applications. Albumins are also allergenic, and exhibit a high degree of cross-reactivity due to significant sequence and structure similarity of SAs from different organisms. Here we present crystal structures of albumins from cattle (BSA), horse (ESA) and rabbit (RSA) serums. The structural data are correlated with the results of immunological studies of SAs. We also analyze the conservation or divergence of structures and sequences of SAs in the context of their potential allergenicity and cross-reactivity. In addition, we identified a previously uncharacterized ligand binding site in the structure of RSA, and calcium binding sites in the structure of BSA, which is the first serum albumin structure to contain metal ions. PMID:22677715
In silico serine β-lactamases analysis reveals a huge potential resistome in environmental and pathogenic species.

PubMed

Brandt, Christian; Braun, Sascha D; Stein, Claudia; Slickers, Peter; Ehricht, Ralf; Pletz, Mathias W; Makarewicz, Oliwia

2017-02-24

The secretion of antimicrobial compounds is an ancient mechanism with clear survival benefits for microbes competing with other microorganisms. Consequently, mechanisms that confer resistance are also ancient and may represent an underestimated reservoir in environmental bacteria. In this context, β-lactamases (BLs) are of great interest due to their long-term presence and diversification in the hospital environment, leading to the emergence of Gram-negative pathogens that are resistant to cephalosporins (extended spectrum BLs = ESBLs) and carbapenems (carbapenemases). In the current study, protein sequence databases were used to analyze BLs, and the results revealed a substantial number of unknown and functionally uncharacterized BLs in a multitude of environmental and pathogenic species. Together, these BLs represent an uncharacterized reservoir of potentially transferable resistance genes. Considering all available data, in silico approaches appear to more adequately reflect a given resistome than analyses of limited datasets. This approach leads to a more precise definition of BL clades and conserved motifs. Moreover, it may support the prediction of new resistance determinants and improve the tailored development of robust molecular diagnostics.
PFP: Automated prediction of gene ontology functional annotations with confidence scores using protein sequence data.

PubMed

Hawkins, Troy; Chitale, Meghana; Luban, Stanislav; Kihara, Daisuke

2009-02-15

Protein function prediction is a central problem in bioinformatics, increasing in importance recently due to the rapid accumulation of biological data awaiting interpretation. Sequence data represents the bulk of this new stock and is the obvious target for consideration as input, as newly sequenced organisms often lack any other type of biological characterization. We have previously introduced PFP (Protein Function Prediction) as our sequence-based predictor of Gene Ontology (GO) functional terms. PFP interprets the results of a PSI-BLAST search by extracting and scoring individual functional attributes, searching a wide range of E-value sequence matches, and utilizing conventional data mining techniques to fill in missing information. We have shown it to be effective in predicting both specific and low-resolution functional attributes when sufficient data is unavailable. Here we describe (1) significant improvements to the PFP infrastructure, including the addition of prediction significance and confidence scores, (2) a thorough benchmark of performance and comparisons to other related prediction methods, and (3) applications of PFP predictions to genome-scale data. We applied PFP predictions to uncharacterized protein sequences from 15 organisms. Among these sequences, 60-90% could be annotated with a GO molecular function term at high confidence (>or=80%). We also applied our predictions to the protein-protein interaction network of the Malaria plasmodium (Plasmodium falciparum). High confidence GO biological process predictions (>or=90%) from PFP increased the number of fully enriched interactions in this dataset from 23% of interactions to 94%. Our benchmark comparison shows significant performance improvement of PFP relative to GOtcha, InterProScan, and PSI-BLAST predictions. This is consistent with the performance of PFP as the overall best predictor in both the AFP-SIG '05 and CASP7 function (FN) assessments. PFP is available as a web service at http://dragon.bio.purdue.edu/pfp/. (c) 2008 Wiley-Liss, Inc.

iHyd-PseCp: Identify hydroxyproline and hydroxylysine in proteins by incorporating sequence-coupled effects into general PseAAC.

PubMed

Qiu, Wang-Ren; Sun, Bi-Qian; Xiao, Xuan; Xu, Zhao-Chun; Chou, Kuo-Chen

2016-07-12

Protein hydroxylation is a posttranslational modification (PTM), in which a CH group in Pro (P) or Lys (K) residue has been converted into a COH group, or a hydroxyl group (-OH) is converted into an organic compound. Closely associated with cellular signaling activities, this type of PTM is also involved in some major diseases, such as stomach cancer and lung cancer. Therefore, from the angles of both basic research and drug development, we are facing a challenging problem: for an uncharacterized protein sequence containing many residues of P or K, which ones can be hydroxylated, and which ones cannot? With the explosive growth of protein sequences in the post-genomic age, the problem has become even more urgent. To address such a problem, we have developed a predictor called iHyd-PseCp by incorporating the sequence-coupled information into the general pseudo amino acid composition (PseAAC) and introducing the "Random Forest" algorithm to operate the calculation. Rigorous jackknife tests indicated that the new predictor remarkably outperformed the existing state-of-the-art prediction method for the same purpose. For the convenience of most experimental scientists, a user-friendly web-server for iHyd-PseCp has been established at http://www.jci-bioinfo.cn/iHyd-PseCp, by which users can easily obtain their desired results without the need to go through the complicated mathematical equations involved.
COMBREX-DB: an experiment centered database of protein function: knowledge, predictions and knowledge gaps.

PubMed

Chang, Yi-Chien; Hu, Zhenjun; Rachlin, John; Anton, Brian P; Kasif, Simon; Roberts, Richard J; Steffen, Martin

2016-01-04

The COMBREX database (COMBREX-DB; combrex.bu.edu) is an online repository of information related to (i) experimentally determined protein function, (ii) predicted protein function, (iii) relationships among proteins of unknown function and various types of experimental data, including molecular function, protein structure, and associated phenotypes. The database was created as part of the novel COMBREX (COMputational BRidges to EXperiments) effort aimed at accelerating the rate of gene function validation. It currently holds information on ∼ 3.3 million known and predicted proteins from over 1000 completely sequenced bacterial and archaeal genomes. The database also contains a prototype recommendation system for helping users identify those proteins whose experimental determination of function would be most informative for predicting function for other proteins within protein families. The emphasis on documenting experimental evidence for function predictions, and the prioritization of uncharacterized proteins for experimental testing distinguish COMBREX from other publicly available microbial genomics resources. This article describes updates to COMBREX-DB since an initial description in the 2011 NAR Database Issue. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Functional discovery via a compendium of expression profiles.

PubMed

Hughes, T R; Marton, M J; Jones, A R; Roberts, C J; Stoughton, R; Armour, C D; Bennett, H A; Coffey, E; Dai, H; He, Y D; Kidd, M J; King, A M; Meyer, M R; Slade, D; Lum, P Y; Stepaniants, S B; Shoemaker, D D; Gachotte, D; Chakraburtty, K; Simon, J; Bard, M; Friend, S H

2000-07-07

Ascertaining the impact of uncharacterized perturbations on the cell is a fundamental problem in biology. Here, we describe how a single assay can be used to monitor hundreds of different cellular functions simultaneously. We constructed a reference database or "compendium" of expression profiles corresponding to 300 diverse mutations and chemical treatments in S. cerevisiae, and we show that the cellular pathways affected can be determined by pattern matching, even among very subtle profiles. The utility of this approach is validated by examining profiles caused by deletions of uncharacterized genes: we identify and experimentally confirm that eight uncharacterized open reading frames encode proteins required for sterol metabolism, cell wall function, mitochondrial respiration, or protein synthesis. We also show that the compendium can be used to characterize pharmacological perturbations by identifying a novel target of the commonly used drug dyclonine.
Genomic Enzymology: Web Tools for Leveraging Protein Family Sequence–Function Space and Genome Context to Discover Novel Functions

PubMed Central

2017-01-01

The exponentially increasing number of protein and nucleic acid sequences provides opportunities to discover novel enzymes, metabolic pathways, and metabolites/natural products, thereby adding to our knowledge of biochemistry and biology. The challenge has evolved from generating sequence information to mining the databases to integrating and leveraging the available information, i.e., the availability of “genomic enzymology” web tools. Web tools that allow identification of biosynthetic gene clusters are widely used by the natural products/synthetic biology community, thereby facilitating the discovery of novel natural products and the enzymes responsible for their biosynthesis. However, many novel enzymes with interesting mechanisms participate in uncharacterized small-molecule metabolic pathways; their discovery and functional characterization also can be accomplished by leveraging information in protein and nucleic acid databases. This Perspective focuses on two genomic enzymology web tools that assist the discovery novel metabolic pathways: (1) Enzyme Function Initiative-Enzyme Similarity Tool (EFI-EST) for generating sequence similarity networks to visualize and analyze sequence–function space in protein families and (2) Enzyme Function Initiative-Genome Neighborhood Tool (EFI-GNT) for generating genome neighborhood networks to visualize and analyze the genome context in microbial and fungal genomes. Both tools have been adapted to other applications to facilitate target selection for enzyme discovery and functional characterization. As the natural products community has demonstrated, the enzymology community needs to embrace the essential role of web tools that allow the protein and genome sequence databases to be leveraged for novel insights into enzymological problems. PMID:28826221
In Silico Pattern-Based Analysis of the Human Cytomegalovirus Genome

PubMed Central

Rigoutsos, Isidore; Novotny, Jiri; Huynh, Tien; Chin-Bow, Stephen T.; Parida, Laxmi; Platt, Daniel; Coleman, David; Shenk, Thomas

2003-01-01

More than 200 open reading frames (ORFs) from the human cytomegalovirus genome have been reported as potentially coding for proteins. We have used two pattern-based in silico approaches to analyze this set of putative viral genes. With the help of an objective annotation method that is based on the Bio-Dictionary, a comprehensive collection of amino acid patterns that describes the currently known natural sequence space of proteins, we have reannotated all of the previously reported putative genes of the human cytomegalovirus. Also, with the help of MUSCA, a pattern-based multiple sequence alignment algorithm, we have reexamined the original human cytomegalovirus gene family definitions. Our analysis of the genome shows that many of the coded proteins comprise amino acid combinations that are unique to either the human cytomegalovirus or the larger group of herpesviruses. We have confirmed that a surprisingly large portion of the analyzed ORFs encode membrane proteins, and we have discovered a significant number of previously uncharacterized proteins that are predicted to be G-protein-coupled receptor homologues. The analysis also indicates that many of the encoded proteins undergo posttranslational modifications such as hydroxylation, phosphorylation, and glycosylation. ORFs encoding proteins with similar functional behavior appear in neighboring regions of the human cytomegalovirus genome. All of the results of the present study can be found and interactively explored online (http://cbcsrv.watson.ibm.com/virus/). PMID:12634390
In silico pattern-based analysis of the human cytomegalovirus genome.

PubMed

Rigoutsos, Isidore; Novotny, Jiri; Huynh, Tien; Chin-Bow, Stephen T; Parida, Laxmi; Platt, Daniel; Coleman, David; Shenk, Thomas

2003-04-01

More than 200 open reading frames (ORFs) from the human cytomegalovirus genome have been reported as potentially coding for proteins. We have used two pattern-based in silico approaches to analyze this set of putative viral genes. With the help of an objective annotation method that is based on the Bio-Dictionary, a comprehensive collection of amino acid patterns that describes the currently known natural sequence space of proteins, we have reannotated all of the previously reported putative genes of the human cytomegalovirus. Also, with the help of MUSCA, a pattern-based multiple sequence alignment algorithm, we have reexamined the original human cytomegalovirus gene family definitions. Our analysis of the genome shows that many of the coded proteins comprise amino acid combinations that are unique to either the human cytomegalovirus or the larger group of herpesviruses. We have confirmed that a surprisingly large portion of the analyzed ORFs encode membrane proteins, and we have discovered a significant number of previously uncharacterized proteins that are predicted to be G-protein-coupled receptor homologues. The analysis also indicates that many of the encoded proteins undergo posttranslational modifications such as hydroxylation, phosphorylation, and glycosylation. ORFs encoding proteins with similar functional behavior appear in neighboring regions of the human cytomegalovirus genome. All of the results of the present study can be found and interactively explored online (http://cbcsrv.watson.ibm.com/virus/).
ProBiS-database: precalculated binding site similarities and local pairwise alignments of PDB structures.

PubMed

Konc, Janez; Cesnik, Tomo; Konc, Joanna Trykowska; Penca, Matej; Janežič, Dušanka

2012-02-27

ProBiS-Database is a searchable repository of precalculated local structural alignments in proteins detected by the ProBiS algorithm in the Protein Data Bank. Identification of functionally important binding regions of the protein is facilitated by structural similarity scores mapped to the query protein structure. PDB structures that have been aligned with a query protein may be rapidly retrieved from the ProBiS-Database, which is thus able to generate hypotheses concerning the roles of uncharacterized proteins. Presented with uncharacterized protein structure, ProBiS-Database can discern relationships between such a query protein and other better known proteins in the PDB. Fast access and a user-friendly graphical interface promote easy exploration of this database of over 420 million local structural alignments. The ProBiS-Database is updated weekly and is freely available online at http://probis.cmm.ki.si/database.
In-depth proteomic analysis of a mollusc shell: acid-soluble and acid-insoluble matrix of the limpet Lottia gigantea

PubMed Central

2012-01-01

Background Invertebrate biominerals are characterized by their extraordinary functionality and physical properties, such as strength, stiffness and toughness that by far exceed those of the pure mineral component of such composites. This is attributed to the organic matrix, secreted by specialized cells, which pervades and envelops the mineral crystals. Despite the obvious importance of the protein fraction of the organic matrix, only few in-depth proteomic studies have been performed due to the lack of comprehensive protein sequence databases. The recent public release of the gastropod Lottia gigantea genome sequence and the associated protein sequence database provides for the first time the opportunity to do a state-of-the-art proteomic in-depth analysis of the organic matrix of a mollusc shell. Results Using three different sodium hypochlorite washing protocols before shell demineralization, a total of 569 proteins were identified in Lottia gigantea shell matrix. Of these, 311 were assembled in a consensus proteome comprising identifications contained in all proteomes irrespective of shell cleaning procedure. Some of these proteins were similar in amino acid sequence, amino acid composition, or domain structure to proteins identified previously in different bivalve or gastropod shells, such as BMSP, dermatopontin, nacrein, perlustrin, perlucin, or Pif. In addition there were dozens of previously uncharacterized proteins, many containing repeated short linear motifs or homorepeats. Such proteins may play a role in shell matrix construction or control of mineralization processes. Conclusions The organic matrix of Lottia gigantea shells is a complex mixture of proteins comprising possible homologs of some previously characterized mollusc shell proteins, but also many novel proteins with a possible function in biomineralization as framework building blocks or as regulatory components. We hope that this data set, the most comprehensive available at present, will provide a platform for the further exploration of biomineralization processes in molluscs. PMID:22540284
Implementation and assessment of a yeast orphan gene research project; involving undergraduates in authentic research experiences and progressing our understanding of uncharacterized open reading frames

PubMed Central

Bowling, Bethany V.; Schultheis, Patrick J.

2015-01-01

Saccharomyces cerevisiae was the first eukaryotic organism to be sequenced, however little progress has been made in recent years in furthering our understanding of all open reading frames (ORFs). From October 2012 to May 2015 the number of verified ORFs has only risen from 75.31% to 78% while the number of uncharacterized ORFs have decreased from 12.8% to 11% (representing more than 700 genes still left in this category) [http://www.yeastgenome.org/genomesnapshot]. Course-based research has been shown to increase student learning while providing experience with real scientific investigation; however, implementation in large, multi-section courses presents many challenges. This study sought to test the feasibility and effectiveness of incorporating authentic research into a core genetics course with multiple instructors to increase student learning and progress our understanding of uncharacterized ORFs. We generated a module-based annotation toolkit and utilized easily accessible bioinformatics tools to predict gene function for uncharacterized ORFs within the Saccharomyces Genome Database (SGD). Students were each assigned an uncharacterized ORF which they annotated using contemporary comparative genomics methodologies including multiple sequence alignment, conserved domain identification, signal peptide prediction and cellular localization algorithms. Student learning outcomes were measured by quizzes, project reports and presentations, as well as a post-project questionnaire. Our results indicate the authentic research experience had positive impacts on student's perception of their learning and their confidence to conduct future research. Furthermore we believe that creation of an online repository and adoption and/or adaptation of this project across multiple researchers and institutions could speed the process of gene function prediction. PMID:26460164
Implementation and assessment of a yeast orphan gene research project: involving undergraduates in authentic research experiences and progressing our understanding of uncharacterized open reading frames.

PubMed

Bowling, Bethany V; Schultheis, Patrick J; Strome, Erin D

2016-02-01

Saccharomyces cerevisiae was the first eukaryotic organism to be sequenced; however, little progress has been made in recent years in furthering our understanding of all open reading frames (ORFs). From October 2012 to May 2015 the number of verified ORFs had only risen from 75.31% to 78%, while the number of uncharacterized ORFs had decreased from 12.8% to 11% (representing > 700 genes still left in this category; http://www.yeastgenome.org/genomesnapshot). Course-based research has been shown to increase student learning while providing experience with real scientific investigation; however, implementation in large, multi-section courses presents many challenges. This study sought to test the feasibility and effectiveness of incorporating authentic research into a core genetics course, with multiple instructors, to increase student learning and progress our understanding of uncharacterized ORFs. We generated a module-based annotation toolkit and utilized easily accessible bioinformatics tools to predict gene function for uncharacterized ORFs within the Saccharomyces Genome Database (SGD). Students were each assigned an uncharacterized ORF, which they annotated using contemporary comparative genomics methodologies, including multiple sequence alignment, conserved domain identification, signal peptide prediction and cellular localization algorithms. Student learning outcomes were measured by quizzes, project reports and presentations, as well as a post-project questionnaire. Our results indicate that the authentic research experience had positive impacts on students' perception of their learning and their confidence to conduct future research. Furthermore, we believe that creation of an online repository and adoption and/or adaptation of this project across multiple researchers and institutions could speed the process of gene function prediction. Copyright © 2015 John Wiley & Sons, Ltd.
The identification and functional annotation of RNA structures conserved in vertebrates

PubMed Central

Seemann, Stefan E.; Mirza, Aashiq H.; Hansen, Claus; Bang-Berthelsen, Claus H.; Garde, Christian; Christensen-Dalsgaard, Mikkel; Torarinsson, Elfar; Yao, Zizhen; Workman, Christopher T.; Pociot, Flemming; Nielsen, Henrik; Tommerup, Niels; Ruzzo, Walter L.; Gorodkin, Jan

2017-01-01

Structured elements of RNA molecules are essential in, e.g., RNA stabilization, localization, and protein interaction, and their conservation across species suggests a common functional role. We computationally screened vertebrate genomes for conserved RNA structures (CRSs), leveraging structure-based, rather than sequence-based, alignments. After careful correction for sequence identity and GC content, we predict ∼516,000 human genomic regions containing CRSs. We find that a substantial fraction of human–mouse CRS regions (1) colocalize consistently with binding sites of the same RNA binding proteins (RBPs) or (2) are transcribed in corresponding tissues. Additionally, a CaptureSeq experiment revealed expression of many of our CRS regions in human fetal brain, including 662 novel ones. For selected human and mouse candidate pairs, qRT-PCR and in vitro RNA structure probing supported both shared expression and shared structure despite low abundance and low sequence identity. About 30,000 CRS regions are located near coding or long noncoding RNA genes or within enhancers. Structured (CRS overlapping) enhancer RNAs and extended 3′ ends have significantly increased expression levels over their nonstructured counterparts. Our findings of transcribed uncharacterized regulatory regions that contain CRSs support their RNA-mediated functionality. PMID:28487280
iDNA-Prot: Identification of DNA Binding Proteins Using Random Forest with Grey Model

PubMed Central

Lin, Wei-Zhong; Fang, Jian-An; Xiao, Xuan; Chou, Kuo-Chen

2011-01-01

DNA-binding proteins play crucial roles in various cellular processes. Developing high throughput tools for rapidly and effectively identifying DNA-binding proteins is one of the major challenges in the field of genome annotation. Although many efforts have been made in this regard, further effort is needed to enhance the prediction power. By incorporating the features into the general form of pseudo amino acid composition that were extracted from protein sequences via the “grey model” and by adopting the random forest operation engine, we proposed a new predictor, called iDNA-Prot, for identifying uncharacterized proteins as DNA-binding proteins or non-DNA binding proteins based on their amino acid sequences information alone. The overall success rate by iDNA-Prot was 83.96% that was obtained via jackknife tests on a newly constructed stringent benchmark dataset in which none of the proteins included has pairwise sequence identity to any other in a same subset. In addition to achieving high success rate, the computational time for iDNA-Prot is remarkably shorter in comparison with the relevant existing predictors. Hence it is anticipated that iDNA-Prot may become a useful high throughput tool for large-scale analysis of DNA-binding proteins. As a user-friendly web-server, iDNA-Prot is freely accessible to the public at the web-site on http://icpr.jci.edu.cn/bioinfo/iDNA-Prot or http://www.jci-bioinfo.cn/iDNA-Prot. Moreover, for the convenience of the vast majority of experimental scientists, a step-by-step guide is provided on how to use the web-server to get the desired results. PMID:21935457
Draft Genome Sequence of Catellicoccus marimammalium, a Novel Species Commonly Found in Gull Feces

EPA Science Inventory

Catellicoccus marimammalium is a relatively uncharacterized Gram-positive, facultative anaerobe with potential utility as an indicator of waterfowl fecal contamination. Here we report an annotated draft genome sequence that suggests this organism may be a symbiotic gut microbe.
Pan-Cancer Analysis of Mutation Hotspots in Protein Domains.

PubMed

Miller, Martin L; Reznik, Ed; Gauthier, Nicholas P; Aksoy, Bülent Arman; Korkut, Anil; Gao, Jianjiong; Ciriello, Giovanni; Schultz, Nikolaus; Sander, Chris

2015-09-23

In cancer genomics, recurrence of mutations in independent tumor samples is a strong indicator of functional impact. However, rare functional mutations can escape detection by recurrence analysis owing to lack of statistical power. We enhance statistical power by extending the notion of recurrence of mutations from single genes to gene families that share homologous protein domains. Domain mutation analysis also sharpens the functional interpretation of the impact of mutations, as domains more succinctly embody function than entire genes. By mapping mutations in 22 different tumor types to equivalent positions in multiple sequence alignments of domains, we confirm well-known functional mutation hotspots, identify uncharacterized rare variants in one gene that are equivalent to well-characterized mutations in another gene, detect previously unknown mutation hotspots, and provide hypotheses about molecular mechanisms and downstream effects of domain mutations. With the rapid expansion of cancer genomics projects, protein domain hotspot analysis will likely provide many more leads linking mutations in proteins to the cancer phenotype. Copyright © 2015 Elsevier Inc. All rights reserved.
Computational Analysis of Uncharacterized Proteins of Environmental Bacterial Genome

NASA Astrophysics Data System (ADS)

Coxe, K. J.; Kumar, M.

2017-12-01

Betaproteobacteria strain CB is a gram-negative bacterium in the phylum Proteobacteria and are found naturally in soil and water. In this complex environment, bacteria play a key role in efficiently eliminating the organic material and other pollutants from wastewater. To investigate the process of pollutant removal from wastewater using bacteria, it is important to characterize the proteins encoded by the bacterial genome. Our study combines a number of bioinformatics tools to predict the function of unassigned proteins in the bacterial genome. The genome of Betaproteobacteria strain CB contains 2,112 proteins in which function of 508 proteins are unknown, termed as uncharacterized proteins (UPs). The localization of the UPs with in the cell was determined and the structure of 38 UPs was accurately predicted. These UPs were predicted to belong to various classes of proteins such as enzymes, transporters, binding proteins, signal peptides, transmembrane proteins and other proteins. The outcome of this work will help better understand wastewater treatment mechanism.
PVP-SVM: Sequence-Based Prediction of Phage Virion Proteins Using a Support Vector Machine

PubMed Central

Manavalan, Balachandran; Shin, Tae H.; Lee, Gwang

2018-01-01

Accurately identifying bacteriophage virion proteins from uncharacterized sequences is important to understand interactions between the phage and its host bacteria in order to develop new antibacterial drugs. However, identification of such proteins using experimental techniques is expensive and often time consuming; hence, development of an efficient computational algorithm for the prediction of phage virion proteins (PVPs) prior to in vitro experimentation is needed. Here, we describe a support vector machine (SVM)-based PVP predictor, called PVP-SVM, which was trained with 136 optimal features. A feature selection protocol was employed to identify the optimal features from a large set that included amino acid composition, dipeptide composition, atomic composition, physicochemical properties, and chain-transition-distribution. PVP-SVM achieved an accuracy of 0.870 during leave-one-out cross-validation, which was 6% higher than control SVM predictors trained with all features, indicating the efficiency of the feature selection method. Furthermore, PVP-SVM displayed superior performance compared to the currently available method, PVPred, and two other machine-learning methods developed in this study when objectively evaluated with an independent dataset. For the convenience of the scientific community, a user-friendly and publicly accessible web server has been established at www.thegleelab.org/PVP-SVM/PVP-SVM.html. PMID:29616000
PVP-SVM: Sequence-Based Prediction of Phage Virion Proteins Using a Support Vector Machine.

PubMed

Manavalan, Balachandran; Shin, Tae H; Lee, Gwang

2018-01-01

Accurately identifying bacteriophage virion proteins from uncharacterized sequences is important to understand interactions between the phage and its host bacteria in order to develop new antibacterial drugs. However, identification of such proteins using experimental techniques is expensive and often time consuming; hence, development of an efficient computational algorithm for the prediction of phage virion proteins (PVPs) prior to in vitro experimentation is needed. Here, we describe a support vector machine (SVM)-based PVP predictor, called PVP-SVM, which was trained with 136 optimal features. A feature selection protocol was employed to identify the optimal features from a large set that included amino acid composition, dipeptide composition, atomic composition, physicochemical properties, and chain-transition-distribution. PVP-SVM achieved an accuracy of 0.870 during leave-one-out cross-validation, which was 6% higher than control SVM predictors trained with all features, indicating the efficiency of the feature selection method. Furthermore, PVP-SVM displayed superior performance compared to the currently available method, PVPred, and two other machine-learning methods developed in this study when objectively evaluated with an independent dataset. For the convenience of the scientific community, a user-friendly and publicly accessible web server has been established at www.thegleelab.org/PVP-SVM/PVP-SVM.html.
Simple and efficient identification of rare recessive pathologically important sequence variants from next generation exome sequence data.

PubMed

Carr, Ian M; Morgan, Joanne; Watson, Christopher; Melnik, Svitlana; Diggle, Christine P; Logan, Clare V; Harrison, Sally M; Taylor, Graham R; Pena, Sergio D J; Markham, Alexander F; Alkuraya, Fowzan S; Black, Graeme C M; Ali, Manir; Bonthron, David T

2013-07-01

Massively parallel ("next generation") DNA sequencing (NGS) has quickly become the method of choice for seeking pathogenic mutations in rare uncharacterized monogenic diseases. Typically, before DNA sequencing, protein-coding regions are enriched from patient genomic DNA, representing either the entire genome ("exome sequencing") or selected mapped candidate loci. Sequence variants, identified as differences between the patient's and the human genome reference sequences, are then filtered according to various quality parameters. Changes are screened against datasets of known polymorphisms, such as dbSNP and the 1000 Genomes Project, in the effort to narrow the list of candidate causative variants. An increasing number of commercial services now offer to both generate and align NGS data to a reference genome. This potentially allows small groups with limited computing infrastructure and informatics skills to utilize this technology. However, the capability to effectively filter and assess sequence variants is still an important bottleneck in the identification of deleterious sequence variants in both research and diagnostic settings. We have developed an approach to this problem comprising a user-friendly suite of programs that can interactively analyze, filter and screen data from enrichment-capture NGS data. These programs ("Agile Suite") are particularly suitable for small-scale gene discovery or for diagnostic analysis. © 2013 WILEY PERIODICALS, INC.
Complete Genome Sequence of a Porcine Polyomavirus from Nasal Swabs of Pigs with Respiratory Disease

PubMed Central

Smith, Catherine; Bishop, Brian; Stewart, Chelsea; Simonson, Randy

2018-01-01

ABSTRACT Metagenomic sequencing of pooled nasal swabs from pigs with unexplained respiratory disease identified a large number of reads mapping to a previously uncharacterized porcine polyomavirus. Sus scrofa polyomavirus 2 was most closely related to betapolyomaviruses frequently detected in mammalian respiratory samples. PMID:29700160
Molecular recognition of the Tes LIM2-3 domains by the actin-related protein Arp7A.

PubMed

Boëda, Batiste; Knowles, Phillip P; Briggs, David C; Murray-Rust, Judith; Soriano, Erika; Garvalov, Boyan K; McDonald, Neil Q; Way, Michael

2011-04-01

Actin-related proteins (Arps) are a highly conserved family of proteins that have extensive sequence and structural similarity to actin. All characterized Arps are components of large multimeric complexes associated with chromatin or the cytoskeleton. In addition, the human genome encodes five conserved but largely uncharacterized "orphan" Arps, which appear to be mostly testis-specific. Here we show that Arp7A, which has 43% sequence identity with β-actin, forms a complex with the cytoskeletal proteins Tes and Mena in the subacrosomal layer of round spermatids. The N-terminal 65-residue extension to the actin-like fold of Arp7A interacts directly with Tes. The crystal structure of the 1-65(Arp7A)·LIM2-3(Tes)·EVH1(Mena) complex reveals that residues 28-49 of Arp7A contact the LIM2-3 domains of Tes. Two alanine residues from Arp7A that occupy equivalent apolar pockets in both LIM domains as well as an intervening GPAK linker that binds the LIM2-3 junction are critical for the Arp7A-Tes interaction. Equivalent occupied apolar pockets are also seen in the tandem LIM domain structures of LMO4 and Lhx3 bound to unrelated ligands. Our results indicate that apolar pocket interactions are a common feature of tandem LIM domain interactions, but ligand specificity is principally determined by the linker sequence.

iCar-PseCp: identify carbonylation sites in proteins by Monte Carlo sampling and incorporating sequence coupled effects into general PseAAC.

PubMed

Jia, Jianhua; Liu, Zi; Xiao, Xuan; Liu, Bingxiang; Chou, Kuo-Chen

2016-06-07

Carbonylation is a posttranslational modification (PTM or PTLM), where a carbonyl group is added to lysine (K), proline (P), arginine (R), and threonine (T) residue of a protein molecule. Carbonylation plays an important role in orchestrating various biological processes but it is also associated with many diseases such as diabetes, chronic lung disease, Parkinson's disease, Alzheimer's disease, chronic renal failure, and sepsis. Therefore, from the angles of both basic research and drug development, we are facing a challenging problem: for an uncharacterized protein sequence containing many residues of K, P, R, or T, which ones can be carbonylated, and which ones cannot? To address this problem, we have developed a predictor called iCar-PseCp by incorporating the sequence-coupled information into the general pseudo amino acid composition, and balancing out skewed training dataset by Monte Carlo sampling to expand positive subset. Rigorous target cross-validations on a same set of carbonylation-known proteins indicated that the new predictor remarkably outperformed its existing counterparts. For the convenience of most experimental scientists, a user-friendly web-server for iCar-PseCp has been established at http://www.jci-bioinfo.cn/iCar-PseCp, by which users can easily obtain their desired results without the need to go through the complicated mathematical equations involved. It has not escaped our notice that the formulation and approach presented here can also be used to analyze many other problems in computational proteomics.
Route of infection alters virulence of neonatal septicemia Escherichia coli clinical isolates

PubMed Central

Cole, Bryan K.; Scott, Edgar; Ilikj, Marko; Bard, David; Akins, Darrin R.; Dyer, David W.

2017-01-01

Escherichia coli is the leading cause of Gram-negative neonatal septicemia in the United States. Invasion and passage across the neonatal gut after ingestion of maternal E. coli strains produce bacteremia. In this study, we compared the virulence properties of the neonatal E. coli bacteremia clinical isolate SCB34 with the archetypal neonatal E. coli meningitis strain RS218. Whole-genome sequencing data was used to compare the protein coding sequences among these clinical isolates and 33 other representative E. coli strains. Oral inoculation of newborn animals with either strain produced septicemia, whereas intraperitoneal injection caused septicemia only in pups infected with RS218 but not in those injected with SCB34. In addition to being virulent only through the oral route, SCB34 demonstrated significantly greater invasion and transcytosis of polarized intestinal epithelial cells in vitro as compared to RS218. Protein coding sequences comparisons highlighted the presence of known virulence factors that are shared among several of these isolates, and revealed the existence of proteins exclusively encoded in SCB34, many of which remain uncharacterized. Our study demonstrates that oral acquisition is crucial for the virulence properties of the neonatal bacteremia clinical isolate SCB34. This characteristic, along with its enhanced ability to invade and transcytose intestinal epithelium are likely determined by the specific virulence factors that predominate in this strain. PMID:29236742
Metagenomics uncovers gaps in amplicon-based detection of microbial diversity

DOE PAGES

Eloe-Fadrosh, Emiley A.; Ivanova, Natalia N.; Woyke, Tanja; ...

2016-02-01

Our view of microbial diversity has expanded greatly over the past 40 years, primarily through the wide application of PCR-based surveys of the small-subunit ribosomal RNA (SSU rRNA) gene. Yet significant gaps in knowledge remain due to well-recognized limitations of this method. Here in this paper, we systematically survey primer fidelity in SSU rRNA gene sequences recovered from over 6,000 assembled metagenomes sampled globally. Our findings show that approximately 10% of environmental microbial sequences might be missed from classical PCR-based SSU rRNA gene surveys, mostly members of the Candidate Phyla Radiation (CPR) and as yet uncharacterized Archaea. In conclusion, thesemore » results underscore the extent of uncharacterized microbial diversity and provide fruitful avenues for describing additional phylogenetic lineages.« less
Integron-Associated DfrB4, a Previously Uncharacterized Member of the Trimethoprim-Resistant Dihydrofolate Reductase B Family, Is a Clinically Identified Emergent Source of Antibiotic Resistance.

PubMed

Toulouse, Jacynthe L; Edens, Thaddeus J; Alejaldre, Lorea; Manges, Amee R; Pelletier, Joelle N

2017-05-01

Whole-genome sequencing of trimethoprim-resistant Escherichia coli clinical isolates identified a member of the trimethoprim-resistant type II dihydrofolate reductase gene family ( dfrB ). The dfrB4 gene was located within a class I integron flanked by multiple resistance genes. This arrangement was previously reported in a 130.6-kb multiresistance plasmid. The DfrB4 protein conferred a >2,000-fold increased trimethoprim resistance on overexpression in E. coli Our results are consistent with the finding that dfrB4 contributes to clinical trimethoprim resistance. Copyright © 2017 American Society for Microbiology.
Dissecting protein loops with a statistical scalpel suggests a functional implication of some structural motifs.

PubMed

Regad, Leslie; Martin, Juliette; Camproux, Anne-Claude

2011-06-20

One of the strategies for protein function annotation is to search particular structural motifs that are known to be shared by proteins with a given function. Here, we present a systematic extraction of structural motifs of seven residues from protein loops and we explore their correspondence with functional sites. Our approach is based on the structural alphabet HMM-SA (Hidden Markov Model - Structural Alphabet), which allows simplification of protein structures into uni-dimensional sequences, and advanced pattern statistics adapted to short sequences. Structural motifs of interest are selected by looking for structural motifs significantly over-represented in SCOP superfamilies in protein loops. We discovered two types of structural motifs significantly over-represented in SCOP superfamilies: (i) ubiquitous motifs, shared by several superfamilies and (ii) superfamily-specific motifs, over-represented in few superfamilies. A comparison of ubiquitous words with known small structural motifs shows that they contain well-described motifs as turn, niche or nest motifs. A comparison between superfamily-specific motifs and biological annotations of Swiss-Prot reveals that some of them actually correspond to functional sites involved in the binding sites of small ligands, such as ATP/GTP, NAD(P) and SAH/SAM. Our findings show that statistical over-representation in SCOP superfamilies is linked to functional features. The detection of over-represented motifs within structures simplified by HMM-SA is therefore a promising approach for prediction of functional sites and annotation of uncharacterized proteins.
Dissecting protein loops with a statistical scalpel suggests a functional implication of some structural motifs

PubMed Central

2011-01-01

Background One of the strategies for protein function annotation is to search particular structural motifs that are known to be shared by proteins with a given function. Results Here, we present a systematic extraction of structural motifs of seven residues from protein loops and we explore their correspondence with functional sites. Our approach is based on the structural alphabet HMM-SA (Hidden Markov Model - Structural Alphabet), which allows simplification of protein structures into uni-dimensional sequences, and advanced pattern statistics adapted to short sequences. Structural motifs of interest are selected by looking for structural motifs significantly over-represented in SCOP superfamilies in protein loops. We discovered two types of structural motifs significantly over-represented in SCOP superfamilies: (i) ubiquitous motifs, shared by several superfamilies and (ii) superfamily-specific motifs, over-represented in few superfamilies. A comparison of ubiquitous words with known small structural motifs shows that they contain well-described motifs as turn, niche or nest motifs. A comparison between superfamily-specific motifs and biological annotations of Swiss-Prot reveals that some of them actually correspond to functional sites involved in the binding sites of small ligands, such as ATP/GTP, NAD(P) and SAH/SAM. Conclusions Our findings show that statistical over-representation in SCOP superfamilies is linked to functional features. The detection of over-represented motifs within structures simplified by HMM-SA is therefore a promising approach for prediction of functional sites and annotation of uncharacterized proteins. PMID:21689388
Three new members of the RNP protein family in Xenopus.

PubMed Central

Good, P J; Rebbert, M L; Dawid, I B

1993-01-01

Many RNP proteins contain one or more copies of the RNA recognition motif (RRM) and are thought to be involved in cellular RNA metabolism. We have previously characterized in Xenopus a nervous system specific gene, nrp1, that is more similar to the hnRNP A/B proteins than to other known proteins (K. Richter, P. J. Good, and I. B. Dawid (1990), New Biol. 2, 556-565). PCR amplification with degenerate primers was used to identify additional cDNAs encoding two RRMs in Xenopus. Three previously uncharacterized genes were identified. Two genes encode hnRNP A/B proteins with two RRMs and a glycine-rich domain. One of these is the Xenopus homolog of the human A2/B1 gene; the other, named hnRNP A3, is similar to both the A1 and A2 hnRNP genes. The Xenopus hnRNP A1, A2 and A3 genes are expressed throughout development and in all adult tissues. Multiple protein isoforms for the hnRNP A2 gene are predicted that differ by the insertion of short peptide sequences in the glycine-rich domain. The third newly isolated gene, named xrp1, encodes a protein that is related by sequence to the nrp1 protein but is expressed ubiquitously. Despite the similarity to nuclear RNP proteins, both the nrp1 and xrp1 proteins are localized to the cytoplasm in the Xenopus oocyte. The xrp1 gene may have a function in all cells that is similar to that executed by nrp1 specifically within the nervous system. Images PMID:8451200
TrypsNetDB: An integrated framework for the functional characterization of trypanosomatid proteins

PubMed Central

Gazestani, Vahid H.; Yip, Chun Wai; Nikpour, Najmeh; Berghuis, Natasha

2017-01-01

Trypanosomatid parasites cause serious infections in humans and production losses in livestock. Due to the high divergence from other eukaryotes, such as humans and model organisms, the functional roles of many trypanosomatid proteins cannot be predicted by homology-based methods, rendering a significant portion of their proteins as uncharacterized. Recent technological advances have led to the availability of multiple systematic and genome-wide datasets on trypanosomatid parasites that are informative regarding the biological role(s) of their proteins. Here, we report TrypsNetDB (http://trypsNetDB.org), a web-based resource for the functional annotation of 16 different species/strains of trypanosomatid parasites. The database not only visualizes the network context of the queried protein(s) in an intuitive way but also examines the response of the represented network in more than 50 different biological contexts and its enrichment for various biological terms and pathways, protein sequence signatures, and potential RNA regulatory elements. The interactome core of the database, as of Jan 23, 2017, contains 101,187 interactions among 13,395 trypanosomatid proteins inferred from 97 genome-wide and focused studies on the interactome of these organisms. PMID:28158179
Systematic discovery of novel eukaryotic transcriptional regulators using sequence homology independent prediction.

PubMed

Bossi, Flavia; Fan, Jue; Xiao, Jun; Chandra, Lilyana; Shen, Max; Dorone, Yanniv; Wagner, Doris; Rhee, Seung Y

2017-06-26

The molecular function of a gene is most commonly inferred by sequence similarity. Therefore, genes that lack sufficient sequence similarity to characterized genes (such as certain classes of transcriptional regulators) are difficult to classify using most function prediction algorithms and have remained uncharacterized. To identify novel transcriptional regulators systematically, we used a feature-based pipeline to screen protein families of unknown function. This method predicted 43 transcriptional regulator families in Arabidopsis thaliana, 7 families in Drosophila melanogaster, and 9 families in Homo sapiens. Literature curation validated 12 of the predicted families to be involved in transcriptional regulation. We tested 33 out of the 195 Arabidopsis putative transcriptional regulators for their ability to activate transcription of a reporter gene in planta and found twelve coactivators, five of which had no prior literature support. To investigate mechanisms of action in which the predicted regulators might work, we looked for interactors of an Arabidopsis candidate that did not show transactivation activity in planta and found that it might work with other members of its own family and a subunit of the Polycomb Repressive Complex 2 to regulate transcription. Our results demonstrate the feasibility of assigning molecular function to proteins of unknown function without depending on sequence similarity. In particular, we identified novel transcriptional regulators using biological features enriched in transcription factors. The predictions reported here should accelerate the characterization of novel regulators.
The identification and functional annotation of RNA structures conserved in vertebrates.

PubMed

Seemann, Stefan E; Mirza, Aashiq H; Hansen, Claus; Bang-Berthelsen, Claus H; Garde, Christian; Christensen-Dalsgaard, Mikkel; Torarinsson, Elfar; Yao, Zizhen; Workman, Christopher T; Pociot, Flemming; Nielsen, Henrik; Tommerup, Niels; Ruzzo, Walter L; Gorodkin, Jan

2017-08-01

Structured elements of RNA molecules are essential in, e.g., RNA stabilization, localization, and protein interaction, and their conservation across species suggests a common functional role. We computationally screened vertebrate genomes for conserved RNA structures (CRSs), leveraging structure-based, rather than sequence-based, alignments. After careful correction for sequence identity and GC content, we predict ∼516,000 human genomic regions containing CRSs. We find that a substantial fraction of human-mouse CRS regions (1) colocalize consistently with binding sites of the same RNA binding proteins (RBPs) or (2) are transcribed in corresponding tissues. Additionally, a CaptureSeq experiment revealed expression of many of our CRS regions in human fetal brain, including 662 novel ones. For selected human and mouse candidate pairs, qRT-PCR and in vitro RNA structure probing supported both shared expression and shared structure despite low abundance and low sequence identity. About 30,000 CRS regions are located near coding or long noncoding RNA genes or within enhancers. Structured (CRS overlapping) enhancer RNAs and extended 3' ends have significantly increased expression levels over their nonstructured counterparts. Our findings of transcribed uncharacterized regulatory regions that contain CRSs support their RNA-mediated functionality. © 2017 Seemann et al.; Published by Cold Spring Harbor Laboratory Press.
Integrated genomics and proteomics of the Torpedo californica electric organ: concordance with the mammalian neuromuscular junction

PubMed Central

2011-01-01

Background During development, the branchial mesoderm of Torpedo californica transdifferentiates into an electric organ capable of generating high voltage discharges to stun fish. The organ contains a high density of cholinergic synapses and has served as a biochemical model for the membrane specialization of myofibers, the neuromuscular junction (NMJ). We studied the genome and proteome of the electric organ to gain insight into its composition, to determine if there is concordance with skeletal muscle and the NMJ, and to identify novel synaptic proteins. Results Of 435 proteins identified, 300 mapped to Torpedo cDNA sequences with ≥2 peptides. We identified 14 uncharacterized proteins in the electric organ that are known to play a role in acetylcholine receptor clustering or signal transduction. In addition, two human open reading frames, C1orf123 and C6orf130, showed high sequence similarity to electric organ proteins. Our profile lists several proteins that are highly expressed in skeletal muscle or are muscle specific. Synaptic proteins such as acetylcholinesterase, acetylcholine receptor subunits, and rapsyn were present in the electric organ proteome but absent in the skeletal muscle proteome. Conclusions Our integrated genomic and proteomic analysis supports research describing a muscle-like profile of the organ. We show that it is a repository of NMJ proteins but we present limitations on its use as a comprehensive model of the NMJ. Finally, we identified several proteins that may become candidates for signaling proteins not previously characterized as components of the NMJ. PMID:21798097
Genome-Wide Sensitivity Analysis of the Microsymbiont Sinorhizobium meliloti to Symbiotically Important, Defensin-Like Host Peptides

PubMed Central

Arnold, Markus F. F.; Shabab, Mohammed; Penterman, Jon; Boehme, Kevin L.; Griffitts, Joel S.

2017-01-01

ABSTRACT The model legume species Medicago truncatula expresses more than 700 nodule-specific cysteine-rich (NCR) signaling peptides that mediate the differentiation of Sinorhizobium meliloti bacteria into nitrogen-fixing bacteroids. NCR peptides are essential for a successful symbiosis in legume plants of the inverted-repeat-lacking clade (IRLC) and show similarity to mammalian defensins. In addition to signaling functions, many NCR peptides exhibit antimicrobial activity in vitro and in vivo. Bacterial resistance to these antimicrobial activities is likely to be important for symbiosis. However, the mechanisms used by S. meliloti to resist antimicrobial activity of plant peptides are poorly understood. To address this, we applied a global genetic approach using transposon mutagenesis followed by high-throughput sequencing (Tn-seq) to identify S. meliloti genes and pathways that increase or decrease bacterial competitiveness during exposure to the well-studied cationic NCR247 peptide and also to the unrelated model antimicrobial peptide polymyxin B. We identified 78 genes and several diverse pathways whose interruption alters S. meliloti resistance to NCR247. These genes encode the following: (i) cell envelope polysaccharide biosynthesis and modification proteins, (ii) inner and outer membrane proteins, (iii) peptidoglycan (PG) effector proteins, and (iv) non-membrane-associated factors such as transcriptional regulators and ribosome-associated factors. We describe a previously uncharacterized yet highly conserved peptidase, which protects S. meliloti from NCR247 and increases competitiveness during symbiosis. Additionally, we highlight a considerable number of uncharacterized genes that provide the basis for future studies to investigate the molecular basis of symbiotic development as well as chronic pathogenic interactions. PMID:28765224
Mutation of the N-Terminal Region of Chikungunya Virus Capsid Protein: Implications for Vaccine Design.

PubMed

Taylor, Adam; Liu, Xiang; Zaid, Ali; Goh, Lucas Y H; Hobson-Peters, Jody; Hall, Roy A; Merits, Andres; Mahalingam, Suresh

2017-02-21

Mosquito-transmitted chikungunya virus (CHIKV) is an arthritogenic alphavirus of the Togaviridae family responsible for frequent outbreaks of arthritic disease in humans. Capsid protein, a structural protein encoded by the CHIKV RNA genome, is able to translocate to the host cell nucleolus. In encephalitic alphaviruses, nuclear translocation induces host cell transcriptional shutoff; however, the role of capsid protein nucleolar localization in arthritogenic alphaviruses remains unclear. Using recombinant enhanced green fluorescent protein (EGFP)-tagged expression constructs and CHIKV infectious clones, we describe a nucleolar localization sequence (NoLS) in the N-terminal region of capsid protein, previously uncharacterized in CHIKV. Mutation of the NoLS by site-directed mutagenesis reduced efficiency of nuclear import of CHIKV capsid protein. In the virus, mutation of the capsid protein NoLS (CHIKV-NoLS) attenuated replication in mammalian and mosquito cells, producing a small-plaque phenotype. Attenuation of CHIKV-NoLS is likely due to disruption of the viral replication cycle downstream of viral RNA synthesis. In mice, CHIKV-NoLS infection caused no disease signs compared to wild-type CHIKV (CHIKV-WT)-infected mice; lack of disease signs correlated with significantly reduced viremia and decreased expression of proinflammatory factors. Mice immunized with CHIKV-NoLS, challenged with CHIKV-WT at 30 days postimmunization, develop no disease signs and no detectable viremia. Serum from CHIKV-NoLS-immunized mice is able to efficiently neutralize CHIKV infection in vitro Additionally, CHIKV-NoLS-immunized mice challenged with the related alphavirus Ross River virus showed reduced early and peak viremia postchallenge, indicating a cross-protective effect. The high degree of CHIKV-NoLS attenuation may improve CHIKV antiviral and rational vaccine design. IMPORTANCE CHIKV is a mosquito-borne pathogen capable of causing explosive epidemics of incapacitating joint pain affecting millions of people. After a series of major outbreaks over the last 10 years, CHIKV and its mosquito vectors have been able to expand their range extensively, now making CHIKV a human pathogen of global importance. With no licensed vaccine or antiviral therapy for the treatment of CHIKV disease, there is a growing need to understand the molecular determinants of viral pathogenesis. These studies identify a previously uncharacterized nucleolar localization sequence (NoLS) in CHIKV capsid protein, begin a functional analysis of site-directed mutants of the capsid protein NoLS, and examine the effect of the NoLS mutation on CHIKV pathogenesis in vivo and its potential to influence CHIKV vaccine design. A better understanding of the pathobiology of CHIKV disease will aid the development of effective therapeutic strategies. Copyright © 2017 Taylor et al.
Directed evolution of a synthetic phylogeny of programmable Trp repressors.

PubMed

Ellefson, Jared W; Ledbetter, Michael P; Ellington, Andrew D

2018-04-01

As synthetic regulatory programs expand in sophistication, an ever increasing number of biological components with predictable phenotypes is required. Regulators are often 'part mined' from a diverse, but uncharacterized, array of genomic sequences, often leading to idiosyncratic behavior. Here, we generate an entire synthetic phylogeny from the canonical allosteric transcription factor TrpR. Iterative rounds of positive and negative compartmentalized partnered replication (CPR) led to the exponential amplification of variants that responded with high affinity and specificity to halogenated tryptophan analogs and novel operator sites. Fourteen repressor variants were evolved with unique regulatory profiles across five operators and three ligands. The logic of individual repressors can be modularly programmed by creating heterodimeric fusions, resulting in single proteins that display logic functions, such as 'NAND'. Despite the evolutionarily limited regulatory role of TrpR, vast functional spaces exist around this highly conserved protein scaffold and can be harnessed to create synthetic regulatory programs.
Niakha virus: A novel member of the family Rhabdoviridae isolated from phlebotomine sandflies in Senegal

PubMed Central

Vasilakis, Nikos; Widen, Steven; Mayer, Sandra V.; Seymour, Robert; Wood, Thomas G.; Popov, Vsevolov; Guzman, Hilda; da Rosa, Amelia P.A. Travassos; Ghedin, Elodie; Holmes, Edward C.; Walker, Peter J.; Tesh, Robert B.

2013-01-01

Members of the family Rhabdoviridae have been assigned to eight genera but many remain unassigned. Rhabdoviruses have a remarkably diverse host range that includes terrestrial and marine animals, invertebrates and plants. Transmission of some rhabdoviruses often requires an arthropod vector, such as mosquitoes, midges, sandflies, ticks, aphids and leafhoppers, in which they replicate. Herein we characterize Niakha virus (NIAV), a previously uncharacterized rhabdovirus isolated from phebotomine sandflies in Senegal. Analysis of the 11,124 nt genome sequence indicates that it encodes the five common rhabdovirus proteins with alternative ORFs in the M, G and L genes. Phylogenetic analysis of the L protein indicate that NIAV’s closest relative is Oak Vale rhabdovirus, although in this analysis NIAV is still so phylogenetically distinct that it might be classified as distinct from the eight currently recognized Rhabdoviridae genera. This observation highlights the vast, and yet not fully recognized diversity, of this family. PMID:23773405
Peptide library synthesis on spectrally encoded beads for multiplexed protein/peptide bioassays

NASA Astrophysics Data System (ADS)

Nguyen, Huy Q.; Brower, Kara; Harink, Björn; Baxter, Brian; Thorn, Kurt S.; Fordyce, Polly M.

2017-02-01

Protein-peptide interactions are essential for cellular responses. Despite their importance, these interactions remain largely uncharacterized due to experimental challenges associated with their measurement. Current techniques (e.g. surface plasmon resonance, fluorescence polarization, and isothermal calorimetry) either require large amounts of purified material or direct fluorescent labeling, making high-throughput measurements laborious and expensive. In this report, we present a new technology for measuring antibody-peptide interactions in vitro that leverages spectrally encoded beads for biological multiplexing. Specific peptide sequences are synthesized directly on encoded beads with a 1:1 relationship between peptide sequence and embedded code, thereby making it possible to track many peptide sequences throughout the course of an experiment within a single small volume. We demonstrate the potential of these bead-bound peptide libraries by: (1) creating a set of 46 peptides composed of 3 commonly used epitope tags (myc, FLAG, and HA) and single amino-acid scanning mutants; (2) incubating with a mixture of fluorescently-labeled antimyc, anti-FLAG, and anti-HA antibodies; and (3) imaging these bead-bound libraries to simultaneously identify the embedded spectral code (and thus the sequence of the associated peptide) and quantify the amount of each antibody bound. To our knowledge, these data demonstrate the first customized peptide library synthesized directly on spectrally encoded beads. While the implementation of the technology provided here is a high-affinity antibody/protein interaction with a small code space, we believe this platform can be broadly applicable to any range of peptide screening applications, with the capability to multiplex into libraries of hundreds to thousands of peptides in a single assay.
Comprehensive phylogenetic analysis of bacterial reverse transcriptases.

PubMed

Toro, Nicolás; Nisa-Martínez, Rafael

2014-01-01

Much less is known about reverse transcriptases (RTs) in prokaryotes than in eukaryotes, with most prokaryotic enzymes still uncharacterized. Two surveys involving BLAST searches for RT genes in prokaryotic genomes revealed the presence of large numbers of diverse, uncharacterized RTs and RT-like sequences. Here, using consistent annotation across all sequenced bacterial species from GenBank and other sources via RAST, available from the PATRIC (Pathogenic Resource Integration Center) platform, we have compiled the data for currently annotated reverse transcriptases from completely sequenced bacterial genomes. RT sequences are broadly distributed across bacterial phyla, but green sulfur bacteria and cyanobacteria have the highest levels of RT sequence diversity (≤85% identity) per genome. By contrast, phylum Actinobacteria, for which a large number of genomes have been sequenced, was found to have a low RT sequence diversity. Phylogenetic analyses revealed that bacterial RTs could be classified into 17 main groups: group II introns, retrons/retron-like RTs, diversity-generating retroelements (DGRs), Abi-like RTs, CRISPR-Cas-associated RTs, group II-like RTs (G2L), and 11 other groups of RTs of unknown function. Proteobacteria had the highest potential functional diversity, as they possessed most of the RT groups. Group II introns and DGRs were the most widely distributed RTs in bacterial phyla. Our results provide insights into bacterial RT phylogeny and the basis for an update of annotation systems based on sequence/domain homology.
Comprehensive Phylogenetic Analysis of Bacterial Reverse Transcriptases

PubMed Central

Toro, Nicolás; Nisa-Martínez, Rafael

2014-01-01

Much less is known about reverse transcriptases (RTs) in prokaryotes than in eukaryotes, with most prokaryotic enzymes still uncharacterized. Two surveys involving BLAST searches for RT genes in prokaryotic genomes revealed the presence of large numbers of diverse, uncharacterized RTs and RT-like sequences. Here, using consistent annotation across all sequenced bacterial species from GenBank and other sources via RAST, available from the PATRIC (Pathogenic Resource Integration Center) platform, we have compiled the data for currently annotated reverse transcriptases from completely sequenced bacterial genomes. RT sequences are broadly distributed across bacterial phyla, but green sulfur bacteria and cyanobacteria have the highest levels of RT sequence diversity (≤85% identity) per genome. By contrast, phylum Actinobacteria, for which a large number of genomes have been sequenced, was found to have a low RT sequence diversity. Phylogenetic analyses revealed that bacterial RTs could be classified into 17 main groups: group II introns, retrons/retron-like RTs, diversity-generating retroelements (DGRs), Abi-like RTs, CRISPR-Cas-associated RTs, group II-like RTs (G2L), and 11 other groups of RTs of unknown function. Proteobacteria had the highest potential functional diversity, as they possessed most of the RT groups. Group II introns and DGRs were the most widely distributed RTs in bacterial phyla. Our results provide insights into bacterial RT phylogeny and the basis for an update of annotation systems based on sequence/domain homology. PMID:25423096
Suppression of NF-κB signal pathway by NLRC3-like protein in stony coral Acropora aculeus under heat stress.

PubMed

Zhou, Zhi; Wu, Yibo; Zhang, Chengkai; Li, Can; Chen, Guangmei; Yu, Xiaopeng; Shi, Xiaowei; Xu, Yanlai; Wang, Lingui; Huang, Bo

2017-08-01

Heat stress is the most common factor for coral bleaching, which has increased both in frequency and severity due to global warming. In the present study, the stony coral Acropora aculeus was subjected to acute heat stress and entire transcriptomes were sequenced via the next generation sequencing platform. Four paired-end libraries were constructed and sequenced in two groups, including a control and a heat stress group. A total of 120,319,751 paired-end reads with lengths of 2 × 100 bp were assembled and 55,021 coral-derived genes were obtained. After read mapping and abundance estimation, 9110 differentially expressed genes were obtained in the comparison between the control and heat stress group, including 4465 significantly upregulated and 4645 significantly downregulated genes. Twenty-three GO terms in the Biological Process category were overrepresented for significantly upregulated genes, and divided into six groups according to their relationship. These three groups were related to the NF-κB signal pathway, and the remaining three groups were relevant for pathogen response, immunocyte activation and protein ubiquitination. Forty-three common genes were found in four GO terms, which were directly related to the NF-κB signal pathway. These included 2 NACHT, LRR, PYD domains-containing protein, 5 nucleotide-binding oligomerization domain-containing protein, 29 NLRC3-like protein, 4 NLRC5-like protein, and 3 uncharacterized protein. For significantly downregulated genes, 27 overrepresented GO terms were found in the Biological Process category, which were relevant to protein ubiquitination and ATP metabolism. Our results indicate that heat stress suppressed the immune response level via the NLRC3-like protein, the fine-tuning of protein turnover activity, and ATP metabolism. This might disrupt the balance of coral-zooxanthellae symbiosis and result in the bleaching of the coral A. aculeus. Copyright © 2017 Elsevier Ltd. All rights reserved.
pSuc-Lys: Predict lysine succinylation sites in proteins with PseAAC and ensemble random forest approach.

PubMed

Jia, Jianhua; Liu, Zi; Xiao, Xuan; Liu, Bingxiang; Chou, Kuo-Chen

2016-04-07

Being one type of post-translational modifications (PTMs), protein lysine succinylation is important in regulating varieties of biological processes. It is also involved with some diseases, however. Consequently, from the angles of both basic research and drug development, we are facing a challenging problem: for an uncharacterized protein sequence having many Lys residues therein, which ones can be succinylated, and which ones cannot? To address this problem, we have developed a predictor called pSuc-Lys through (1) incorporating the sequence-coupled information into the general pseudo amino acid composition, (2) balancing out skewed training dataset by random sampling, and (3) constructing an ensemble predictor by fusing a series of individual random forest classifiers. Rigorous cross-validations indicated that it remarkably outperformed the existing methods. A user-friendly web-server for pSuc-Lys has been established at http://www.jci-bioinfo.cn/pSuc-Lys, by which users can easily obtain their desired results without the need to go through the complicated mathematical equations involved. It has not escaped our notice that the formulation and approach presented here can also be used to analyze many other problems in computational proteomics. Copyright © 2016 Elsevier Ltd. All rights reserved.

Chemical-genetic profile analysis in yeast suggests that a previously uncharacterized open reading frame, YBR261C, affects protein synthesis

PubMed Central

Alamgir, Md; Eroukova, Veronika; Jessulat, Matthew; Xu, Jianhua; Golshani, Ashkan

2008-01-01

Background Functional genomics has received considerable attention in the post-genomic era, as it aims to identify function(s) for different genes. One way to study gene function is to investigate the alterations in the responses of deletion mutants to different stimuli. Here we investigate the genetic profile of yeast non-essential gene deletion array (yGDA, ~4700 strains) for increased sensitivity to paromomycin, which targets the process of protein synthesis. Results As expected, our analysis indicated that the majority of deletion strains (134) with increased sensitivity to paromomycin, are involved in protein biosynthesis. The remaining strains can be divided into smaller functional categories: metabolism (45), cellular component biogenesis and organization (28), DNA maintenance (21), transport (20), others (38) and unknown (39). These may represent minor cellular target sites (side-effects) for paromomycin. They may also represent novel links to protein synthesis. One of these strains carries a deletion for a previously uncharacterized ORF, YBR261C, that we term TAE1 for Translation Associated Element 1. Our focused follow-up experiments indicated that deletion of TAE1 alters the ribosomal profile of the mutant cells. Also, gene deletion strain for TAE1 has defects in both translation efficiency and fidelity. Miniaturized synthetic genetic array analysis further indicates that TAE1 genetically interacts with 16 ribosomal protein genes. Phenotypic suppression analysis using TAE1 overexpression also links TAE1 to protein synthesis. Conclusion We show that a previously uncharacterized ORF, YBR261C, affects the process of protein synthesis and reaffirm that large-scale genetic profile analysis can be a useful tool to study novel gene function(s). PMID:19055778
Chemical-genetic profile analysis in yeast suggests that a previously uncharacterized open reading frame, YBR261C, affects protein synthesis.

PubMed

Alamgir, Md; Eroukova, Veronika; Jessulat, Matthew; Xu, Jianhua; Golshani, Ashkan

2008-12-03

Functional genomics has received considerable attention in the post-genomic era, as it aims to identify function(s) for different genes. One way to study gene function is to investigate the alterations in the responses of deletion mutants to different stimuli. Here we investigate the genetic profile of yeast non-essential gene deletion array (yGDA, approximately 4700 strains) for increased sensitivity to paromomycin, which targets the process of protein synthesis. As expected, our analysis indicated that the majority of deletion strains (134) with increased sensitivity to paromomycin, are involved in protein biosynthesis. The remaining strains can be divided into smaller functional categories: metabolism (45), cellular component biogenesis and organization (28), DNA maintenance (21), transport (20), others (38) and unknown (39). These may represent minor cellular target sites (side-effects) for paromomycin. They may also represent novel links to protein synthesis. One of these strains carries a deletion for a previously uncharacterized ORF, YBR261C, that we term TAE1 for Translation Associated Element 1. Our focused follow-up experiments indicated that deletion of TAE1 alters the ribosomal profile of the mutant cells. Also, gene deletion strain for TAE1 has defects in both translation efficiency and fidelity. Miniaturized synthetic genetic array analysis further indicates that TAE1 genetically interacts with 16 ribosomal protein genes. Phenotypic suppression analysis using TAE1 overexpression also links TAE1 to protein synthesis. We show that a previously uncharacterized ORF, YBR261C, affects the process of protein synthesis and reaffirm that large-scale genetic profile analysis can be a useful tool to study novel gene function(s).
The uncharacterized transcription factor YdhM is the regulator of the nemA gene, encoding N-ethylmaleimide reductase.

PubMed

Umezawa, Yoshimasa; Shimada, Tomohiro; Kori, Ayako; Yamada, Kayoko; Ishihama, Akira

2008-09-01

N-ethylmaleimide (NEM) has been used as a specific reagent of Cys modification in proteins and thus is toxic for cell growth. On the Escherichia coli genome, the nemA gene coding for NEM reductase is located downstream of the gene encoding an as-yet-uncharacterized transcription factor, YdhM. Disruption of the ydhM gene results in reduction of nemA expression even in the induced state, indicating that the two genes form a single operon. After in vitro genomic SELEX screening, one of the target recognition sequences for YdhM was identified within the promoter region for this ydhM-nemA operon. Both YdhM binding in vitro to the ydhM promoter region and transcription repression in vivo of the ydhM-nemA operon by YdhM were markedly reduced by the addition of NEM. Taken together, we propose that YdhM is the repressor for the nemA gene, thus hereafter designated NemR. The repressor function of NemR was inactivated by the addition of not only NEM but also other Cys modification reagents, implying that Cys modification of NemR renders it inactive. This is an addition to the mode of controlling activity of transcription factors by alkylation with chemical agents.
OST-HTH: a novel predicted RNA-binding domain

PubMed Central

2010-01-01

Background The mechanism by which the arthropod Oskar and vertebrate TDRD5/TDRD7 proteins nucleate or organize structurally related ribonucleoprotein (RNP) complexes, the polar granule and nuage, is poorly understood. Using sequence profile searches we identify a novel domain in these proteins that is widely conserved across eukaryotes and bacteria. Results Using contextual information from domain architectures, sequence-structure superpositions and available functional information we predict that this domain is likely to adopt the winged helix-turn-helix fold and bind RNA with a potential specificity for dsRNA. We show that in eukaryotes this domain is often combined in the same polypeptide with protein-protein- or lipid- interaction domains that might play a role in anchoring these proteins to specific cytoskeletal structures. Conclusions Thus, proteins with this domain might have a key role in the recognition and localization of dsRNA, including miRNAs, rasiRNAs and piRNAs hybridized to their targets. In other cases, this domain is fused to ubiquitin-binding, E3 ligase and ubiquitin-like domains indicating a previously under-appreciated role for ubiquitination in regulating the assembly and stability of nuage-like RNP complexes. Both bacteria and eukaryotes encode a conserved family of proteins that combines this predicted RNA-binding domain with a previously uncharacterized domain (DUF88). We present evidence that it is an RNAse belonging to the superfamily that includes the 5'->3' nucleases, PIN and NYN domains and might be recruited to degrade certain RNAs. Reviewers This article was reviewed by Sandor Pongor and Arcady Mushegian. PMID:20302647
The phosphatidylinositol transfer protein RdgBβ binds 14-3-3 via its unstructured C-terminus, whereas its lipid-binding domain interacts with the integral membrane protein ATRAP (angiotensin II type I receptor-associated protein).

PubMed

Garner, Kathryn; Li, Michelle; Ugwuanya, Natalie; Cockcroft, Shamshad

2011-10-01

PITPs [PI (phosphatidylinositol) transfer proteins] bind and transfer PI between intracellular membranes and participate in many cellular processes including signalling, lipid metabolism and membrane traffic. The largely uncharacterized PITP RdgBβ (PITPNC1; retinal degeneration type B β), contains a long C-terminal disordered region following its defining N-terminal PITP domain. In the present study we report that the C-terminus contains two tandem phosphorylated binding sites (Ser(274) and Ser(299)) for 14-3-3. The C-terminus also contains PEST sequences which are shielded by 14-3-3 binding. Like many proteins containing PEST sequences, the levels of RdgBβ are regulated by proteolysis. RdgBβ is degraded with a half-life of 4 h following ubiquitination via the proteasome. A mutant RdgBβ which is unable to bind 14-3-3 is degraded even faster with a half-life of 2 h. In vitro, RdgBβ is 100-fold less active than PITPα for PI transfer, and RdgBβ proteins (wild-type and a mutant that cannot bind 14-3-3) expressed in COS-7 cells or endogenous proteins from heart cytosol do not exhibit transfer activity. When cells are treated with PMA, the PITP domain of RdgBβ interacts with the integral membrane protein ATRAP (angiotensin II type I receptor-associated protein; also known as AGTRAP) causing membrane recruitment. We suggest that RdgBβ executes its function following recruitment to membranes via its PITP domain and the C-terminal end of the protein could regulate entry to the hydrophobic cavity.
Substrate prediction of Ixodes ricinus salivary lipocalins differentially expressed during Borrelia afzelii infection

NASA Astrophysics Data System (ADS)

Valdés, James J.; Cabezas-Cruz, Alejandro; Sima, Radek; Butterill, Philip T.; Růžek, Daniel; Nuttall, Patricia A.

2016-09-01

Evolution has provided ticks with an arsenal of bioactive saliva molecules that counteract host defense mechanisms. This salivary pharmacopoeia enables blood-feeding while enabling pathogen transmission. High-throughput sequencing of tick salivary glands has thus become a major focus, revealing large expansion within protein encoding gene families. Among these are lipocalins, ubiquitous barrel-shaped proteins that sequester small, typically hydrophobic molecules. This study was initiated by mining the Ixodes ricinus salivary gland transcriptome for specific, uncharacterized lipocalins: three were identified. Differential expression of these I. ricinus lipocalins during feeding at distinct developmental stages and in response to Borrelia afzelii infection suggests a role in transmission of this Lyme disease spirochete. A phylogenetic analysis using 803 sequences places the three I. ricinus lipocalins with tick lipocalins that sequester monoamines, leukotrienes and fatty acids. Both structural analysis and biophysical simulations generated robust predictions showing these I. ricinus lipocalins have the potential to bind monoamines similar to other tick species previously reported. The multidisciplinary approach employed in this study characterized unique lipocalins that play a role in tick blood-feeding and transmission of the most important tick-borne pathogen in North America and Eurasia.
Cloning, expression and characterization of a new ι-carrageenase from marine bacterium, Cellulophaga sp.

PubMed

Ma, Su; Tan, Yu-Long; Yu, Wen-Gong; Han, Feng

2013-10-01

The purpose of this study is to report a ι-carrageenase which degrades ι-carrageenan yielding neo-ι-carratetraose as the main product in the absence of NaCl. The gene for a new ι-carrageenase, CgiB_Ce, from Cellulophaga sp. QY3 was cloned and sequenced. It comprised an ORF of 1,386 bp encoding for a protein of 461 amino acid residues. From its sequence analysis, CgiB_Ce is a new member of GH family 82 and shared the highest identity of 32% in amino acids with ι-carrageenase CgiA2 from Zobellia galactanovorans indicating that it is a hitherto uncharacterized protein. The recombinant CgiB_Ce had maximum specific activity (1,870 U/mg) at 45 °C and pH 6.5. It was stable between pH 6.0-9.6 and below 40 °C. Although its activity was enhanced by NaCl, the enzyme was active in the absence of NaCl. CgiB_Ce is an endo-type ι-carrageenase that hydrolyzes β-1,4-linkages of ι-carrageenan, yielding neo-ι-carratetraose as the main product (more than 80% of the total product).
Analysis and functional classification of transcripts from the nematode Meloidogyne incognita

PubMed Central

McCarter, James P; Dautova Mitreva, Makedonka; Martin, John; Dante, Mike; Wylie, Todd; Rao, Uma; Pape, Deana; Bowers, Yvette; Theising, Brenda; Murphy, Claire V; Kloek, Andrew P; Chiapelli, Brandi J; Clifton, Sandra W; Bird, David Mck; Waterston, Robert H

2003-01-01

Background Plant parasitic nematodes are major pathogens of most crops. Molecular characterization of these species as well as the development of new techniques for control can benefit from genomic approaches. As an entrée to characterizing plant parasitic nematode genomes, we analyzed 5,700 expressed sequence tags (ESTs) from second-stage larvae (L2) of the root-knot nematode Meloidogyne incognita. Results From these, 1,625 EST clusters were formed and classified by function using the Gene Ontology (GO) hierarchy and the Kyoto KEGG database. L2 larvae, which represent the infective stage of the life cycle before plant invasion, express a diverse array of ligand-binding proteins and abundant cytoskeletal proteins. L2 are structurally similar to Caenorhabditis elegans dauer larva and the presence of transcripts encoding glyoxylate pathway enzymes in the M. incognita clusters suggests that root-knot nematode larvae metabolize lipid stores while in search of a host. Homology to other species was observed in 79% of translated cluster sequences, with the C. elegans genome providing more information than any other source. In addition to identifying putative nematode-specific and Tylenchida-specific genes, sequencing revealed previously uncharacterized horizontal gene transfer candidates in Meloidogyne with high identity to rhizobacterial genes including homologs of nodL acetyltransferase and novel cellulases. Conclusions With sequencing from plant parasitic nematodes accelerating, the approaches to transcript characterization described here can be applied to more extensive datasets and also provide a foundation for more complex genome analyses. PMID:12702207
Systematic discovery of novel eukaryotic transcriptional regulators using sequence homology independent prediction

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bossi, Flavia; Fan, Jue; Xiao, Jun

Here, the molecular function of a gene is most commonly inferred by sequence similarity. Therefore, genes that lack sufficient sequence similarity to characterized genes (such as certain classes of transcriptional regulators) are difficult to classify using most function prediction algorithms and have remained uncharacterized. As a result, to identify novel transcriptional regulators systematically, we used a feature-based pipeline to screen protein families of unknown function. This method predicted 43 transcriptional regulator families in Arabidopsis thaliana, 7 families in Drosophila melanogaster, and 9 families in Homo sapiens. Literature curation validated 12 of the predicted families to be involved in transcriptional regulation.more » We tested 33 out of the 195 Arabidopsis putative transcriptional regulators for their ability to activate transcription of a reporter gene in planta and found twelve coactivators, five of which had no prior literature support. To investigate mechanisms of action in which the predicted regulators might work, we looked for interactors of an Arabidopsis candidate that did not show transactivation activity in planta and found that it might work with other members of its own family and a subunit of the Polycomb Repressive Complex 2 to regulate transcription. Our results demonstrate the feasibility of assigning molecular function to proteins of unknown function without depending on sequence similarity. In particular, we identified novel transcriptional regulators using biological features enriched in transcription factors. The predictions reported here should accelerate the characterization of novel regulators.« less
Systematic discovery of novel eukaryotic transcriptional regulators using sequence homology independent prediction

DOE PAGES

Bossi, Flavia; Fan, Jue; Xiao, Jun; ...

2017-06-26

Here, the molecular function of a gene is most commonly inferred by sequence similarity. Therefore, genes that lack sufficient sequence similarity to characterized genes (such as certain classes of transcriptional regulators) are difficult to classify using most function prediction algorithms and have remained uncharacterized. As a result, to identify novel transcriptional regulators systematically, we used a feature-based pipeline to screen protein families of unknown function. This method predicted 43 transcriptional regulator families in Arabidopsis thaliana, 7 families in Drosophila melanogaster, and 9 families in Homo sapiens. Literature curation validated 12 of the predicted families to be involved in transcriptional regulation.more » We tested 33 out of the 195 Arabidopsis putative transcriptional regulators for their ability to activate transcription of a reporter gene in planta and found twelve coactivators, five of which had no prior literature support. To investigate mechanisms of action in which the predicted regulators might work, we looked for interactors of an Arabidopsis candidate that did not show transactivation activity in planta and found that it might work with other members of its own family and a subunit of the Polycomb Repressive Complex 2 to regulate transcription. Our results demonstrate the feasibility of assigning molecular function to proteins of unknown function without depending on sequence similarity. In particular, we identified novel transcriptional regulators using biological features enriched in transcription factors. The predictions reported here should accelerate the characterization of novel regulators.« less
Cross-Species Analyses Identify the BNIP-2 and Cdc42GAP Homology (BCH) Domain as a Distinct Functional Subclass of the CRAL_TRIO/Sec14 Superfamily

PubMed Central

Gupta, Anjali Bansal; Wee, Liang En; Zhou, Yi Ting; Hortsch, Michael; Low, Boon Chuan

2012-01-01

The CRAL_TRIO protein domain, which is unique to the Sec14 protein superfamily, binds to a diverse set of small lipophilic ligands. Similar domains are found in a range of different proteins including neurofibromatosis type-1, a Ras GTPase-activating Protein (RasGAP) and Rho guanine nucleotide exchange factors (RhoGEFs). Proteins containing this structural protein domain exhibit a low sequence similarity and ligand specificity while maintaining an overall characteristic three-dimensional structure. We have previously demonstrated that the BNIP-2 and Cdc42GAP Homology (BCH) protein domain, which shares a low sequence homology with the CRAL_TRIO domain, can serve as a regulatory scaffold that binds to Rho, RhoGEFs and RhoGAPs to control various cell signalling processes. In this work, we investigate 175 BCH domain-containing proteins from a wide range of different organisms. A phylogenetic analysis with ∼100 CRAL_TRIO and similar domains from eight representative species indicates a clear distinction of BCH-containing proteins as a novel subclass within the CRAL_TRIO/Sec14 superfamily. BCH-containing proteins contain a hallmark sequence motif R(R/K)h(R/K)(R/K)NL(R/K)xhhhhHPs (‘h’ is large and hydrophobic residue and ‘s’ is small and weekly polar residue) and can be further subdivided into three unique subtypes associated with BNIP-2-N, macro- and RhoGAP-type protein domains. A previously unknown group of genes encoding ‘BCH-only’ domains is also identified in plants and arthropod species. Based on an analysis of their gene-structure and their protein domain context we hypothesize that BCH domain-containing genes evolved through gene duplication, intron insertions and domain swapping events. Furthermore, we explore the point of divergence between BCH and CRAL-TRIO proteins in relation to their ability to bind small GTPases, GAPs and GEFs and lipid ligands. Our study suggests a need for a more extensive analysis of previously uncharacterized BCH, ‘BCH-like’ and CRAL_TRIO-containing proteins and their significance in regulating signaling events involving small GTPases. PMID:22479462
Predicting membrane protein types by the LLDA algorithm.

PubMed

Wang, Tong; Yang, Jie; Shen, Hong-Bin; Chou, Kuo-Chen

2008-01-01

Membrane proteins are generally classified into the following eight types: (1) type I transmembrane, (2) type II, (3) type III, (4) type IV, (5) multipass transmembrane, (6) lipid-chain-anchored membrane, (7) GPI-anchored membrane, and (8) peripheral membrane (K.C. Chou and H.B. Shen: BBRC, 2007, 360: 339-345). Knowing the type of an uncharacterized membrane protein often provides useful clues for finding its biological function and interaction process with other molecules in a biological system. With the explosion of protein sequences generated in the Post-Genomic Age, it is urgent to develop an automated method to deal with such a challenge. Recently, the PsePSSM (Pseudo Position-Specific Score Matrix) descriptor is proposed by Chou and Shen (Biochem. Biophys. Res. Comm. 2007, 360, 339-345) to represent a protein sample. The advantage of the PsePSSM descriptor is that it can combine the evolution information and sequence-correlated information. However, incorporating all these effects into a descriptor may cause the "high dimension disaster". To overcome such a problem, the fusion approach was adopted by Chou and Shen. Here, a completely different approach, the so-called LLDA (Local Linear Discriminant Analysis) is introduced to extract the key features from the high-dimensional PsePSSM space. The dimension-reduced descriptor vector thus obtained is a compact representation of the original high dimensional vector. Our jackknife and independent dataset test results indicate that it is very promising to use the LLDA approach to cope with complicated problems in biological systems, such as predicting the membrane protein type.
iPhos-PseEvo: Identifying Human Phosphorylated Proteins by Incorporating Evolutionary Information into General PseAAC via Grey System Theory.

PubMed

Qiu, Wang-Ren; Sun, Bi-Qian; Xiao, Xuan; Xu, Dong; Chou, Kuo-Chen

2017-05-01

Protein phosphorylation plays a critical role in human body by altering the structural conformation of a protein, causing it to become activated/deactivated, or functional modification. Given an uncharacterized protein sequence, can we predict whether it may be phosphorylated or may not? This is no doubt a very meaningful problem for both basic research and drug development. Unfortunately, to our best knowledge, so far no high throughput bioinformatics tool whatsoever has been developed to address such a very basic but important problem due to its extremely complexity and lacking sufficient training data. Here we proposed a predictor called iPhos-PseEvo by (1) incorporating the protein sequence evolutionary information into the general pseudo amino acid composition (PseAAC) via the grey system theory, (2) balancing out the skewed training datasets by the asymmetric bootstrap approach, and (3) constructing an ensemble predictor by fusing an array of individual random forest classifiers thru a voting system. Rigorous jackknife tests have indicated that very promising success rates have been achieved by iPhos-PseEvo even for such a difficult problem. A user-friendly web-server for iPhos-PseEvo has been established at http://www.jci-bioinfo.cn/iPhos-PseEvo, by which users can easily obtain their desired results without the need to go through the complicated mathematical equations involved. It has not escaped our notice that the formulation and approach presented here can be used to analyze many other problems in protein science as well. © 2017 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.
Comparative transcriptomics of Entelegyne spiders (Araneae, Entelegynae), with emphasis on molecular evolution of orphan genes.

PubMed

Carlson, David E; Hedin, Marshal

2017-01-01

Next-generation sequencing technology is rapidly transforming the landscape of evolutionary biology, and has become a cost-effective and efficient means of collecting exome information for non-model organisms. Due to their taxonomic diversity, production of interesting venom and silk proteins, and the relative scarcity of existing genomic resources, spiders in particular are excellent targets for next-generation sequencing (NGS) methods. In this study, the transcriptomes of six entelegyne spider species from three genera (Cicurina travisae, C. vibora, Habronattus signatus, H. ustulatus, Nesticus bishopi, and N. cooperi) were sequenced and de novo assembled. Each assembly was assessed for quality and completeness and functionally annotated using gene ontology information. Approximately 100 transcripts with evidence of homology to venom proteins were discovered. After identifying more than 3,000 putatively orthologous genes across all six taxa, we used comparative analyses to identify 24 instances of positively selected genes. In addition, between ~ 550 and 1,100 unique orphan genes were found in each genus. These unique, uncharacterized genes exhibited elevated rates of amino acid substitution, potentially consistent with lineage-specific adaptive evolution. The data generated for this study represent a valuable resource for future phylogenetic and molecular evolutionary research, and our results provide new insight into the forces driving genome evolution in taxa that span the root of entelegyne spider phylogeny.
Discovery of phosphorylation motif mixtures in phosphoproteomics data

PubMed Central

Ritz, Anna; Shakhnarovich, Gregory; Salomon, Arthur R.; Raphael, Benjamin J.

2009-01-01

Motivation: Modification of proteins via phosphorylation is a primary mechanism for signal transduction in cells. Phosphorylation sites on proteins are determined in part through particular patterns, or motifs, present in the amino acid sequence. Results: We describe an algorithm that simultaneously discovers multiple motifs in a set of peptides that were phosphorylated by several different kinases. Such sets of peptides are routinely produced in proteomics experiments.Our motif-finding algorithm uses the principle of minimum description length to determine a mixture of sequence motifs that distinguish a foreground set of phosphopeptides from a background set of unphosphorylated peptides. We show that our algorithm outperforms existing motif-finding algorithms on synthetic datasets consisting of mixtures of known phosphorylation sites. We also derive a motif specificity score that quantifies whether or not the phosphoproteins containing an instance of a motif have a significant number of known interactions. Application of our motif-finding algorithm to recently published human and mouse proteomic studies recovers several known phosphorylation motifs and reveals a number of novel motifs that are enriched for interactions with a particular kinase or phosphatase. Our tools provide a new approach for uncovering the sequence specificities of uncharacterized kinases or phosphatases. Availability: Software is available at http:/cs.brown.edu/people/braphael/software.html. Contact: aritz@cs.brown.edu; braphael@cs.brown.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:18996944
SACCHARIS: an automated pipeline to streamline discovery of carbohydrate active enzyme activities within polyspecific families and de novo sequence datasets.

PubMed

Jones, Darryl R; Thomas, Dallas; Alger, Nicholas; Ghavidel, Ata; Inglis, G Douglas; Abbott, D Wade

2018-01-01

Deposition of new genetic sequences in online databases is expanding at an unprecedented rate. As a result, sequence identification continues to outpace functional characterization of carbohydrate active enzymes (CAZymes). In this paradigm, the discovery of enzymes with novel functions is often hindered by high volumes of uncharacterized sequences particularly when the enzyme sequence belongs to a family that exhibits diverse functional specificities (i.e., polyspecificity). Therefore, to direct sequence-based discovery and characterization of new enzyme activities we have developed an automated in silico pipeline entitled: Sequence Analysis and Clustering of CarboHydrate Active enzymes for Rapid Informed prediction of Specificity (SACCHARIS). This pipeline streamlines the selection of uncharacterized sequences for discovery of new CAZyme or CBM specificity from families currently maintained on the CAZy website or within user-defined datasets. SACCHARIS was used to generate a phylogenetic tree of a GH43, a CAZyme family with defined subfamily designations. This analysis confirmed that large datasets can be organized into sequence clusters of manageable sizes that possess related functions. Seeding this tree with a GH43 sequence from Bacteroides dorei DSM 17855 (BdGH43b, revealed it partitioned as a single sequence within the tree. This pattern was consistent with it possessing a unique enzyme activity for GH43 as BdGH43b is the first described α-glucanase described for this family. The capacity of SACCHARIS to extract and cluster characterized carbohydrate binding module sequences was demonstrated using family 6 CBMs (i.e., CBM6s). This CBM family displays a polyspecific ligand binding profile and contains many structurally determined members. Using SACCHARIS to identify a cluster of divergent sequences, a CBM6 sequence from a unique clade was demonstrated to bind yeast mannan, which represents the first description of an α-mannan binding CBM. Additionally, we have performed a CAZome analysis of an in-house sequenced bacterial genome and a comparative analysis of B. thetaiotaomicron VPI-5482 and B. thetaiotaomicron 7330, to demonstrate that SACCHARIS can generate "CAZome fingerprints", which differentiate between the saccharolytic potential of two related strains in silico. Establishing sequence-function and sequence-structure relationships in polyspecific CAZyme families are promising approaches for streamlining enzyme discovery. SACCHARIS facilitates this process by embedding CAZyme and CBM family trees generated from biochemically to structurally characterized sequences, with protein sequences that have unknown functions. In addition, these trees can be integrated with user-defined datasets (e.g., genomics, metagenomics, and transcriptomics) to inform experimental characterization of new CAZymes or CBMs not currently curated, and for researchers to compare differential sequence patterns between entire CAZomes. In this light, SACCHARIS provides an in silico tool that can be tailored for enzyme bioprospecting in datasets of increasing complexity and for diverse applications in glycobiotechnology.
Computational Studies of Snake Venom Toxins

PubMed Central

Ojeda, Paola G.; Caballero, Julio; Kaas, Quentin; González, Wendy

2017-01-01

Most snake venom toxins are proteins, and participate to envenomation through a diverse array of bioactivities, such as bleeding, inflammation, and pain, cytotoxic, cardiotoxic or neurotoxic effects. The venom of a single snake species contains hundreds of toxins, and the venoms of the 725 species of venomous snakes represent a large pool of potentially bioactive proteins. Despite considerable discovery efforts, most of the snake venom toxins are still uncharacterized. Modern bioinformatics tools have been recently developed to mine snake venoms, helping focus experimental research on the most potentially interesting toxins. Some computational techniques predict toxin molecular targets, and the binding mode to these targets. This review gives an overview of current knowledge on the ~2200 sequences, and more than 400 three-dimensional structures of snake toxins deposited in public repositories, as well as of molecular modeling studies of the interaction between these toxins and their molecular targets. We also describe how modern bioinformatics have been used to study the snake venom protein phospholipase A2, the small basic myotoxin Crotamine, and the three-finger peptide Mambalgin. PMID:29271884
A subclass of plant heat shock cognate 70 chaperones carries a motif that facilitates trafficking through plasmodesmata

PubMed Central

Aoki, Koh; Kragler, Friedrich; Xoconostle-Cázares, Beatriz; Lucas, William J.

2002-01-01

Plasmodesmata establish a pathway for the trafficking of non-cell-autonomously acting proteins and ribonucleoprotein complexes. Plasmodesmal enriched cell fractions and the contents of enucleate sieve elements, in the form of phloem sap, were used to isolate and characterize heat shock cognate 70 (Hsc70) chaperones associated with this cell-to-cell transport pathway. Three Cucurbita maxima Hsc70 chaperones were cloned and functional and sequence analysis led to the identification of a previously uncharacterized subclass of non-cell-autonomous chaperones. The highly conserved nature of the heat shock protein 70 (Hsp70) family, in conjunction with mutant analysis, permitted the characterization of a motif that allows these Hsc70 chaperones to engage the plasmodesmal non-cell-autonomous translocation machinery. Proof of concept that this motif is necessary for Hsp70 gain-of-movement function was obtained through the engineering of a human Hsp70 that acquired the capacity to traffic through plasmodesmata. These results are discussed in terms of the roles likely played by this subclass of Hsc70 chaperones in the trafficking of non-cell-autonomous proteins. PMID:12456884
Interaction between focal adhesion kinase and Crk-associated tyrosine kinase substrate p130Cas.

PubMed

Polte, T R; Hanks, S K

1995-11-07

The focal adhesion kinase (FAK) has been implicated in integrin-mediated signaling events and in the mechanism of cell transformation by the v-Src and v-Crk oncoproteins. To gain further insight into FAK signaling pathways, we used a two-hybrid screen to identify proteins that interact with mouse FAK. The screen identified two proteins that interact with FAK via their Src homology 3 (SH3) domains: a v-Crk-associated tyrosine kinase substrate (Cas), p130Cas, and a still uncharacterized protein, FIPSH3-2, which contains an SH3 domain closely related to that of p130Cas. These SH3 domains bind to the same proline-rich region of FAK (APPKPSR) encompassing residues 711-717. The mouse p130Cas amino acid sequence was deduced from cDNA clones, revealing an overall high degree of similarity to the recently reported rat sequence. Coimmunoprecipitation experiments confirmed that p130Cas and FAK are associated in mouse fibroblasts. The stable interaction between p130Cas and FAK emerges as a likely key element in integrin-mediated signal transduction and further represents a direct molecular link between the v-Src and v-Crk oncoproteins. The Src family kinase Fyn, whose Src homology 2 (SH2) domain binds to the major FAK autophosphorylation site (tyrosine 397), was also identified in the two-hybrid screen.
Combining functional genomics and chemical biology to identify targets of bioactive compounds.

PubMed

Ho, Cheuk Hei; Piotrowski, Jeff; Dixon, Scott J; Baryshnikova, Anastasia; Costanzo, Michael; Boone, Charles

2011-02-01

Genome sequencing projects have revealed thousands of suspected genes, challenging researchers to develop efficient large-scale functional analysis methodologies. Determining the function of a gene product generally requires a means to alter its function. Genetically tractable model organisms have been widely exploited for the isolation and characterization of activating and inactivating mutations in genes encoding proteins of interest. Chemical genetics represents a complementary approach involving the use of small molecules capable of either inactivating or activating their targets. Saccharomyces cerevisiae has been an important test bed for the development and application of chemical genomic assays aimed at identifying targets and modes of action of known and uncharacterized compounds. Here we review yeast chemical genomic assays strategies for drug target identification. Copyright © 2010 Elsevier Ltd. All rights reserved.

Functional validation of the oncogenic cooperativity and targeting potential of tuberous sclerosis mutation in medulloblastoma using a MYC-amplified model cell line.

PubMed

Henderson, Jacob J; Wagner, Jacob P; Hofmann, Nicolle E; Eide, Christopher A; Cho, Yoon-Jae; Druker, Brian J; Davare, Monika A

2017-10-01

Medulloblastoma is the most common malignant brain tumor of childhood. To identify targetable vulnerabilities, we employed inhibitor screening that revealed mTOR inhibitor hypersensitivity in the MYC-overexpressing medulloblastoma cell line, D341. Concomitant exome sequencing unveiled an uncharacterized missense mutation, TSC2 A415V , in these cells. We biochemically demonstrate that the TSC2 A415V mutation is functionally deleterious, leading to shortened half-life and proteasome-mediated protein degradation. These data suggest that MYC cooperates with activated kinase pathways, enabling pharmacologic intervention in these treatment refractory tumors. We propose that identification of activated kinase pathways may allow for tailoring targeted therapy to improve survival and treatment-related morbidity in medulloblastoma. © 2017 Wiley Periodicals, Inc.
The Chlamydomonas Genome Reveals the Evolution of Key Animal and Plant Functions

PubMed Central

Merchant, Sabeeha S.; Prochnik, Simon E.; Vallon, Olivier; Harris, Elizabeth H.; Karpowicz, Steven J.; Witman, George B.; Terry, Astrid; Salamov, Asaf; Fritz-Laylin, Lillian K.; Maréchal-Drouard, Laurence; Marshall, Wallace F.; Qu, Liang-Hu; Nelson, David R.; Sanderfoot, Anton A.; Spalding, Martin H.; Kapitonov, Vladimir V.; Ren, Qinghu; Ferris, Patrick; Lindquist, Erika; Shapiro, Harris; Lucas, Susan M.; Grimwood, Jane; Schmutz, Jeremy; Cardol, Pierre; Cerutti, Heriberto; Chanfreau, Guillaume; Chen, Chun-Long; Cognat, Valérie; Croft, Martin T.; Dent, Rachel; Dutcher, Susan; Fernández, Emilio; Ferris, Patrick; Fukuzawa, Hideya; González-Ballester, David; González-Halphen, Diego; Hallmann, Armin; Hanikenne, Marc; Hippler, Michael; Inwood, William; Jabbari, Kamel; Kalanon, Ming; Kuras, Richard; Lefebvre, Paul A.; Lemaire, Stéphane D.; Lobanov, Alexey V.; Lohr, Martin; Manuell, Andrea; Meier, Iris; Mets, Laurens; Mittag, Maria; Mittelmeier, Telsa; Moroney, James V.; Moseley, Jeffrey; Napoli, Carolyn; Nedelcu, Aurora M.; Niyogi, Krishna; Novoselov, Sergey V.; Paulsen, Ian T.; Pazour, Greg; Purton, Saul; Ral, Jean-Philippe; Riaño-Pachón, Diego Mauricio; Riekhof, Wayne; Rymarquis, Linda; Schroda, Michael; Stern, David; Umen, James; Willows, Robert; Wilson, Nedra; Zimmer, Sara Lana; Allmer, Jens; Balk, Janneke; Bisova, Katerina; Chen, Chong-Jian; Elias, Marek; Gendler, Karla; Hauser, Charles; Lamb, Mary Rose; Ledford, Heidi; Long, Joanne C.; Minagawa, Jun; Page, M. Dudley; Pan, Junmin; Pootakham, Wirulda; Roje, Sanja; Rose, Annkatrin; Stahlberg, Eric; Terauchi, Aimee M.; Yang, Pinfen; Ball, Steven; Bowler, Chris; Dieckmann, Carol L.; Gladyshev, Vadim N.; Green, Pamela; Jorgensen, Richard; Mayfield, Stephen; Mueller-Roeber, Bernd; Rajamani, Sathish; Sayre, Richard T.; Brokstein, Peter; Dubchak, Inna; Goodstein, David; Hornick, Leila; Huang, Y. Wayne; Jhaveri, Jinal; Luo, Yigong; Martínez, Diego; Ngau, Wing Chi Abby; Otillar, Bobby; Poliakov, Alexander; Porter, Aaron; Szajkowski, Lukasz; Werner, Gregory; Zhou, Kemin; Grigoriev, Igor V.; Rokhsar, Daniel S.; Grossman, Arthur R.

2010-01-01

Chlamydomonas reinhardtii is a unicellular green alga whose lineage diverged from land plants over 1 billion years ago. It is a model system for studying chloroplast-based photosynthesis, as well as the structure, assembly, and function of eukaryotic flagella (cilia), which were inherited from the common ancestor of plants and animals, but lost in land plants. We sequenced the ∼120-megabase nuclear genome of Chlamydomonas and performed comparative phylogenomic analyses, identifying genes encoding uncharacterized proteins that are likely associated with the function and biogenesis of chloroplasts or eukaryotic flagella. Analyses of the Chlamydomonas genome advance our understanding of the ancestral eukaryotic cell, reveal previously unknown genes associated with photosynthetic and flagellar functions, and establish links between ciliopathy and the composition and function of flagella. PMID:17932292
Functional annotation by sequence-weighted structure alignments: statistical analysis and case studies from the Protein 3000 structural genomics project in Japan.

PubMed

Standley, Daron M; Toh, Hiroyuki; Nakamura, Haruki

2008-09-01

A method to functionally annotate structural genomics targets, based on a novel structural alignment scoring function, is proposed. In the proposed score, position-specific scoring matrices are used to weight structurally aligned residue pairs to highlight evolutionarily conserved motifs. The functional form of the score is first optimized for discriminating domains belonging to the same Pfam family from domains belonging to different families but the same CATH or SCOP superfamily. In the optimization stage, we consider four standard weighting functions as well as our own, the "maximum substitution probability," and combinations of these functions. The optimized score achieves an area of 0.87 under the receiver-operating characteristic curve with respect to identifying Pfam families within a sequence-unique benchmark set of domain pairs. Confidence measures are then derived from the benchmark distribution of true-positive scores. The alignment method is next applied to the task of functionally annotating 230 query proteins released to the public as part of the Protein 3000 structural genomics project in Japan. Of these queries, 78 were found to align to templates with the same Pfam family as the query or had sequence identities > or = 30%. Another 49 queries were found to match more distantly related templates. Within this group, the template predicted by our method to be the closest functional relative was often not the most structurally similar. Several nontrivial cases are discussed in detail. Finally, 103 queries matched templates at the fold level, but not the family or superfamily level, and remain functionally uncharacterized. 2008 Wiley-Liss, Inc.
Description of Drinking Water Bacterial Communities Using 16S rRNA Gene Sequence Analyses

EPA Science Inventory

Descriptions of bacterial communities inhabiting water distribution systems (WDS) have mainly been accomplished using culture-based approaches. Due to the inherent selective nature of culture-based approaches, the majority of bacteria inhabiting WDS remain uncharacterized. The go...
Structural snapshots of Xer recombination reveal activation by synaptic complex remodeling and DNA bending

PubMed Central

Bebel, Aleksandra; Karaca, Ezgi; Kumar, Banushree; Stark, W Marshall; Barabas, Orsolya

2016-01-01

Bacterial Xer site-specific recombinases play an essential genome maintenance role by unlinking chromosome multimers, but their mechanism of action has remained structurally uncharacterized. Here, we present two high-resolution structures of Helicobacter pylori XerH with its recombination site DNA difH, representing pre-cleavage and post-cleavage synaptic intermediates in the recombination pathway. The structures reveal that activation of DNA strand cleavage and rejoining involves large conformational changes and DNA bending, suggesting how interaction with the cell division protein FtsK may license recombination at the septum. Together with biochemical and in vivo analysis, our structures also reveal how a small sequence asymmetry in difH defines protein conformation in the synaptic complex and orchestrates the order of DNA strand exchanges. Our results provide insights into the catalytic mechanism of Xer recombination and a model for regulation of recombination activity during cell division. DOI: http://dx.doi.org/10.7554/eLife.19706.001 PMID:28009253
Niakha virus: a novel member of the family Rhabdoviridae isolated from phlebotomine sandflies in Senegal.

PubMed

Vasilakis, Nikos; Widen, Steven; Mayer, Sandra V; Seymour, Robert; Wood, Thomas G; Popov, Vsevolov; Guzman, Hilda; Travassos da Rosa, Amelia P A; Ghedin, Elodie; Holmes, Edward C; Walker, Peter J; Tesh, Robert B

2013-09-01

Members of the family Rhabdoviridae have been assigned to eight genera but many remain unassigned. Rhabdoviruses have a remarkably diverse host range that includes terrestrial and marine animals, invertebrates and plants. Transmission of some rhabdoviruses often requires an arthropod vector, such as mosquitoes, midges, sandflies, ticks, aphids and leafhoppers, in which they replicate. Herein we characterize Niakha virus (NIAV), a previously uncharacterized rhabdovirus isolated from phebotomine sandflies in Senegal. Analysis of the 11,124 nt genome sequence indicates that it encodes the five common rhabdovirus proteins with alternative ORFs in the M, G and L genes. Phylogenetic analysis of the L protein indicate that NIAV's closest relative is Oak Vale rhabdovirus, although in this analysis NIAV is still so phylogenetically distinct that it might be classified as distinct from the eight currently recognized Rhabdoviridae genera. This observation highlights the vast, and yet not fully recognized diversity, of this family. Copyright © 2013 Elsevier Inc. All rights reserved.
Genome-wide transcriptional analysis of flagellar regeneration in Chlamydomonas reinhardtii identifies orthologs of ciliary disease genes

NASA Technical Reports Server (NTRS)

Stolc, Viktor; Samanta, Manoj Pratim; Tongprasit, Waraporn; Marshall, Wallace F.

2005-01-01

The important role that cilia and flagella play in human disease creates an urgent need to identify genes involved in ciliary assembly and function. The strong and specific induction of flagellar-coding genes during flagellar regeneration in Chlamydomonas reinhardtii suggests that transcriptional profiling of such cells would reveal new flagella-related genes. We have conducted a genome-wide analysis of RNA transcript levels during flagellar regeneration in Chlamydomonas by using maskless photolithography method-produced DNA oligonucleotide microarrays with unique probe sequences for all exons of the 19,803 predicted genes. This analysis represents previously uncharacterized whole-genome transcriptional activity profiling study in this important model organism. Analysis of strongly induced genes reveals a large set of known flagellar components and also identifies a number of important disease-related proteins as being involved with cilia and flagella, including the zebrafish polycystic kidney genes Qilin, Reptin, and Pontin, as well as the testis-expressed tubby-like protein TULP2.
Genome-wide analysis of gene expression and protein secretion of Babesia canis during virulent infection identifies potential pathogenicity factors.

PubMed

Eichenberger, Ramon M; Ramakrishnan, Chandra; Russo, Giancarlo; Deplazes, Peter; Hehl, Adrian B

2017-06-13

Infections of dogs with virulent strains of Babesia canis are characterized by rapid onset and high mortality, comparable to complicated human malaria. As in other apicomplexan parasites, most Babesia virulence factors responsible for survival and pathogenicity are secreted to the host cell surface and beyond where they remodel and biochemically modify the infected cell interacting with host proteins in a very specific manner. Here, we investigated factors secreted by B. canis during acute infections in dogs and report on in silico predictions and experimental analysis of the parasite's exportome. As a backdrop, we generated a fully annotated B. canis genome sequence of a virulent Hungarian field isolate (strain BcH-CHIPZ) underpinned by extensive genome-wide RNA-seq analysis. We find evidence for conserved factors in apicomplexan hemoparasites involved in immune-evasion (e.g. VESA-protein family), proteins secreted across the iRBC membrane into the host bloodstream (e.g. SA- and Bc28 protein families), potential moonlighting proteins (e.g. profilin and histones), and uncharacterized antigens present during acute crisis in dogs. The combined data provides a first predicted and partially validated set of potential virulence factors exported during fatal infections, which can be exploited for urgently needed innovative intervention strategies aimed at facilitating diagnosis and management of canine babesiosis.
Expanded microbial genome coverage and improved protein family annotation in the COG database

PubMed Central

Galperin, Michael Y.; Makarova, Kira S.; Wolf, Yuri I.; Koonin, Eugene V.

2015-01-01

Microbial genome sequencing projects produce numerous sequences of deduced proteins, only a small fraction of which have been or will ever be studied experimentally. This leaves sequence analysis as the only feasible way to annotate these proteins and assign to them tentative functions. The Clusters of Orthologous Groups of proteins (COGs) database (http://www.ncbi.nlm.nih.gov/COG/), first created in 1997, has been a popular tool for functional annotation. Its success was largely based on (i) its reliance on complete microbial genomes, which allowed reliable assignment of orthologs and paralogs for most genes; (ii) orthology-based approach, which used the function(s) of the characterized member(s) of the protein family (COG) to assign function(s) to the entire set of carefully identified orthologs and describe the range of potential functions when there were more than one; and (iii) careful manual curation of the annotation of the COGs, aimed at detailed prediction of the biological function(s) for each COG while avoiding annotation errors and overprediction. Here we present an update of the COGs, the first since 2003, and a comprehensive revision of the COG annotations and expansion of the genome coverage to include representative complete genomes from all bacterial and archaeal lineages down to the genus level. This re-analysis of the COGs shows that the original COG assignments had an error rate below 0.5% and allows an assessment of the progress in functional genomics in the past 12 years. During this time, functions of many previously uncharacterized COGs have been elucidated and tentative functional assignments of many COGs have been validated, either by targeted experiments or through the use of high-throughput methods. A particularly important development is the assignment of functions to several widespread, conserved proteins many of which turned out to participate in translation, in particular rRNA maturation and tRNA modification. The new version of the COGs is expected to become an important tool for microbial genomics. PMID:25428365
A set of GFP organelle marker lines for intracellular localization studies in Medicago truncatula

USDA-ARS?s Scientific Manuscript database

Genomics advances in the model legume Medicago truncatula have led to an increase in the number of identified genes encoding proteins with unknown biological function. Determining the intracellular location of uncharacterized proteins often aids in the elucidation of biological function. To expedite...
Genome-Wide Protein Interaction Screens Reveal Functional Networks Involving Sm-Like Proteins

PubMed Central

Fromont-Racine, Micheline; Mayes, Andrew E.; Brunet-Simon, Adeline; Rain, Jean-Christophe; Colley, Alan; Dix, Ian; Decourty, Laurence; Joly, Nicolas; Ricard, Florence; Beggs, Jean D.

2000-01-01

A set of seven structurally related Sm proteins forms the core of the snRNP particles containing the spliceosomal U1, U2, U4 and U5 snRNAs. A search of the genomic sequence of Saccharomyces cerevisiae has identified a number of open reading frames that potentially encode structurally similar proteins termed Lsm (Like Sm) proteins. With the aim of analysing all possible interactions between the Lsm proteins and any protein encoded in the yeast genome, we performed exhaustive and iterative genomic two-hybrid screens, starting with the Lsm proteins as baits. Indeed, extensive interactions amongst eight Lsm proteins were found that suggest the existence of a Lsm complex or complexes. These Lsm interactions apparently involve the conserved Sm domain that also mediates interactions between the Sm proteins. The screens also reveal functionally significant interactions with splicing factors, in particular with Prp4 and Prp24, compatible with genetic studies and with the reported association of Lsm proteins with spliceosomal U6 and U4/U6 particles. In addition, interactions with proteins involved in mRNA turnover, such as Mrt1, Dcp1, Dcp2 and Xrn1, point to roles for Lsm complexes in distinct RNA metabolic processes, that are confirmed in independent functional studies. These results provide compelling evidence that two-hybrid screens yield functionally meaningful information about protein–protein interactions and can suggest functions for uncharacterized proteins, especially when they are performed on a genome-wide scale. PMID:10900456
Fungal Genes in Context: Genome Architecture Reflects Regulatory Complexity and Function

PubMed Central

Noble, Luke M.; Andrianopoulos, Alex

2013-01-01

Gene context determines gene expression, with local chromosomal environment most influential. Comparative genomic analysis is often limited in scope to conserved or divergent gene and protein families, and fungi are well suited to this approach with low functional redundancy and relatively streamlined genomes. We show here that one aspect of gene context, the amount of potential upstream regulatory sequence maintained through evolution, is highly predictive of both molecular function and biological process in diverse fungi. Orthologs with large upstream intergenic regions (UIRs) are strongly enriched in information processing functions, such as signal transduction and sequence-specific DNA binding, and, in the genus Aspergillus, include the majority of experimentally studied, high-level developmental and metabolic transcriptional regulators. Many uncharacterized genes are also present in this class and, by implication, may be of similar importance. Large intergenic regions also share two novel sequence characteristics, currently of unknown significance: they are enriched for plus-strand polypyrimidine tracts and an information-rich, putative regulatory motif that was present in the last common ancestor of the Pezizomycotina. Systematic consideration of gene UIR in comparative genomics, particularly for poorly characterized species, could help reveal organisms’ regulatory priorities. PMID:23699226
DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhang, Yanfeng; Zheng, Yi; Qin, Ling

Beta-hydroxyacid dehydrogenase (β-HAD) genes have been identified in all sequenced genomes of eukaryotes and prokaryotes. Their gene products catalyze the NAD+- or NADP+-dependent oxidation of various β-hydroxy acid substrates into their corresponding semialdehyde. In many fungal and bacterial genomes, multiple β-HAD genes are observed leading to the hypothesis that these gene products may have unique, uncharacterized metabolic roles specific to their species. The genomes of Geobacter sulfurreducens and Geobacter metallireducens each contain two potential β-HAD genes. The protein sequences of one pair of these genes, Gs-βHAD (Q74DE4) and Gm-βHAD (Q39R98), have 65% sequence identity and 77% sequence similarity with eachmore » other. Both proteins reduce succinic semialdehyde, a metabolite of the GABA shunt. To further explore the structural and functional characteristics of these two β-HADs with a potentially unique substrate specificity, crystal structures for Gs-βHAD and Gm-βHAD in complex with NADP+ were determined to a resolution of 1.89 Å and 2.07 Å, respectively. The structure of both proteins are similar, composed of 14 α-helices and nine β-strands organized into two domains. Domain One (1-165) adopts a typical Rossmann fold composed of two α/β units: a six-strand parallel β-sheet surrounded by six α-helices (α1 – α6) followed by a mixed three-strand β-sheet surrounded by two α-helices (α7 and α8). Domain Two (166-287) is composed of a bundle of seven α-helices (α9 – α14). Four functional regions conserved in all β-HADs are spatially located near each other at the interdomain cleft in both Gs-βHAD and Gm-βHAD with a buried molecule of NADP+. The structural features of Gs-βHAD and Gm-βHAD are described in relation to the four conserved consensus sequences characteristic of β-HADs and the potential biochemical importance of these enzymes as an alternative pathway for the degradation of succinic semialdehyde.« less
The Starch Granule-Associated Protein EARLY STARVATION1 Is Required for the Control of Starch Degradation in Arabidopsis thaliana Leaves[OPEN

PubMed Central

Feike, Doreen; Seung, David; Graf, Alexander; Bischof, Sylvain; Ellick, Tamaryn; Coiro, Mario; Soyk, Sebastian; Eicke, Simona; Mettler-Altmann, Tabea; Lu, Kuan Jen; Trick, Martin; Zeeman, Samuel C.

2016-01-01

To uncover components of the mechanism that adjusts the rate of leaf starch degradation to the length of the night, we devised a screen for mutant Arabidopsis thaliana plants in which starch reserves are prematurely exhausted. The mutation in one such mutant, named early starvation1 (esv1), eliminates a previously uncharacterized protein. Starch in mutant leaves is degraded rapidly and in a nonlinear fashion, so that reserves are exhausted 2 h prior to dawn. The ESV1 protein and a similar uncharacterized Arabidopsis protein (named Like ESV1 [LESV]) are located in the chloroplast stroma and are also bound into starch granules. The region of highest similarity between the two proteins contains a series of near-repeated motifs rich in tryptophan. Both proteins are conserved throughout starch-synthesizing organisms, from angiosperms and monocots to green algae. Analysis of transgenic plants lacking or overexpressing ESV1 or LESV, and of double mutants lacking ESV1 and another protein necessary for starch degradation, leads us to propose that these proteins function in the organization of the starch granule matrix. We argue that their misexpression affects starch degradation indirectly, by altering matrix organization and, thus, accessibility of starch polymers to starch-degrading enzymes. PMID:27207856
Molecular characterization and immunolocalization of the olfactory co-recepter Orco from two blood-feeding muscid flies, the stable fly (Stomoxys calcitrans, L.) and the horn fly (Haematobia irritans irritans, L.)

PubMed Central

Olafson, Pia Untalan

2012-01-01

Biting flies are economically important, blood-feeding pests of medical and veterinary significance. Chemosensory-based biting fly behaviors, such as host/nutrient source localization and ovipositional site selection, are intriguing targets for the development of supplemental control strategies. In an effort to expand our understanding of biting fly chemosensory pathways, transcripts encoding the highly conserved insect odorant co-receptor (Orco) were isolated from two representative biting fly species, the stable fly (Scal\\Orco) and the horn fly (Hirr\\Orco). Orco forms a complex with an odor-specific odorant receptor to form an odor-gated ion channel. The biting fly transcripts were predicted to encode proteins with 87% – 94% amino acid similarity to published insect Orco sequences and were detected in various immature stages as well as in adult structures associated with olfaction, i.e. antennae and maxillary palps, and gustation, i.e. proboscis. Further, the relevant proteins were immunolocalized to specific antennal sensilla using anti-serum raised against a peptide sequence conserved between the two fly species. Results from this study provide a basis for functional evaluation of repellent/attractant effects on as yet uncharacterized stable fly and horn fly conventional odorant receptors. PMID:23278866
Genetic and biological variation among nucleopolyhedrovirus isolates from spodoptera frugiperda (lepidotpera: noctuidae)

USDA-ARS?s Scientific Manuscript database

A PCR-based method was used to identify and distinguish among 40 uncharacterized nucleopolyhedrovirus (NPV) isolates from the moth Spodoptera frugiperda that were part of an insect virus collection. Phylogenetic analysis was carried out with sequences amplified from two strongly conserved loci (pol...
Global functional atlas of Escherichia coli encompassing previously uncharacterized proteins.

PubMed

Hu, Pingzhao; Janga, Sarath Chandra; Babu, Mohan; Díaz-Mejía, J Javier; Butland, Gareth; Yang, Wenhong; Pogoutse, Oxana; Guo, Xinghua; Phanse, Sadhna; Wong, Peter; Chandran, Shamanta; Christopoulos, Constantine; Nazarians-Armavil, Anaies; Nasseri, Negin Karimi; Musso, Gabriel; Ali, Mehrab; Nazemof, Nazila; Eroukova, Veronika; Golshani, Ashkan; Paccanaro, Alberto; Greenblatt, Jack F; Moreno-Hagelsieb, Gabriel; Emili, Andrew

2009-04-28

One-third of the 4,225 protein-coding genes of Escherichia coli K-12 remain functionally unannotated (orphans). Many map to distant clades such as Archaea, suggesting involvement in basic prokaryotic traits, whereas others appear restricted to E. coli, including pathogenic strains. To elucidate the orphans' biological roles, we performed an extensive proteomic survey using affinity-tagged E. coli strains and generated comprehensive genomic context inferences to derive a high-confidence compendium for virtually the entire proteome consisting of 5,993 putative physical interactions and 74,776 putative functional associations, most of which are novel. Clustering of the respective probabilistic networks revealed putative orphan membership in discrete multiprotein complexes and functional modules together with annotated gene products, whereas a machine-learning strategy based on network integration implicated the orphans in specific biological processes. We provide additional experimental evidence supporting orphan participation in protein synthesis, amino acid metabolism, biofilm formation, motility, and assembly of the bacterial cell envelope. This resource provides a "systems-wide" functional blueprint of a model microbe, with insights into the biological and evolutionary significance of previously uncharacterized proteins.
Identification of a second flagellin gene and functional characterization of a sigma70-like promoter upstream of a Leptospira borgpetersenii flaB gene.

PubMed

Lin, Min; Dan, Hanhong; Li, Yijing

2004-02-01

Leptospira borgpetersenii, one of the causative agents of leptospirosis in both animals and humans, is a bacterial pathogen with characteristic motility that is mediated by the rotation of two periplasmic flagella (PF). The flaB gene coding for a core polypeptide subunit of PF was previously characterized by sequence analysis of its open reading frame (ORF) (M. Lin, J Biochem Mol Biol Biophys 2:181-187, 1999). The present study was undertaken to isolate and clone the uncharacterized sequence upstream of the flaB gene by using a PCR-based genome walking procedure. This has resulted in a 1470-bp genomic DNA sequence in which an 846-bp ORF coding for a 281-amino acid polypeptide (31.3 kDa) is identified 455 bp upstream from the flaB start codon. The encoded protein exhibits 72% amino acid identity to the deduced FlaB protein sequence of L. borgpetersenii and a high degree of sequence homology to the FlaB proteins of other spirochaetes. This has demonstrated for the first time that a second flaB gene homolog is present in a Leptospira species. The newly identified gene is designated flaB1, and the previously cloned flaB renamed flaB2. Within the intergenic sequence between flaB1 and flaB2, a potential stem-loop structure (12-bp inverted repeats) was identified 25 bp downstream of the flaB1 stop codon; this could serve as a transcription terminator for the flaB1 mRNA. Three E. coli-like promoter regions (I, II, and III) for binding Esigma(70), a regulatory sequence uncommonly found in flagellar genes, were predicted upstream of the flaB2 ORF. Only promoter region II contains a promoter that is functional in E. coli, as revealed at phenotypic and transcriptional levels by its capability of directing the expression of the chloramphenicol acetyltransferase (CAT) gene in the promoter probe vector pKK232-8. These observations may suggest that flaB1 and flaB2 are transcribed separately and do not form a transcriptional operon controlled by a single promoter.
Using the structure-function linkage database to characterize functional domains in enzymes.

PubMed

Brown, Shoshana; Babbitt, Patricia

2014-12-12

The Structure-Function Linkage Database (SFLD; http://sfld.rbvi.ucsf.edu/) is a Web-accessible database designed to link enzyme sequence, structure, and functional information. This unit describes the protocols by which a user may query the database to predict the function of uncharacterized enzymes and to correct misannotated functional assignments. The information in this unit is especially useful in helping a user discriminate functional capabilities of a sequence that is only distantly related to characterized sequences in publicly available databases. Copyright © 2014 John Wiley & Sons, Inc.
Kolente virus, a rhabdovirus species isolated from ticks and bats in the Republic of Guinea.

PubMed

Ghedin, Elodie; Rogers, Matthew B; Widen, Steven G; Guzman, Hilda; Travassos da Rosa, Amelia P A; Wood, Thomas G; Fitch, Adam; Popov, Vsevolod; Holmes, Edward C; Walker, Peter J; Vasilakis, Nikos; Tesh, Robert B

2013-12-01

Kolente virus (KOLEV) is a rhabdovirus originally isolated from ticks and a bat in Guinea, West Africa, in 1985. Although tests at the time of isolation suggested that KOLEV is a novel rhabdovirus, it has remained largely uncharacterized. We assembled the complete genome sequence of the prototype strain DakAr K7292, which was found to encode the five canonical rhabdovirus structural proteins (N, P, M, G and L) with alternative ORFs (>180 nt) in the P and L genes. Serologically, KOLEV exhibited a weak antigenic relationship with Barur and Fukuoka viruses in the Kern Canyon group. Phylogenetic analysis revealed that KOLEV represents a distinct and divergent lineage that shows no clear relationship to any rhabdovirus except Oita virus, although with limited phylogenetic resolution. In summary, KOLEV represents a novel species in the family Rhabdoviridae.

Mutations in C4orf26, encoding a peptide with in vitro hydroxyapatite crystal nucleation and growth activity, cause amelogenesis imperfecta.

PubMed

Parry, David A; Brookes, Steven J; Logan, Clare V; Poulter, James A; El-Sayed, Walid; Al-Bahlani, Suhaila; Al Harasi, Sharifa; Sayed, Jihad; Raïf, El Mostafa; Shore, Roger C; Dashash, Mayssoon; Barron, Martin; Morgan, Joanne E; Carr, Ian M; Taylor, Graham R; Johnson, Colin A; Aldred, Michael J; Dixon, Michael J; Wright, J Tim; Kirkham, Jennifer; Inglehearn, Chris F; Mighell, Alan J

2012-09-07

Autozygosity mapping and clonal sequencing of an Omani family identified mutations in the uncharacterized gene, C4orf26, as a cause of recessive hypomineralized amelogenesis imperfecta (AI), a disease in which the formation of tooth enamel fails. Screening of a panel of 57 autosomal-recessive AI-affected families identified eight further families with loss-of-function mutations in C4orf26. C4orf26 encodes a putative extracellular matrix acidic phosphoprotein expressed in the enamel organ. A mineral nucleation assay showed that the protein's phosphorylated C terminus has the capacity to promote nucleation of hydroxyapatite, suggesting a possible function in enamel mineralization during amelogenesis. Copyright © 2012 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.
iNR-PhysChem: A Sequence-Based Predictor for Identifying Nuclear Receptors and Their Subfamilies via Physical-Chemical Property Matrix

PubMed Central

Xiao, Xuan; Wang, Pu; Chou, Kuo-Chen

2012-01-01

Nuclear receptors (NRs) form a family of ligand-activated transcription factors that regulate a wide variety of biological processes, such as homeostasis, reproduction, development, and metabolism. Human genome contains 48 genes encoding NRs. These receptors have become one of the most important targets for therapeutic drug development. According to their different action mechanisms or functions, NRs have been classified into seven subfamilies. With the avalanche of protein sequences generated in the postgenomic age, we are facing the following challenging problems. Given an uncharacterized protein sequence, how can we identify whether it is a nuclear receptor? If it is, what subfamily it belongs to? To address these problems, we developed a predictor called iNR-PhysChem in which the protein samples were expressed by a novel mode of pseudo amino acid composition (PseAAC) whose components were derived from a physical-chemical matrix via a series of auto-covariance and cross-covariance transformations. It was observed that the overall success rate achieved by iNR-PhysChem was over 98% in identifying NRs or non-NRs, and over 92% in identifying NRs among the following seven subfamilies: NR1thyroid hormone like, NR2HNF4-like, NR3estrogen like, NR4nerve growth factor IB-like, NR5fushi tarazu-F1 like, NR6germ cell nuclear factor like, and NR0knirps like. These rates were derived by the jackknife tests on a stringent benchmark dataset in which none of protein sequences included has pairwise sequence identity to any other in a same subset. As a user-friendly web-server, iNR-PhysChem is freely accessible to the public at either http://www.jci-bioinfo.cn/iNR-PhysChem or http://icpr.jci.edu.cn/bioinfo/iNR-PhysChem. Also a step-by-step guide is provided on how to use the web-server to get the desired results without the need to follow the complicated mathematics involved in developing the predictor. It is anticipated that iNR-PhysChem may become a useful high throughput tool for both basic research and drug design. PMID:22363503
CSL encodes a leucine-rich-repeat protein implicated in red/violet light signaling to the circadian clock in Chlamydomonas

PubMed Central

Kinoshita, Ayumi; Niwa, Yoshimi; Onai, Kiyoshi; Fukuzawa, Hideya; Ishiura, Masahiro

2017-01-01

The green alga Chlamydomonas reinhardtii shows various light responses in behavior and physiology. One such photoresponse is the circadian clock, which can be reset by external light signals to entrain its oscillation to daily environmental cycles. In a previous report, we suggested that a light-induced degradation of the clock protein ROC15 is a trigger to reset the circadian clock in Chlamydomonas. However, light signaling pathways of this process remained unclear. Here, we screened for mutants that show abnormal ROC15 diurnal rhythms, including the light-induced protein degradation at dawn, using a luciferase fusion reporter. In one mutant, ROC15 degradation and phase resetting of the circadian clock by light were impaired. Interestingly, the impairments were observed in response to red and violet light, but not to blue light. We revealed that an uncharacterized gene encoding a protein similar to RAS-signaling-related leucine-rich repeat (LRR) proteins is responsible for the mutant phenotypes. Our results indicate that a previously uncharacterized red/violet light signaling pathway is involved in the phase resetting of circadian clock in Chlamydomonas. PMID:28333924
Discovery of Escherichia coli CRISPR sequences in an undergraduate laboratory.

PubMed

Militello, Kevin T; Lazatin, Justine C

2017-05-01

Clustered regularly interspaced short palindromic repeats (CRISPRs) represent a novel type of adaptive immune system found in eubacteria and archaebacteria. CRISPRs have recently generated a lot of attention due to their unique ability to catalog foreign nucleic acids, their ability to destroy foreign nucleic acids in a mechanism that shares some similarity to RNA interference, and the ability to utilize reconstituted CRISPR systems for genome editing in numerous organisms. In order to introduce CRISPR biology into an undergraduate upper-level laboratory, a five-week set of exercises was designed to allow students to examine the CRISPR status of uncharacterized Escherichia coli strains and to allow the discovery of new repeats and spacers. Students started the project by isolating genomic DNA from E. coli and amplifying the iap CRISPR locus using the polymerase chain reaction (PCR). The PCR products were analyzed by Sanger DNA sequencing, and the sequences were examined for the presence of CRISPR repeat sequences. The regions between the repeats, the spacers, were extracted and analyzed with BLASTN searches. Overall, CRISPR loci were sequenced from several previously uncharacterized E. coli strains and one E. coli K-12 strain. Sanger DNA sequencing resulted in the discovery of 36 spacer sequences and their corresponding surrounding repeat sequences. Five of the spacers were homologous to foreign (non-E. coli) DNA. Assessment of the laboratory indicates that improvements were made in the ability of students to answer questions relating to the structure and function of CRISPRs. Future directions of the laboratory are presented and discussed. © 2016 by The International Union of Biochemistry and Molecular Biology, 45(3):262-269, 2017. © 2016 The International Union of Biochemistry and Molecular Biology.
Functional similarity and molecular divergence of a novel reproductive transcriptome in two male-pregnant Syngnathus pipefish species

PubMed Central

Small, Clayton M; Harlin-Cognato, April D; Jones, Adam G

2013-01-01

Evolutionary studies have revealed that reproductive proteins in animals and plants often evolve more rapidly than the genome-wide average. The causes of this pattern, which may include relaxed purifying selection, sexual selection, sexual conflict, pathogen resistance, reinforcement, or gene duplication, remain elusive. Investigative expansions to additional taxa and reproductive tissues have the potential to shed new light on this unresolved problem. Here, we embark on such an expansion, in a comparison of the brood-pouch transcriptome between two male-pregnant species of the pipefish genus Syngnathus. Male brooding tissues in syngnathid fishes represent a novel, nonurogenital reproductive trait, heretofore mostly uncharacterized from a molecular perspective. We leveraged next-generation sequencing (Roche 454 pyrosequencing) to compare transcript abundance in the male brooding tissues of pregnant with nonpregnant samples from Gulf (S. scovelli) and dusky (S. floridae) pipefish. A core set of protein-coding genes, including multiple members of astacin metalloprotease and c-type lectin gene families, is consistent between species in both the direction and magnitude of expression bias. As predicted, coding DNA sequence analysis of these putative “male pregnancy proteins” suggests rapid evolution relative to nondifferentially expressed genes and reflects signatures of adaptation similar in magnitude to those reported from Drosophila male accessory gland proteins. Although the precise drivers of male pregnancy protein divergence remain unknown, we argue that the male pregnancy transcriptome in syngnathid fishes, a clade diverse with respect to brooding morphology and mating system, represents a unique and promising object of study for understanding the perplexing evolutionary nature of reproductive molecules. PMID:24324861
Chromosome segregation in Archaea mediated by a hybrid DNA partition machine

PubMed Central

Kalliomaa-Sanford, Anne K.; Rodriguez-Castañeda, Fernando A.; McLeod, Brett N.; Latorre-Roselló, Victor; Smith, Jasmine H.; Reimann, Julia; Albers, Sonja V.; Barillà, Daniela

2012-01-01

Eukarya and, more recently, some bacteria have been shown to rely on a cytoskeleton-based apparatus to drive chromosome segregation. In contrast, the factors and mechanisms underpinning this fundamental process are underexplored in archaea, the third domain of life. Here we establish that the archaeon Sulfolobus solfataricus harbors a hybrid segrosome consisting of two interacting proteins, SegA and SegB, that play a key role in genome segregation in this organism. SegA is an ortholog of bacterial, Walker-type ParA proteins, whereas SegB is an archaea-specific factor lacking sequence identity to either eukaryotic or bacterial proteins, but sharing homology with a cluster of uncharacterized factors conserved in both crenarchaea and euryarchaea, the two major archaeal sub-phyla. We show that SegA is an ATPase that polymerizes in vitro and that SegB is a site-specific DNA-binding protein contacting palindromic sequences located upstream of the segAB cassette. SegB interacts with SegA in the presence of nucleotides and dramatically affects its polymerization dynamics. Our data demonstrate that SegB strongly stimulates SegA polymerization, possibly by promoting SegA nucleation and accelerating polymer growth. Increased expression levels of segAB resulted in severe growth and chromosome segregation defects, including formation of anucleate cells, compact nucleoids confined to one half of the cell compartment and fragmented nucleoids. The overall picture emerging from our findings indicates that the SegAB complex fulfills a crucial function in chromosome segregation and is the prototype of a DNA partition machine widespread across archaea. PMID:22355141
Chromosome segregation in Archaea mediated by a hybrid DNA partition machine.

PubMed

Kalliomaa-Sanford, Anne K; Rodriguez-Castañeda, Fernando A; McLeod, Brett N; Latorre-Roselló, Victor; Smith, Jasmine H; Reimann, Julia; Albers, Sonja V; Barillà, Daniela

2012-03-06

Eukarya and, more recently, some bacteria have been shown to rely on a cytoskeleton-based apparatus to drive chromosome segregation. In contrast, the factors and mechanisms underpinning this fundamental process are underexplored in archaea, the third domain of life. Here we establish that the archaeon Sulfolobus solfataricus harbors a hybrid segrosome consisting of two interacting proteins, SegA and SegB, that play a key role in genome segregation in this organism. SegA is an ortholog of bacterial, Walker-type ParA proteins, whereas SegB is an archaea-specific factor lacking sequence identity to either eukaryotic or bacterial proteins, but sharing homology with a cluster of uncharacterized factors conserved in both crenarchaea and euryarchaea, the two major archaeal sub-phyla. We show that SegA is an ATPase that polymerizes in vitro and that SegB is a site-specific DNA-binding protein contacting palindromic sequences located upstream of the segAB cassette. SegB interacts with SegA in the presence of nucleotides and dramatically affects its polymerization dynamics. Our data demonstrate that SegB strongly stimulates SegA polymerization, possibly by promoting SegA nucleation and accelerating polymer growth. Increased expression levels of segAB resulted in severe growth and chromosome segregation defects, including formation of anucleate cells, compact nucleoids confined to one half of the cell compartment and fragmented nucleoids. The overall picture emerging from our findings indicates that the SegAB complex fulfills a crucial function in chromosome segregation and is the prototype of a DNA partition machine widespread across archaea.
Genome-wide identification, classification, and functional analysis of the basic helix-loop-helix transcription factors in the cattle, Bos Taurus.

PubMed

Li, Fengmei; Liu, Wuyi

2017-06-01

The basic helix-loop-helix (bHLH) transcription factors (TFs) form a huge superfamily and play crucial roles in many essential developmental, genetic, and physiological-biochemical processes of eukaryotes. In total, 109 putative bHLH TFs were identified and categorized successfully in the genomic databases of cattle, Bos Taurus, after removing redundant sequences and merging genetic isoforms. Through phylogenetic analyses, 105 proteins among these bHLH TFs were classified into 44 families with 46, 25, 14, 3, 13, and 4 members in the high-order groups A, B, C, D, E, and F, respectively. The remaining 4 bHLH proteins were sorted out as 'orphans.' Next, these 109 putative bHLH proteins identified were further characterized as significantly enriched in 524 significant Gene Ontology (GO) annotations (corrected P value ≤ 0.05) and 21 significantly enriched pathways (corrected P value ≤ 0.05) that had been mapped by the web server KOBAS 2.0. Furthermore, 95 bHLH proteins were further screened and analyzed together with two uncharacterized proteins in the STRING online database to reconstruct the protein-protein interaction network of cattle bHLH TFs. Ultimately, 89 bHLH proteins were fully mapped in a network with 67 biological process, 13 molecular functions, 5 KEGG pathways, 12 PFAM protein domains, and 25 INTERPRO classified protein domains and features. These results provide much useful information and a good reference for further functional investigations and updated researches on cattle bHLH TFs.
The Large Mitochondrial Genome of Symbiodinium minutum Reveals Conserved Noncoding Sequences between Dinoflagellates and Apicomplexans

PubMed Central

Shoguchi, Eiichi; Shinzato, Chuya; Hisata, Kanako; Satoh, Nori; Mungpakdee, Sutada

2015-01-01

Even though mitochondrial genomes, which characterize eukaryotic cells, were first discovered more than 50 years ago, mitochondrial genomics remains an important topic in molecular biology and genome sciences. The Phylum Alveolata comprises three major groups (ciliates, apicomplexans, and dinoflagellates), the mitochondrial genomes of which have diverged widely. Even though the gene content of dinoflagellate mitochondrial genomes is reportedly comparable to that of apicomplexans, the highly fragmented and rearranged genome structures of dinoflagellates have frustrated whole genomic analysis. Consequently, noncoding sequences and gene arrangements of dinoflagellate mitochondrial genomes have not been well characterized. Here we report that the continuous assembled genome (∼326 kb) of the dinoflagellate, Symbiodinium minutum, is AT-rich (∼64.3%) and that it contains three protein-coding genes. Based upon in silico analysis, the remaining 99% of the genome comprises transcriptomic noncoding sequences. RNA edited sites and unique, possible start and stop codons clarify conserved regions among dinoflagellates. Our massive transcriptome analysis shows that almost all regions of the genome are transcribed, including 27 possible fragmented ribosomal RNA genes and 12 uncharacterized small RNAs that are similar to mitochondrial RNA genes of the malarial parasite, Plasmodium falciparum. Gene map comparisons show that gene order is only slightly conserved between S. minutum and P. falciparum. However, small RNAs and intergenic sequences share sequence similarities with P. falciparum, suggesting that the function of noncoding sequences has been preserved despite development of very different genome structures. PMID:26199191
Diagnostic Markers of Ovarian Cancer by High-Throughput Antigen Cloning and Detection on Arrays

PubMed Central

Chatterjee, Madhumita; Mohapatra, Saroj; Ionan, Alexei; Bawa, Gagandeep; Ali-Fehmi, Rouba; Wang, Xiaoju; Nowak, James; Ye, Bin; Nahhas, Fatimah A.; Lu, Karen; Witkin, Steven S.; Fishman, David; Munkarah, Adnan; Morris, Robert; Levin, Nancy K.; Shirley, Natalie N.; Tromp, Gerard; Abrams, Judith; Draghici, Sorin; Tainsky, Michael A.

2008-01-01

A noninvasive screening test would significantly facilitate early detection of epithelial ovarian cancer. This study used a combination of high-throughput selection and array-based serologic detection of many antigens indicative of the presence of cancer, thereby using the immune system as a biosensor. This high-throughput selection involved biopanning of an ovarian cancer phage display library using serum immunoglobulins from an ovarian cancer patient as bait. Protein macroarrays containing 480 of these selected antigen clones revealed 65 clones that interacted with immunoglobulins in sera from 32 ovarian cancer patients but not with sera from 25 healthy women or 14 patients having other benign or malignant gynecologic diseases. Sequence analysis data of these 65 clones revealed 62 different antigens. Among the markers, we identified some known antigens, including RCAS1, signal recognition protein-19, AHNAK-related sequence, nuclear autoantogenic sperm protein, Nijmegen breakage syndrome 1 (Nibrin), ribosomal protein L4, Homo sapiens KIAA0419 gene product, eukaryotic initiation factor 5A, and casein kinase II, as well as many previously uncharacterized antigenic gene products. Using these 65 antigens on protein microarrays, we trained neural networks on two-color fluorescent detection of serum IgG binding and found an average sensitivity and specificity of 55% and 98%, respectively. In addition, the top 6 of the most specific clones resulted in an average sensitivity and specificity of 32% and 94%, respectively. This global approach to antigenic profiling, epitomics, has applications to cancer and autoimmune diseases for diagnostic and therapeutic studies. Further work with larger panels of antigens should provide a comprehensive set of markers with sufficient sensitivity and specificity suitable for clinical testing in high-risk populations. PMID:16424057
Proteolytic Processing of Turnip Yellow Mosaic Virus Replication Proteins and Functional Impact on Infectivity▿

PubMed Central

Jakubiec, Anna; Drugeon, Gabrièle; Camborde, Laurent; Jupin, Isabelle

2007-01-01

Turnip yellow mosaic virus (TYMV), a positive-strand RNA virus belonging to the alphavirus-like supergroup, encodes its nonstructural replication proteins as a 206K precursor with domains indicative of methyltransferase (MT), proteinase (PRO), NTPase/helicase (HEL), and polymerase (POL) activities. Subsequent processing of 206K generates a 66K protein encompassing the POL domain and uncharacterized 115K and 85K proteins. Here, we demonstrate that TYMV proteinase mediates an additional cleavage between the PRO and HEL domains of the polyprotein, generating the 115K protein and a 42K protein encompassing the HEL domain that can be detected in plant cells using a specific antiserum. Deletion and substitution mutagenesis experiments and sequence comparisons indicate that the scissile bond is located between residues Ser879 and Gln880. The 85K protein is generated by a host proteinase and is likely to result from nonspecific proteolytic degradation occurring during protein sample extraction or analysis. We also report that TYMV proteinase has the ability to process substrates in trans in vivo. Finally, we examined the processing of the 206K protein containing native, mutated, or shuffled cleavage sites and analyzed the effects of cleavage mutations on viral infectivity and RNA synthesis by performing reverse-genetics experiments. We present evidence that PRO/HEL cleavage is critical for productive virus infection and that the impaired infectivity of PRO/HEL cleavage mutants is due mainly to defective synthesis of positive-strand RNA. PMID:17686855
Identification of specific posttranslational O-mycoloylations mediating protein targeting to the mycomembrane.

PubMed

Carel, Clément; Marcoux, Julien; Réat, Valérie; Parra, Julien; Latgé, Guillaume; Laval, Françoise; Demange, Pascal; Burlet-Schiltz, Odile; Milon, Alain; Daffé, Mamadou; Tropis, Maryelle G; Renault, Marie A M

2017-04-18

The outer membranes (OMs) of members of the Corynebacteriales bacterial order, also called mycomembranes, harbor mycolic acids and unusual outer membrane proteins (OMPs), including those with α-helical structure. The signals that allow precursors of such proteins to be targeted to the mycomembrane remain uncharacterized. We report here the molecular features responsible for OMP targeting to the mycomembrane of Corynebacterium glutamicum , a nonpathogenic member of the Corynebacteriales order. To better understand the mechanisms by which OMP precursors were sorted in C. glutamicum , we first investigated the partitioning of endogenous and recombinant PorA, PorH, PorB, and PorC between bacterial compartments and showed that they were both imported into the mycomembrane and secreted into the extracellular medium. A detailed investigation of cell extracts and purified proteins by top-down MS, NMR spectroscopy, and site-directed mutagenesis revealed specific and well-conserved posttranslational modifications (PTMs), including O -mycoloylation, pyroglutamylation, and N -formylation, for mycomembrane-associated and -secreted OMPs. PTM site sequence analysis from C. glutamicum OMP and other O -acylated proteins in bacteria and eukaryotes revealed specific patterns. Furthermore, we found that such modifications were essential for targeting to the mycomembrane and sufficient for OMP assembly into mycolic acid-containing lipid bilayers. Collectively, it seems that these PTMs have evolved in the Corynebacteriales order and beyond to guide membrane proteins toward a specific cell compartment.
Identification of specific posttranslational O-mycoloylations mediating protein targeting to the mycomembrane

PubMed Central

Carel, Clément; Réat, Valérie; Parra, Julien; Latgé, Guillaume; Laval, Françoise; Burlet-Schiltz, Odile; Milon, Alain; Daffé, Mamadou; Tropis, Maryelle G.; Renault, Marie A. M.

2017-01-01

The outer membranes (OMs) of members of the Corynebacteriales bacterial order, also called mycomembranes, harbor mycolic acids and unusual outer membrane proteins (OMPs), including those with α-helical structure. The signals that allow precursors of such proteins to be targeted to the mycomembrane remain uncharacterized. We report here the molecular features responsible for OMP targeting to the mycomembrane of Corynebacterium glutamicum, a nonpathogenic member of the Corynebacteriales order. To better understand the mechanisms by which OMP precursors were sorted in C. glutamicum, we first investigated the partitioning of endogenous and recombinant PorA, PorH, PorB, and PorC between bacterial compartments and showed that they were both imported into the mycomembrane and secreted into the extracellular medium. A detailed investigation of cell extracts and purified proteins by top-down MS, NMR spectroscopy, and site-directed mutagenesis revealed specific and well-conserved posttranslational modifications (PTMs), including O-mycoloylation, pyroglutamylation, and N-formylation, for mycomembrane-associated and -secreted OMPs. PTM site sequence analysis from C. glutamicum OMP and other O-acylated proteins in bacteria and eukaryotes revealed specific patterns. Furthermore, we found that such modifications were essential for targeting to the mycomembrane and sufficient for OMP assembly into mycolic acid-containing lipid bilayers. Collectively, it seems that these PTMs have evolved in the Corynebacteriales order and beyond to guide membrane proteins toward a specific cell compartment. PMID:28373551
Identification and biochemical characterization of an acid sphingomyelinase-like protein from the bacterial plant pathogen Ralstonia solanacearum that hydrolyzes ATP to AMP but not sphingomyelin to ceramide.

PubMed

Airola, Michael V; Tumolo, Jessica M; Snider, Justin; Hannun, Yusuf A

2014-01-01

Acid sphingomyelinase (aSMase) is a human enzyme that catalyzes the hydrolysis of sphingomyelin to generate the bioactive lipid ceramide and phosphocholine. ASMase deficiency is the underlying cause of the genetic diseases Niemann-Pick Type A and B and has been implicated in the onset and progression of a number of other human diseases including cancer, depression, liver, and cardiovascular disease. ASMase is the founding member of the aSMase protein superfamily, which is a subset of the metallophosphatase (MPP) superfamily. To date, MPPs that share sequence homology with aSMase, termed aSMase-like proteins, have been annotated and presumed to function as aSMases. However, none of these aSMase-like proteins have been biochemically characterized to verify this. Here we identify RsASML, previously annotated as RSp1609: acid sphingomyelinase-like phosphodiesterase, as the first bacterial aSMase-like protein from the deadly plant pathogen Ralstonia solanacearum based on sequence homology with the catalytic and C-terminal domains of human aSMase. A biochemical characterization of RsASML does not support a role in sphingomyelin hydrolysis but rather finds RsASML capable of acting as an ATP diphosphohydrolase, catalyzing the hydrolysis of ATP and ADP to AMP. In addition, RsASML displays a neutral, not acidic, pH optimum and prefers Ni2+ or Mn2+, not Zn2+, for catalysis. This alters the expectation that all aSMase-like proteins function as acid SMases and expands the substrate possibilities of this protein superfamily to include nucleotides. Overall, we conclude that sequence homology with human aSMase is not sufficient to predict substrate specificity, pH optimum for catalysis, or metal dependence. This may have implications to the biochemically uncharacterized human aSMase paralogs, aSMase-like 3a (aSML3a) and aSML3b, which have been implicated in cancer and kidney disease, respectively, and assumed to function as aSMases.
Measuring the Global Substrate Specificity of Mycobacterial Serine Hydrolases Using a Library of Fluorogenic Ester Substrates.

PubMed

Bassett, Braden; Waibel, Brent; White, Alex; Hansen, Heather; Stephens, Dominique; Koelper, Andrew; Larsen, Erik M; Kim, Charles; Glanzer, Adam; Lavis, Luke D; Hoops, Geoffrey C; Johnson, R Jeremy

2018-04-16

Among the proteins required for lipid metabolism in Mycobacterium tuberculosis are a significant number of uncharacterized serine hydrolases, especially lipases and esterases. Using a streamlined synthetic method, a library of immolative fluorogenic ester substrates was expanded to better represent the natural lipidomic diversity of Mycobacterium. This expanded fluorogenic library was then used to rapidly characterize the global structure activity relationship (SAR) of mycobacterial serine hydrolases in M. smegmatis under different growth conditions. Confirmation of fluorogenic substrate activation by mycobacterial serine hydrolases was performed using nonspecific serine hydrolase inhibitors and reinforced the biological significance of the SAR. The hydrolases responsible for the global SAR were then assigned using gel-resolved activity measurements, and these assignments were used to rapidly identify the relative substrate specificity of previously uncharacterized mycobacterial hydrolases. These measurements provide a global SAR of mycobacterial hydrolase activity, a picture of cycling hydrolase activity, and a detailed substrate specificity profile for previously uncharacterized hydrolases.
Selective inhibition of miR-92 in hippocampal neurons alters contextual fear memory.

PubMed

Vetere, Gisella; Barbato, Christian; Pezzola, Silvia; Frisone, Paola; Aceti, Massimiliano; Ciotti, MariaTeresa; Cogoni, Carlo; Ammassari-Teule, Martine; Ruberti, Francesca

2014-12-01

Post-transcriptional gene regulation mediated by microRNAs (miRNAs) is implicated in memory formation; however, the function of miR-92 in this regulation is uncharacterized. The present study shows that training mice in contextual fear conditioning produces a transient increase in miR-92 levels in the hippocampus and decreases several miR-92 gene targets, including: (i) the neuronal Cl(-) extruding K(+) Cl(-) co-transporter 2 (KCC2) protein; (ii) the cytoplasmic polyadenylation protein (CPEB3), an RNA-binding protein regulator of protein synthesis in neurons; and (iii) the transcription factor myocyte enhancer factor 2D (MEF2D), one of the MEF2 genes which negatively regulates memory-induced structural plasticity. Selective inhibition of endogenous miR-92 in CA1 hippocampal neurons, by a sponge lentiviral vector expressing multiple sequences imperfectly complementary to mature miR-92 under the control of the neuronal specific synapsin promoter, leads to up-regulation of KCC2, CPEB3 and MEF2D, impairs contextual fear conditioning, and prevents a memory-induced increase in the spine density. Taken together, the results indicate that neuronal-expressed miR-92 is an endogenous fine regulator of contextual fear memory in mice. © 2014 Wiley Periodicals, Inc.
C2orf62 and TTC17 are involved in actin organization and ciliogenesis in zebrafish and human.

PubMed

Bontems, Franck; Fish, Richard J; Borlat, Irene; Lembo, Frédérique; Chocu, Sophie; Chalmel, Frédéric; Borg, Jean-Paul; Pineau, Charles; Neerman-Arbez, Marguerite; Bairoch, Amos; Lane, Lydie

2014-01-01

Vertebrate genomes contain around 20,000 protein-encoding genes, of which a large fraction is still not associated with specific functions. A major task in future genomics will thus be to assign physiological roles to all open reading frames revealed by genome sequencing. Here we show that C2orf62, a highly conserved protein with little homology to characterized proteins, is strongly expressed in testis in zebrafish and mammals, and in various types of ciliated cells during zebrafish development. By yeast two hybrid and GST pull-down, C2orf62 was shown to interact with TTC17, another uncharacterized protein. Depletion of either C2orf62 or TTC17 in human ciliated cells interferes with actin polymerization and reduces the number of primary cilia without changing their length. Zebrafish embryos injected with morpholinos against C2orf62 or TTC17, or with mRNA coding for the C2orf62 C-terminal part containing a RII dimerization/docking (R2D2) - like domain show morphological defects consistent with imperfect ciliogenesis. We provide here the first evidence for a C2orf62-TTC17 axis that would regulate actin polymerization and ciliogenesis.
Discovery of uncharacterized sugarcane viruses by next generation sequencing technology: the case of Ramu stunt

USDA-ARS?s Scientific Manuscript database

Ramu stunt disease of sugarcane was first reported in Papua New Guinea in the mid 1980's. The disease can reduce sugarcane yields significantly and causes severe stunting and mortality in highly susceptible cultivars. The causal agent of Ramu stunt has been investigated but its characterization has ...
Sequences of 95 human MHC haplotypes reveal extreme coding variation in genes other than highly polymorphic HLA class I and II

PubMed Central

Norman, Paul J.; Norberg, Steven J.; Guethlein, Lisbeth A.; Nemat-Gorgani, Neda; Royce, Thomas; Wroblewski, Emily E.; Dunn, Tamsen; Mann, Tobias; Alicata, Claudia; Hollenbach, Jill A.; Chang, Weihua; Shults Won, Melissa; Gunderson, Kevin L.; Abi-Rached, Laurent; Ronaghi, Mostafa; Parham, Peter

2017-01-01

The most polymorphic part of the human genome, the MHC, encodes over 160 proteins of diverse function. Half of them, including the HLA class I and II genes, are directly involved in immune responses. Consequently, the MHC region strongly associates with numerous diseases and clinical therapies. Notoriously, the MHC region has been intractable to high-throughput analysis at complete sequence resolution, and current reference haplotypes are inadequate for large-scale studies. To address these challenges, we developed a method that specifically captures and sequences the 4.8-Mbp MHC region from genomic DNA. For 95 MHC homozygous cell lines we assembled, de novo, a set of high-fidelity contigs and a sequence scaffold, representing a mean 98% of the target region. Included are six alternative MHC reference sequences of the human genome that we completed and refined. Characterization of the sequence and structural diversity of the MHC region shows the approach accurately determines the sequences of the highly polymorphic HLA class I and HLA class II genes and the complex structural diversity of complement factor C4A/C4B. It has also uncovered extensive and unexpected diversity in other MHC genes; an example is MUC22, which encodes a lung mucin and exhibits more coding sequence alleles than any HLA class I or II gene studied here. More than 60% of the coding sequence alleles analyzed were previously uncharacterized. We have created a substantial database of robust reference MHC haplotype sequences that will enable future population scale studies of this complicated and clinically important region of the human genome. PMID:28360230
Predicting highly-connected hubs in protein interaction networks by QSAR and biological data descriptors

PubMed Central

Hsing, Michael; Byler, Kendall; Cherkasov, Artem

2009-01-01

Hub proteins (those engaged in most physical interactions in a protein interaction network (PIN) have recently gained much research interest due to their essential role in mediating cellular processes and their potential therapeutic value. It is straightforward to identify hubs if the underlying PIN is experimentally determined; however, theoretical hub prediction remains a very challenging task, as physicochemical properties that differentiate hubs from less connected proteins remain mostly uncharacterized. To adequately distinguish hubs from non-hub proteins we have utilized over 1300 protein descriptors, some of which represent QSAR (quantitative structure-activity relationship) parameters, and some reflect sequence-derived characteristics of proteins including domain composition and functional annotations. Those protein descriptors, together with available protein interaction data have been processed by a machine learning method (boosting trees) and resulted in the development of hub classifiers that are capable of predicting highly interacting proteins for four model organisms: Escherichia coli, Saccharomyces cerevisiae, Drosophila melanogaster and Homo sapiens. More importantly, through the analyses of the most relevant protein descriptors, we are able to demonstrate that hub proteins not only share certain common physicochemical and structural characteristics that make them different from non-hub counterparts, but they also exhibit species-specific characteristics that should be taken into account when analyzing different PINs. The developed prediction models can be used for determining highly interacting proteins in the four studied species to assist future proteomics experiments and PIN analyses. Availability The source code and executable program of the hub classifier are available for download at: http://www.cnbi2.ca/hub-analysis/ PMID:20198194

LIFEGUARD proteins support plant colonization by biotrophic powdery mildew fungi.

PubMed

Weis, Corina; Hückelhoven, Ralph; Eichmann, Ruth

2013-09-01

Pathogenic microbes manipulate eukaryotic cells during invasion and target plant proteins to achieve host susceptibility. BAX INHIBITOR-1 (BI-1) is an endoplasmic reticulum-resident cell death suppressor in plants and animals and is required for full susceptibility of barley to the barley powdery mildew fungus Blumeria graminis f.sp. hordei. LIFEGUARD (LFG) proteins resemble BI-1 proteins in terms of predicted membrane topology and cell-death-inhibiting function in metazoans, but display clear sequence-specific distinctions. This work shows that barley (Hordeum vulgare L.) and Arabidopsis thaliana genomes harbour five LFG genes, HvLFGa-HvLFGe and AtLFG1-AtLFG5, whose functions are largely uncharacterized. As observed for HvBI-1, single-cell overexpression of HvLFGa supports penetration success of B. graminis f.sp. hordei into barley epidermal cells, while transient-induced gene silencing restricts it. In penetrated barley epidermal cells, a green fluorescent protein-tagged HvLFGa protein accumulates at the site of fungal entry, around fungal haustoria and in endosomal or vacuolar membranes. The data further suggest a role of LFG proteins in plant-powdery mildew interactions in both monocot and dicot plants, because stable overexpression or knockdown of AtLFG1 or AtLFG2 also support or delay development of the powdery mildew fungus Erysiphe cruciferarum on the respective Arabidopsis mutants. Together, this work has identified new modulators of plant-powdery mildew interactions, and the data further support functional similarities between BI-1 and LFG proteins beyond cell death regulation.
LIFEGUARD proteins support plant colonization by biotrophic powdery mildew fungi

PubMed Central

Weis, Corina; Hückelhoven, Ralph; Eichmann, Ruth

2013-01-01

Pathogenic microbes manipulate eukaryotic cells during invasion and target plant proteins to achieve host susceptibility. BAX INHIBITOR-1 (BI-1) is an endoplasmic reticulum-resident cell death suppressor in plants and animals and is required for full susceptibility of barley to the barley powdery mildew fungus Blumeria graminis f.sp. hordei. LIFEGUARD (LFG) proteins resemble BI-1 proteins in terms of predicted membrane topology and cell-death-inhibiting function in metazoans, but display clear sequence-specific distinctions. This work shows that barley (Hordeum vulgare L.) and Arabidopsis thaliana genomes harbour five LFG genes, HvLFGa–HvLFGe and AtLFG1–AtLFG5, whose functions are largely uncharacterized. As observed for HvBI-1, single-cell overexpression of HvLFGa supports penetration success of B. graminis f.sp. hordei into barley epidermal cells, while transient-induced gene silencing restricts it. In penetrated barley epidermal cells, a green fluorescent protein-tagged HvLFGa protein accumulates at the site of fungal entry, around fungal haustoria and in endosomal or vacuolar membranes. The data further suggest a role of LFG proteins in plant–powdery mildew interactions in both monocot and dicot plants, because stable overexpression or knockdown of AtLFG1 or AtLFG2 also support or delay development of the powdery mildew fungus Erysiphe cruciferarum on the respective Arabidopsis mutants. Together, this work has identified new modulators of plant–powdery mildew interactions, and the data further support functional similarities between BI-1 and LFG proteins beyond cell death regulation. PMID:23888068
pLoc-mPlant: predict subcellular localization of multi-location plant proteins by incorporating the optimal GO information into general PseAAC.

PubMed

Cheng, Xiang; Xiao, Xuan; Chou, Kuo-Chen

2017-08-22

One of the fundamental goals in cellular biochemistry is to identify the functions of proteins in the context of compartments that organize them in the cellular environment. To realize this, it is indispensable to develop an automated method for fast and accurate identification of the subcellular locations of uncharacterized proteins. The current study is focused on plant protein subcellular location prediction based on the sequence information alone. Although considerable efforts have been made in this regard, the problem is far from being solved yet. Most of the existing methods can be used to deal with single-location proteins only. Actually, proteins with multi-locations may have some special biological functions. This kind of multiplex protein is particularly important for both basic research and drug design. Using the multi-label theory, we present a new predictor called "pLoc-mPlant" by extracting the optimal GO (Gene Ontology) information into the Chou's general PseAAC (Pseudo Amino Acid Composition). Rigorous cross-validation on the same stringent benchmark dataset indicated that the proposed pLoc-mPlant predictor is remarkably superior to iLoc-Plant, the state-of-the-art method for predicting plant protein subcellular localization. To maximize the convenience of most experimental scientists, a user-friendly web-server for the new predictor has been established at , by which users can easily get their desired results without the need to go through the complicated mathematics involved.
Evaluation of Combining Ability and Grain Quality of Quality Protein Maize Derived from U.S. Public Inbred Lines

USDA-ARS?s Scientific Manuscript database

Quality Protein Maize (QPM) has improved nutritional quality due to the opaque2 mutation as well as hard endosperm conferred by uncharacterized modifier genes. We have developed a series of QPM inbred lines based on crosses between public U.S. Corn Belt-adapted lines with QPM lines developed at the...
Systematic Identification and Characterization of Novel Human Skin-Associated Genes Encoding Membrane and Secreted Proteins

PubMed Central

Buhren, Bettina Alexandra; Martinez, Cynthia; Schrumpf, Holger; Gasis, Marcia; Grether-Beck, Susanne; Krutmann, Jean

2013-01-01

Through bioinformatics analyses of a human gene expression database representing 105 different tissues and cell types, we identified 687 skin-associated genes that are selectively and highly expressed in human skin. Over 50 of these represent uncharacterized genes not previously associated with skin and include a subset that encode novel secreted and plasma membrane proteins. The high levels of skin-associated expression for eight of these novel therapeutic target genes were confirmed by semi-quantitative real time PCR, western blot and immunohistochemical analyses of normal skin and skin-derived cell lines. Four of these are expressed specifically by epidermal keratinocytes; two that encode G-protein-coupled receptors (GPR87 and GPR115), and two that encode secreted proteins (WFDC5 and SERPINB7). Further analyses using cytokine-activated and terminally differentiated human primary keratinocytes or a panel of common inflammatory, autoimmune or malignant skin diseases revealed distinct patterns of regulation as well as disease associations that point to important roles in cutaneous homeostasis and disease. Some of these novel uncharacterized skin genes may represent potential biomarkers or drug targets for the development of future diagnostics or therapeutics. PMID:23840300
Discovering Deeply Divergent RNA Viruses in Existing Metatranscriptome Data with Machine Learning

NASA Astrophysics Data System (ADS)

Rivers, A. R.

2016-02-01

Most sampling of RNA viruses and phages has been directed toward a narrow range of hosts and environments. Several marine metagenomic studies have examined the RNA viral fraction in aquatic samples and found a number of picornaviruses and uncharacterized sequences. The lack of homology to known protein families has limited the discovery of new RNA viruses. We developed a computational method for identifying RNA viruses that relies on information in the codon transition probabilities of viral sequences to train a classifier. This approach does not rely on homology, but it has higher information content than other reference-free methods such as tetranucleotide frequency. Training and validation with RefSeq data gave true positive and true negative rates of 99.6% and 99.5% on the highly imbalanced validation sets (0.2% viruses) that, like the metatranscriptomes themselves, contain mostly non-viral sequences. To further test the method, a validation dataset of putative RNA virus genomes were identified in metatransciptomes by the presence of RNA dependent RNA polymerase, an essential gene for RNA viruses. The classifier successfully identified 99.4% of those contigs as viral. This approach is currently being extended to screen all metatranscriptome data sequenced at the DOE Joint Genome Institute, presently 4.5 Gb of assembled data from 504 public projects representing a wide range of marine, aquatic and terrestrial environments.
Consensus Prediction of Charged Single Alpha-Helices with CSAHserver.

PubMed

Dudola, Dániel; Tóth, Gábor; Nyitray, László; Gáspári, Zoltán

2017-01-01

Charged single alpha-helices (CSAHs) constitute a rare structural motif. CSAH is characterized by a high density of regularly alternating residues with positively and negatively charged side chains. Such segments exhibit unique structural properties; however, there are only a handful of proteins where its existence is experimentally verified. Therefore, establishing a pipeline that is capable of predicting the presence of CSAH segments with a low false positive rate is of considerable importance. Here we describe a consensus-based approach that relies on two conceptually different CSAH detection methods and a final filter based on the estimated helix-forming capabilities of the segments. This pipeline was shown to be capable of identifying previously uncharacterized CSAH segments that could be verified experimentally. The method is available as a web server at http://csahserver.itk.ppke.hu and also a downloadable standalone program suitable to scan larger sequence collections.
Kolente virus, a rhabdovirus species isolated from ticks and bats in the Republic of Guinea

PubMed Central

Ghedin, Elodie; Rogers, Matthew B.; Widen, Steven G.; Guzman, Hilda; Travassos da Rosa, Amelia P. A.; Wood, Thomas G.; Fitch, Adam; Popov, Vsevolod; Holmes, Edward C.; Walker, Peter J.; Tesh, Robert B.

2013-01-01

Kolente virus (KOLEV) is a rhabdovirus originally isolated from ticks and a bat in Guinea, West Africa, in 1985. Although tests at the time of isolation suggested that KOLEV is a novel rhabdovirus, it has remained largely uncharacterized. We assembled the complete genome sequence of the prototype strain DakAr K7292, which was found to encode the five canonical rhabdovirus structural proteins (N, P, M, G and L) with alternative ORFs (>180 nt) in the P and L genes. Serologically, KOLEV exhibited a weak antigenic relationship with Barur and Fukuoka viruses in the Kern Canyon group. Phylogenetic analysis revealed that KOLEV represents a distinct and divergent lineage that shows no clear relationship to any rhabdovirus except Oita virus, although with limited phylogenetic resolution. In summary, KOLEV represents a novel species in the family Rhabdoviridae. PMID:24062532
High Throughput Sequencing Identifies Misregulated Genes in the Drosophila Polypyrimidine Tract-Binding Protein (hephaestus) Mutant Defective in Spermatogenesis.

PubMed

Sridharan, Vinod; Heimiller, Joseph; Robida, Mark D; Singh, Ravinder

2016-01-01

The Drosophila polypyrimidine tract-binding protein (dmPTB or hephaestus) plays an important role during spermatogenesis. The heph2 mutation in this gene results in a specific defect in spermatogenesis, causing aberrant spermatid individualization and male sterility. However, the array of molecular defects in the mutant remains uncharacterized. Using an unbiased high throughput sequencing approach, we have identified transcripts that are misregulated in this mutant. Aberrant transcripts show altered expression levels, exon skipping, and alternative 5' ends. We independently verified these findings by reverse-transcription and polymerase chain reaction (RT-PCR) analysis. Our analysis shows misregulation of transcripts that have been connected to spermatogenesis, including components of the actomyosin cytoskeletal apparatus. We show, for example, that the Myosin light chain 1 (Mlc1) transcript is aberrantly spliced. Furthermore, bioinformatics analysis reveals that Mlc1 contains a high affinity binding site(s) for dmPTB and that the site is conserved in many Drosophila species. We discuss that Mlc1 and other components of the actomyosin cytoskeletal apparatus offer important molecular links between the loss of dmPTB function and the observed developmental defect in spermatogenesis. This study provides the first comprehensive list of genes misregulated in vivo in the heph2 mutant in Drosophila and offers insight into the role of dmPTB during spermatogenesis.
Coupling unbiased mutagenesis to high-throughput DNA sequencing uncovers functional domains in the Ndc80 kinetochore protein of Saccharomyces cerevisiae.

PubMed

Tien, Jerry F; Fong, Kimberly K; Umbreit, Neil T; Payen, Celia; Zelter, Alex; Asbury, Charles L; Dunham, Maitreya J; Davis, Trisha N

2013-09-01

During mitosis, kinetochores physically link chromosomes to the dynamic ends of spindle microtubules. This linkage depends on the Ndc80 complex, a conserved and essential microtubule-binding component of the kinetochore. As a member of the complex, the Ndc80 protein forms microtubule attachments through a calponin homology domain. Ndc80 is also required for recruiting other components to the kinetochore and responding to mitotic regulatory signals. While the calponin homology domain has been the focus of biochemical and structural characterization, the function of the remainder of Ndc80 is poorly understood. Here, we utilized a new approach that couples high-throughput sequencing to a saturating linker-scanning mutagenesis screen in Saccharomyces cerevisiae. We identified domains in previously uncharacterized regions of Ndc80 that are essential for its function in vivo. We show that a helical hairpin adjacent to the calponin homology domain influences microtubule binding by the complex. Furthermore, a mutation in this hairpin abolishes the ability of the Dam1 complex to strengthen microtubule attachments made by the Ndc80 complex. Finally, we defined a C-terminal segment of Ndc80 required for tetramerization of the Ndc80 complex in vivo. This unbiased mutagenesis approach can be generally applied to genes in S. cerevisiae to identify functional properties and domains.
Expression and characterization of a new esterase with GCSAG motif from a permafrost metagenomic library.

PubMed

Petrovskaya, Lada E; Novototskaya-Vlasova, Ksenia A; Spirina, Elena V; Durdenko, Ekaterina V; Lomakina, Galina Yu; Zavialova, Maria G; Nikolaev, Evgeny N; Rivkina, Elizaveta M

2016-05-01

As a result of construction and screening of a metagenomic library prepared from a permafrost-derived microcosm, we have isolated a novel gene coding for a putative lipolytic enzyme that belongs to the hormone-sensitive lipase family. It encodes a polypeptide of 343 amino acid residues whose amino acid sequence displays maximum likelihood with uncharacterized proteins from Sphingomonas species. A putative catalytic serine residue of PMGL2 resides in a new variant of a recently discovered GTSAG sequence in which a Thr residue is replaced by a Cys residue (GCSAG). The recombinant PMGL2 was produced in Escherichia coli cells and purified by Ni-affinity chromatography. The resulting protein preferably utilizes short-chain p-nitrophenyl esters (C4 and C8) and therefore is an esterase. It possesses maximum activity at 45°C in slightly alkaline conditions and has limited thermostability at higher temperatures. Activity of PMGL2 is stimulated in the presence of 0.25-1.5 M NaCl indicating the good salt tolerance of the new enzyme. Mass spectrometric analysis demonstrated that N-terminal methionine in PMGL2 is processed and cysteine residues do not form a disulfide bond. The results of the study demonstrate the significance of the permafrost environment as a unique genetic reservoir and its potential for metagenomic exploration. © FEMS 2016. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Bioinformatic analysis of the nucleolus.

PubMed

Leung, Anthony K L; Andersen, Jens S; Mann, Matthias; Lamond, Angus I

2003-12-15

The nucleolus is a plurifunctional, nuclear organelle, which is responsible for ribosome biogenesis and many other functions in eukaryotes, including RNA processing, viral replication and tumour suppression. Our knowledge of the human nucleolar proteome has been expanded dramatically by the two recent MS studies on isolated nucleoli from HeLa cells [Andersen, Lyon, Fox, Leung, Lam, Steen, Mann and Lamond (2002) Curr. Biol. 12, 1-11; Scherl, Coute, Deon, Calle, Kindbeiter, Sanchez, Greco, Hochstrasser and Diaz (2002) Mol. Biol. Cell 13, 4100-4109]. Nearly 400 proteins were identified within the nucleolar proteome so far in humans. Approx. 12% of the identified proteins were previously shown to be nucleolar in human cells and, as expected, nearly all of the known housekeeping proteins required for ribosome biogenesis were identified in these analyses. Surprisingly, approx. 30% represented either novel or uncharacterized proteins. This review focuses on how to apply the derived knowledge of this newly recognized nucleolar proteome, such as their amino acid/peptide composition and their homologies across species, to explore the function and dynamics of the nucleolus, and suggests ways to identify, in silico, possible functions of the novel/uncharacterized proteins and potential interaction networks within the human nucleolus, or between the nucleolus and other nuclear organelles, by drawing resources from the public domain.
Expanded microbial genome coverage and improved protein family annotation in the COG database.

PubMed

Galperin, Michael Y; Makarova, Kira S; Wolf, Yuri I; Koonin, Eugene V

2015-01-01

Microbial genome sequencing projects produce numerous sequences of deduced proteins, only a small fraction of which have been or will ever be studied experimentally. This leaves sequence analysis as the only feasible way to annotate these proteins and assign to them tentative functions. The Clusters of Orthologous Groups of proteins (COGs) database (http://www.ncbi.nlm.nih.gov/COG/), first created in 1997, has been a popular tool for functional annotation. Its success was largely based on (i) its reliance on complete microbial genomes, which allowed reliable assignment of orthologs and paralogs for most genes; (ii) orthology-based approach, which used the function(s) of the characterized member(s) of the protein family (COG) to assign function(s) to the entire set of carefully identified orthologs and describe the range of potential functions when there were more than one; and (iii) careful manual curation of the annotation of the COGs, aimed at detailed prediction of the biological function(s) for each COG while avoiding annotation errors and overprediction. Here we present an update of the COGs, the first since 2003, and a comprehensive revision of the COG annotations and expansion of the genome coverage to include representative complete genomes from all bacterial and archaeal lineages down to the genus level. This re-analysis of the COGs shows that the original COG assignments had an error rate below 0.5% and allows an assessment of the progress in functional genomics in the past 12 years. During this time, functions of many previously uncharacterized COGs have been elucidated and tentative functional assignments of many COGs have been validated, either by targeted experiments or through the use of high-throughput methods. A particularly important development is the assignment of functions to several widespread, conserved proteins many of which turned out to participate in translation, in particular rRNA maturation and tRNA modification. The new version of the COGs is expected to become an important tool for microbial genomics. Published by Oxford University Press on behalf of Nucleic Acids Research 2014. This work is written by US Government employees and is in the public domain in the US.
Predicting membrane protein types using various decision tree classifiers based on various modes of general PseAAC for imbalanced datasets.

PubMed

Sankari, E Siva; Manimegalai, D

2017-12-21

Predicting membrane protein types is an important and challenging research area in bioinformatics and proteomics. Traditional biophysical methods are used to classify membrane protein types. Due to large exploration of uncharacterized protein sequences in databases, traditional methods are very time consuming, expensive and susceptible to errors. Hence, it is highly desirable to develop a robust, reliable, and efficient method to predict membrane protein types. Imbalanced datasets and large datasets are often handled well by decision tree classifiers. Since imbalanced datasets are taken, the performance of various decision tree classifiers such as Decision Tree (DT), Classification And Regression Tree (CART), C4.5, Random tree, REP (Reduced Error Pruning) tree, ensemble methods such as Adaboost, RUS (Random Under Sampling) boost, Rotation forest and Random forest are analysed. Among the various decision tree classifiers Random forest performs well in less time with good accuracy of 96.35%. Another inference is RUS boost decision tree classifier is able to classify one or two samples in the class with very less samples while the other classifiers such as DT, Adaboost, Rotation forest and Random forest are not sensitive for the classes with fewer samples. Also the performance of decision tree classifiers is compared with SVM (Support Vector Machine) and Naive Bayes classifier. Copyright © 2017 Elsevier Ltd. All rights reserved.
The Human Cytomegalovirus-Specific UL1 Gene Encodes a Late-Phase Glycoprotein Incorporated in the Virion Envelope

PubMed Central

Shikhagaie, Medya; Mercé-Maldonado, Eva; Isern, Elena; Muntasell, Aura; Albà, M. Mar; López-Botet, Miguel; Hengel, Hartmut

2012-01-01

We have investigated the previously uncharacterized human cytomegalovirus (HCMV) UL1 open reading frame (ORF), a member of the rapidly evolving HCMV RL11 family. UL1 is HCMV specific; the absence of UL1 in chimpanzee cytomegalovirus (CCMV) and sequence analysis studies suggest that UL1 may have originated by the duplication of an ancestor gene from the RL11-TRL cluster (TRL11, TRL12, and TRL13). Sequence similarity searches against human immunoglobulin (Ig)-containing proteins revealed that HCMV pUL1 shows significant similarity to the cellular carcinoembryonic antigen-related (CEA) protein family N-terminal Ig domain, which is responsible for CEA ligand recognition. Northern blot analysis revealed that UL1 is transcribed during the late phase of the viral replication cycle in both fibroblast-adapted and endotheliotropic strains of HCMV. We characterized the protein encoded by hemagglutinin (HA)-tagged UL1 in the AD169-derived HB5 background. UL1 is expressed as a 224-amino-acid type I transmembrane glycoprotein which becomes detectable at 48 h postinfection. In infected human fibroblasts, pUL1 colocalized at the cytoplasmic site of virion assembly and secondary envelopment together with TGN-46, a marker for the trans-Golgi network, and viral structural proteins, including the envelope glycoprotein gB and the tegument phosphoprotein pp28. Furthermore, analyses of highly purified AD169 UL1-HA epitope-tagged virions revealed that pUL1 is a novel constituent of the HCMV envelope. Importantly, the deletion of UL1 in HCMV TB40/E resulted in reduced growth in a cell type-specific manner, suggesting that pUL1 may be implicated in regulating HCMV cell tropism. PMID:22345456
Direct enzyme assay evidence confirms aldehyde reductase function of Ydr541cp and Ygl039wp from Saccharomyces cerevisiae.

PubMed

Moon, Jaewoong; Liu, Z Lewis

2015-04-01

The aldehyde reductase gene ARI1 is a recently characterized member of an intermediate subfamily within the short-chain dehydrogenase/reductase (SDR) superfamily that clarified mechanisms of in situ detoxification of 2-furaldehyde and 5-hydroxymethyl-2-furaldehyde by Saccharomyces cerevisiae. Uncharacterized open reading frames (ORFs) are common among tolerant candidate genes identified for lignocellulose-to-advanced biofuels conversion. This study presents partially purified proteins of two ORFs, YDR541C and YGL039W, and direct enzyme assay evidence against aldehyde-inhibitory compounds commonly encountered during lignocellulosic biomass fermentation processes. Each of the partially purified proteins encoded by these ORFs showed a molecular mass of approximately 38 kDa, similar to Ari1p, a protein encoded by aldehyde reductase gene. Both proteins demonstrated strong aldehyde reduction activities toward 14 aldehyde substrates, with high levels of reduction activity for Ydr541cp toward both aromatic and aliphatic aldehydes. While Ydr541cp was observed to have a significantly higher specific enzyme activity at 20 U/mg using co-factor NADPH, Ygl039wp displayed a NADH preference at 25 U/mg in reduction of butylaldehyde. Amino acid sequence analysis identified a characteristic catalytic triad, Ser, Tyr and Lys; a conserved catalytic motif of Tyr-X-X-X-Lys; and a cofactor-binding sequence motif, Gly-X-X-Gly-X-X-Ala, near the N-terminus that are shared by Ydr541cp, Ygl039wp, Yol151wp/GRE2 and Ari1p. Findings of aldehyde reductase genes contribute to the yeast gene annotation and aids development of the next-generation biocatalyst for advanced biofuels production. Copyright © 2015 John Wiley & Sons, Ltd.
Virus variants with differences in the P1 protein coexist in a Plum pox virus population and display particular host-dependent pathogenicity features.

PubMed

Maliogka, Varvara I; Salvador, Beatriz; Carbonell, Alberto; Sáenz, Pilar; León, David San; Oliveros, Juan Carlos; Delgadillo, Ma Otilia; García, Juan Antonio; Simón-Mateo, Carmen

2012-10-01

Subisolates segregated from an M-type Plum pox virus (PPV) isolate, PPV-PS, differ widely in pathogenicity despite their high degree of sequence similarity. A single amino acid substitution, K109E, in the helper component proteinase (HCPro) protein of PPV caused a significant enhancement of symptom severity in herbaceous hosts, and notably modified virus infectivity in peach seedlings. The presence of this substitution in certain subisolates that induced mild symptoms in herbaceous hosts and did not infect peach seedlings suggested the existence of uncharacterized attenuating factors in these subisolates. In this study, we show that two amino acid changes in the P1 protein are specifically associated with the mild pathogenicity exhibited by some PS subisolates. Site-directed mutagenesis studies demonstrated that both substitutions, W29R and V139E, but especially W29R, resulted in lower levels of virus accumulation and symptom severity in a woody host, Prunus persica. Furthermore, when W29R and V139E mutations were expressed concomitantly, PPV infectivity was completely abolished in this host. In contrast, the V139E substitution, but not W29R, was found to be responsible for symptom attenuation in herbaceous hosts. Deep sequencing analysis demonstrated that the W29R and V139E heterogeneities already existed in the original PPV-PS isolate before its segregation in different subisolates by local lesion cloning. These results highlight the potential complexity of potyviral populations and the relevance of the P1 protein of potyviruses in pathogenesis and viral adaptation to the host. © 2012 THE AUTHORS. MOLECULAR PLANT PATHOLOGY © 2012 BSPP AND BLACKWELL PUBLISHING LTD.
Demonstration of lysosomal localization for the mammalian ependymin-related protein using classical approaches combined with a novel density shift method.

PubMed

Della Valle, Maria Cecilia; Sleat, David E; Sohar, Istvan; Wen, Ting; Pintar, John E; Jadot, Michel; Lobel, Peter

2006-11-17

Most newly synthesized soluble lysosomal proteins are delivered to the lysosome via the mannose 6-phosphate (Man-6-P)-targeting pathway. The presence of the Man-6-P post-translational modification allows these proteins to be affinity-purified on immobilized Man-6-P receptors. This approach has formed the basis for a number of proteomic studies that identified multiple as yet uncharacterized Man-6-P glycoproteins that may represent new lysosomal proteins. Although the presence of Man-6-P is suggestive of lysosomal function, the subcellular localization of such candidates requires experimental verification. Here, we have investigated one such candidate, ependymin-related protein (EPDR). EPDR is a protein of unknown function with some sequence similarity to ependymin, a fish protein thought to play a role in memory consolidation and learning. Using classical subcellular fractionation on rat brain, EPDR co-distributes with lysosomal proteins, but there is significant overlap between lysosomal and mitochondrial markers. For more definitive localization, we have developed a novel approach based upon a selective buoyant density shift of the brain lysosomes in a mutant mouse lacking NPC2, a lysosomal protein involved in lipid transport. EPDR, in parallel with lysosomal markers, shows this density shift in gradient centrifugation experiments comparing mutant and wild type mice. This approach, combined with morphological analyses, demonstrates that EPDR resides in the lysosome. In addition, the lipidosis-induced density shift approach represents a valuable tool for identification and validation of both luminal and membrane lysosomal proteins that should be applicable to high throughput proteomic studies.
The Large Mitochondrial Genome of Symbiodinium minutum Reveals Conserved Noncoding Sequences between Dinoflagellates and Apicomplexans.

PubMed

Shoguchi, Eiichi; Shinzato, Chuya; Hisata, Kanako; Satoh, Nori; Mungpakdee, Sutada

2015-07-20

Even though mitochondrial genomes, which characterize eukaryotic cells, were first discovered more than 50 years ago, mitochondrial genomics remains an important topic in molecular biology and genome sciences. The Phylum Alveolata comprises three major groups (ciliates, apicomplexans, and dinoflagellates), the mitochondrial genomes of which have diverged widely. Even though the gene content of dinoflagellate mitochondrial genomes is reportedly comparable to that of apicomplexans, the highly fragmented and rearranged genome structures of dinoflagellates have frustrated whole genomic analysis. Consequently, noncoding sequences and gene arrangements of dinoflagellate mitochondrial genomes have not been well characterized. Here we report that the continuous assembled genome (∼326 kb) of the dinoflagellate, Symbiodinium minutum, is AT-rich (∼64.3%) and that it contains three protein-coding genes. Based upon in silico analysis, the remaining 99% of the genome comprises transcriptomic noncoding sequences. RNA edited sites and unique, possible start and stop codons clarify conserved regions among dinoflagellates. Our massive transcriptome analysis shows that almost all regions of the genome are transcribed, including 27 possible fragmented ribosomal RNA genes and 12 uncharacterized small RNAs that are similar to mitochondrial RNA genes of the malarial parasite, Plasmodium falciparum. Gene map comparisons show that gene order is only slightly conserved between S. minutum and P. falciparum. However, small RNAs and intergenic sequences share sequence similarities with P. falciparum, suggesting that the function of noncoding sequences has been preserved despite development of very different genome structures. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Rhodohalobacter barkolensis sp. nov., isolated from a saline lake and emended description of the genus Rhodohalobacter.

PubMed

Han, Shuai-Bo; Yu, Yang-Huan; Ju, Zhao; Li, Yu; Zhang, Ran; Hou, Xin-Jun; Ma, Xin-Yuan; Yu, Xiao-Yun; Sun, Cong; Wu, Min

2018-06-01

A Gram-stain-negative, non-motile, aerobic, rod-shaped bacterium, designated 15182 T , was isolated from a saline lake in China. The novel strain 15182 T was able to grow at 10-40 °C (optimum, 37 °C), pH 7.0-8.0 (optimum, 7.5) and with 0.5-4 % NaCl (optimum, 2-3 %, w/v). The phylogenetic analysis based on 16S rRNA gene sequences revealed that strain 15182 T was most closely related to the genus Rhodohalobacter by sharing the highest sequence similarity of 97.0 % with Rhodohalobacter halophilus JZ3C29 T . Chemotaxonomic analysis showed that the sole respiratory quinone was menaquinone 7, the major fatty acids included C16 : 0 N alcohol and C16 : 1ω11c. The major polar lipids included diphosphatidylglycerol, phosphatidylglycerol, phosphatidylethanolamine, four uncharacterized glycolipids, one uncharacterized phospholipid and two uncharacterized lipids. The genomic DNA G+C content of the strain 15182 T was 42.4 mol%. The average nucleotide identity value between 15182 T and R. halophilus JZ3C29 T was 75.4 %, and the in silico DNA-DNA hybridization value of the two strains was 19.1 %. On the basis of its phenotypic, chemotaxonomic, genotypic and genomic characteristics presented in this study, strain 15182 T is suggested to represent a novel species in the genus Rhodohalobacter, for which the name Rhodohalobacter barkolensis sp. nov. is proposed. The type strain is 15182 T (=KCTC 62172 T =MCCC 1K03442 T ). An emended description of the genus Rhodohalobacter is also presented.

Structural basis for plant plasma membrane protein dynamics and organization into functional nanodomains.

PubMed

Gronnier, Julien; Crowet, Jean-Marc; Habenstein, Birgit; Nasir, Mehmet Nail; Bayle, Vincent; Hosy, Eric; Platre, Matthieu Pierre; Gouguet, Paul; Raffaele, Sylvain; Martinez, Denis; Grelard, Axelle; Loquet, Antoine; Simon-Plas, Françoise; Gerbeau-Pissot, Patricia; Der, Christophe; Bayer, Emmanuelle M; Jaillais, Yvon; Deleu, Magali; Germain, Véronique; Lins, Laurence; Mongrand, Sébastien

2017-07-31

Plasma Membrane is the primary structure for adjusting to ever changing conditions. PM sub-compartmentalization in domains is thought to orchestrate signaling. Yet, mechanisms governing membrane organization are mostly uncharacterized. The plant-specific REMORINs are proteins regulating hormonal crosstalk and host invasion. REMs are the best-characterized nanodomain markers via an uncharacterized moiety called REMORIN C-terminal Anchor. By coupling biophysical methods, super-resolution microscopy and physiology, we decipher an original mechanism regulating the dynamic and organization of nanodomains. We showed that targeting of REMORIN is independent of the COP-II-dependent secretory pathway and mediated by PI4P and sterol. REM-CA is an unconventional lipid-binding motif that confers nanodomain organization. Analyses of REM-CA mutants by single particle tracking demonstrate that mobility and supramolecular organization are critical for immunity. This study provides a unique mechanistic insight into how the tight control of spatial segregation is critical in the definition of PM domain necessary to support biological function.
Starch Flocculation by the Sweet Potato Sour Liquid Is Mediated by the Adhesion of Lactic Acid Bacteria to Starch

PubMed Central

Zhang, Lili; Yu, Yang; Li, Xinhua; Li, Xiaona; Zhang, Huajiang; Zhang, Zhen; Xu, Yunhe

2017-01-01

In the current study, we focused on the mechanism underlying starch flocculation by the sweet potato sour liquid. The traditional microbial techniques and 16S rDNA sequencing revealed that Lactobacillus was dominant flocculating microorganism in sour liquid. In total, 86 bacteria, 20 yeasts, and 10 molds were isolated from the sour liquid and only eight Lactobacillus species exhibited flocculating activity. Lactobacillus paracasei subsp. paracasei L1 strain with a high flocculating activity was isolated and identified, and the mechanism of starch flocculation was examined. L. paracasei subsp. paracasei L1 cells formed chain-like structures on starch granules. Consequently, these cells connected the starch granules to one another, leading to formation of large flocs. The results of various treatments of L1 cells indicated that bacterial surface proteins play a role in flocculation and L1 cells adhered to the surface of starch granules via specific surface proteins. These surface starch-binding proteins were extracted using the guanidine hydrochloride method; 10 proteins were identified by mass spectrometry: three of these proteins were glycolytic enzymes; two were identified as the translation elongation factor Tu; one was a cell wall hydrolase; one was a surface antigen; one was lyzozyme M1; one was a glycoside hydrolase; and one was an uncharacterized proteins. This study will paves the way for future industrial application of the L1 isolate in starch processing and food manufacturing. PMID:28791000
A novel bioinformatics pipeline to discover genes related to arbuscular mycorrhizal symbiosis based on their evolutionary conservation pattern among higher plants.

PubMed

Favre, Patrick; Bapaume, Laure; Bossolini, Eligio; Delorenzi, Mauro; Falquet, Laurent; Reinhardt, Didier

2014-12-03

Genes involved in arbuscular mycorrhizal (AM) symbiosis have been identified primarily by mutant screens, followed by identification of the mutated genes (forward genetics). In addition, a number of AM-related genes has been identified by their AM-related expression patterns, and their function has subsequently been elucidated by knock-down or knock-out approaches (reverse genetics). However, genes that are members of functionally redundant gene families, or genes that have a vital function and therefore result in lethal mutant phenotypes, are difficult to identify. If such genes are constitutively expressed and therefore escape differential expression analyses, they remain elusive. The goal of this study was to systematically search for AM-related genes with a bioinformatics strategy that is insensitive to these problems. The central element of our approach is based on the fact that many AM-related genes are conserved only among AM-competent species. Our approach involves genome-wide comparisons at the proteome level of AM-competent host species with non-mycorrhizal species. Using a clustering method we first established orthologous/paralogous relationships and subsequently identified protein clusters that contain members only of the AM-competent species. Proteins of these clusters were then analyzed in an extended set of 16 plant species and ranked based on their relatedness among AM-competent monocot and dicot species, relative to non-mycorrhizal species. In addition, we combined the information on the protein-coding sequence with gene expression data and with promoter analysis. As a result we present a list of yet uncharacterized proteins that show a strongly AM-related pattern of sequence conservation, indicating that the respective genes may have been under selection for a function in AM. Among the top candidates are three genes that encode a small family of similar receptor-like kinases that are related to the S-locus receptor kinases involved in sporophytic self-incompatibility. We present a new systematic strategy of gene discovery based on conservation of the protein-coding sequence that complements classical forward and reverse genetics. This strategy can be applied to diverse other biological phenomena if species with established genome sequences fall into distinguished groups that differ in a defined functional trait of interest.
On the detection of functionally coherent groups of protein domains with an extension to protein annotation

PubMed Central

McLaughlin, William A; Chen, Ken; Hou, Tingjun; Wang, Wei

2007-01-01

Background Protein domains coordinate to perform multifaceted cellular functions, and domain combinations serve as the functional building blocks of the cell. The available methods to identify functional domain combinations are limited in their scope, e.g. to the identification of combinations falling within individual proteins or within specific regions in a translated genome. Further effort is needed to identify groups of domains that span across two or more proteins and are linked by a cooperative function. Such functional domain combinations can be useful for protein annotation. Results Using a new computational method, we have identified 114 groups of domains, referred to as domain assembly units (DASSEM units), in the proteome of budding yeast Saccharomyces cerevisiae. The units participate in many important cellular processes such as transcription regulation, translation initiation, and mRNA splicing. Within the units the domains were found to function in a cooperative manner; and each domain contributed to a different aspect of the unit's overall function. The member domains of DASSEM units were found to be significantly enriched among proteins contained in transcription modules, defined as genes sharing similar expression profiles and presumably similar functions. The observation further confirmed the functional coherence of DASSEM units. The functional linkages of units were found in both functionally characterized and uncharacterized proteins, which enabled the assessment of protein function based on domain composition. Conclusion A new computational method was developed to identify groups of domains that are linked by a common function in the proteome of Saccharomyces cerevisiae. These groups can either lie within individual proteins or span across different proteins. We propose that the functional linkages among the domains within the DASSEM units can be used as a non-homology based tool to annotate uncharacterized proteins. PMID:17937820
Molecular cloning of ADIR, a novel interferon responsive gene encoding a protein related to the torsins.

PubMed

Dron, Michel; Meritet, Jean François; Dandoy-Dron, Françoise; Meyniel, Jean-Philippe; Maury, Chantal; Tovey, Michael G

2002-03-01

The expression of the previously uncharacterized gene Adir (for ATP dependent interferon responsive gene) was increased by 5- to 15-fold in tissue of the oral cavity or in spleen and liver of mice treated orally or intraperitoneally with IFN-alpha, and in mouse cells treated in vitro with IFN-alpha or IFN-gamma. The level of Adir mRNA was also increased 20- to 40-fold in the brains of animals infected with encephalomyocarditis virus. Adir is expressed ubiquitously in mouse tissues as 1.9-, 2.4-, and 3.5-kb mRNA transcripts encoding a 385-amino-acid protein with a conserved ATP binding domain containing typical nucleotide and Mg(2+) binding sites. We also characterized the human ortholog, ADIR, which is located on chromosome 1q25-q31 and contains six exons encoding a 397-amino-acid protein with 80% homology to the mouse protein. A single 2.3-kb mRNA was detected in all human tissues examined, except for placenta, which also contained a 1.25-kb tissue-specific transcript generated by alternative splicing and encoding a putative 336-amino-acid protein. Although ADIR exhibits low homology to DYT1 and TOR1B, the deduced ADIR protein sequences are highly homologous to torsin A and torsin B and more distantly related to members of the Clp/HSP100 family of proteins, suggesting that ADIR, like torsins, is related to the AAA chaperone-like family of ATPases. An ADIR-EGFP fusion protein expressed in HeLa cells was shown to be associated with the endoplasmic reticulum.
Isolation and characterization of a new zinc-binding protein from albacore tuna plasma

DOE Office of Scientific and Technical Information (OSTI.GOV)

Dyke, B.; Hegenauer, J.; Saltman, P.

1987-06-02

The protein responsible for sequestering high levels of zinc in the plasma of the albacore tuna (Thunnus alalunga) has been isolated by sequential chromatography. The glycoprotein has a molecular weight of 66,000. Approximately 8.2% of its amino acid residues are histidines. Equilibrium dialysis experiments show it to bind 3 mol of zinc/mol of protein. The stoichiometric constant for the association of zinc with a binding site containing three histidines was determined to be 10/sup 9.4/. This protein is different from albumin and represents a previously uncharacterized zinc transport protein.
A novel plant enzyme with dual activity: an atypical Nudix hydrolase and a dipeptidyl peptidase III

PubMed Central

Karačić, Zrinka; Vukelić, Bojana; Ho, Gabrielle H.; Jozić, Iva; Sučec, Iva; Salopek-Sondi, Branka; Kozlović, Marija; Brenner, Steven E.; Ludwig-Müller, Jutta; Abramić, Marija

2017-01-01

In a search for plant homologues of dipeptidyl peptidase III (DPP III) family, we found a predicted protein from the moss Physcomitrella patens (UniProt entry: A9TLP4), which shared 61% sequence identity with the Arabidopsis thaliana uncharacterized protein, designated Nudix hydrolase 3. Both proteins contained all conserved regions of the DPP III family, but instead of the characteristic hexapeptide HEXXGH zinc-binding motif, they possessed a pentapeptide HEXXH, and at the N-terminus, a Nudix box, a hallmark of Nudix hydrolases, known to act upon a variety of nucleoside diphosphate derivatives. To investigate their biochemical properties, we expressed heterologously and purified Physcomitrella (PpND) and Arabidopsis (AtND) protein. Both hydrolyzed, with comparable catalytic efficiency, the isopentenyl diphosphate (IPP), a universal precursor for the biosynthesis of isoprenoid compounds. In addition, PpND dephosphorylated four purine nucleotides (ADP, dGDP, dGTP, and 8-oxo-dATP) with strong preference for oxidized dATP. Furthermore, PpND and AtND showed DPP III activity against dipeptidyl-2-arylamide substrates, which they cleaved with different specificity. This is the first report of a dual activity enzyme, highly conserved in land plants, which catalyses the hydrolysis of a peptide bond and of a phosphate bond, acting both as a dipeptidyl peptidase III and an atypical Nudix hydrolase. PMID:27467751
Cysteine-rich domains related to Frizzled receptors and Hedgehog-interacting proteins

PubMed Central

Pei, Jimin; Grishin, Nick V

2012-01-01

Frizzled and Smoothened are homologous seven-transmembrane proteins functioning in the Wnt and Hedgehog signaling pathways, respectively. They harbor an extracellular cysteine-rich domain (FZ-CRD), a mobile evolutionary unit that has been found in a number of other metazoan proteins and Frizzled-like proteins in Dictyostelium. Domains distantly related to FZ-CRDs, in Hedgehog-interacting proteins (HHIPs), folate receptors and riboflavin-binding proteins (FRBPs), and Niemann-Pick Type C1 proteins (NPC1s), referred to as HFN-CRDs, exhibit similar structures and disulfide connectivity patterns compared with FZ-CRDs. We used computational analyses to expand the homologous set of FZ-CRDs and HFN-CRDs, providing a better understanding of their evolution and classification. First, FZ-CRD-containing proteins with various domain compositions were identified in several major eukaryotic lineages including plants and Chromalveolata, revealing a wider phylogenetic distribution of FZ-CRDs than previously recognized. Second, two new and distinct groups of highly divergent FZ-CRDs were found by sensitive similarity searches. One of them is present in the calcium channel component Mid1 in fungi and the uncharacterized FAM155 proteins in metazoans. Members of the other new FZ-CRD group occur in the metazoan-specific RECK (reversion-inducing-cysteine-rich protein with Kazal motifs) proteins that are putative tumor suppressors acting as inhibitors of matrix metalloproteases. Finally, sequence and three-dimensional structural comparisons helped us uncover a divergent HFN-CRD in glypicans, which are important morphogen-binding heparan sulfate proteoglycans. Such a finding reinforces the evolutionary ties between the Wnt and Hedgehog signaling pathways and underscores the importance of gene duplications in creating essential signaling components in metazoan evolution. PMID:22693159
A snapshot of the microbiome of Amblyomma tuberculatum ticks infesting the gopher tortoise, an endangered species.

PubMed

Budachetri, Khemraj; Gaillard, Daniel; Williams, Jaclyn; Mukherjee, Nabanita; Karim, Shahid

2016-10-01

The gopher tortoise tick, Amblyomma tuberculatum, has a unique relationship with the gopher tortoise, Gopherus polyphemus, found in sandy habitats across the southeastern United States. We aimed to understand the overall bacterial community associated with A. tuberculatum while also focusing on spotted fever group Rickettsia. These tortoises in the Southern Mississippi region are a federally threatened species; therefore, we have carefully trapped the tortoises and removed the species-specific ticks attached to them. Genomic DNA was extracted from individual ticks and used to explore overall bacterial load using pyrosequencing of bacterial 16S rRNA on 454-sequencing platform. The spotted fever group of Rickettsia was explored by amplifying rickettsial outer membrane protein A (rompA) gene by nested PCR. Sequencing results revealed 330 bacterial operational taxonomic units (OTUs) after all the necessary curation of sequences. Four whole A. tuberculatum ticks showed Proteobacteria, Firmicutes, Actinobacteria and Bacteroidetes as the most dominant phyla with a total of 74 different bacterial genera detected. Together Rickettsiae and Francisella showed >85% abundance, thus dominating the bacterial community structure. Partial sequences obtained from ompA amplicons revealed the presence of an uncharacterized Rickettsia similar to the Rickettsial endosymbiont of A. tuberculatum. This is the first preliminary profile of a complete bacterial community from gopher tortoise ticks and warrants further investigation regarding the functional role of Rickettsial and Francisella-like endosymbionts in tick physiology. Copyright © 2016 Elsevier GmbH. All rights reserved.
Retrieval Does Not Induce Reconsolidation of Inhibitory Avoidance Memory

ERIC Educational Resources Information Center

Medina, Jorge H.; Izquierdo, Ivan; Cammarota, Martin; Bevilaqua, Lia R. M.

2004-01-01

It has been suggested that retrieval during a nonreinforced test induces reconsolidation instead of extinction of the mnemonic trace. Reconsolidation would preserve the original memory from the labilization induced by its nonreinforced recall through a hitherto uncharacterized mechanism requiring protein synthesis. Given the importance that such a…
ICAM-1-related long non-coding RNA: promoter analysis and expression in human retinal endothelial cells.

PubMed

Lumsden, Amanda L; Ma, Yuefang; Ashander, Liam M; Stempel, Andrew J; Keating, Damien J; Smith, Justine R; Appukuttan, Binoy

2018-05-09

Regulation of intercellular adhesion molecule (ICAM)-1 in retinal endothelial cells is a promising druggable target for retinal vascular diseases. The ICAM-1-related (ICR) long non-coding RNA stabilizes ICAM-1 transcript, increasing protein expression. However, studies of ICR involvement in disease have been limited as the promoter is uncharacterized. To address this issue, we undertook a comprehensive in silico analysis of the human ICR gene promoter region. We used genomic evolutionary rate profiling to identify a 115 base pair (bp) sequence within 500 bp upstream of the transcription start site of the annotated human ICR gene that was conserved across 25 eutherian genomes. A second constrained sequence upstream of the orthologous mouse gene (68 bp; conserved across 27 Eutherian genomes including human) was also discovered. Searching these elements identified 33 matrices predictive of binding sites for transcription factors known to be responsive to a broad range of pathological stimuli, including hypoxia, and metabolic and inflammatory proteins. Five phenotype-associated single nucleotide polymorphisms (SNPs) in the immediate vicinity of these elements included four SNPs (i.e. rs2569693, rs281439, rs281440 and rs11575074) predicted to impact binding motifs of transcription factors, and thus the expression of ICR and ICAM-1 genes, with potential to influence disease susceptibility. We verified that human retinal endothelial cells expressed ICR, and observed induction of expression by tumor necrosis factor-α.
Computational modeling of RNA 3D structures, with the aid of experimental restraints

PubMed Central

Magnus, Marcin; Matelska, Dorota; Łach, Grzegorz; Chojnowski, Grzegorz; Boniecki, Michal J; Purta, Elzbieta; Dawson, Wayne; Dunin-Horkawicz, Stanislaw; Bujnicki, Janusz M

2014-01-01

In addition to mRNAs whose primary function is transmission of genetic information from DNA to proteins, numerous other classes of RNA molecules exist, which are involved in a variety of functions, such as catalyzing biochemical reactions or performing regulatory roles. In analogy to proteins, the function of RNAs depends on their structure and dynamics, which are largely determined by the ribonucleotide sequence. Experimental determination of high-resolution RNA structures is both laborious and difficult, and therefore, the majority of known RNAs remain structurally uncharacterized. To address this problem, computational structure prediction methods were developed that simulate either the physical process of RNA structure formation (“Greek science” approach) or utilize information derived from known structures of other RNA molecules (“Babylonian science” approach). All computational methods suffer from various limitations that make them generally unreliable for structure prediction of long RNA sequences. However, in many cases, the limitations of computational and experimental methods can be overcome by combining these two complementary approaches with each other. In this work, we review computational approaches for RNA structure prediction, with emphasis on implementations (particular programs) that can utilize restraints derived from experimental analyses. We also list experimental approaches, whose results can be relatively easily used by computational methods. Finally, we describe case studies where computational and experimental analyses were successfully combined to determine RNA structures that would remain out of reach for each of these approaches applied separately. PMID:24785264
mCSF1, a nucleus-encoded CRM protein required for the processing of many mitochondrial introns, is involved in the biogenesis of respiratory complexes I and IV in Arabidopsis.

PubMed

Zmudjak, Michal; Colas des Francs-Small, Catherine; Keren, Ido; Shaya, Felix; Belausov, Eduard; Small, Ian; Ostersetzer-Biran, Oren

2013-07-01

The coding regions of many mitochondrial genes in plants are interrupted by intervening sequences that are classified as group II introns. Their splicing is essential for the expression of the genes they interrupt and hence for respiratory function, and is facilitated by various protein cofactors. Despite the importance of these cofactors, only a few of them have been characterized. CRS1-YhbY domain (CRM) is a recently recognized RNA-binding domain that is present in several characterized splicing factors in plant chloroplasts. The Arabidopsis genome encodes 16 CRM proteins, but these are largely uncharacterized. Here, we analyzed the intracellular location of one of these hypothetical proteins in Arabidopsis, mitochondrial CAF-like splicing factor 1 (mCSF1; At4 g31010), and analyzed the growth phenotypes and organellar activities associated with mcsf1 mutants in plants. Our data indicated that mCSF1 resides within mitochondria and its functions are essential during embryogenesis. Mutant plants with reduced mCSF1 displayed inhibited germination and retarded growth phenotypes that were tightly associated with reduced complex I and IV activities. Analogously to the functions of plastid-localized CRM proteins, analysis of the RNA profiles in wildtype and mcsf1 plants showed that mCSF1 acts in the splicing of many of the group II intron RNAs in Arabidopsis mitochondria. © 2013 The Authors. New Phytologist © 2013 New Phytologist Trust.
GSHSite: Exploiting an Iteratively Statistical Method to Identify S-Glutathionylation Sites with Substrate Specificity

PubMed Central

Chen, Yi-Ju; Lu, Cheng-Tsung; Huang, Kai-Yao; Wu, Hsin-Yi; Chen, Yu-Ju; Lee, Tzong-Yi

2015-01-01

S-glutathionylation, the covalent attachment of a glutathione (GSH) to the sulfur atom of cysteine, is a selective and reversible protein post-translational modification (PTM) that regulates protein activity, localization, and stability. Despite its implication in the regulation of protein functions and cell signaling, the substrate specificity of cysteine S-glutathionylation remains unknown. Based on a total of 1783 experimentally identified S-glutathionylation sites from mouse macrophages, this work presents an informatics investigation on S-glutathionylation sites including structural factors such as the flanking amino acids composition and the accessible surface area (ASA). TwoSampleLogo presents that positively charged amino acids flanking the S-glutathionylated cysteine may influence the formation of S-glutathionylation in closed three-dimensional environment. A statistical method is further applied to iteratively detect the conserved substrate motifs with statistical significance. Support vector machine (SVM) is then applied to generate predictive model considering the substrate motifs. According to five-fold cross-validation, the SVMs trained with substrate motifs could achieve an enhanced sensitivity, specificity, and accuracy, and provides a promising performance in an independent test set. The effectiveness of the proposed method is demonstrated by the correct identification of previously reported S-glutathionylation sites of mouse thioredoxin (TXN) and human protein tyrosine phosphatase 1b (PTP1B). Finally, the constructed models are adopted to implement an effective web-based tool, named GSHSite (http://csb.cse.yzu.edu.tw/GSHSite/), for identifying uncharacterized GSH substrate sites on the protein sequences. PMID:25849935
The R148.3 Gene Modulates Caenorhabditis elegans Lifespan and Fat Metabolism

PubMed Central

Roy-Bellavance, Catherine; Grants, Jennifer M.; Miard, Stéphanie; Lee, Kayoung; Rondeau, Évelyne; Guillemette, Chantal; Simard, Martin J.; Taubert, Stefan; Picard, Frédéric

2017-01-01

Despite many advances, the molecular links between energy metabolism and longevity are not well understood. Here, we have used the nematode model Caenorhabditis elegans to study the role of the yet-uncharacterized gene R148.3 in fat accumulation and lifespan. In wild-type worms, a R148.3p::GFP reporter showed enhanced expression throughout life in the pharynx, in neurons, and in muscles. Functionally, a protein fusing a predicted 22 amino acid N-terminal signal sequence (SS) of R148.3 to mCherry displayed robust accumulation in coelomyocytes, indicating that R148.3 is a secreted protein. Systematic depletion of R148.3 by RNA interference (RNAi) at L1 but not at young-adult stage enhanced triglyceride accumulation, which was associated with increased food uptake and lower expression of genes involved in lipid oxidation. However, RNAi of R148.3 at both L1 and young-adult stages robustly diminished mean and maximal lifespan of wild-type worms, and also abolished the long-lived phenotypes of eat-2 and daf-2/InsR mutants. Based on these data, we propose that R148.3 is an SS that modulates fat mass and longevity in an independent manner. PMID:28620088
Spliced Leader RNAs, Mitochondrial Gene Frameshifts and Multi-Protein Phylogeny Expand Support for the Genus Perkinsus as a Unique Group of Alveolates

PubMed Central

Zhang, Huan; Campbell, David A.; Sturm, Nancy R.; Dungan, Christopher F.; Lin, Senjie

2011-01-01

The genus Perkinsus occupies a precarious phylogenetic position. To gain a better understanding of the relationship between perkinsids, dinoflagellates and other alveolates, we analyzed the nuclear-encoded spliced-leader (SL) RNA and mitochondrial genes, intron prevalence, and multi-protein phylogenies. In contrast to the canonical 22-nt SL found in dinoflagellates (DinoSL), P. marinus has a shorter (21-nt) and a longer (22-nt) SL with slightly different sequences than DinoSL. The major SL RNA transcripts range in size between 80–83 nt in P. marinus, and ∼83 nt in P. chesapeaki, significantly larger than the typical ≤56-nt dinoflagellate SL RNA. In most of the phylogenetic trees based on 41 predicted protein sequences, P. marinus branched at the base of the dinoflagellate clade that included the ancient taxa Oxyrrhis and Amoebophrya, sister to the clade of apicomplexans, and in some cases clustered with apicomplexans as a sister to the dinoflagellate clade. Of 104 Perkinsus spp. genes examined 69.2% had introns, a higher intron prevalence than in dinoflagellates. Examination of Perkinsus spp. mitochondrial cytochrome B and cytochrome C oxidase subunit I genes and their cDNAs revealed no mRNA editing, but these transcripts can only be translated when frameshifts are introduced at every AGG and CCC codon as if AGGY codes for glycine and CCCCU for proline. These results, along with the presence of the numerous uncharacterized ‘marine alveolate group I' and Perkinsus-like lineages separating perkinsids from core dinoflagellates, expand support for the affiliation of the genus Perkinsus with an independent lineage (Perkinsozoa) positioned between the phyla of Apicomplexa and Dinoflagellata. PMID:21629701
A novel estrogen-regulated avian apolipoprotein☆

PubMed Central

Nikolay, Birgit; Plieschnig, Julia A.; Šubik, Desiree; Schneider, Jeannine D.; Schneider, Wolfgang J.; Hermann, Marcela

2013-01-01

In search for yet uncharacterized proteins involved in lipid metabolism of the chicken, we have isolated a hitherto unknown protein from the serum lipoprotein fraction with a buoyant density of ≤1.063 g/ml. Data obtained by protein microsequencing and molecular cloning of cDNA defined a 537 bp cDNA encoding a precursor molecule of 178 residues. As determined by SDS-PAGE, the major circulating form of the protein, which we designate apolipoprotein-VLDL-IV (Apo-IV), has an apparent Mr of approximately 17 kDa. Northern Blot analysis of different tissues of laying hens revealed Apo-IV expression mainly in the liver and small intestine, compatible with an involvement of the protein in lipoprotein metabolism. To further investigate the biology of Apo-IV, we raised an antibody against a GST-Apo-IV fusion protein, which allowed the detection of the 17-kDa protein in rooster plasma, whereas in laying hens it was detectable only in the isolated ≤1.063 g/ml density lipoprotein fraction. Interestingly, estrogen treatment of roosters caused a reduction of Apo-IV in the liver and in the circulation to levels similar to those in mature hens. Furthermore, the antibody crossreacted with a 17-kDa protein in quail plasma, indicating conservation of Apo-IV in avian species. In search for mammalian counterparts of Apo-IV, alignment of the sequence of the novel chicken protein with those of different mammalian apolipoproteins revealed stretches with limited similarity to regions of ApoC-IV and possibly with ApoE from various mammalian species. These data suggest that Apo-IV is a newly identified avian apolipoprotein. PMID:24047540
A Complete Structural Inventory of the Mycobacterial Microcompartment Shell Proteins Constrains Models of Global Architecture and Transport*

PubMed Central

Mallette, Evan

2017-01-01

Bacterial microcompartments are bacterial analogs of eukaryotic organelles in that they spatially segregate aspects of cellular metabolism, but they do so by building not a lipid membrane but a thin polyhedral protein shell. Although multiple shell protein structures are known for several microcompartment types, additional uncharacterized components complicate systematic investigations of shell architecture. We report here the structures of all four proteins proposed to form the shell of an uncharacterized microcompartment designated the Rhodococcus and Mycobacterium microcompartment (RMM), which, along with crystal interactions and docking studies, suggests possible models for the particle's vertex and edge organization. MSM0272 is a typical hexameric β-sandwich shell protein thought to form the bulk of the facet. MSM0273 is a pentameric β-barrel shell protein that likely plugs the vertex of the particle. MSM0271 is an unusual double-ringed bacterial microcompartment shell protein whose rings are organized in an offset position relative to all known related proteins. MSM0275 is related to MSM0271 but self-organizes as linear strips that may line the facet edge; here, the presence of a novel extendable loop may help ameliorate poor packing geometry of the rigid main particle at the angled edges. In contrast to previously characterized homologs, both of these proteins show closed pores at both ends. This suggests a model where key interactions at the vertex and edges are mediated at the inner layer of the shell by MSM0271 (encircling MSM0273) and MSM0275, and the facet is built from MSM0272 hexamers tiling in the outer layer of the shell. PMID:27927988
A Complete Structural Inventory of the Mycobacterial Microcompartment Shell Proteins Constrains Models of Global Architecture and Transport.

PubMed

Mallette, Evan; Kimber, Matthew S

2017-01-27

Bacterial microcompartments are bacterial analogs of eukaryotic organelles in that they spatially segregate aspects of cellular metabolism, but they do so by building not a lipid membrane but a thin polyhedral protein shell. Although multiple shell protein structures are known for several microcompartment types, additional uncharacterized components complicate systematic investigations of shell architecture. We report here the structures of all four proteins proposed to form the shell of an uncharacterized microcompartment designated the Rhodococcus and Mycobacterium microcompartment (RMM), which, along with crystal interactions and docking studies, suggests possible models for the particle's vertex and edge organization. MSM0272 is a typical hexameric β-sandwich shell protein thought to form the bulk of the facet. MSM0273 is a pentameric β-barrel shell protein that likely plugs the vertex of the particle. MSM0271 is an unusual double-ringed bacterial microcompartment shell protein whose rings are organized in an offset position relative to all known related proteins. MSM0275 is related to MSM0271 but self-organizes as linear strips that may line the facet edge; here, the presence of a novel extendable loop may help ameliorate poor packing geometry of the rigid main particle at the angled edges. In contrast to previously characterized homologs, both of these proteins show closed pores at both ends. This suggests a model where key interactions at the vertex and edges are mediated at the inner layer of the shell by MSM0271 (encircling MSM0273) and MSM0275, and the facet is built from MSM0272 hexamers tiling in the outer layer of the shell. © 2017 by The American Society for Biochemistry and Molecular Biology, Inc.
The uncharacterized gene 1700093K21Rik and flanking regions are correlated with reproductive isolation in the house mouse, Mus musculus.

PubMed

Kass, David H; Janoušek, Václav; Wang, Liuyang; Tucker, Priscilla K

2014-06-01

Reproductive barriers exist between the house mouse subspecies, Mus musculus musculus and M. m. domesticus, members of the Mus musculus species complex, primarily as a result of hybrid male infertility, and a hybrid zone exists where their ranges intersect in Europe. Using single nucleotide polymorphisms (SNPs) diagnostic for the two taxa, the extent of introgression across the genome was previously compared in these hybrid populations. Sixty-nine of 1316 autosomal SNPs exhibited reduced introgression in two hybrid zone transects suggesting maladaptive interactions among certain loci. One of these markers is within a region on chromosome 11 that, in other studies, has been associated with hybrid male sterility of these subspecies. We assessed sequence variation in a 20 Mb region on chromosome 11 flanking this marker, and observed its inclusion within a roughly 150 kb stretch of DNA showing elevated sequence differentiation between the two subspecies. Four genes are associated with this genomic subregion, with two entirely encompassed. One of the two genes, the uncharacterized 1700093K21Rik gene, displays distinguishing features consistent with a potential role in reproductive isolation between these subspecies. Along with its expression specifically within spermatogenic cells, we present various sequence analyses that demonstrate a high rate of molecular evolution of this gene, as well as identify a subspecies amino acid variant resulting in a structural difference. Taken together, the data suggest a role for this gene in reproductive isolation.

Transcriptome Engineering with RNA-Targeting Type VI-D CRISPR Effectors.

PubMed

Konermann, Silvana; Lotfy, Peter; Brideau, Nicholas J; Oki, Jennifer; Shokhirev, Maxim N; Hsu, Patrick D

2018-04-19

Class 2 CRISPR-Cas systems endow microbes with diverse mechanisms for adaptive immunity. Here, we analyzed prokaryotic genome and metagenome sequences to identify an uncharacterized family of RNA-guided, RNA-targeting CRISPR systems that we classify as type VI-D. Biochemical characterization and protein engineering of seven distinct orthologs generated a ribonuclease effector derived from Ruminococcus flavefaciens XPD3002 (CasRx) with robust activity in human cells. CasRx-mediated knockdown exhibits high efficiency and specificity relative to RNA interference across diverse endogenous transcripts. As one of the most compact single-effector Cas enzymes, CasRx can also be flexibly packaged into adeno-associated virus. We target virally encoded, catalytically inactive CasRx to cis elements of pre-mRNA to manipulate alternative splicing, alleviating dysregulated tau isoform ratios in a neuronal model of frontotemporal dementia. Our results present CasRx as a programmable RNA-binding module for efficient targeting of cellular RNA, enabling a general platform for transcriptome engineering and future therapeutic development. Copyright © 2018 Elsevier Inc. All rights reserved.
Global analysis of the Burkholderia thailandensis quorum sensing-controlled regulon.

PubMed

Majerczyk, Charlotte; Brittnacher, Mitchell; Jacobs, Michael; Armour, Christopher D; Radey, Mathew; Schneider, Emily; Phattarasokul, Somsak; Bunt, Richard; Greenberg, E Peter

2014-04-01

Burkholderia thailandensis contains three acyl-homoserine lactone quorum sensing circuits and has two additional LuxR homologs. To identify B. thailandensis quorum sensing-controlled genes, we carried out transcriptome sequencing (RNA-seq) analyses of quorum sensing mutants and their parent. The analyses were grounded in the fact that we identified genes coding for factors shown previously to be regulated by quorum sensing among a larger set of quorum-controlled genes. We also found that genes coding for contact-dependent inhibition were induced by quorum sensing and confirmed that specific quorum sensing mutants had a contact-dependent inhibition defect. Additional quorum-controlled genes included those for the production of numerous secondary metabolites, an uncharacterized exopolysaccharide, and a predicted chitin-binding protein. This study provides insights into the roles of the three quorum sensing circuits in the saprophytic lifestyle of B. thailandensis, and it provides a foundation on which to build an understanding of the roles of quorum sensing in the biology of B. thailandensis and the closely related pathogenic Burkholderia pseudomallei and Burkholderia mallei.
Structural basis for plant plasma membrane protein dynamics and organization into functional nanodomains

PubMed Central

Gronnier, Julien; Crowet, Jean-Marc; Habenstein, Birgit; Nasir, Mehmet Nail; Bayle, Vincent; Hosy, Eric; Platre, Matthieu Pierre; Gouguet, Paul; Raffaele, Sylvain; Martinez, Denis; Grelard, Axelle; Loquet, Antoine; Simon-Plas, Françoise; Gerbeau-Pissot, Patricia; Der, Christophe; Bayer, Emmanuelle M; Jaillais, Yvon; Deleu, Magali; Germain, Véronique; Lins, Laurence; Mongrand, Sébastien

2017-01-01

Plasma Membrane is the primary structure for adjusting to ever changing conditions. PM sub-compartmentalization in domains is thought to orchestrate signaling. Yet, mechanisms governing membrane organization are mostly uncharacterized. The plant-specific REMORINs are proteins regulating hormonal crosstalk and host invasion. REMs are the best-characterized nanodomain markers via an uncharacterized moiety called REMORIN C-terminal Anchor. By coupling biophysical methods, super-resolution microscopy and physiology, we decipher an original mechanism regulating the dynamic and organization of nanodomains. We showed that targeting of REMORIN is independent of the COP-II-dependent secretory pathway and mediated by PI4P and sterol. REM-CA is an unconventional lipid-binding motif that confers nanodomain organization. Analyses of REM-CA mutants by single particle tracking demonstrate that mobility and supramolecular organization are critical for immunity. This study provides a unique mechanistic insight into how the tight control of spatial segregation is critical in the definition of PM domain necessary to support biological function. DOI: http://dx.doi.org/10.7554/eLife.26404.001 PMID:28758890
An Uncharacterized Member of the Ribokinase Family in Thermococcus kodakarensis Exhibits myo-Inositol Kinase Activity*

PubMed Central

Sato, Takaaki; Fujihashi, Masahiro; Miyamoto, Yukika; Kuwata, Keiko; Kusaka, Eriko; Fujita, Haruo; Miki, Kunio; Atomi, Haruyuki

2013-01-01

Here we performed structural and biochemical analyses on the TK2285 gene product, an uncharacterized protein annotated as a member of the ribokinase family, from the hyperthermophilic archaeon Thermococcus kodakarensis. The three-dimensional structure of the TK2285 protein resembled those of previously characterized members of the ribokinase family including ribokinase, adenosine kinase, and phosphofructokinase. Conserved residues characteristic of this protein family were located in a cleft of the TK2285 protein as in other members whose structures have been determined. We thus examined the kinase activity of the TK2285 protein toward various sugars recognized by well characterized ribokinase family members. Although activity with sugar phosphates and nucleosides was not detected, kinase activity was observed toward d-allose, d-lyxose, d-tagatose, d-talose, d-xylose, and d-xylulose. Kinetic analyses with the six sugar substrates revealed high Km values, suggesting that they were not the true physiological substrates. By examining activity toward amino sugars, sugar alcohols, and disaccharides, we found that the TK2285 protein exhibited prominent kinase activity toward myo-inositol. Kinetic analyses with myo-inositol revealed a greater kcat and much lower Km value than those obtained with the monosaccharides, resulting in over a 2,000-fold increase in kcat/Km values. TK2285 homologs are distributed among members of Thermococcales, and in most species, the gene is positioned close to a myo-inositol monophosphate synthase gene. Our results suggest the presence of a novel subfamily of the ribokinase family whose members are present in Archaea and recognize myo-inositol as a substrate. PMID:23737529
Sequencing of the needle transcriptome from Norway spruce (Picea abies Karst L.) reveals lower substitution rates, but similar selective constraints in gymnosperms and angiosperms

PubMed Central

2012-01-01

Background A detailed knowledge about spatial and temporal gene expression is important for understanding both the function of genes and their evolution. For the vast majority of species, transcriptomes are still largely uncharacterized and even in those where substantial information is available it is often in the form of partially sequenced transcriptomes. With the development of next generation sequencing, a single experiment can now simultaneously identify the transcribed part of a species genome and estimate levels of gene expression. Results mRNA from actively growing needles of Norway spruce (Picea abies) was sequenced using next generation sequencing technology. In total, close to 70 million fragments with a length of 76 bp were sequenced resulting in 5 Gbp of raw data. A de novo assembly of these reads, together with publicly available expressed sequence tag (EST) data from Norway spruce, was used to create a reference transcriptome. Of the 38,419 PUTs (putative unique transcripts) longer than 150 bp in this reference assembly, 83.5% show similarity to ESTs from other spruce species and of the remaining PUTs, 3,704 show similarity to protein sequences from other plant species, leaving 4,167 PUTs with limited similarity to currently available plant proteins. By predicting coding frames and comparing not only the Norway spruce PUTs, but also PUTs from the close relatives Picea glauca and Picea sitchensis to both Pinus taeda and Taxus mairei, we obtained estimates of synonymous and non-synonymous divergence among conifer species. In addition, we detected close to 15,000 SNPs of high quality and estimated gene expression differences between samples collected under dark and light conditions. Conclusions Our study yielded a large number of single nucleotide polymorphisms as well as estimates of gene expression on transcriptome scale. In agreement with a recent study we find that the synonymous substitution rate per year (0.6 × 10−09 and 1.1 × 10−09) is an order of magnitude smaller than values reported for angiosperm herbs. However, if one takes generation time into account, most of this difference disappears. The estimates of the dN/dS ratio (non-synonymous over synonymous divergence) reported here are in general much lower than 1 and only a few genes showed a ratio larger than 1. PMID:23122049
Methylation of human eukaryotic elongation factor alpha (eEF1A) by a member of a novel protein lysine methyltransferase family modulates mRNA translation

PubMed Central

Małecki, Jędrzej; Nilges, Benedikt S.; Moen, Anders; Leidel, Sebastian A.

2017-01-01

Abstract Many cellular proteins are methylated on lysine residues and this has been most intensively studied for histone proteins. Lysine methylations on non-histone proteins are also frequent, but in most cases the functional significance of the methylation event, as well as the identity of the responsible lysine (K) specific methyltransferase (KMT), remain unknown. Several recently discovered KMTs belong to the so-called seven-β-strand (7BS) class of MTases and we have here investigated an uncharacterized human 7BS MTase currently annotated as part of the endothelin converting enzyme 2, but which should be considered a separate enzyme. Combining in vitro enzymology and analyzes of knockout cells, we demonstrate that this MTase efficiently methylates K36 in eukaryotic translation elongation factor 1 alpha (eEF1A) in vitro and in vivo. We suggest that this novel KMT is named eEF1A-KMT4 (gene name EEF1AKMT4), in agreement with the recently established nomenclature. Furthermore, by ribosome profiling we show that the absence of K36 methylation affects translation dynamics and changes translation speed of distinct codons. Finally, we show that eEF1A-KMT4 is part of a novel family of human KMTs, defined by a shared sequence motif in the active site and we demonstrate the importance of this motif for catalytic activity. PMID:28520920
Enterotoxigenic Escherichia coli Elicits Immune Responses to Multiple Surface Proteins▿ †

PubMed Central

Roy, Koushik; Bartels, Scott; Qadri, Firdausi; Fleckenstein, James M.

2010-01-01

Enterotoxigenic Escherichia coli (ETEC) causes considerable morbidity and mortality due to diarrheal illness in developing countries, particularly in young children. Despite the global importance of these heterogeneous pathogens, a broadly protective vaccine is not yet available. While much is known regarding the immunology of well-characterized virulence proteins, in particular the heat-labile toxin (LT) and colonization factors (CFs), to date, evaluation of the immune response to other antigens has been limited. However, the availability of genomic DNA sequences for ETEC strains coupled with proteomics technology affords opportunities to examine novel uncharacterized antigens that might also serve as targets for vaccine development. Analysis of whole or fractionated bacterial proteomes with convalescent-phase sera can potentially accelerate identification of secreted or surface-expressed targets that are recognized during the course of infection. Here we report results of an immunoproteomics approach to antigen discovery with ETEC strain H10407. Immunoblotting of proteins separated by two-dimensional electrophoresis (2DE) with sera from mice infected with strain H10407 or with convalescent human sera obtained following natural ETEC infections demonstrated multiple immunoreactive molecules in culture supernatant, outer membrane, and outer membrane vesicle preparations, suggesting that many antigens are recognized during the course of infection. Proteins identified by this approach included established virulence determinants, more recently identified putative virulence factors, as well as novel secreted and outer membrane proteins. Together, these studies suggest that existing and emerging proteomics technologies can provide a useful complement to ongoing approaches to ETEC vaccine development. PMID:20457787
Discovery of novel bacterial toxins by genomics and computational biology.

PubMed

Doxey, Andrew C; Mansfield, Michael J; Montecucco, Cesare

2018-06-01

Hundreds and hundreds of bacterial protein toxins are presently known. Traditionally, toxin identification begins with pathological studies of bacterial infectious disease. Following identification and cultivation of a bacterial pathogen, the protein toxin is purified from the culture medium and its pathogenic activity is studied using the methods of biochemistry and structural biology, cell biology, tissue and organ biology, and appropriate animal models, supplemented by bioimaging techniques. The ongoing and explosive development of high-throughput DNA sequencing and bioinformatic approaches have set in motion a revolution in many fields of biology, including microbiology. One consequence is that genes encoding novel bacterial toxins can be identified by bioinformatic and computational methods based on previous knowledge accumulated from studies of the biology and pathology of thousands of known bacterial protein toxins. Starting from the paradigmatic cases of diphtheria toxin, tetanus and botulinum neurotoxins, this review discusses traditional experimental approaches as well as bioinformatics and genomics-driven approaches that facilitate the discovery of novel bacterial toxins. We discuss recent work on the identification of novel botulinum-like toxins from genera such as Weissella, Chryseobacterium, and Enteroccocus, and the implications of these computationally identified toxins in the field. Finally, we discuss the promise of metagenomics in the discovery of novel toxins and their ecological niches, and present data suggesting the existence of uncharacterized, botulinum-like toxin genes in insect gut metagenomes. Copyright © 2018. Published by Elsevier Ltd.
Distinctive acceptor-end structure and other determinants of Escherichia coli tRNAPro identity.

PubMed Central

McClain, W H; Schneider, J; Gabriel, K

1994-01-01

The previously uncharacterized determinants of the specificity of tRNAPro for aminoacylation (tRNAPro identity) were defined by a computer comparison of all Escherichia coli tRNA sequences and tested by a functional analysis of amber suppressor tRNAs in vivo. We determined the amino acid specificity of tRNA by sequencing a suppressed protein and the aminoacylation efficiency of tRNA by examining the steady-state level of aminoacyl-tRNA. On substituting nucleotides derived from the acceptor end and variable pocket of tRNAPro for the corresponding nucleotides in a tRNAPhe gene, the identity of the resulting tRNA changed substantially but incompletely to that of tRNAPro. The redesigned tRNAPhe was weakly active and aminoacyl-tRNA was not detected. Ethyl methanesulfonate mutagenesis of the redesigned tRNAPhe gene produced a mutant with a wobble pair in place of a base pair in the end of the acceptor-stem helix of the transcribed tRNA. This mutant exhibited both a tRNAPro identity and substantial aminoacyl-tRNA. The results speak for the importance of a distinctive conformation in the acceptor-stem helix of tRNAPro for aminoacylation by the prolyl-tRNA synthetase. The anticodon also contributes to tRNAPro identity but is not necessary in vivo. Images PMID:8127693
A multivariate prediction model for Rho-dependent termination of transcription.

PubMed

Nadiras, Cédric; Eveno, Eric; Schwartz, Annie; Figueroa-Bossi, Nara; Boudvillain, Marc

2018-06-21

Bacterial transcription termination proceeds via two main mechanisms triggered either by simple, well-conserved (intrinsic) nucleic acid motifs or by the motor protein Rho. Although bacterial genomes can harbor hundreds of termination signals of either type, only intrinsic terminators are reliably predicted. Computational tools to detect the more complex and diversiform Rho-dependent terminators are lacking. To tackle this issue, we devised a prediction method based on Orthogonal Projections to Latent Structures Discriminant Analysis [OPLS-DA] of a large set of in vitro termination data. Using previously uncharacterized genomic sequences for biochemical evaluation and OPLS-DA, we identified new Rho-dependent signals and quantitative sequence descriptors with significant predictive value. Most relevant descriptors specify features of transcript C>G skewness, secondary structure, and richness in regularly-spaced 5'CC/UC dinucleotides that are consistent with known principles for Rho-RNA interaction. Descriptors collectively warrant OPLS-DA predictions of Rho-dependent termination with a ∼85% success rate. Scanning of the Escherichia coli genome with the OPLS-DA model identifies significantly more termination-competent regions than anticipated from transcriptomics and predicts that regions intrinsically refractory to Rho are primarily located in open reading frames. Altogether, this work delineates features important for Rho activity and describes the first method able to predict Rho-dependent terminators in bacterial genomes.
Integrating metagenomic and amplicon databases to resolve the phylogenetic and ecological diversity of the Chlamydiae

PubMed Central

Lagkouvardos, Ilias; Weinmaier, Thomas; Lauro, Federico M; Cavicchioli, Ricardo; Rattei, Thomas; Horn, Matthias

2014-01-01

In the era of metagenomics and amplicon sequencing, comprehensive analyses of available sequence data remain a challenge. Here we describe an approach exploiting metagenomic and amplicon data sets from public databases to elucidate phylogenetic diversity of defined microbial taxa. We investigated the phylum Chlamydiae whose known members are obligate intracellular bacteria that represent important pathogens of humans and animals, as well as symbionts of protists. Despite their medical relevance, our knowledge about chlamydial diversity is still scarce. Most of the nine known families are represented by only a few isolates, while previous clone library-based surveys suggested the existence of yet uncharacterized members of this phylum. Here we identified more than 22 000 high quality, non-redundant chlamydial 16S rRNA gene sequences in diverse databases, as well as 1900 putative chlamydial protein-encoding genes. Even when applying the most conservative approach, clustering of chlamydial 16S rRNA gene sequences into operational taxonomic units revealed an unexpectedly high species, genus and family-level diversity within the Chlamydiae, including 181 putative families. These in silico findings were verified experimentally in one Antarctic sample, which contained a high diversity of novel Chlamydiae. In our analysis, the Rhabdochlamydiaceae, whose known members infect arthropods, represents the most diverse and species-rich chlamydial family, followed by the protist-associated Parachlamydiaceae, and a putative new family (PCF8) with unknown host specificity. Available information on the origin of metagenomic samples indicated that marine environments contain the majority of the newly discovered chlamydial lineages, highlighting this environment as an important chlamydial reservoir. PMID:23949660
LncRNA NEAT1 promotes autophagy in MPTP-induced Parkinson's disease through stabilizing PINK1 protein.

PubMed

Yan, Wang; Chen, Zhao-Ying; Chen, Jia-Qi; Chen, Hui-Min

2018-02-19

Long non-coding RNA nuclear paraspeckle assembly transcript 1 (lncRNA NEAT1) was found to be closely related to the pathological changes in brain and nervous system. However, the role of NEAT1 and its potential mechanism in Parkinson's disease (PD) largely remain uncharacterized. In this study, PD mouse model was established by intraperitoneal injection of MPTP. The numbers of TH + neurons, NEAT1 expression and the level of PINK1, LC3-II, LC3-I protein were assessed in PD mice. SH-SY5Y cells were treated with MPP + as PD cell model. RNA pull-down assay was used to identify the interaction between NEAT1 and PINK1 in vitro. The endogenous expression of NEAT1 was modified by lentiviral vector carrying interference sequence for NEAT1 in vivo. The numbers of TH + neurons significantly decreased in PD mice compared with the control. The expressions of NEAT1, PINK1 protein and LC3-II/LC3-I level were increased by MPTP in vitro and in vivo. Moreover, NEAT1 positively regulated the protein level of PINK1 through inhibition of PINK1 protein degradation. And NEAT1 mediated the effects of MPP + on SH-SY5Y cells through stabilization of PINK1 protein. The results of in vivo experiments revealed that NEAT1 knockdown could effectively suppress MPTP-induced autophagy in vivo that alleviated dopaminergic neuronal injury. LncRNA NEAT1 promoted the MPTP-induced autophagy in PD through stabilization of PINK1 protein. Copyright © 2017 Elsevier Inc. All rights reserved.
A SNARE-Like Superfamily Protein SbSLSP from the Halophyte Salicornia brachiata Confers Salt and Drought Tolerance by Maintaining Membrane Stability, K+/Na+ Ratio, and Antioxidant Machinery

PubMed Central

Singh, Dinkar; Yadav, Narendra Singh; Tiwari, Vivekanand; Agarwal, Pradeep K.; Jha, Bhavanath

2016-01-01

About 1000 salt-responsive ESTs were identified from an extreme halophyte Salicornia brachiata. Among these, a novel salt-inducible gene SbSLSP (Salicornia brachiata SNARE-like superfamily protein), showed up-regulation upon salinity and dehydration stress. The presence of cis-regulatory motifs related to abiotic stress in the putative promoter region supports our finding that SbSLSP gene is inducible by abiotic stress. The SbSLSP protein showed a high sequence identity to hypothetical/uncharacterized proteins from Beta vulgaris, Spinacia oleracea, Eucalyptus grandis, and Prunus persica and with SNARE-like superfamily proteins from Zostera marina and Arabidopsis thaliana. Bioinformatics analysis predicted a clathrin adaptor complex small-chain domain and N-myristoylation site in the SbSLSP protein. Subcellular localization studies indicated that the SbSLSP protein is mainly localized in the plasma membrane. Using transgenic tobacco lines, we establish that overexpression of SbSLSP resulted in elevated tolerance to salt and drought stress. The improved tolerance was confirmed by alterations in a range of physiological parameters, including high germination and survival rate, higher leaf chlorophyll contents, and reduced accumulation of Na+ ion and reactive oxygen species (ROS). Furthermore, overexpressing lines also showed lower water loss, higher cell membrane stability, and increased accumulation of proline and ROS-scavenging enzymes. Overexpression of SbSLSP also enhanced the transcript levels of ROS-scavenging and signaling enzyme genes. This study is the first investigation of the function of the SbSLSP gene as a novel determinant of salinity/drought tolerance. The results suggest that SbSLSP could be a potential candidate to increase salinity and drought tolerance in crop plants for sustainable agriculture in semi-arid saline soil. PMID:27313584
Using Affinity Chromatography to Investigate Novel Protein–Protein Interactions in an Undergraduate Cell and Molecular Biology Lab Course

PubMed Central

2009-01-01

Inquiry-driven lab exercises require students to think carefully about a question, carry out an investigation of that question, and critically analyze the results of their investigation. Here, we describe the implementation and assessment of an inquiry-based laboratory exercise in which students obtain and analyze novel data that contribute to our understanding of macromolecular trafficking between the nucleus and cytoplasm in eukaryotic cells. Although many of the proteins involved in nucleocytoplasmic transport are known, the physical interactions between some of these polypeptides remain uncharacterized. In this cell and molecular biology lab exercise, students investigate novel protein–protein interactions between factors involved in nuclear RNA export. Using recombinant protein expression, protein extraction, affinity chromatography, SDS-polyacrylamide gel electrophoresis, and Western blotting, undergraduates in a sophomore-level lab course identified a previously unreported association between the soluble mRNA transport factor Mex67 and the C-terminal region of the yeast nuclear pore complex protein Nup1. This exercise immersed students in the process of investigative science, from proposing and performing experiments through analyzing data and reporting outcomes. On completion of this investigative lab sequence, students reported enhanced understanding of the scientific process, increased proficiency with cellular and molecular methods and content, greater understanding of data analysis and the importance of appropriate controls, an enhanced ability to communicate science effectively, and an increased enthusiasm for scientific research and for the lab component of the course. The modular nature of this exercise and its focus on asking novel questions about protein–protein interactions make it easily transferable to undergraduate lab courses performed in a wide variety of contexts. PMID:19723816
Prolyl hydroxylation regulates protein degradation, synthesis, and splicing in human induced pluripotent stem cell-derived cardiomyocytes

PubMed Central

Stoehr, Andrea; Yang, Yanqin; Patel, Sajni; Evangelista, Alicia M.; Aponte, Angel; Wang, Guanghui; Liu, Poching; Boylston, Jennifer; Kloner, Philip H.; Lin, Yongshun; Gucek, Marjan; Zhu, Jun; Murphy, Elizabeth

2016-01-01

Aims Protein hydroxylases are oxygen- and α-ketoglutarate-dependent enzymes that catalyse hydroxylation of amino acids such as proline, thus linking oxygen and metabolism to enzymatic activity. Prolyl hydroxylation is a dynamic post-translational modification that regulates protein stability and protein–protein interactions; however, the extent of this modification is largely uncharacterized. The goals of this study are to investigate the biological consequences of prolyl hydroxylation and to identify new targets that undergo prolyl hydroxylation in human cardiomyocytes. Methods and results We used human induced pluripotent stem cell-derived cardiomyocytes in combination with pulse-chase amino acid labelling and proteomics to analyse the effects of prolyl hydroxylation on protein degradation and synthesis. We identified 167 proteins that exhibit differences in degradation with inhibition of prolyl hydroxylation by dimethyloxalylglycine (DMOG); 164 were stabilized. Proteins involved in RNA splicing such as serine/arginine-rich splicing factor 2 (SRSF2) and splicing factor and proline- and glutamine-rich (SFPQ) were stabilized with DMOG. DMOG also decreased protein translation of cytoskeletal and sarcomeric proteins such as α-cardiac actin. We searched the mass spectrometry data for proline hydroxylation and identified 134 high confidence peptides mapping to 78 unique proteins. We identified SRSF2, SFPQ, α-cardiac actin, and cardiac titin as prolyl hydroxylated. We identified 29 prolyl hydroxylated proteins that showed a significant difference in either protein degradation or synthesis. Additionally, we performed next-generation RNA sequencing and showed that the observed decrease in protein synthesis was not due to changes in mRNA levels. Because RNA splicing factors were prolyl hydroxylated, we investigated splicing ± inhibition of prolyl hydroxylation and detected 369 alternative splicing events, with a preponderance of exon skipping. Conclusions This study provides the first extensive characterization of the cardiac prolyl hydroxylome and demonstrates that inhibition of α-ketoglutarate hydroxylases alters protein stability, translation, and splicing. PMID:27095734
Identification of Novel Saccharomyces cerevisiae Proteins with Nuclear Export Activity: Cell Cycle-Regulated Transcription Factor Ace2p Shows Cell Cycle-Independent Nucleocytoplasmic Shuttling

PubMed Central

Jensen, Torben Heick; Neville, Megan; Rain, Jean Christophe; McCarthy, Terri; Legrain, Pierre; Rosbash, Michael

2000-01-01

Nuclear export of proteins containing leucine-rich nuclear export signals (NESs) is mediated by the NES receptor CRM1/Crm1p. We have carried out a yeast two-hybrid screen with Crm1p as a bait. The Crm1p-interacting clones were subscreened for nuclear export activity in a visual assay utilizing the Crm1p-inhibitor leptomycin B (LMB). This approach identified three Saccharomyces cerevisiae proteins not previously known to have nuclear export activity. These proteins are the 5′ RNA triphosphatase Ctl1p, the cell cycle-regulated transcription factor Ace2p, and a protein encoded by the previously uncharacterized open reading frame YDR499W. Mutagenesis analysis show that YDR499Wp contains an NES that conforms to the consensus sequence for leucine-rich NESs. Mutagenesis of Ctl1p and Ace2p were unable to identify specific NES residues. However, a 29-amino-acid region of Ace2p, rich in hydrophobic residues, contains nuclear export activity. Ace2p accumulates in the nucleus at the end of mitosis and activates early-G1-specific genes. We now provide evidence that Ace2p is nuclear not only in late M-early G1 but also during other stages of the cell cycle. This feature of Ace2p localization explains its ability to activate genes such as CUP1, which are not expressed in a cell cycle-dependent manner. PMID:11027275
A Rapid and Reliable Method for Total Protein Extraction from Succulent Plants for Proteomic Analysis.

PubMed

Lledías, Fernando; Hernández, Felipe; Rivas, Viridiana; García-Mendoza, Abisaí; Cassab, Gladys I; Nieto-Sotelo, Jorge

2017-08-01

Crassulacean acid metabolism plants have some morphological features, such as succulent and reduced leaves, thick cuticles, and sunken stomata that help them prevent excessive water loss and irradiation. As molecular constituents of these morphological adaptations to xeric environments, succulent plants produce a set of specific compounds such as complex polysaccharides, pigments, waxes, and terpenoids, to name a few, in addition to uncharacterized proteases. Since all these compounds interfere with the analysis of proteins by electrophoretic techniques, preparation of high quality samples from these sources represents a real challenge. The absence of adequate protocols for protein extraction has restrained the study of this class of plants at the molecular level. Here, we present a rapid and reliable protocol that could be accomplished in 1 h and applied to a broad range of plants with reproducible results. We were able to obtain well-resolved SDS/PAGE protein patterns in extracts from different members of the subfamilies Agavoideae (Agave, Yucca, Manfreda, and Furcraea), Nolinoideae (Dasylirion and Beucarnea), and the Cactaceae family. This method is based on the differential solubility of contaminants and proteins in the presence of acetone and pH-altered solutions. We speculate about the role of saponins and high molecular weight carbohydrates to produce electrophoretic-compatible samples. A modification of the basic protocol allowed the analysis of samples by bidimensional electrophoresis (2DE) for proteomic analysis. Furostanol glycoside 26-O-β-glucosidase (an enzyme involved in steroid saponin synthesis) was successfully identified by mass spectrometry analysis and de novo sequencing of a 2DE spot from an Agave attenuata sample.
A trans-outer membrane porin-cytochrome protein complex for extracellular electron transfer by Geobacter sulfurreducens PCA

DOE PAGES

Liu, Yimo; Wang, Zheming; Liu, Juan; ...

2014-09-24

The multiheme, outer membrane c-type cytochrome (c-Cyt) OmcB of Geobacter sulfurreducens was previously proposed to mediate electron transfer across the outer membrane. However, the underlying mechanism has remained uncharacterized. In G. sulfurreducens, the omcB gene is part of two tandem four-gene clusters, each is predicted to encode a transcriptional factor (OrfR/OrfS), a porin-like outer membrane protein (OmbB/OmbC), a periplasmic c-type cytochrome (OmaB/OmaC), and an outer membrane c-Cyt (OmcB/OmcC), respectively. Here we showed that OmbB/OmbC, OmaB/OmaC and OmcB/OmcC of G. sulfurreducens PCA formed the porin-cytochrome (Pcc) protein complexes, which were involved in transferring electrons across the outer membrane. The isolated Pccmore » protein complexes reconstituted in proteoliposomes transferred electrons from reduced methyl viologen across the lipid bilayer of liposomes to Fe(III)-citrate and ferrihydrite. The pcc clusters were found in all eight sequenced Geobacter and 11 other bacterial genomes from six different phyla, demonstrating a widespread distribution of Pcc protein complexes in phylogenetically diverse bacteria. Deletion of ombB-omaB-omcB-orfS-ombC-omaC-omcC gene clusters had no impact on the growth of G. sulfurreducens PCA with fumarate, but diminished the ability of G. sulfurreducens PCA to reduce Fe(III)-citrate and ferrihydrite. Finally, complementation with the ombB-omaB-omcB gene cluster restored the ability of G. sulfurreducens PCA to reduce Fe(III)-citrate and ferrihydrite.« less
Processing of Nonconjugative Resistance Plasmids by Conjugation Nicking Enzyme of Staphylococci

DOE Office of Scientific and Technical Information (OSTI.GOV)

Pollet, Rebecca M.; Ingle, James D.; Hymes, Jeff P.

Antimicrobial resistance inStaphylococcus aureuspresents an increasing threat to human health. This resistance is often encoded on mobile plasmids, such as pSK41; however, the mechanism of transfer of these plasmids is not well understood. In this study, we first examine key protein-DNA interactions formed by the relaxase enzyme, NES, which initiates and terminates the transfer of the multidrug resistance plasmid pSK41. Two loops on the NES protein, hairpin loops 1 and 2, form extensive contacts with the DNA hairpin formed at theoriTregion of pSK41, and here we establish that these contacts are essential for proper DNA cleavage and religation by themore » full 665-residue NES proteinin vitro. Second, pSK156 and pCA347 are nonconjugativeStaphylococcus aureusplasmids that contain sequences similar to theoriTregion of pSK41 but differ in the sequence predicted to form a DNA hairpin. We show that pSK41-encoded NES is able to bind, cleave, and religate theoriTsequences of these nonconjugative plasmidsin vitro. Although pSK41 could mobilize a coresident plasmid harboring its cognateoriT, it was unable to mobilize plasmids containing the pSK156 and pCA347 variantoriTmimics, suggesting that an accessory protein like that previously shown to confer specificity in the pWBG749 system may also be involved in transmission of plasmids containing a pSK41-likeoriT. These data indicate that the conjugative relaxase intransmechanism recently described for the pWBG749 family of plasmids also applies to the pSK41 family of plasmids, further heightening the potential significance of this mechanism in the horizontal transfer of staphylococcal plasmids. IMPORTANCEUnderstanding the mechanism of antimicrobial resistance transfer in bacteria such asStaphylococcus aureusis an important step toward potentially slowing the spread of antimicrobial-resistant infections. This work establishes protein-DNA interactions essential for the transfer of theStaphylococcus aureusmultiresistance plasmid pSK41 by its relaxase, NES. This enzyme also processed variantoriT-like sequences found on numerous plasmids previously considered nontransmissible, suggesting that in conjunction with an uncharacterized accessory protein, these plasmids may be transferred horizontally via a relaxase intransmechanism. These findings have important implications for our understanding of staphylococcal resistance plasmid evolution.« less
Identification of Variant-Specific Functions of PIK3CA by Rapid Phenotyping of Rare Mutations | Office of Cancer Genomics

Cancer.gov

Large-scale sequencing efforts are uncovering the complexity of cancer genomes, which are composed of causal "driver" mutations that promote tumor progression along with many more pathologically neutral "passenger" events. The majority of mutations, both in known cancer drivers and uncharacterized genes, are generally of low occurrence, highlighting the need to functionally annotate the long tail of infrequent mutations present in heterogeneous cancers.

The polyphenol oxidase gene family in land plants: Lineage-specific duplication and expansion

PubMed Central

2012-01-01

Background Plant polyphenol oxidases (PPOs) are enzymes that typically use molecular oxygen to oxidize ortho-diphenols to ortho-quinones. These commonly cause browning reactions following tissue damage, and may be important in plant defense. Some PPOs function as hydroxylases or in cross-linking reactions, but in most plants their physiological roles are not known. To better understand the importance of PPOs in the plant kingdom, we surveyed PPO gene families in 25 sequenced genomes from chlorophytes, bryophytes, lycophytes, and flowering plants. The PPO genes were then analyzed in silico for gene structure, phylogenetic relationships, and targeting signals. Results Many previously uncharacterized PPO genes were uncovered. The moss, Physcomitrella patens, contained 13 PPO genes and Selaginella moellendorffii (spike moss) and Glycine max (soybean) each had 11 genes. Populus trichocarpa (poplar) contained a highly diversified gene family with 11 PPO genes, but several flowering plants had only a single PPO gene. By contrast, no PPO-like sequences were identified in several chlorophyte (green algae) genomes or Arabidopsis (A. lyrata and A. thaliana). We found that many PPOs contained one or two introns often near the 3’ terminus. Furthermore, N-terminal amino acid sequence analysis using ChloroP and TargetP 1.1 predicted that several putative PPOs are synthesized via the secretory pathway, a unique finding as most PPOs are predicted to be chloroplast proteins. Phylogenetic reconstruction of these sequences revealed that large PPO gene repertoires in some species are mostly a consequence of independent bursts of gene duplication, while the lineage leading to Arabidopsis must have lost all PPO genes. Conclusion Our survey identified PPOs in gene families of varying sizes in all land plants except in the genus Arabidopsis. While we found variation in intron numbers and positions, overall PPO gene structure is congruent with the phylogenetic relationships based on primary sequence data. The dynamic nature of this gene family differentiates PPO from other oxidative enzymes, and is consistent with a protein important for a diversity of functions relating to environmental adaptation. PMID:22897796
Structural characteristics of ScBx genes controlling the biosynthesis of hydroxamic acids in rye (Secale cereale L.).

PubMed

Bakera, Beata; Makowska, Bogna; Groszyk, Jolanta; Niziołek, Michał; Orczyk, Wacław; Bolibok-Brągoszewska, Hanna; Hromada-Judycka, Aneta; Rakoczy-Trojanowska, Monika

2015-08-01

Benzoxazinoids (BX) are major secondary metabolites of gramineous plants that play an important role in disease resistance and allelopathy. They also have many other unique properties including anti-bacterial and anti-fungal activity, and the ability to reduce alfa-amylase activity. The biosynthesis and modification of BX are controlled by the genes Bx1 ÷ Bx10, GT and glu, and the majority of these Bx genes have been mapped in maize, wheat and rye. However, the genetic basis of BX biosynthesis remains largely uncharacterized apart from some data from maize and wheat. The aim of this study was to isolate, sequence and characterize five genes (ScBx1, ScBx2, ScBx3, ScBx4 and ScBx5) encoding enzymes involved in the synthesis of DIBOA, an important defense compound of rye. Using a modified 3D procedure of BAC library screening, seven BAC clones containing all of the ScBx genes were isolated and sequenced. Bioinformatic analyses of the resulting contigs were used to examine the structure and other features of these genes, including their promoters, introns and 3'UTRs. Comparative analysis showed that the ScBx genes are similar to those of other Poaceae species, especially to the TaBx genes. The polymorphisms present both in the coding sequences and non-coding regions of ScBx in relation to other Bx genes are predicted to have an impact on the expression, structure and properties of the encoded proteins.
Structures of Arg- and Gln-type bacterial cysteine dioxygenase homologs: Arg- and Gln-type Bacterial CDO Homologs

DOE PAGES

Driggers, Camden M.; Hartman, Steven J.; Karplus, P. Andrew

2015-01-01

In some bacteria, cysteine is converted to cysteine sulfinic acid by cysteine dioxygenases (CDO) that are only ~15–30% identical in sequence to mammalian CDOs. Among bacterial proteins having this range of sequence similarity to mammalian CDO are some that conserve an active site Arg residue (“Arg-type” enzymes) and some having a Gln substituted for this Arg (“Gln-type” enzymes). Here, we describe a structure from each of these enzyme types by analyzing structures originally solved by structural genomics groups but not published: a Bacillus subtilis “Arg-type” enzyme that has cysteine dioxygenase activity (BsCDO), and a Ralstonia eutropha “Gln-type” CDO homolog ofmore » uncharacterized activity (ReCDOhom). The BsCDO active site is well conserved with mammalian CDO, and a cysteine complex captured in the active site confirms that the cysteine binding mode is also similar. The ReCDOhom structure reveals a new active site Arg residue that is hydrogen bonding to an iron-bound diatomic molecule we have interpreted as dioxygen. Notably, the Arg position is not compatible with the mode of Cys binding seen in both rat CDO and BsCDO. As sequence alignments show that this newly discovered active site Arg is well conserved among “Gln-type” CDO enzymes, we conclude that the “Gln-type” CDO homologs are not authentic CDOs but will have substrate specificity more similar to 3-mercaptopropionate dioxygenases.« less
Construction of reliable protein-protein interaction networks with a new interaction generality measure.

PubMed

Saito, Rintaro; Suzuki, Harukazu; Hayashizaki, Yoshihide

2003-04-12

Recent screening techniques have made large amounts of protein-protein interaction data available, from which biologically important information such as the function of uncharacterized proteins, the existence of novel protein complexes, and novel signal-transduction pathways can be discovered. However, experimental data on protein interactions contain many false positives, making these discoveries difficult. Therefore computational methods of assessing the reliability of each candidate protein-protein interaction are urgently needed. We developed a new 'interaction generality' measure (IG2) to assess the reliability of protein-protein interactions using only the topological properties of their interaction-network structure. Using yeast protein-protein interaction data, we showed that reliable protein-protein interactions had significantly lower IG2 values than less-reliable interactions, suggesting that IG2 values can be used to evaluate and filter interaction data to enable the construction of reliable protein-protein interaction networks.
Transcriptionally active PCR for antigen identification and vaccine development: in vitro genome-wide screening and in vivo immunogenicity

PubMed Central

Regis, David P.; Dobaño, Carlota; Quiñones-Olson, Paola; Liang, Xiaowu; Graber, Norma L.; Stefaniak, Maureen E.; Campo, Joseph J.; Carucci, Daniel J.; Roth, David A.; He, Huaping; Felgner, Philip L.; Doolan, Denise L.

2009-01-01

We have evaluated a technology called Transcriptionally Active PCR (TAP) for high throughput identification and prioritization of novel target antigens from genomic sequence data using the Plasmodium parasite, the causative agent of malaria, as a model. First, we adapted the TAP technology for the highly AT-rich Plasmodium genome, using well-characterized P. falciparum and P. yoelii antigens and a small panel of uncharacterized open reading frames from the P. falciparum genome sequence database. We demonstrated that TAP fragments encoding six well-characterized P. falciparum antigens and five well-characterized P. yoelii antigens could be amplified in an equivalent manner from both plasmid DNA and genomic DNA templates, and that uncharacterized open reading frames could also be amplified from genomic DNA template. Second, we showed that the in vitro expression of the TAP fragments was equivalent or superior to that of supercoiled plasmid DNA encoding the same antigen. Third, we evaluated the in vivo immunogenicity of TAP fragments encoding a subset of the model P. falciparum and P. yoelii antigens. We found that antigen-specific antibody and cellular immune responses induced by the TAP fragments in mice were equivalent or superior to those induced by the corresponding plasmid DNA vaccines. Finally, we developed and demonstrated proof-of-principle for an in vitro humoral immunoscreening assay for down-selection of novel target antigens. These data support the potential of a TAP approach for rapid high throughput functional screening and identification of potential candidate vaccine antigens from genomic sequence data. PMID:18164079
Transcriptionally active PCR for antigen identification and vaccine development: in vitro genome-wide screening and in vivo immunogenicity.

PubMed

Regis, David P; Dobaño, Carlota; Quiñones-Olson, Paola; Liang, Xiaowu; Graber, Norma L; Stefaniak, Maureen E; Campo, Joseph J; Carucci, Daniel J; Roth, David A; He, Huaping; Felgner, Philip L; Doolan, Denise L

2008-03-01

We have evaluated a technology called transcriptionally active PCR (TAP) for high throughput identification and prioritization of novel target antigens from genomic sequence data using the Plasmodium parasite, the causative agent of malaria, as a model. First, we adapted the TAP technology for the highly AT-rich Plasmodium genome, using well-characterized P. falciparum and P. yoelii antigens and a small panel of uncharacterized open reading frames from the P. falciparum genome sequence database. We demonstrated that TAP fragments encoding six well-characterized P. falciparum antigens and five well-characterized P. yoelii antigens could be amplified in an equivalent manner from both plasmid DNA and genomic DNA templates, and that uncharacterized open reading frames could also be amplified from genomic DNA template. Second, we showed that the in vitro expression of the TAP fragments was equivalent or superior to that of supercoiled plasmid DNA encoding the same antigen. Third, we evaluated the in vivo immunogenicity of TAP fragments encoding a subset of the model P. falciparum and P. yoelii antigens. We found that antigen-specific antibody and cellular immune responses induced by the TAP fragments in mice were equivalent or superior to those induced by the corresponding plasmid DNA vaccines. Finally, we developed and demonstrated proof-of-principle for an in vitro humoral immunoscreening assay for down-selection of novel target antigens. These data support the potential of a TAP approach for rapid high throughput functional screening and identification of potential candidate vaccine antigens from genomic sequence data.
ATRX tolerates activity-dependent histone H3 methyl/phos switching to maintain repetitive element silencing in neurons

PubMed Central

Noh, Kyung-Min; Zhao, Dan; Xiang, Bin; Wenderski, Wendy; Lewis, Peter W.; Shen, Li; Li, Haitao; Allis, C. David

2015-01-01

ATRX (the alpha thalassemia/mental retardation syndrome X-linked protein) is a member of the switch2/sucrose nonfermentable2 (SWI2/SNF2) family of chromatin-remodeling proteins and primarily functions at heterochromatic loci via its recognition of “repressive” histone modifications [e.g., histone H3 lysine 9 tri-methylation (H3K9me3)]. Despite significant roles for ATRX during normal neural development, as well as its relationship to human disease, ATRX function in the central nervous system is not well understood. Here, we describe ATRX’s ability to recognize an activity-dependent combinatorial histone modification, histone H3 lysine 9 tri-methylation/serine 10 phosphorylation (H3K9me3S10ph), in postmitotic neurons. In neurons, this “methyl/phos” switch occurs exclusively after periods of stimulation and is highly enriched at heterochromatic repeats associated with centromeres. Using a multifaceted approach, we reveal that H3K9me3S10ph-bound Atrx represses noncoding transcription of centromeric minor satellite sequences during instances of heightened activity. Our results indicate an essential interaction between ATRX and a previously uncharacterized histone modification in the central nervous system and suggest a potential role for abnormal repetitive element transcription in pathological states manifested by ATRX dysfunction. PMID:25538301
ATRX tolerates activity-dependent histone H3 methyl/phos switching to maintain repetitive element silencing in neurons.

PubMed

Noh, Kyung-Min; Maze, Ian; Zhao, Dan; Xiang, Bin; Wenderski, Wendy; Lewis, Peter W; Shen, Li; Li, Haitao; Allis, C David

2015-06-02

ATRX (the alpha thalassemia/mental retardation syndrome X-linked protein) is a member of the switch2/sucrose nonfermentable2 (SWI2/SNF2) family of chromatin-remodeling proteins and primarily functions at heterochromatic loci via its recognition of "repressive" histone modifications [e.g., histone H3 lysine 9 tri-methylation (H3K9me3)]. Despite significant roles for ATRX during normal neural development, as well as its relationship to human disease, ATRX function in the central nervous system is not well understood. Here, we describe ATRX's ability to recognize an activity-dependent combinatorial histone modification, histone H3 lysine 9 tri-methylation/serine 10 phosphorylation (H3K9me3S10ph), in postmitotic neurons. In neurons, this "methyl/phos" switch occurs exclusively after periods of stimulation and is highly enriched at heterochromatic repeats associated with centromeres. Using a multifaceted approach, we reveal that H3K9me3S10ph-bound Atrx represses noncoding transcription of centromeric minor satellite sequences during instances of heightened activity. Our results indicate an essential interaction between ATRX and a previously uncharacterized histone modification in the central nervous system and suggest a potential role for abnormal repetitive element transcription in pathological states manifested by ATRX dysfunction.
Open reading frames associated with cancer in the dark matter of the human genome.

PubMed

Delgado, Ana Paula; Brandao, Pamela; Chapado, Maria Julia; Hamid, Sheilin; Narayanan, Ramaswamy

2014-01-01

The uncharacterized proteins (open reading frames, ORFs) in the human genome offer an opportunity to discover novel targets for cancer. A systematic analysis of the dark matter of the human proteome for druggability and biomarker discovery is crucial to mining the genome. Numerous data mining tools are available to mine these ORFs to develop a comprehensive knowledge base for future target discovery and validation. Using the Genetic Association Database, the ORFs of the human dark matter proteome were screened for evidence of association with neoplasms. The Phenome-Genome Integrator tool was used to establish phenotypic association with disease traits including cancer. Batch analysis of the tools for protein expression analysis, gene ontology and motifs and domains was used to characterize the ORFs. Sixty-two ORFs were identified for neoplasm association. The expression Quantitative Trait Loci (eQTL) analysis identified thirteen ORFs related to cancer traits. Protein expression, motifs and domain analysis and genome-wide association studies verified the relevance of these OncoORFs in diverse tumors. The OncoORFs are also associated with a wide variety of human diseases and disorders. Our results link the OncoORFs to diverse diseases and disorders. This suggests a complex landscape of the uncharacterized proteome in human diseases. These results open the dark matter of the proteome to novel cancer target research. Copyright© 2014, International Institute of Anticancer Research (Dr. John G. Delinasios), All rights reserved.
Coinheritance of biallelic SLURP1 and SLC39A4 mutations cause a severe genodermatosis with skin peeling and hair loss all over the body.

PubMed

Harms, F L; Nampoothiri, S; Kortüm, F; Thomas, J; Panicker, V V; Alawi, M; Altmüller, J; Yesodharan, D; Kutsche, K

2018-06-27

Next-generation sequencing (NGS), especially multi-gene panels and whole-exome sequencing (WES), is a tool for identifying the cause of monogenic disorders and has played a role in uncovering the genetic cause of previously uncharacterized genodermatoses. 1 By the application of NGS, the concept of apparently novel or atypical clinical presentations has been challenged by the finding of two or more genetic diagnoses in affected individuals. Approximately 5% of cases in which WES was informative had dual or multiple molecular diagnoses. 2 This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.
Genome sequence and comparative microarray analysis of serotype M18 group A Streptococcus strains associated with acute rheumatic fever outbreaks.

PubMed

Smoot, James C; Barbian, Kent D; Van Gompel, Jamie J; Smoot, Laura M; Chaussee, Michael S; Sylva, Gail L; Sturdevant, Daniel E; Ricklefs, Stacy M; Porcella, Stephen F; Parkins, Larye D; Beres, Stephen B; Campbell, David S; Smith, Todd M; Zhang, Qing; Kapur, Vivek; Daly, Judy A; Veasy, L George; Musser, James M

2002-04-02

Acute rheumatic fever (ARF), a sequelae of group A Streptococcus (GAS) infection, is the most common cause of preventable childhood heart disease worldwide. The molecular basis of ARF and the subsequent rheumatic heart disease are poorly understood. Serotype M18 GAS strains have been associated for decades with ARF outbreaks in the U.S. As a first step toward gaining new insight into ARF pathogenesis, we sequenced the genome of strain MGAS8232, a serotype M18 organism isolated from a patient with ARF. The genome is a circular chromosome of 1,895,017 bp, and it shares 1.7 Mb of closely related genetic material with strain SF370 (a sequenced serotype M1 strain). Strain MGAS8232 has 178 ORFs absent in SF370. Phages, phage-like elements, and insertion sequences are the major sources of variation between the genomes. The genomes of strain MGAS8232 and SF370 encode many of the same proven or putative virulence factors. Importantly, strain MGAS8232 has genes encoding many additional secreted proteins involved in human-GAS interactions, including streptococcal pyrogenic exotoxin A (scarlet fever toxin) and two uncharacterized pyrogenic exotoxin homologues, all phage-associated. DNA microarray analysis of 36 serotype M18 strains from diverse localities showed that most regions of variation were phages or phage-like elements. Two epidemics of ARF occurring 12 years apart in Salt Lake City, UT, were caused by serotype M18 strains that were genetically identical, or nearly so. Our analysis provides a critical foundation for accelerated research into ARF pathogenesis and a molecular framework to study the plasticity of GAS genomes.
A phylogenomic analysis of the Actinomycetales mce operons.

PubMed

Casali, Nicola; Riley, Lee W

2007-02-26

The genome of Mycobacterium tuberculosis harbors four copies of a cluster of genes termed mce operons. Despite extensive research that has demonstrated the importance of these operons on infection outcome, their physiological function remains obscure. Expanding databases of complete microbial genome sequences facilitate a comparative genomic approach that can provide valuable insight into the role of uncharacterized proteins. The M. tuberculosis mce loci each include two yrbE and six mce genes, which have homology to ABC transporter permeases and substrate-binding proteins, respectively. Operons with an identical structure were identified in all Mycobacterium species examined, as well as in five other Actinomycetales genera. Some of the Actinomycetales mce operons include an mkl gene, which encodes an ATPase resembling those of ABC uptake transporters. The phylogenetic profile of Mkl orthologs exactly matched that of the Mce and YrbE proteins. Through topology and motif analyses of YrbE homologs, we identified a region within the penultimate cytoplasmic loop that may serve as the site of interaction with the putative cognate Mkl ATPase. Homologs of the exported proteins encoded adjacent to the M. tuberculosis mce operons were detected in a conserved chromosomal location downstream of the majority of Actinomycetales operons. Operons containing linked mkl, yrbE and mce genes, resembling the classic organization of an ABC importer, were found to be common in Gram-negative bacteria and appear to be associated with changes in properties of the cell surface. Evidence presented suggests that the mce operons of Actinomycetales species and related operons in Gram-negative bacteria encode a subfamily of ABC uptake transporters with a possible role in remodeling the cell envelope.
ATP-binding Cassette (ABC) Transport System Solute-binding Protein-guided Identification of Novel d-Altritol and Galactitol Catabolic Pathways in Agrobacterium tumefaciens C58*

PubMed Central

Wichelecki, Daniel J.; Vetting, Matthew W.; Chou, Liyushang; Al-Obaidi, Nawar; Bouvier, Jason T.; Almo, Steven C.; Gerlt, John A.

2015-01-01

Innovations in the discovery of the functions of uncharacterized proteins/enzymes have become increasingly important as advances in sequencing technology flood protein databases with an exponentially growing number of open reading frames. This study documents one such innovation developed by the Enzyme Function Initiative (EFI; U54GM093342), the use of solute-binding proteins for transport systems to identify novel metabolic pathways. In a previous study, this strategy was applied to the tripartite ATP-independent periplasmic transporters. Here, we apply this strategy to the ATP-binding cassette transporters and report the discovery of novel catabolic pathways for d-altritol and galactitol in Agrobacterium tumefaciens C58. These efforts resulted in the description of three novel enzymatic reactions as follows: 1) oxidation of d-altritol to d-tagatose via a dehydrogenase in Pfam family PF00107, a previously unknown reaction; 2) phosphorylation of d-tagatose to d-tagatose 6-phosphate via a kinase in Pfam family PF00294, a previously orphan EC number; and 3) epimerization of d-tagatose 6-phosphate C-4 to d-fructose 6-phosphate via a member of Pfam family PF08013, another previously unknown reaction. The epimerization reaction catalyzed by a member of PF08013 is especially noteworthy, because the functions of members of PF08013 have been unknown. These discoveries were assisted by the following two synergistic bioinformatics web tools made available by the Enzyme Function Initiative: the EFI-Enzyme Similarity Tool and the EFI-Genome Neighborhood Tool. PMID:26472925
Genome Sequencing of Listeria monocytogenes “Quargel” Listeriosis Outbreak Strains Reveals Two Different Strains with Distinct In Vitro Virulence Potential

PubMed Central

Rychli, Kathrin; Müller, Anneliese; Zaiser, Andreas; Schoder, Dagmar; Allerberger, Franz; Wagner, Martin; Schmitz-Esser, Stephan

2014-01-01

A large listeriosis outbreak occurred in Austria, Germany and the Czech Republic in 2009 and 2010. The outbreak was traced back to a traditional Austrian curd cheese called “Quargel” which was contaminated with two distinct serovar 1/2a Listeria monocytogenes strains (QOC1 and QOC2). In this study we sequenced and analysed the genomes of both outbreak strains in order to investigate the extent of genetic diversity between the two strains belonging to MLST sequence types 398 (QOC2) and 403 (QOC1). Both genomes are highly similar, but also display distinct properties: The QOC1 genome is approximately 74 kbp larger than the QOC2 genome. In addition, the strains harbour 93 (QOC1) and 45 (QOC2) genes encoding strain-specific proteins. A 21 kbp region showing highest similarity to plasmid pLMIV encoding three putative internalins is integrated in the QOC1 genome. In contrast to QOC1, strain QOC2 harbours a vip homologue, which encodes a LPXTG surface protein involved in cell invasion. In accordance, in vitro virulence assays revealed distinct differences in invasion efficiency and intracellular proliferation within different cell types. The higher virulence potential of QOC1 in non-phagocytic cells may be explained by the presence of additional internalins in the pLMIV-like region, whereas the higher invasion capability of QOC2 into phagocytic cells may be due to the presence of a vip homologue. In addition, both strains show differences in stress-related gene content. Strain QOC1 encodes a so-called stress survival islet 1, whereas strain QOC2 harbours a homologue of the uncharacterized LMOf2365_0481 gene. Consistently, QOC1 shows higher resistance to acidic, alkaline and gastric stress. In conclusion, our results show that strain QOC1 and QOC2 are distinct and did not recently evolve from a common ancestor. PMID:24587155
Annotation-based genome-wide SNP discovery in the large and complex Aegilops tauschii genome using next-generation sequencing without a reference genome sequence

PubMed Central

2011-01-01

Background Many plants have large and complex genomes with an abundance of repeated sequences. Many plants are also polyploid. Both of these attributes typify the genome architecture in the tribe Triticeae, whose members include economically important wheat, rye and barley. Large genome sizes, an abundance of repeated sequences, and polyploidy present challenges to genome-wide SNP discovery using next-generation sequencing (NGS) of total genomic DNA by making alignment and clustering of short reads generated by the NGS platforms difficult, particularly in the absence of a reference genome sequence. Results An annotation-based, genome-wide SNP discovery pipeline is reported using NGS data for large and complex genomes without a reference genome sequence. Roche 454 shotgun reads with low genome coverage of one genotype are annotated in order to distinguish single-copy sequences and repeat junctions from repetitive sequences and sequences shared by paralogous genes. Multiple genome equivalents of shotgun reads of another genotype generated with SOLiD or Solexa are then mapped to the annotated Roche 454 reads to identify putative SNPs. A pipeline program package, AGSNP, was developed and used for genome-wide SNP discovery in Aegilops tauschii-the diploid source of the wheat D genome, and with a genome size of 4.02 Gb, of which 90% is repetitive sequences. Genomic DNA of Ae. tauschii accession AL8/78 was sequenced with the Roche 454 NGS platform. Genomic DNA and cDNA of Ae. tauschii accession AS75 was sequenced primarily with SOLiD, although some Solexa and Roche 454 genomic sequences were also generated. A total of 195,631 putative SNPs were discovered in gene sequences, 155,580 putative SNPs were discovered in uncharacterized single-copy regions, and another 145,907 putative SNPs were discovered in repeat junctions. These SNPs were dispersed across the entire Ae. tauschii genome. To assess the false positive SNP discovery rate, DNA containing putative SNPs was amplified by PCR from AL8/78 and AS75 and resequenced with the ABI 3730 xl. In a sample of 302 randomly selected putative SNPs, 84.0% in gene regions, 88.0% in repeat junctions, and 81.3% in uncharacterized regions were validated. Conclusion An annotation-based genome-wide SNP discovery pipeline for NGS platforms was developed. The pipeline is suitable for SNP discovery in genomic libraries of complex genomes and does not require a reference genome sequence. The pipeline is applicable to all current NGS platforms, provided that at least one such platform generates relatively long reads. The pipeline package, AGSNP, and the discovered 497,118 Ae. tauschii SNPs can be accessed at (http://avena.pw.usda.gov/wheatD/agsnp.shtml). PMID:21266061
Functional Metagenomics Reveals Previously Unrecognized Diversity of Antibiotic Resistance Genes in Gulls

PubMed Central

Martiny, Adam C.; Martiny, Jennifer B. H.; Weihe, Claudia; Field, Andrew; Ellis, Julie C.

2011-01-01

Wildlife may facilitate the spread of antibiotic resistance (AR) between human-dominated habitats and the surrounding environment. Here, we use functional metagenomics to survey the diversity and genomic context of AR genes in gulls. Using this approach, we found a variety of AR genes not previously detected in gulls and wildlife, including class A and C β-lactamases as well as six tetracycline resistance gene types. An analysis of the flanking sequences indicates that most of these genes are present in Enterobacteriaceae and various Gram-positive bacteria. In addition to finding known gene types, we detected 31 previously undescribed AR genes. These undescribed genes include one most similar to an uncharacterized gene in Verrucomicrobium and another to a putative DNA repair protein in Lactobacillus. Overall, the study more than doubled the number of clinically relevant AR gene types known to be carried by gulls or by wildlife in general. Together with the propensity of gulls to visit human-dominated habitats, this high diversity of AR gene types suggests that gulls could facilitate the spread of AR. PMID:22347872
Global Analysis of the Burkholderia thailandensis Quorum Sensing-Controlled Regulon

PubMed Central

Majerczyk, Charlotte; Brittnacher, Mitchell; Jacobs, Michael; Armour, Christopher D.; Radey, Mathew; Schneider, Emily; Phattarasokul, Somsak; Bunt, Richard

2014-01-01

Burkholderia thailandensis contains three acyl-homoserine lactone quorum sensing circuits and has two additional LuxR homologs. To identify B. thailandensis quorum sensing-controlled genes, we carried out transcriptome sequencing (RNA-seq) analyses of quorum sensing mutants and their parent. The analyses were grounded in the fact that we identified genes coding for factors shown previously to be regulated by quorum sensing among a larger set of quorum-controlled genes. We also found that genes coding for contact-dependent inhibition were induced by quorum sensing and confirmed that specific quorum sensing mutants had a contact-dependent inhibition defect. Additional quorum-controlled genes included those for the production of numerous secondary metabolites, an uncharacterized exopolysaccharide, and a predicted chitin-binding protein. This study provides insights into the roles of the three quorum sensing circuits in the saprophytic lifestyle of B. thailandensis, and it provides a foundation on which to build an understanding of the roles of quorum sensing in the biology of B. thailandensis and the closely related pathogenic Burkholderia pseudomallei and Burkholderia mallei. PMID:24464461
Interactions among tobacco sieve element occlusion (SEO) proteins.

PubMed

Jekat, Stephan B; Ernst, Antonia M; Zielonka, Sascia; Noll, Gundula A; Prüfer, Dirk

2012-12-01

Angiosperms transport their photoassimilates through sieve tubes, which comprise longitudinally-connected sieve elements. In dicots and also some monocots, the sieve elements contain parietal structural proteins known as phloem proteins or P-proteins. Following injury, P proteins disperse and accumulate as viscous plugs at the sieve plates to prevent the loss of valuable transport sugars. Tobacco (Nicotiana tabacum) P-proteins are multimeric complexes comprising subunits encoded by members of the SEO (sieve element occlusion) gene family. The existence of multiple subunits suggests that P-protein assembly involves interactions between SEO proteins, but this process is largely uncharacterized and it is unclear whether the different subunits perform unique roles or are redundant. We therefore extended our analysis of the tobacco P-proteins NtSEO1 and NtSEO2 to investigate potential interactions between them, and found that both proteins can form homomeric and heteromeric complexes in planta.
Molecular characterization of human ABHD2 as TAG lipase and ester hydrolase

PubMed Central

M., Naresh Kumar; V.B.S.C., Thunuguntla; G.K., Veeramachaneni; B., Chandra Sekhar; Guntupalli, Swapna; J.S., Bondili

2016-01-01

Alterations in lipid metabolism have been progressively documented as a characteristic property of cancer cells. Though, human ABHD2 gene was found to be highly expressed in breast and lung cancers, its biochemical functionality is yet uncharacterized. In the present study we report, human ABHD2 as triacylglycerol (TAG) lipase along with ester hydrolysing capacity. Sequence analysis of ABHD2 revealed the presence of conserved motifs G205XS207XG209 and H120XXXXD125. Phylogenetic analysis showed homology to known lipases, Drosophila melanogaster CG3488. To evaluate the biochemical role, recombinant ABHD2 was expressed in Saccharomyces cerevisiae using pYES2/CT vector and His-tag purified protein showed TAG lipase activity. Ester hydrolase activity was confirmed with pNP acetate, butyrate and palmitate substrates respectively. Further, the ABHD2 homology model was built and the modelled protein was analysed based on the RMSD and root mean square fluctuation (RMSF) of the 100 ns simulation trajectory. Docking the acetate, butyrate and palmitate ligands with the model confirmed covalent binding of ligands with the Ser207 of the GXSXG motif. The model was validated with a mutant ABHD2 developed with alanine in place of Ser207 and the docking studies revealed loss of interaction between selected ligands and the mutant protein active site. Based on the above results, human ABHD2 was identified as a novel TAG lipase and ester hydrolase. PMID:27247428
Molecular characterization of human ABHD2 as TAG lipase and ester hydrolase.

PubMed

M, Naresh Kumar; V B S C, Thunuguntla; G K, Veeramachaneni; B, Chandra Sekhar; Guntupalli, Swapna; J S, Bondili

2016-08-01

Alterations in lipid metabolism have been progressively documented as a characteristic property of cancer cells. Though, human ABHD2 gene was found to be highly expressed in breast and lung cancers, its biochemical functionality is yet uncharacterized. In the present study we report, human ABHD2 as triacylglycerol (TAG) lipase along with ester hydrolysing capacity. Sequence analysis of ABHD2 revealed the presence of conserved motifs G(205)XS(207)XG(209) and H(120)XXXXD(125) Phylogenetic analysis showed homology to known lipases, Drosophila melanogaster CG3488. To evaluate the biochemical role, recombinant ABHD2 was expressed in Saccharomyces cerevisiae using pYES2/CT vector and His-tag purified protein showed TAG lipase activity. Ester hydrolase activity was confirmed with pNP acetate, butyrate and palmitate substrates respectively. Further, the ABHD2 homology model was built and the modelled protein was analysed based on the RMSD and root mean square fluctuation (RMSF) of the 100 ns simulation trajectory. Docking the acetate, butyrate and palmitate ligands with the model confirmed covalent binding of ligands with the Ser(207) of the GXSXG motif. The model was validated with a mutant ABHD2 developed with alanine in place of Ser(207) and the docking studies revealed loss of interaction between selected ligands and the mutant protein active site. Based on the above results, human ABHD2 was identified as a novel TAG lipase and ester hydrolase. © 2016 The Author(s).

The structural role of the zinc ion can be dispensable in prokaryotic zinc-finger domains

PubMed Central

Baglivo, Ilaria; Russo, Luigi; Esposito, Sabrina; Malgieri, Gaetano; Renda, Mario; Salluzzo, Antonio; Di Blasio, Benedetto; Isernia, Carla; Fattorusso, Roberto; Pedone, Paolo V.

2009-01-01

The recent characterization of the prokaryotic Cys2His2 zinc-finger domain, identified in Ros protein from Agrobacterium tumefaciens, has demonstrated that, although possessing a similar zinc coordination sphere, this domain is structurally very different from its eukaryotic counterpart. A search in the databases has identified ≈300 homologues with a high sequence identity to the Ros protein, including the amino acids that form the extensive hydrophobic core in Ros. Surprisingly, the Cys2His2 zinc coordination sphere is generally poorly conserved in the Ros homologues, raising the question of whether the zinc ion is always preserved in these proteins. Here, we present a functional and structural study of a point mutant of Ros protein, Ros56–142C82D, in which the second coordinating cysteine is replaced by an aspartate, 5 previously-uncharacterized representative Ros homologues from Mesorhizobium loti, and 2 mutants of the homologues. Our results indicate that the prokaryotic zinc-finger domain, which in Ros protein tetrahedrally coordinates Zn(II) through the typical Cys2His2 coordination, in Ros homologues can either exploit a CysAspHis2 coordination sphere, previously never described in DNA binding zinc finger domains to our knowledge, or lose the metal, while still preserving the DNA-binding activity. We demonstrate that this class of prokaryotic zinc-finger domains is structurally very adaptable, and surprisingly single mutations can transform a zinc-binding domain into a nonzinc-binding domain and vice versa, without affecting the DNA-binding ability. In light of our findings an evolutionary link between the prokaryotic and eukaryotic zinc-finger domains, based on bacteria-to-eukaryota horizontal gene transfer, is discussed. PMID:19369210
Identification and Characterization of FAM124B as a Novel Component of a CHD7 and CHD8 Containing Complex

PubMed Central

Batsukh, Tserendulam; Schulz, Yvonne; Wolf, Stephan; Rabe, Tamara I.; Oellerich, Thomas; Urlaub, Henning; Schaefer, Inga-Marie; Pauli, Silke

2012-01-01

Background Mutations in the chromodomain helicase DNA binding protein 7 gene (CHD7) lead to CHARGE syndrome, an autosomal dominant multiple malformation disorder. Proteins involved in chromatin remodeling typically act in multiprotein complexes. We previously demonstrated that a part of human CHD7 interacts with a part of human CHD8, another chromodomain helicase DNA binding protein presumably being involved in the pathogenesis of neurodevelopmental (NDD) and autism spectrum disorders (ASD). Because identification of novel CHD7 and CHD8 interacting partners will provide further insights into the pathogenesis of CHARGE syndrome and ASD/NDD, we searched for additional associated polypeptides using the method of stable isotope labeling by amino acids in cell culture (SILAC) in combination with mass spectrometry. Principle findings The hitherto uncharacterized FAM124B (Family with sequence similarity 124B) was identified as a potential interaction partner of both CHD7 and CHD8. We confirmed the result by co-immunoprecipitation studies and showed a direct binding to the CHD8 part by direct yeast two hybrid experiments. Furthermore, we characterized FAM124B as a mainly nuclear localized protein with a widespread expression in embryonic and adult mouse tissues. Conclusion Our results demonstrate that FAM124B is a potential interacting partner of a CHD7 and CHD8 containing complex. From the overlapping expression pattern between Chd7 and Fam124B at murine embryonic day E12.5 and the high expression of Fam124B in the developing mouse brain, we conclude that Fam124B is a novel protein possibly involved in the pathogenesis of CHARGE syndrome and neurodevelopmental disorders. PMID:23285124
Genome-Wide Binding Analysis of the Transcription Activator IDEAL PLANT ARCHITECTURE1 Reveals a Complex Network Regulating Rice Plant Architecture[W

PubMed Central

Lu, Zefu; Yu, Hong; Xiong, Guosheng; Wang, Jing; Jiao, Yongqing; Liu, Guifu; Jing, Yanhui; Meng, Xiangbing; Hu, Xingming; Qian, Qian; Fu, Xiangdong; Wang, Yonghong; Li, Jiayang

2013-01-01

IDEAL PLANT ARCHITECTURE1 (IPA1) is critical in regulating rice (Oryza sativa) plant architecture and substantially enhances grain yield. To elucidate its molecular basis, we first confirmed IPA1 as a functional transcription activator and then identified 1067 and 2185 genes associated with IPA1 binding sites in shoot apices and young panicles, respectively, through chromatin immunoprecipitation sequencing assays. The SQUAMOSA PROMOTER BINDING PROTEIN-box direct binding core motif GTAC was highly enriched in IPA1 binding peaks; interestingly, a previously uncharacterized indirect binding motif TGGGCC/T was found to be significantly enriched through the interaction of IPA1 with proliferating cell nuclear antigen PROMOTER BINDING FACTOR1 or PROMOTER BINDING FACTOR2. Genome-wide expression profiling by RNA sequencing revealed IPA1 roles in diverse pathways. Moreover, our results demonstrated that IPA1 could directly bind to the promoter of rice TEOSINTE BRANCHED1, a negative regulator of tiller bud outgrowth, to suppress rice tillering, and directly and positively regulate DENSE AND ERECT PANICLE1, an important gene regulating panicle architecture, to influence plant height and panicle length. The elucidation of target genes of IPA1 genome-wide will contribute to understanding the molecular mechanisms underlying plant architecture and to facilitating the breeding of elite varieties with ideal plant architecture. PMID:24170127
Functional Diversity of Haloacid Dehalogenase Superfamily Phosphatases from Saccharomyces cerevisiae: BIOCHEMICAL, STRUCTURAL, AND EVOLUTIONARY INSIGHTS.

PubMed

Kuznetsova, Ekaterina; Nocek, Boguslaw; Brown, Greg; Makarova, Kira S; Flick, Robert; Wolf, Yuri I; Khusnutdinova, Anna; Evdokimova, Elena; Jin, Ke; Tan, Kemin; Hanson, Andrew D; Hasnain, Ghulam; Zallot, Rémi; de Crécy-Lagard, Valérie; Babu, Mohan; Savchenko, Alexei; Joachimiak, Andrzej; Edwards, Aled M; Koonin, Eugene V; Yakunin, Alexander F

2015-07-24

The haloacid dehalogenase (HAD)-like enzymes comprise a large superfamily of phosphohydrolases present in all organisms. The Saccharomyces cerevisiae genome encodes at least 19 soluble HADs, including 10 uncharacterized proteins. Here, we biochemically characterized 13 yeast phosphatases from the HAD superfamily, which includes both specific and promiscuous enzymes active against various phosphorylated metabolites and peptides with several HADs implicated in detoxification of phosphorylated compounds and pseudouridine. The crystal structures of four yeast HADs provided insight into their active sites, whereas the structure of the YKR070W dimer in complex with substrate revealed a composite substrate-binding site. Although the S. cerevisiae and Escherichia coli HADs share low sequence similarities, the comparison of their substrate profiles revealed seven phosphatases with common preferred substrates. The cluster of secondary substrates supporting significant activity of both S. cerevisiae and E. coli HADs includes 28 common metabolites that appear to represent the pool of potential activities for the evolution of novel HAD phosphatases. Evolution of novel substrate specificities of HAD phosphatases shows no strict correlation with sequence divergence. Thus, evolution of the HAD superfamily combines the conservation of the overall substrate pool and the substrate profiles of some enzymes with remarkable biochemical and structural flexibility of other superfamily members. © 2015 by The American Society for Biochemistry and Molecular Biology, Inc.
High-throughput amplification of mature microRNAs in uncharacterized animal models using polyadenylated RNA and stem-loop reverse transcription polymerase chain reaction.

PubMed

Biggar, Kyle K; Wu, Cheng-Wei; Storey, Kenneth B

2014-10-01

This study makes a significant advancement on a microRNA amplification technique previously used for expression analysis and sequencing in animal models without annotated mature microRNA sequences. As research progresses into the post-genomic era of microRNA prediction and analysis, the need for a rapid and cost-effective method for microRNA amplification is critical to facilitate wide-scale analysis of microRNA expression. To facilitate this requirement, we have reoptimized the design of amplification primers and introduced a polyadenylation step to allow amplification of all mature microRNAs from a single RNA sample. Importantly, this method retains the ability to sequence reverse transcription polymerase chain reaction (RT-PCR) products, validating microRNA-specific amplification. Copyright © 2014 Elsevier Inc. All rights reserved.
Genetic mapping of the LOBED LEAF 1 (ClLL1) gene to a 127.6-kb region in watermelon (Citrullus lanatus L.)

PubMed Central

Wei, Chunhua; Chen, Xiner; Wang, Zhongyuan; Liu, Qiyan; Li, Hao; Zhang, Yong; Ma, Jianxiang; Yang, Jianqiang

2017-01-01

The lobed leaf character is a unique morphologic trait in crops, featuring many potential advantages for agricultural productivity. Although the majority of watermelon varieties feature lobed leaves, the genetic factors responsible for lobed leaf formation remain elusive. The F2:3 leaf shape segregating population offers the opportunity to study the underlying mechanism of lobed leaf formation in watermelon. Genetic analysis revealed that a single dominant allele (designated ClLL1) controlled the lobed leaf trait. A large-sized F3:4 population derived from F2:3 individuals was used to map ClLL1. A total of 5,966 reliable SNPs and indels were identified genome-wide via a combination of BSA and RNA-seq. Using the validated SNP and indel markers, the location of ClLL1 was narrowed down to a 127.6-kb region between markers W08314 and W07061, containing 23 putative ORFs. Expression analysis via qRT-PCR revealed differential expression patterns (fold-changes above 2-fold or below 0.5-fold) of three ORFs (ORF3, ORF11, and ORF18) between lobed and non-lobed leaf plants. Based on gene annotation and expression analysis, ORF18 (encoding an uncharacterized protein) and ORF22 (encoding a homeobox-leucine zipper-like protein) were considered as most likely candidate genes. Furthermore, sequence analysis revealed no polymorphisms in cDNA sequences of ORF18; however, two notable deletions were identified in ORF22. This study is the first report to map a leaf shape gene in watermelon and will facilitate cloning and functional characterization of ClLL1 in future studies. PMID:28704497
Genetic mapping of the LOBED LEAF 1 (ClLL1) gene to a 127.6-kb region in watermelon (Citrullus lanatus L.).

PubMed

Wei, Chunhua; Chen, Xiner; Wang, Zhongyuan; Liu, Qiyan; Li, Hao; Zhang, Yong; Ma, Jianxiang; Yang, Jianqiang; Zhang, Xian

2017-01-01

The lobed leaf character is a unique morphologic trait in crops, featuring many potential advantages for agricultural productivity. Although the majority of watermelon varieties feature lobed leaves, the genetic factors responsible for lobed leaf formation remain elusive. The F2:3 leaf shape segregating population offers the opportunity to study the underlying mechanism of lobed leaf formation in watermelon. Genetic analysis revealed that a single dominant allele (designated ClLL1) controlled the lobed leaf trait. A large-sized F3:4 population derived from F2:3 individuals was used to map ClLL1. A total of 5,966 reliable SNPs and indels were identified genome-wide via a combination of BSA and RNA-seq. Using the validated SNP and indel markers, the location of ClLL1 was narrowed down to a 127.6-kb region between markers W08314 and W07061, containing 23 putative ORFs. Expression analysis via qRT-PCR revealed differential expression patterns (fold-changes above 2-fold or below 0.5-fold) of three ORFs (ORF3, ORF11, and ORF18) between lobed and non-lobed leaf plants. Based on gene annotation and expression analysis, ORF18 (encoding an uncharacterized protein) and ORF22 (encoding a homeobox-leucine zipper-like protein) were considered as most likely candidate genes. Furthermore, sequence analysis revealed no polymorphisms in cDNA sequences of ORF18; however, two notable deletions were identified in ORF22. This study is the first report to map a leaf shape gene in watermelon and will facilitate cloning and functional characterization of ClLL1 in future studies.
Identification of novel and known oocyte-specific genes using complementary DNA subtraction and microarray analysis in three different species.

PubMed

Vallée, Maud; Gravel, Catherine; Palin, Marie-France; Reghenas, Hélène; Stothard, Paul; Wishart, David S; Sirard, Marc-André

2005-07-01

The main objective of the present study was to identify novel oocyte-specific genes in three different species: bovine, mouse, and Xenopus laevis. To achieve this goal, two powerful technologies were combined: a polymerase chain reaction (PCR)-based cDNA subtraction, and cDNA microarrays. Three subtractive libraries consisting of 3456 clones were established and enriched for oocyte-specific transcripts. Sequencing analysis of the positive insert-containing clones resulted in the following classification: 53% of the clones corresponded to known cDNAs, 26% were classified as uncharacterized cDNAs, and a final 9% were classified as novel sequences. All these clones were used for cDNA microarray preparation. Results from these microarray analyses revealed that in addition to already known oocyte-specific genes, such as GDF9, BMP15, and ZP, known genes with unknown function in the oocyte were identified, such as a MLF1-interacting protein (MLF1IP), B-cell translocation gene 4 (BTG4), and phosphotyrosine-binding protein (xPTB). Furthermore, 15 novel oocyte-specific genes were validated by reverse transcription-PCR to confirm their preferential expression in the oocyte compared to somatic tissues. The results obtained in the present study confirmed that microarray analysis is a robust technique to identify true positives from the suppressive subtractive hybridization experiment. Furthermore, obtaining oocyte-specific genes from three species simultaneously allowed us to look at important genes that are conserved across species. Further characterization of these novel oocyte-specific genes will lead to a better understanding of the molecular mechanisms related to the unique functions found in the oocyte.
Discovery of a novel iota carrageenan sulfatase isolated from the marine bacterium Pseudoalteromonas carrageenovora

NASA Astrophysics Data System (ADS)

Genicot, Sabine; Groisillier, Agnès; Rogniaux, Hélène; Meslet-Cladière, Laurence; Barbeyron, Tristan; Helbert, William

2014-08-01

Carrageenans are sulfated polysaccharides extracted from the cell wall of some marine red algae. These polysaccharides are widely used as gelling, stabilizing, and viscosifying agents in the food and pharmaceutical industries. Since the rheological properties of these polysaccharides depend on their sulfate content, we screened several isolated marine bacteria for carrageenan specific sulfatase activity, in the aim of developing enzymatic bioconversion of carrageenans. As a result of the screening, an iota-carrageenan sulfatase was detected in the cell-free lysate of the marine bacterium Pseudoalteromonas carrageenovora strain PscT. It was purified through Phenyl Sepharose and Diethylaminoethyl Sepharose chromatography. The pure enzyme, Psc ?-CgsA, was characterized. It had a molecular weight of 115.9 kDaltons and exhibited an optimal activity/stability at pH ~8.3 and at 40°C ± 5°C. It was inactivated by phenylmethylsulfonyl fluoride but not by ethylene diamine tetraacetic acid. Psc ?-CgsA specifically catalyzes the hydrolysis of the 4-S sulfate of iota-carrageenan. The purified enzyme could transform iota-carrageenan into hybrid iota-/alpha- or pure alpha-carrageenan under controlled conditions. The gene encoding Psc ?-CgsA, a protein of 1038 amino acids, was cloned into Escherichia coli, and the sequence analysis revealed that Psc ?-CgsA has more than 90% sequence identity with a putative uncharacterized protein Q3IKL4 from the marine strain Pseudoalteromonas haloplanktis TAC 125, but besides this did not share any homology to characterized sulfatases. Phylogenetic studies show that P. carrageenovora sulfatase thus represents the first characterized member of a new sulfatase family, with a C-terminal domain having strong similarity with the superfamily of amidohydrolases, highlighting the still unexplored diversity of marine polysaccharide modifying enzymes.
Discovery of a novel iota carrageenan sulfatase isolated from the marine bacterium Pseudoalteromonas carrageenovora.

PubMed

Genicot, Sabine M; Groisillier, Agnès; Rogniaux, Hélène; Meslet-Cladière, Laurence; Barbeyron, Tristan; Helbert, William

2014-01-01

Carrageenans are sulfated polysaccharides extracted from the cell wall of some marine red algae. These polysaccharides are widely used as gelling, stabilizing, and viscosifying agents in the food and pharmaceutical industries. Since the rheological properties of these polysaccharides depend on their sulfate content, we screened several isolated marine bacteria for carrageenan specific sulfatase activity, in the aim of developing enzymatic bioconversion of carrageenans. As a result of the screening, an iota-carrageenan sulfatase was detected in the cell-free lysate of the marine bacterium Pseudoalteromonas carrageenovora strain Psc(T). It was purified through Phenyl Sepharose and Diethylaminoethyl Sepharose chromatography. The pure enzyme, Psc ι-CgsA, was characterized. It had a molecular weight of 115.9 kDaltons and exhibited an optimal activity/stability at pH ~8.3 and at 40 ± 5°C. It was inactivated by phenylmethylsulfonyl fluoride but not by ethylene diamine tetraacetic acid. Psc ι-CgsA specifically catalyzes the hydrolysis of the 4-S sulfate of iota-carrageenan. The purified enzyme could transform iota-carrageenan into hybrid iota-/alpha- or pure alpha-carrageenan under controlled conditions. The gene encoding Psc ι-CgsA, a protein of 1038 amino acids, was cloned into Escherichia coli, and the sequence analysis revealed that Psc ι-CgsA has more than 90% sequence identity with a putative uncharacterized protein Q3IKL4 from the marine strain Pseudoalteromonas haloplanktis TAC 125, but besides this did not share any homology to characterized sulfatases. Phylogenetic studies show that P. carrageenovora sulfatase thus represents the first characterized member of a new sulfatase family, with a C-terminal domain having strong similarity with the superfamily of amidohydrolases, highlighting the still unexplored diversity of marine polysaccharide modifying enzymes.
Discovery of a novel iota carrageenan sulfatase isolated from the marine bacterium Pseudoalteromonas carrageenovora

PubMed Central

Genicot, Sabine M.; Groisillier, Agnès; Rogniaux, Hélène; Meslet-Cladière, Laurence; Barbeyron, Tristan; Helbert, William

2014-01-01

Carrageenans are sulfated polysaccharides extracted from the cell wall of some marine red algae. These polysaccharides are widely used as gelling, stabilizing, and viscosifying agents in the food and pharmaceutical industries. Since the rheological properties of these polysaccharides depend on their sulfate content, we screened several isolated marine bacteria for carrageenan specific sulfatase activity, in the aim of developing enzymatic bioconversion of carrageenans. As a result of the screening, an iota-carrageenan sulfatase was detected in the cell-free lysate of the marine bacterium Pseudoalteromonas carrageenovora strain PscT. It was purified through Phenyl Sepharose and Diethylaminoethyl Sepharose chromatography. The pure enzyme, Psc ι-CgsA, was characterized. It had a molecular weight of 115.9 kDaltons and exhibited an optimal activity/stability at pH ~8.3 and at 40 ± 5°C. It was inactivated by phenylmethylsulfonyl fluoride but not by ethylene diamine tetraacetic acid. Psc ι-CgsA specifically catalyzes the hydrolysis of the 4-S sulfate of iota-carrageenan. The purified enzyme could transform iota-carrageenan into hybrid iota-/alpha- or pure alpha-carrageenan under controlled conditions. The gene encoding Psc ι-CgsA, a protein of 1038 amino acids, was cloned into Escherichia coli, and the sequence analysis revealed that Psc ι-CgsA has more than 90% sequence identity with a putative uncharacterized protein Q3IKL4 from the marine strain Pseudoalteromonas haloplanktis TAC 125, but besides this did not share any homology to characterized sulfatases. Phylogenetic studies show that P. carrageenovora sulfatase thus represents the first characterized member of a new sulfatase family, with a C-terminal domain having strong similarity with the superfamily of amidohydrolases, highlighting the still unexplored diversity of marine polysaccharide modifying enzymes. PMID:25207269
Identification of new regulators of embryonic patterning and morphogenesis in Xenopus gastrulae by RNA sequencing.

PubMed

Popov, Ivan K; Kwon, Taejoon; Crossman, David K; Crowley, Michael R; Wallingford, John B; Chang, Chenbei

2017-06-15

During early vertebrate embryogenesis, cell fate specification is often coupled with cell acquisition of specific adhesive, polar and/or motile behaviors. In Xenopus gastrulae, tissues fated to form different axial structures display distinct motility. The cells in the early organizer move collectively and directionally toward the animal pole and contribute to anterior mesendoderm, whereas the dorsal and the ventral-posterior trunk tissues surrounding the blastopore of mid-gastrula embryos undergo convergent extension and convergent thickening movements, respectively. While factors regulating cell lineage specification have been described in some detail, the molecular machinery that controls cell motility is not understood in depth. To gain insight into the gene battery that regulates both cell fates and motility in particular embryonic tissues, we performed RNA sequencing (RNA-seq) to investigate differentially expressed genes in the early organizer, the dorsal and the ventral marginal zone of Xenopus gastrulae. We uncovered many known signaling and transcription factors that have been reported to play roles in embryonic patterning during gastrulation. We also identified many uncharacterized genes as well as genes that encoded extracellular matrix (ECM) proteins or potential regulators of actin cytoskeleton. Co-expression of a selected subset of the differentially expressed genes with activin in animal caps revealed that they had distinct ability to block activin-induced animal cap elongation. Most of these factors did not interfere with mesodermal induction by activin, but an ECM protein, EFEMP2, inhibited activin signaling and acted downstream of the activated type I receptor. By focusing on a secreted protein kinase PKDCC1, we showed with overexpression and knockdown experiments that PKDCC1 regulated gastrulation movements as well as anterior neural patterning during early Xenopus development. Overall, our studies identify many differentially expressed signaling and cytoskeleton regulators in different embryonic regions of Xenopus gastrulae and imply their functions in regulating cell fates and/or behaviors during gastrulation. Copyright © 2016 Elsevier Inc. All rights reserved.
Extensive Conserved Synteny of Genes between the Karyotypes of Manduca sexta and Bombyx mori Revealed by BAC-FISH Mapping

PubMed Central

Tanaka-Okuyama, Makiko; Shibata, Fukashi; Yoshido, Atsuo; Marec, František; Wu, Chengcang; Zhang, Hongbin; Goldsmith, Marian R.

2009-01-01

Background Genome sequencing projects have been completed for several species representing four highly diverged holometabolous insect orders, Diptera, Hymenoptera, Coleoptera, and Lepidoptera. The striking evolutionary diversity of insects argues a need for efficient methods to apply genome information from such models to genetically uncharacterized species. Constructing conserved synteny maps plays a crucial role in this task. Here, we demonstrate the use of fluorescence in situ hybridization with bacterial artificial chromosome probes as a powerful tool for physical mapping of genes and comparative genome analysis in Lepidoptera, which have numerous and morphologically uniform holokinetic chromosomes. Methodology/Principal Findings We isolated 214 clones containing 159 orthologs of well conserved single-copy genes of a sequenced lepidopteran model, the silkworm, Bombyx mori, from a BAC library of a sphingid with an unexplored genome, the tobacco hornworm, Manduca sexta. We then constructed a BAC-FISH karyotype identifying all 28 chromosomes of M. sexta by mapping 124 loci using the corresponding BAC clones. BAC probes from three M. sexta chromosomes also generated clear signals on the corresponding chromosomes of the convolvulus hawk moth, Agrius convolvuli, which belongs to the same subfamily, Sphinginae, as M. sexta. Conclusions/Significance Comparison of the M. sexta BAC physical map with the linkage map and genome sequence of B. mori pointed to extensive conserved synteny including conserved gene order in most chromosomes. Only a few rearrangements, including three inversions, three translocations, and two fission/fusion events were estimated to have occurred after the divergence of Bombycidae and Sphingidae. These results add to accumulating evidence for the stability of lepidopteran genomes. Generating signals on A. convolvuli chromosomes using heterologous M. sexta probes demonstrated that BAC-FISH with orthologous sequences can be used for karyotyping a wide range of related and genetically uncharacterized species, significantly extending the ability to develop synteny maps for comparative and functional genomics. PMID:19829706
Rapid quantification of mutant fitness in diverse bacteria by sequencing randomly bar-coded transposons

DOE PAGES

Wetmore, Kelly M.; Price, Morgan N.; Waters, Robert J.; ...

2015-05-12

Transposon mutagenesis with next-generation sequencing (TnSeq) is a powerful approach to annotate gene function in bacteria, but existing protocols for TnSeq require laborious preparation of every sample before sequencing. Thus, the existing protocols are not amenable to the throughput necessary to identify phenotypes and functions for the majority of genes in diverse bacteria. Here, we present a method, random bar code transposon-site sequencing (RB-TnSeq), which increases the throughput of mutant fitness profiling by incorporating random DNA bar codes into Tn5 and mariner transposons and by using bar code sequencing (BarSeq) to assay mutant fitness. RB-TnSeq can be used with anymore » transposon, and TnSeq is performed once per organism instead of once per sample. Each BarSeq assay requires only a simple PCR, and 48 to 96 samples can be sequenced on one lane of an Illumina HiSeq system. We demonstrate the reproducibility and biological significance of RB-TnSeq with Escherichia coli, Phaeobacter inhibens, Pseudomonas stutzeri, Shewanella amazonensis, and Shewanella oneidensis. To demonstrate the increased throughput of RB-TnSeq, we performed 387 successful genome-wide mutant fitness assays representing 130 different bacterium-carbon source combinations and identified 5,196 genes with significant phenotypes across the five bacteria. In P. inhibens, we used our mutant fitness data to identify genes important for the utilization of diverse carbon substrates, including a putative D-mannose isomerase that is required for mannitol catabolism. RB-TnSeq will enable the cost-effective functional annotation of diverse bacteria using mutant fitness profiling. A large challenge in microbiology is the functional assessment of the millions of uncharacterized genes identified by genome sequencing. Transposon mutagenesis coupled to next-generation sequencing (TnSeq) is a powerful approach to assign phenotypes and functions to genes. However, the current strategies for TnSeq are too laborious to be applied to hundreds of experimental conditions across multiple bacteria. Here, we describe an approach, random bar code transposon-site sequencing (RB-TnSeq), which greatly simplifies the measurement of gene fitness by using bar code sequencing (BarSeq) to monitor the abundance of mutants. We performed 387 genome-wide fitness assays across five bacteria and identified phenotypes for over 5,000 genes. RB-TnSeq can be applied to diverse bacteria and is a powerful tool to annotate uncharacterized genes using phenotype data.« less
Rapid quantification of mutant fitness in diverse bacteria by sequencing randomly bar-coded transposons

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wetmore, Kelly M.; Price, Morgan N.; Waters, Robert J.

Transposon mutagenesis with next-generation sequencing (TnSeq) is a powerful approach to annotate gene function in bacteria, but existing protocols for TnSeq require laborious preparation of every sample before sequencing. Thus, the existing protocols are not amenable to the throughput necessary to identify phenotypes and functions for the majority of genes in diverse bacteria. Here, we present a method, random bar code transposon-site sequencing (RB-TnSeq), which increases the throughput of mutant fitness profiling by incorporating random DNA bar codes into Tn5 and mariner transposons and by using bar code sequencing (BarSeq) to assay mutant fitness. RB-TnSeq can be used with anymore » transposon, and TnSeq is performed once per organism instead of once per sample. Each BarSeq assay requires only a simple PCR, and 48 to 96 samples can be sequenced on one lane of an Illumina HiSeq system. We demonstrate the reproducibility and biological significance of RB-TnSeq with Escherichia coli, Phaeobacter inhibens, Pseudomonas stutzeri, Shewanella amazonensis, and Shewanella oneidensis. To demonstrate the increased throughput of RB-TnSeq, we performed 387 successful genome-wide mutant fitness assays representing 130 different bacterium-carbon source combinations and identified 5,196 genes with significant phenotypes across the five bacteria. In P. inhibens, we used our mutant fitness data to identify genes important for the utilization of diverse carbon substrates, including a putative D-mannose isomerase that is required for mannitol catabolism. RB-TnSeq will enable the cost-effective functional annotation of diverse bacteria using mutant fitness profiling. A large challenge in microbiology is the functional assessment of the millions of uncharacterized genes identified by genome sequencing. Transposon mutagenesis coupled to next-generation sequencing (TnSeq) is a powerful approach to assign phenotypes and functions to genes. However, the current strategies for TnSeq are too laborious to be applied to hundreds of experimental conditions across multiple bacteria. Here, we describe an approach, random bar code transposon-site sequencing (RB-TnSeq), which greatly simplifies the measurement of gene fitness by using bar code sequencing (BarSeq) to monitor the abundance of mutants. We performed 387 genome-wide fitness assays across five bacteria and identified phenotypes for over 5,000 genes. RB-TnSeq can be applied to diverse bacteria and is a powerful tool to annotate uncharacterized genes using phenotype data.« less
Binding ligand prediction for proteins using partial matching of local surface patches.

PubMed

Sael, Lee; Kihara, Daisuke

2010-01-01

Functional elucidation of uncharacterized protein structures is an important task in bioinformatics. We report our new approach for structure-based function prediction which captures local surface features of ligand binding pockets. Function of proteins, specifically, binding ligands of proteins, can be predicted by finding similar local surface regions of known proteins. To enable partial comparison of binding sites in proteins, a weighted bipartite matching algorithm is used to match pairs of surface patches. The surface patches are encoded with the 3D Zernike descriptors. Unlike the existing methods which compare global characteristics of the protein fold or the global pocket shape, the local surface patch method can find functional similarity between non-homologous proteins and binding pockets for flexible ligand molecules. The proposed method improves prediction results over global pocket shape-based method which was previously developed by our group.
Binding Ligand Prediction for Proteins Using Partial Matching of Local Surface Patches

PubMed Central

Sael, Lee; Kihara, Daisuke

2010-01-01

Functional elucidation of uncharacterized protein structures is an important task in bioinformatics. We report our new approach for structure-based function prediction which captures local surface features of ligand binding pockets. Function of proteins, specifically, binding ligands of proteins, can be predicted by finding similar local surface regions of known proteins. To enable partial comparison of binding sites in proteins, a weighted bipartite matching algorithm is used to match pairs of surface patches. The surface patches are encoded with the 3D Zernike descriptors. Unlike the existing methods which compare global characteristics of the protein fold or the global pocket shape, the local surface patch method can find functional similarity between non-homologous proteins and binding pockets for flexible ligand molecules. The proposed method improves prediction results over global pocket shape-based method which was previously developed by our group. PMID:21614188
Prolyl hydroxylation regulates protein degradation, synthesis, and splicing in human induced pluripotent stem cell-derived cardiomyocytes.

PubMed

Stoehr, Andrea; Yang, Yanqin; Patel, Sajni; Evangelista, Alicia M; Aponte, Angel; Wang, Guanghui; Liu, Poching; Boylston, Jennifer; Kloner, Philip H; Lin, Yongshun; Gucek, Marjan; Zhu, Jun; Murphy, Elizabeth

2016-06-01

Protein hydroxylases are oxygen- and α-ketoglutarate-dependent enzymes that catalyse hydroxylation of amino acids such as proline, thus linking oxygen and metabolism to enzymatic activity. Prolyl hydroxylation is a dynamic post-translational modification that regulates protein stability and protein-protein interactions; however, the extent of this modification is largely uncharacterized. The goals of this study are to investigate the biological consequences of prolyl hydroxylation and to identify new targets that undergo prolyl hydroxylation in human cardiomyocytes. We used human induced pluripotent stem cell-derived cardiomyocytes in combination with pulse-chase amino acid labelling and proteomics to analyse the effects of prolyl hydroxylation on protein degradation and synthesis. We identified 167 proteins that exhibit differences in degradation with inhibition of prolyl hydroxylation by dimethyloxalylglycine (DMOG); 164 were stabilized. Proteins involved in RNA splicing such as serine/arginine-rich splicing factor 2 (SRSF2) and splicing factor and proline- and glutamine-rich (SFPQ) were stabilized with DMOG. DMOG also decreased protein translation of cytoskeletal and sarcomeric proteins such as α-cardiac actin. We searched the mass spectrometry data for proline hydroxylation and identified 134 high confidence peptides mapping to 78 unique proteins. We identified SRSF2, SFPQ, α-cardiac actin, and cardiac titin as prolyl hydroxylated. We identified 29 prolyl hydroxylated proteins that showed a significant difference in either protein degradation or synthesis. Additionally, we performed next-generation RNA sequencing and showed that the observed decrease in protein synthesis was not due to changes in mRNA levels. Because RNA splicing factors were prolyl hydroxylated, we investigated splicing ± inhibition of prolyl hydroxylation and detected 369 alternative splicing events, with a preponderance of exon skipping. This study provides the first extensive characterization of the cardiac prolyl hydroxylome and demonstrates that inhibition of α-ketoglutarate hydroxylases alters protein stability, translation, and splicing. Published by Oxford University Press on behalf of the European Society of Cardiology 2016. This work is written by (a) US Government employee(s) and is in the public domain in the US.
Frequent side chain methyl carbon-oxygen hydrogen bonding in proteins revealed by computational and stereochemical analysis of neutron structures.

PubMed

Yesselman, Joseph D; Horowitz, Scott; Brooks, Charles L; Trievel, Raymond C

2015-03-01

The propensity of backbone Cα atoms to engage in carbon-oxygen (CH · · · O) hydrogen bonding is well-appreciated in protein structure, but side chain CH · · · O hydrogen bonding remains largely uncharacterized. The extent to which side chain methyl groups in proteins participate in CH · · · O hydrogen bonding is examined through a survey of neutron crystal structures, quantum chemistry calculations, and molecular dynamics simulations. Using these approaches, methyl groups were observed to form stabilizing CH · · · O hydrogen bonds within protein structure that are maintained through protein dynamics and participate in correlated motion. Collectively, these findings illustrate that side chain methyl CH · · · O hydrogen bonding contributes to the energetics of protein structure and folding. © 2014 Wiley Periodicals, Inc.
A novel protein RLS1 with NB-ARM domains is involved in chloroplast degradation during leaf senescence in rice.

PubMed

Jiao, Bin-Bin; Wang, Jian-Jun; Zhu, Xu-Dong; Zeng, Long-Jun; Li, Qun; He, Zu-Hua

2012-01-01

Leaf senescence, a type of programmed cell death (PCD) characterized by chlorophyll degradation, is important to plant growth and crop productivity. It emerges that autophagy is involved in chloroplast degradation during leaf senescence. However, the molecular mechanism(s) involved in the process is not well understood. In this study, the genetic and physiological characteristics of the rice rls1 (rapid leaf senescence 1) mutant were identified. The rls1 mutant developed small, yellow-brown lesions resembling disease scattered over the whole surfaces of leaves that displayed earlier senescence than those of wild-type plants. The rapid loss of chlorophyll content during senescence was the main cause of accelerated leaf senescence in rls1. Microscopic observation indicated that PCD was misregulated, probably resulting in the accelerated degradation of chloroplasts in rls1 leaves. Map-based cloning of the RLS1 gene revealed that it encodes a previously uncharacterized NB (nucleotide-binding site)-containing protein with an ARM (armadillo) domain at the carboxyl terminus. Consistent with its involvement in leaf senescence, RLS1 was up-regulated during dark-induced leaf senescence and down-regulated by cytokinin. Intriguingly, constitutive expression of RLS1 also slightly accelerated leaf senescence with decreased chlorophyll content in transgenic rice plants. Our study identified a previously uncharacterized NB-ARM protein involved in PCD during plant growth and development, providing a unique tool for dissecting possible autophagy-mediated PCD during senescence in plants.

A Global Survey of ATPase Activity in Plasmodium falciparum Asexual Blood Stages and Gametocytes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ortega, Corrie; Frando, Andrew; Webb-Robertson, Bobbie-Jo

Effective malaria control and elimination in hyperendemic areas of the world will require treatment of disease-causing Plasmodium falciparum (Pf) blood stage infection but also blocking parasite transmission from humans to mosquito to prevent disease spread. Numerous antimalarial drugs have become ineffective due to parasite drug resistance and many currently used therapies do not kill gametocytes, highly specialized sexual parasite stages with distinct physiology that are necessary for transmission from the human host to the mosquito vector. Further confounding next generation drug development against Pf is the lack of known biochemical activity for most parasite gene products as well as themore » unknown metabolic needs of non-replicating gametocyte. Here, we take a systematic activity-based proteomics approach to survey the large and druggable ATPase family that is associated with replicating blood stage asexual parasites and transmissible gametocytes. We experimentally confirm existing annotation and predict ATPase function for 38 uncharacterized proteins. ATPase activity broadly changes during the transition from asexual schizonts to gametocytes, indicating altered metabolism and regulatory roles of ATPases specific for each lifecycle stage. By mapping the activity of ATPases associated with gametocytogenesis, we assign biochemical activity to a large number of uncharacterized proteins and identify new candidate transmission blocking targets.« less
Plum Pox Virus 6K1 Protein Is Required for Viral Replication and Targets the Viral Replication Complex at the Early Stage of Infection

PubMed Central

Cui, Hongguang

2016-01-01

ABSTRACT The potyviral RNA genome encodes two polyproteins that are proteolytically processed by three viral protease domains into 11 mature proteins. Extensive molecular studies have identified functions for the majority of the viral proteins. For example, 6K2, one of the two smallest potyviral proteins, is an integral membrane protein and induces the endoplasmic reticulum (ER)-originated replication vesicles that target the chloroplast for robust viral replication. However, the functional role of 6K1, the other smallest protein, remains uncharacterized. In this study, we developed a series of recombinant full-length viral cDNA clones derived from a Canadian Plum pox virus (PPV) isolate. We found that deletion of any of the short motifs of 6K1 (each of which ranged from 5 to 13 amino acids), most of the 6K1 sequence (but with the conserved sequence of the cleavage sites being retained), or all of the 6K1 sequence in the PPV infectious clone abolished viral replication. The trans expression of 6K1 or the cis expression of a dislocated 6K1 failed to rescue the loss-of-replication phenotype, suggesting the temporal and spatial requirement of 6K1 for viral replication. Disruption of the N- or C-terminal cleavage site of 6K1, which prevented the release of 6K1 from the polyprotein, either partially or completely inhibited viral replication, suggesting the functional importance of the mature 6K1. We further found that green fluorescent protein-tagged 6K1 formed punctate inclusions at the viral early infection stage and colocalized with chloroplast-bound viral replicase elements 6K2 and NIb. Taken together, our results suggest that 6K1 is required for viral replication and is an important viral element of the viral replication complex at the early infection stage. IMPORTANCE Potyviruses account for more than 30% of known plant viruses and consist of many agriculturally important viruses. The genomes of potyviruses encode two polyproteins that are proteolytically processed into 11 mature proteins, with the majority of them having been at least partially functionally characterized. However, the functional role of a small protein named 6K1 remains obscure. In this study, we showed that deletion of 6K1 or a short motif/region of 6K1 in the full-length cDNA clones of plum pox virus abolishes viral replication and that mutation of the N- or C-terminal cleavage sites of 6K1 to prevent its release from the polyprotein greatly attenuates or completely inhibits viral replication, suggesting its important role in potyviral infection. We report that 6K1 forms punctate structures and targets the replication vesicles in PPV-infected plant leaf cells at the early infection stage. Our data reveal that 6K1 is an important viral protein of the potyviral replication complex. PMID:26962227
Aquimarina salinaria sp. nov., a novel algicidal bacterium isolated from a saltpan.

PubMed

Chen, Wen-Ming; Sheu, Fu-Sian; Sheu, Shih-Yi

2012-02-01

A bacterial strain designated antisso-27(T), previously isolated from saltpan in Taiwan while screening for bacteria for algicidal activity, was characterized using the polyphasic taxonomic approach. Strain antisso-27(T) was Gram-negative, aerobic, brownish yellow colored, rod-shaped, non-flagellated and non-gliding. Phylogenetic analyses based on 16S rRNA gene sequences showed that strain antisso-27(T) belonged to the genus Aquimarina within the family Flavobacteriaceae with relatively low sequence similarities of 94.0-96.6% to other valid Aquimarina spp. It contained iso-C(17:0) 3-OH, iso-C(15:0), iso-C(16:0), iso-C(15:1) and iso-C(15:0) 3-OH as the main fatty acids and contained a menaquinone with six isoprene units (MK-6) as the major isoprenoid quinone. Major polar lipids were phosphatidylethanolamine, diphosphatidylglycerol, an uncharacterized aminolipid and five uncharacterized phospholipids. Strain antisso-27(T) employed direct mode of algicidal lysis to Chlorella vulgaris strain 211-31; nevertheless, it released an algicidal substance against M. aeruginosa strain MTY01. This is the first study that the Aquimarina species possesses both direct and indirect algicidal activities. On the basis of the phylogenetic and phenotypic data, strain antisso-27(T) should be classified as representing a novel species, for which the name A. salinaria sp. nov. is proposed. The type strain is A. salinaria antisso-27(T) (= BCRC 80080(T) = LMG 25375(T)).
A traveling salesman approach for predicting protein functions.

PubMed

Johnson, Olin; Liu, Jing

2006-10-12

Protein-protein interaction information can be used to predict unknown protein functions and to help study biological pathways. Here we present a new approach utilizing the classic Traveling Salesman Problem to study the protein-protein interactions and to predict protein functions in budding yeast Saccharomyces cerevisiae. We apply the global optimization tool from combinatorial optimization algorithms to cluster the yeast proteins based on the global protein interaction information. We then use this clustering information to help us predict protein functions. We use our algorithm together with the direct neighbor algorithm 1 on characterized proteins and compare the prediction accuracy of the two methods. We show our algorithm can produce better predictions than the direct neighbor algorithm, which only considers the immediate neighbors of the query protein. Our method is a promising one to be used as a general tool to predict functions of uncharacterized proteins and a successful sample of using computer science knowledge and algorithms to study biological problems.
A traveling salesman approach for predicting protein functions

PubMed Central

Johnson, Olin; Liu, Jing

2006-01-01

Background Protein-protein interaction information can be used to predict unknown protein functions and to help study biological pathways. Results Here we present a new approach utilizing the classic Traveling Salesman Problem to study the protein-protein interactions and to predict protein functions in budding yeast Saccharomyces cerevisiae. We apply the global optimization tool from combinatorial optimization algorithms to cluster the yeast proteins based on the global protein interaction information. We then use this clustering information to help us predict protein functions. We use our algorithm together with the direct neighbor algorithm [1] on characterized proteins and compare the prediction accuracy of the two methods. We show our algorithm can produce better predictions than the direct neighbor algorithm, which only considers the immediate neighbors of the query protein. Conclusion Our method is a promising one to be used as a general tool to predict functions of uncharacterized proteins and a successful sample of using computer science knowledge and algorithms to study biological problems. PMID:17147783
Nasopharyngeal teratoma, congenital diaphragmatic hernia and Dandy-Walker malformation - a yet uncharacterized syndrome.

PubMed

Gupta, N; Shastri, S; Singh, P K; Jana, M; Mridha, A; Verma, G; Kabra, M

2016-11-01

An association of congenital diaphragmatic hernia, dandy walker malformation and nasopharyngeal teratoma is very rare. Here, we report a fourth case with this association where chromosomal microarray and whole exome sequencing (WES) was performed to understand the underlying genetic basis. Findings of few variants especially a novel variation in HIRA provided some insights. An association of congenital diaphragmatic hernia, dandy walker malformation and nasopharyngeal teratoma is very rare. Here, we report a fourth case with this association where chromosomal microarray and whole exome sequencing (WES) was performed to understand the underlying genetic basis. Findings of few variants especially a novel variation in HIRA provided some insights. © 2016 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Proteomic study of differential protein expression in mouse lung tissues after aerosolized ricin poisoning.

PubMed

Guo, Zhendong; Han, Chao; Du, Jiajun; Zhao, Siyan; Fu, Yingying; Zheng, Guanyu; Sun, Yucheng; Zhang, Yi; Liu, Wensen; Wan, Jiayu; Qian, Jun; Liu, Linna

2014-04-28

Ricin is one of the most poisonous natural toxins from plants and is classified as a Class B biological threat pathogen by the Centers for Disease Control and Prevention (CDC) of U.S.A. Ricin exposure can occur through oral or aerosol routes. Ricin poisoning has a rapid onset and a short incubation period. There is no effective treatment for ricin poisoning. In this study, an aerosolized ricin-exposed mouse model was developed and the pathology was investigated. The protein expression profile in the ricin-poisoned mouse lung tissue was analyzed using proteomic techniques to determine the proteins that were closely related to the toxicity of ricin. 2D gel electrophoresis, mass spectrometry and subsequent biological functional analysis revealed that six proteins including Apoa1 apolipoprotein, Ywhaz 14-3-3 protein, Prdx6 Uncharacterized Protein, Selenium-binding protein 1, HMGB1, and DPYL-2, were highly related to ricin poisoning.
Isolation and Identification of Genes Activating Uas2-Dependent Adh2 Expression in Saccharomyces Cerevisiae

PubMed Central

Donoviel, M. S.; Young, E. T.

1996-01-01

Two cis-acting elements have been identified that act synergistically to regulate expression of the glucose-repressed alcohol dehydrogenase 2 (ADH2) gene. UAS1 is bound by the trans-activator Adr1p. UAS2 is thought to be the binding site for an unidentified regulatory protein. A genetic selection based on a UAS2-dependent ADH2 reporter was devised to isolate genes capable of activating UAS2-dependent transcription. One set of UAS2-dependent genes contained SPT6/CRE2/SSN20. Multicopy SPT6 caused improper expression of chromosomal ADH2. A second set of UAS2-dependent clones contained a previously uncharacterized open reading frame designated MEU1 (Multicopy Enhancer of UAS2). A frame shift mutation in MEU1 abolished its ability to activate UAS2-dependent gene expression. Multicopy MEU1 expression suppressed the constitutive ADH2 expression caused by cre2-1. Disruption of MEU1 reduced endogenous ADH2 expression about twofold but had no effect on cell viability or growth. No homologues of MEU1 were identified by low-stringency Southern hybridization of yeast genomic DNA, and no significant homologues were found in the sequence data bases. A MEU1/β-gal fusion protein was not localized to a particular region of the cell. MEU1 is linked to PPR1 on chromosome XII. PMID:8807288
Global Analysis of Photosynthesis Transcriptional Regulatory Networks

PubMed Central

Imam, Saheed; Noguera, Daniel R.; Donohue, Timothy J.

2014-01-01

Photosynthesis is a crucial biological process that depends on the interplay of many components. This work analyzed the gene targets for 4 transcription factors: FnrL, PrrA, CrpK and MppG (RSP_2888), which are known or predicted to control photosynthesis in Rhodobacter sphaeroides. Chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) identified 52 operons under direct control of FnrL, illustrating its regulatory role in photosynthesis, iron homeostasis, nitrogen metabolism and regulation of sRNA synthesis. Using global gene expression analysis combined with ChIP-seq, we mapped the regulons of PrrA, CrpK and MppG. PrrA regulates ∼34 operons encoding mainly photosynthesis and electron transport functions, while CrpK, a previously uncharacterized Crp-family protein, regulates genes involved in photosynthesis and maintenance of iron homeostasis. Furthermore, CrpK and FnrL share similar DNA binding determinants, possibly explaining our observation of the ability of CrpK to partially compensate for the growth defects of a ΔFnrL mutant. We show that the Rrf2 family protein, MppG, plays an important role in photopigment biosynthesis, as part of an incoherent feed-forward loop with PrrA. Our results reveal a previously unrealized, high degree of combinatorial regulation of photosynthetic genes and significant cross-talk between their transcriptional regulators, while illustrating previously unidentified links between photosynthesis and the maintenance of iron homeostasis. PMID:25503406
Helminths and Cancers From the Evolutionary Perspective.

PubMed

Scholte, Larissa L S; Pascoal-Xavier, Marcelo A; Nahum, Laila A

2018-01-01

Helminths include free-living and parasitic Platyhelminthes and Nematoda which infect millions of people worldwide. Some Platyhelminthes species of blood flukes ( Schistosoma haematobium, Schistosoma japonicum , and Schistosoma mansoni ) and liver flukes ( Clonorchis sinensis and Opisthorchis viverrini ) are known to be involved in human cancers. Other helminths are likely to be carcinogenic. Our main goals are to summarize the current knowledge of human cancers caused by Platyhelminthes, point out some helminth and human biomarkers identified so far, and highlight the potential contributions of phylogenetics and molecular evolution to cancer research. Human cancers caused by helminth infection include cholangiocarcinoma, colorectal hepatocellular carcinoma, squamous cell carcinoma, and urinary bladder cancer. Chronic inflammation is proposed as a common pathway for cancer initiation and development. Furthermore, different bacteria present in gastric, colorectal, and urogenital microbiomes might be responsible for enlarging inflammatory and fibrotic responses in cancers. Studies have suggested that different biomarkers are involved in helminth infection and human cancer development; although, the detailed mechanisms remain under debate. Different helminth proteins have been studied by different approaches. However, their evolutionary relationships remain unsolved. Here, we illustrate the strengths of homology identification and function prediction of uncharacterized proteins from genome sequencing projects based on an evolutionary framework. Together, these approaches may help identifying new biomarkers for disease diagnostics and intervention measures. This work has potential applications in the field of phylomedicine (evolutionary medicine) and may contribute to parasite and cancer research.
Revisiting the TALE repeat.

PubMed

Deng, Dong; Yan, Chuangye; Wu, Jianping; Pan, Xiaojing; Yan, Nieng

2014-04-01

Transcription activator-like (TAL) effectors specifically bind to double stranded (ds) DNA through a central domain of tandem repeats. Each TAL effector (TALE) repeat comprises 33-35 amino acids and recognizes one specific DNA base through a highly variable residue at a fixed position in the repeat. Structural studies have revealed the molecular basis of DNA recognition by TALE repeats. Examination of the overall structure reveals that the basic building block of TALE protein, namely a helical hairpin, is one-helix shifted from the previously defined TALE motif. Here we wish to suggest a structure-based re-demarcation of the TALE repeat which starts with the residues that bind to the DNA backbone phosphate and concludes with the base-recognition hyper-variable residue. This new numbering system is consistent with the α-solenoid superfamily to which TALE belongs, and reflects the structural integrity of TAL effectors. In addition, it confers integral number of TALE repeats that matches the number of bound DNA bases. We then present fifteen crystal structures of engineered dHax3 variants in complex with target DNA molecules, which elucidate the structural basis for the recognition of bases adenine (A) and guanine (G) by reported or uncharacterized TALE codes. Finally, we analyzed the sequence-structure correlation of the amino acid residues within a TALE repeat. The structural analyses reported here may advance the mechanistic understanding of TALE proteins and facilitate the design of TALEN with improved affinity and specificity.
Global analysis of photosynthesis transcriptional regulatory networks.

PubMed

Imam, Saheed; Noguera, Daniel R; Donohue, Timothy J

2014-12-01

Photosynthesis is a crucial biological process that depends on the interplay of many components. This work analyzed the gene targets for 4 transcription factors: FnrL, PrrA, CrpK and MppG (RSP_2888), which are known or predicted to control photosynthesis in Rhodobacter sphaeroides. Chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) identified 52 operons under direct control of FnrL, illustrating its regulatory role in photosynthesis, iron homeostasis, nitrogen metabolism and regulation of sRNA synthesis. Using global gene expression analysis combined with ChIP-seq, we mapped the regulons of PrrA, CrpK and MppG. PrrA regulates ∼34 operons encoding mainly photosynthesis and electron transport functions, while CrpK, a previously uncharacterized Crp-family protein, regulates genes involved in photosynthesis and maintenance of iron homeostasis. Furthermore, CrpK and FnrL share similar DNA binding determinants, possibly explaining our observation of the ability of CrpK to partially compensate for the growth defects of a ΔFnrL mutant. We show that the Rrf2 family protein, MppG, plays an important role in photopigment biosynthesis, as part of an incoherent feed-forward loop with PrrA. Our results reveal a previously unrealized, high degree of combinatorial regulation of photosynthetic genes and significant cross-talk between their transcriptional regulators, while illustrating previously unidentified links between photosynthesis and the maintenance of iron homeostasis.
Integration of deep transcriptome and proteome analyses reveals the components of alkaloid metabolism in opium poppy cell cultures

PubMed Central

2010-01-01

Background Papaver somniferum (opium poppy) is the source for several pharmaceutical benzylisoquinoline alkaloids including morphine, the codeine and sanguinarine. In response to treatment with a fungal elicitor, the biosynthesis and accumulation of sanguinarine is induced along with other plant defense responses in opium poppy cell cultures. The transcriptional induction of alkaloid metabolism in cultured cells provides an opportunity to identify components of this process via the integration of deep transcriptome and proteome databases generated using next-generation technologies. Results A cDNA library was prepared for opium poppy cell cultures treated with a fungal elicitor for 10 h. Using 454 GS-FLX Titanium pyrosequencing, 427,369 expressed sequence tags (ESTs) with an average length of 462 bp were generated. Assembly of these sequences yielded 93,723 unigenes, of which 23,753 were assigned Gene Ontology annotations. Transcripts encoding all known sanguinarine biosynthetic enzymes were identified in the EST database, 5 of which were represented among the 50 most abundant transcripts. Liquid chromatography-tandem mass spectrometry (LC-MS/MS) of total protein extracts from cell cultures treated with a fungal elicitor for 50 h facilitated the identification of 1,004 proteins. Proteins were fractionated by one-dimensional SDS-PAGE and digested with trypsin prior to LC-MS/MS analysis. Query of an opium poppy-specific EST database substantially enhanced peptide identification. Eight out of 10 known sanguinarine biosynthetic enzymes and many relevant primary metabolic enzymes were represented in the peptide database. Conclusions The integration of deep transcriptome and proteome analyses provides an effective platform to catalogue the components of secondary metabolism, and to identify genes encoding uncharacterized enzymes. The establishment of corresponding transcript and protein databases generated by next-generation technologies in a system with a well-defined metabolite profile facilitates an improved linkage between genes, enzymes, and pathway components. The proteome database represents the most relevant alkaloid-producing enzymes, compared with the much deeper and more complete transcriptome library. The transcript database contained full-length mRNAs encoding most alkaloid biosynthetic enzymes, which is a key requirement for the functional characterization of novel gene candidates. PMID:21083930
The centrosomin CM2 domain is a multi-functional binding domain with distinct cell cycle roles.

PubMed

Citron, Y Rose; Fagerstrom, Carey J; Keszthelyi, Bettina; Huang, Bo; Rusan, Nasser M; Kelly, Mark J S; Agard, David A

2018-01-01

The centrosome serves as the main microtubule-organizing center in metazoan cells, yet despite its functional importance, little is known mechanistically about the structure and organizational principles that dictate protein organization in the centrosome. In particular, the protein-protein interactions that allow for the massive structural transition between the tightly organized interphase centrosome and the highly expanded matrix-like arrangement of the mitotic centrosome have been largely uncharacterized. Among the proteins that undergo a major transition is the Drosophila melanogaster protein centrosomin that contains a conserved carboxyl terminus motif, CM2. Recent crystal structures have shown this motif to be dimeric and capable of forming an intramolecular interaction with a central region of centrosomin. Here we use a combination of in-cell microscopy and in vitro oligomer assessment to show that dimerization is not necessary for CM2 recruitment to the centrosome and that CM2 alone undergoes significant cell cycle dependent rearrangement. We use NMR binding assays to confirm this intramolecular interaction and show that residues involved in solution are consistent with the published crystal structure and identify L1137 as critical for binding. Additionally, we show for the first time an in vitro interaction of CM2 with the Drosophila pericentrin-like-protein that exploits the same set of residues as the intramolecular interaction. Furthermore, NMR experiments reveal a calcium sensitive interaction between CM2 and calmodulin. Although unexpected because of sequence divergence, this suggests that centrosomin-mediated assemblies, like the mammalian pericentrin, may be calcium regulated. From these results, we suggest an expanded model where during interphase CM2 interacts with pericentrin-like-protein to form a layer of centrosomin around the centriole wall and that at the onset of mitosis this population acts as a nucleation site of intramolecular centrosomin interactions that support the expansion into the metaphase matrix.
High-quality de novo assembly of the apple genome and methylome dynamics of early fruit development.

PubMed

Daccord, Nicolas; Celton, Jean-Marc; Linsmith, Gareth; Becker, Claude; Choisne, Nathalie; Schijlen, Elio; van de Geest, Henri; Bianco, Luca; Micheletti, Diego; Velasco, Riccardo; Di Pierro, Erica Adele; Gouzy, Jérôme; Rees, D Jasper G; Guérif, Philippe; Muranty, Hélène; Durel, Charles-Eric; Laurens, François; Lespinasse, Yves; Gaillard, Sylvain; Aubourg, Sébastien; Quesneville, Hadi; Weigel, Detlef; van de Weg, Eric; Troggio, Michela; Bucher, Etienne

2017-07-01

Using the latest sequencing and optical mapping technologies, we have produced a high-quality de novo assembly of the apple (Malus domestica Borkh.) genome. Repeat sequences, which represented over half of the assembly, provided an unprecedented opportunity to investigate the uncharacterized regions of a tree genome; we identified a new hyper-repetitive retrotransposon sequence that was over-represented in heterochromatic regions and estimated that a major burst of different transposable elements (TEs) occurred 21 million years ago. Notably, the timing of this TE burst coincided with the uplift of the Tian Shan mountains, which is thought to be the center of the location where the apple originated, suggesting that TEs and associated processes may have contributed to the diversification of the apple ancestor and possibly to its divergence from pear. Finally, genome-wide DNA methylation data suggest that epigenetic marks may contribute to agronomically relevant aspects, such as apple fruit development.
Numerous uncharacterized and highly divergent microbes which colonize humans are revealed by circulating cell-free DNA

PubMed Central

Camunas-Soler, Joan; Kertesz, Michael; De Vlaminck, Iwijn; Koh, Winston; Pan, Wenying; Martin, Lance; Neff, Norma F.; Okamoto, Jennifer; Wong, Ronald J.; Kharbanda, Sandhya; El-Sayed, Yasser; Blumenfeld, Yair; Stevenson, David K.; Shaw, Gary M.; Wolfe, Nathan D.; Quake, Stephen R.

2017-01-01

Blood circulates throughout the human body and contains molecules drawn from virtually every tissue, including the microbes and viruses which colonize the body. Through massive shotgun sequencing of circulating cell-free DNA from the blood, we identified hundreds of new bacteria and viruses which represent previously unidentified members of the human microbiome. Analyzing cumulative sequence data from 1,351 blood samples collected from 188 patients enabled us to assemble 7,190 contiguous regions (contigs) larger than 1 kbp, of which 3,761 are novel with little or no sequence homology in any existing databases. The vast majority of these novel contigs possess coding sequences, and we have validated their existence both by finding their presence in independent experiments and by performing direct PCR amplification. When their nearest neighbors are located in the tree of life, many of the organisms represent entirely novel taxa, showing that microbial diversity within the human body is substantially broader than previously appreciated. PMID:28830999
The Legionella pneumophila Dot/Icm-secreted Effector PlcC/CegC1 Together with PlcA and PlcB Promotes Virulence and Belongs to a Novel Zinc Metallophospholipase C Family Present in Bacteria and Fungi*

PubMed Central

Aurass, Philipp; Schlegel, Maren; Metwally, Omar; Harding, Clare R.; Schroeder, Gunnar N.; Frankel, Gad; Flieger, Antje

2013-01-01

Legionella pneumophila is a water-borne bacterium that causes pneumonia in humans. PlcA and PlcB are two previously defined L. pneumophila proteins with homology to the phosphatidylcholine-specific phospholipase C (PC-PLC) of Pseudomonas fluorescens. Additionally, we found that Lpg0012 shows similarity to PLCs and has been shown to be a Dot/Icm-injected effector, CegC1, which is designated here as PlcC. It remained unclear, however, whether these L. pneumophila proteins exhibit PLC activity. PlcC expressed in Escherichia coli hydrolyzed a broad phospholipid spectrum, including PC, phosphatidylglycerol (PG), and phosphatidylinositol. The addition of Zn2+ ions activated, whereas EDTA inhibited, PlcC-derived PLC activity. Protein homology search revealed that the three Legionella enzymes and P. fluorescens PC-PLC share conserved domains also present in uncharacterized fungal proteins. Fifteen conserved amino acids were essential for enzyme activity as identified via PlcC mutagenesis. Analysis of defined L. pneumophila knock-out mutants indicated Lsp-dependent export of PG-hydrolyzing PLC activity. PlcA and PlcB exhibited PG-specific activity and contain a predicted Sec signal sequence. In line with the reported requirement of host cell contact for Dot/Icm-dependent effector translocation, PlcC showed cell-associated PC-specific PLC activity after bacterial growth in broth. A PLC triple mutant, but not single or double mutants, exhibited reduced host killing in a Galleria mellonella infection model, highlighting the importance of the three PLCs in pathogenesis. In summary, we describe here a novel Zn2+-dependent PLC family present in Legionella, Pseudomonas, and fungi with broad substrate preference and function in virulence. PMID:23457299
A phylogenomic analysis of the Actinomycetales mce operons

PubMed Central

Casali, Nicola; Riley, Lee W

2007-01-01

Background The genome of Mycobacterium tuberculosis harbors four copies of a cluster of genes termed mce operons. Despite extensive research that has demonstrated the importance of these operons on infection outcome, their physiological function remains obscure. Expanding databases of complete microbial genome sequences facilitate a comparative genomic approach that can provide valuable insight into the role of uncharacterized proteins. Results The M. tuberculosis mce loci each include two yrbE and six mce genes, which have homology to ABC transporter permeases and substrate-binding proteins, respectively. Operons with an identical structure were identified in all Mycobacterium species examined, as well as in five other Actinomycetales genera. Some of the Actinomycetales mce operons include an mkl gene, which encodes an ATPase resembling those of ABC uptake transporters. The phylogenetic profile of Mkl orthologs exactly matched that of the Mce and YrbE proteins. Through topology and motif analyses of YrbE homologs, we identified a region within the penultimate cytoplasmic loop that may serve as the site of interaction with the putative cognate Mkl ATPase. Homologs of the exported proteins encoded adjacent to the M. tuberculosis mce operons were detected in a conserved chromosomal location downstream of the majority of Actinomycetales operons. Operons containing linked mkl, yrbE and mce genes, resembling the classic organization of an ABC importer, were found to be common in Gram-negative bacteria and appear to be associated with changes in properties of the cell surface. Conclusion Evidence presented suggests that the mce operons of Actinomycetales species and related operons in Gram-negative bacteria encode a subfamily of ABC uptake transporters with a possible role in remodeling the cell envelope. PMID:17324287
Comparative Life Cycle Transcriptomics Revises Leishmania mexicana Genome Annotation and Links a Chromosome Duplication with Parasitism of Vertebrates

PubMed Central

Fiebig, Michael; Kelly, Steven; Gluenz, Eva

2015-01-01

Leishmania spp. are protozoan parasites that have two principal life cycle stages: the motile promastigote forms that live in the alimentary tract of the sandfly and the amastigote forms, which are adapted to survive and replicate in the harsh conditions of the phagolysosome of mammalian macrophages. Here, we used Illumina sequencing of poly-A selected RNA to characterise and compare the transcriptomes of L. mexicana promastigotes, axenic amastigotes and intracellular amastigotes. These data allowed the production of the first transcriptome evidence-based annotation of gene models for this species, including genome-wide mapping of trans-splice sites and poly-A addition sites. The revised genome annotation encompassed 9,169 protein-coding genes including 936 novel genes as well as modifications to previously existing gene models. Comparative analysis of gene expression across promastigote and amastigote forms revealed that 3,832 genes are differentially expressed between promastigotes and intracellular amastigotes. A large proportion of genes that were downregulated during differentiation to amastigotes were associated with the function of the motile flagellum. In contrast, those genes that were upregulated included cell surface proteins, transporters, peptidases and many uncharacterized genes, including 293 of the 936 novel genes. Genome-wide distribution analysis of the differentially expressed genes revealed that the tetraploid chromosome 30 is highly enriched for genes that were upregulated in amastigotes, providing the first evidence of a link between this whole chromosome duplication event and adaptation to the vertebrate host in this group. Peptide evidence for 42 proteins encoded by novel transcripts supports the idea of an as yet uncharacterised set of small proteins in Leishmania spp. with possible implications for host-pathogen interactions. PMID:26452044
Anti-angiogenic activities of snake venom CRISP isolated from Echis carinatus sochureki.

PubMed

Lecht, Shimon; Chiaverelli, Rachel A; Gerstenhaber, Jonathan; Calvete, Juan J; Lazarovici, Philip; Casewell, Nicholas R; Harrison, Robert; Lelkes, Peter I; Marcinkiewicz, Cezary

2015-06-01

Cysteine-rich secretory protein (CRISP) is present in majority of vertebrate including human. The physiological role of this protein is not characterized. We report that a CRISP isolated from Echis carinatus sochureki venom (ES-CRISP) inhibits angiogenesis. The anti-angiogenic activity of purified ES-CRISP from snake venom was investigated in vitro using endothelial cells assays such as proliferation, migration and tube formation in Matrigel, as well as in vivo in quail embryonic CAM system. The modulatory effect of ES-CRISP on the expression of major angiogenesis factors and activation of angiogenesis pathways was tested by qRT-PCR and Western blot. The amino acid sequence of ES-CRISP was found highly similar to other members of this snake venom protein family, and shares over 50% identity with human CRISP-3. ES-CRISP supported adhesion to endothelial cells, although it was also internalized into the cytoplasm in a granule-like manner. It blocked EC proliferation, migration and tube formation in Matrigel. In the embryonic quail CAM system, ES-CRISP abolished neovascularization process induced by exogenous growth factors (bFGF, vpVEGF) and by developing gliomas. CRISP modulates the expression of several factors at the mRNA level, which were characterized as regulators of angiogenesis and blocked activation of MAPK Erk1/2 induced by VEGF. ES-CRISP was characterized as a negative regulator of the angiogenesis, by direct interaction with endothelial cells. The presented work may lead to the development of novel angiostatic therapy, as well as contribute to the identification of the physiological relevance of this functionally uncharacterized protein. Copyright © 2015 Elsevier B.V. All rights reserved.

Secretory expression of Lentinula edodes intracellular laccase by yeast high-cell-density system: sub-milligram production of difficult-to-express secretory protein.

PubMed

Kurose, Takeshi; Saito, Yuta; Kimata, Koichi; Nakagawa, Yuko; Yano, Akira; Ito, Keisuke; Kawarasaki, Yasuaki

2014-06-01

While a number of heterologous expression systems have been reported for extracellular laccases, there are few for the intracellular counterparts. The Lentinula edodes intracellular laccase Lcc4 is an industrially potential enzyme with its unique substrate specificity. The heterologous production of the intracellular laccase, however, had been difficult because of its expression-dependent toxicity. We previously demonstrated that recombinant yeast cells synthesized and, interestingly, secreted Lcc4 only when they were suspended to an inducing medium in a high cell-density (J. Biosci. Bioeng., 113, 154-159, 2012). The high cell-density system was versatile and applicable to other difficult-to-express secretory proteins. Nevertheless, the system's great dependence on aeration, which was a practical obstacle to scale-up production of the enzyme and some other proteins, left the secretion pathway and enzymatic properties of the Lcc4 uncharacterized. In this report, we demonstrate a successful production of Lcc4 by applying a jar-fermentor to the high cell-density system. The elevated yield (0.6 mg L(-1)) due to the sufficient aeration allowed us to prepare and purify the enzyme to homogeneity. The enzyme had been secreted as a hyper-glycosylated protein, resulting in smear band-formations in SDS-PAGE. The amino acid sequencing analysis suggested that the N-terminal 17 residues had been recognized as a secretion signal. The recombinant enzyme showed similar enzymatic properties to the naturally occurring Lcc4. The characteristics of the scale-upped expression system, which includes helpful information for the potential users, have also been described. Copyright © 2013 The Society for Biotechnology, Japan. Published by Elsevier B.V. All rights reserved.
The Legionella pneumophila Dot/Icm-secreted effector PlcC/CegC1 together with PlcA and PlcB promotes virulence and belongs to a novel zinc metallophospholipase C family present in bacteria and fungi.

PubMed

Aurass, Philipp; Schlegel, Maren; Metwally, Omar; Harding, Clare R; Schroeder, Gunnar N; Frankel, Gad; Flieger, Antje

2013-04-19

Legionella pneumophila is a water-borne bacterium that causes pneumonia in humans. PlcA and PlcB are two previously defined L. pneumophila proteins with homology to the phosphatidylcholine-specific phospholipase C (PC-PLC) of Pseudomonas fluorescens. Additionally, we found that Lpg0012 shows similarity to PLCs and has been shown to be a Dot/Icm-injected effector, CegC1, which is designated here as PlcC. It remained unclear, however, whether these L. pneumophila proteins exhibit PLC activity. PlcC expressed in Escherichia coli hydrolyzed a broad phospholipid spectrum, including PC, phosphatidylglycerol (PG), and phosphatidylinositol. The addition of Zn(2+) ions activated, whereas EDTA inhibited, PlcC-derived PLC activity. Protein homology search revealed that the three Legionella enzymes and P. fluorescens PC-PLC share conserved domains also present in uncharacterized fungal proteins. Fifteen conserved amino acids were essential for enzyme activity as identified via PlcC mutagenesis. Analysis of defined L. pneumophila knock-out mutants indicated Lsp-dependent export of PG-hydrolyzing PLC activity. PlcA and PlcB exhibited PG-specific activity and contain a predicted Sec signal sequence. In line with the reported requirement of host cell contact for Dot/Icm-dependent effector translocation, PlcC showed cell-associated PC-specific PLC activity after bacterial growth in broth. A PLC triple mutant, but not single or double mutants, exhibited reduced host killing in a Galleria mellonella infection model, highlighting the importance of the three PLCs in pathogenesis. In summary, we describe here a novel Zn(2+)-dependent PLC family present in Legionella, Pseudomonas, and fungi with broad substrate preference and function in virulence.
The conserved apicomplexan Aurora kinase TgArk3 is involved in endodyogeny, duplication rate and parasite virulence

PubMed Central

Morlon-Guyot, Juliette; Bordat, Yann; Lebrun, Maryse; Gubbels, Marc-Jan; Doerig, Christian; Daher, Wassim

2016-01-01

Aurora kinases are eukaryotic serine/threonine protein kinases that regulate key events associated with chromatin condensation, centrosome and spindle function, and cytokinesis. Elucidating the roles of Aurora kinases in apicomplexan parasites is crucial to understand the cell cycle control during Plasmodium schizogony or Toxoplasma endodyogeny. Here, we report on the localization of two previously uncharacterized Toxoplasma Aurora-related kinases (Ark2 and Ark3) in tachyzoites and of the uncharacterized Ark3 orthologue in Plasmodium falciparum erythrocytic stages. In T. gondii, we show that TgArk2 and TgArk3 concentrate at specific sub-cellular structures linked to parasite division: the mitotic spindle and intranuclear mitotic structures (TgArk2), and the outer core of the centrosome and the budding daughter cells cytoskeleton (TgArk3). By tagging the endogenous PfArk3 gene with the green fluorescent protein (GFP) in live parasites, we show that PfArk3 protein expression peaks late in schizogony and localizes at the periphery of budding schizonts. Disruption of the TgArk2 gene reveals no essential function for tachyzoite propagation in vitro, which is surprising giving that the P. falciparum and P. berghei orthologues are essential for erythrocyte schizogony. In contrast, knock-down of TgArk3 protein results in pronounced defects in parasite division and a major growth deficiency. TgArk3-depleted parasites display several defects, such as reduced parasite growth rate, delayed egress and parasite duplication, defect in rosette formation, reduced parasite size and invasion efficiency and lack of virulence in mice. Our study provides new insights into cell cycle control in Toxoplasma and malaria parasites, and highlights Aurora kinase 3 as potential drug target. PMID:26833682
Resolving the Complexity of Human Skin Metagenomes Using Single-Molecule Sequencing

PubMed Central

Tsai, Yu-Chih; Deming, Clayton; Segre, Julia A.; Kong, Heidi H.; Korlach, Jonas

2016-01-01

ABSTRACT Deep metagenomic shotgun sequencing has emerged as a powerful tool to interrogate composition and function of complex microbial communities. Computational approaches to assemble genome fragments have been demonstrated to be an effective tool for de novo reconstruction of genomes from these communities. However, the resultant “genomes” are typically fragmented and incomplete due to the limited ability of short-read sequence data to assemble complex or low-coverage regions. Here, we use single-molecule, real-time (SMRT) sequencing to reconstruct a high-quality, closed genome of a previously uncharacterized Corynebacterium simulans and its companion bacteriophage from a skin metagenomic sample. Considerable improvement in assembly quality occurs in hybrid approaches incorporating short-read data, with even relatively small amounts of long-read data being sufficient to improve metagenome reconstruction. Using short-read data to evaluate strain variation of this C. simulans in its skin community at single-nucleotide resolution, we observed a dominant C. simulans strain with moderate allelic heterozygosity throughout the population. We demonstrate the utility of SMRT sequencing and hybrid approaches in metagenome quantitation, reconstruction, and annotation. PMID:26861018
An IgaA/UmoB Family Protein from Serratia marcescens Regulates Motility, Capsular Polysaccharide Biosynthesis, and Secondary Metabolite Production.

PubMed

Stella, Nicholas A; Brothers, Kimberly M; Callaghan, Jake D; Passerini, Angelina M; Sigindere, Cihad; Hill, Preston J; Liu, Xinyu; Wozniak, Daniel J; Shanks, Robert M Q

2018-03-15

Secondary metabolites are an important source of pharmaceuticals and key modulators of microbe-microbe interactions. The bacterium Serratia marcescens is part of the Enterobacteriaceae family of eubacteria and produces a number of biologically active secondary metabolites. In this study, we screened for novel regulators of secondary metabolites synthesized by a clinical isolate of S. marcescens and found mutations in a gene for an uncharacterized UmoB/IgaA family member here named gumB Mutation of gumB conferred a severe loss of the secondary metabolites prodigiosin and serratamolide. The gumB mutation conferred pleiotropic phenotypes, including altered biofilm formation, highly increased capsular polysaccharide production, and loss of swimming and swarming motility. These phenotypes corresponded to transcriptional changes in fimA , wecA , and flhD Unlike other UmoB/IgaA family members, gumB was found to be not essential for growth in S. marcescens , yet igaA from Salmonella enterica , yrfF from Escherichia coli , and an uncharacterized predicted ortholog from Klebsiella pneumoniae complemented the gumB mutant secondary metabolite defects, suggesting highly conserved function. These data support the idea that UmoB/IgaA family proteins are functionally conserved and extend the known regulatory influence of UmoB/IgaA family proteins to the control of competition-associated secondary metabolites and biofilm formation. IMPORTANCE IgaA/UmoB family proteins are found in members of the Enterobacteriaceae family of bacteria, which are of environmental and public health importance. IgaA/UmoB family proteins are thought to be inner membrane proteins that report extracellular stresses to intracellular signaling pathways that respond to environmental challenge. This study introduces a new member of the IgaA/UmoB family and demonstrates a high degree of functional similarity between IgaA/UmoB family proteins. Moreover, this study extends the phenomena controlled by IgaA/UmoB family proteins to include the biosynthesis of antimicrobial secondary metabolites. Copyright © 2018 American Society for Microbiology.
Quantitative Profiling Identifies Potential Regulatory Proteins Involved in Development from Dauer Stage to L4 Stage in Caenorhabditis elegans.

PubMed

Kim, Sunhee; Lee, Hyoung-Joo; Hahm, Jeong-Hoon; Jeong, Seul-Ki; Park, Don-Ha; Hancock, William S; Paik, Young-Ki

2016-02-05

When Caenorhabditis elegans encounters unfavorable growth conditions, it enters the dauer stage, an alternative L3 developmental period. A dauer larva resumes larval development to the normal L4 stage by uncharacterized postdauer reprogramming (PDR) when growth conditions become more favorable. During this transition period, certain heterochronic genes involved in controlling the proper sequence of developmental events are known to act, with their mutations suppressing the Muv (multivulva) phenotype in C. elegans. To identify the specific proteins in which the Muv phenotype is highly suppressed, quantitative proteomic analysis with iTRAQ labeling of samples obtained from worms at L1 + 30 h (for continuous development [CD]) and dauer recovery +3 h (for postdauer development [PD]) was carried out to detect changes in protein abundance in the CD and PD states of both N2 and lin-28(n719). Of the 1661 unique proteins identified with a < 1% false discovery rate at the peptide level, we selected 58 proteins exhibiting ≥2-fold up-regulation or ≥2-fold down-regulation in the PD state and analyzed the Gene Ontology terms. RNAi assays against 15 selected up-regulated genes showed that seven genes were predicted to be involved in higher Muv phenotype (p < 0.05) in lin-28(n791), which is not seen in N2. Specifically, two genes, K08H10.1 and W05H9.1, displayed not only the highest rate (%) of Muv phenotype in the RNAi assay but also the dauer-specific mRNA expression, indicating that these genes may be required for PDR, leading to the very early onset of dauer recovery. Thus, our proteomic approach identifies and quantitates the regulatory proteins potentially involved in PDR in C. elegans, which safeguards the overall lifecycle in response to environmental changes.
Megadalton Complexes in the Chloroplast Stroma of Arabidopsis thaliana Characterized by Size Exclusion Chromatography, Mass Spectrometry, and Hierarchical Clustering*

PubMed Central

Olinares, Paul Dominic B.; Ponnala, Lalit; van Wijk, Klaas J.

2010-01-01

To characterize MDa-sized macromolecular chloroplast stroma protein assemblies and to extend coverage of the chloroplast stroma proteome, we fractionated soluble chloroplast stroma in the non-denatured state by size exclusion chromatography with a size separation range up to ∼5 MDa. To maximize protein complex stability and resolution of megadalton complexes, ionic strength and composition were optimized. Subsequent high accuracy tandem mass spectrometry analysis (LTQ-Orbitrap) identified 1081 proteins across the complete native mass range. Protein complexes and assembly states above 0.8 MDa were resolved using hierarchical clustering, and protein heat maps were generated from normalized protein spectral counts for each of the size exclusion chromatography fractions; this complemented previous analysis of stromal complexes up to 0.8 MDa (Peltier, J. B., Cai, Y., Sun, Q., Zabrouskov, V., Giacomelli, L., Rudella, A., Ytterberg, A. J., Rutschow, H., and van Wijk, K. J. (2006) The oligomeric stromal proteome of Arabidopsis thaliana chloroplasts. Mol. Cell. Proteomics 5, 114–133). This combined experimental and bioinformatics analyses resolved chloroplast ribosomes in different assembly and functional states (e.g. 30, 50, and 70 S), which enabled the identification of plastid homologues of prokaryotic ribosome assembly factors as well as proteins involved in co-translational modifications, targeting, and folding. The roles of these ribosome-associating proteins will be discussed. Known RNA splice factors (e.g. CAF1/WTF1/RNC1) as well as uncharacterized proteins with RNA-binding domains (pentatricopeptide repeat, RNA recognition motif, and chloroplast ribosome maturation), RNases, and DEAD box helicases were found in various sized complexes. Chloroplast DNA (>3 MDa) was found in association with the complete heteromeric plastid-encoded DNA polymerase complex, and a dozen other DNA-binding proteins, e.g. DNA gyrase, topoisomerase, and various DNA repair enzymes. The heteromeric ≥5-MDa pyruvate dehydrogenase complex and the 0.8–1-MDa acetyl-CoA carboxylase complex associated with uncharacterized biotin carboxyl carrier domain proteins constitute the entry point to fatty acid metabolism in leaves; we suggest that their large size relates to the need for metabolic channeling. Protein annotations and identification data are available through the Plant Proteomics Database, and mass spectrometry data are available through Proteomics Identifications database. PMID:20423899
Direct Repeat Unit (dru) Typing of Methicillin-Resistant Staphylococcus pseudintermedius from Dogs and Cats.

PubMed

Kadlec, Kristina; Schwarz, Stefan; Goering, Richard V; Weese, J Scott

2015-12-01

Methicillin-resistant Staphylococcus pseudintermedius (MRSP) has emerged in a remarkable manner as an important problem in dogs and cats. However, limited molecular epidemiological information is available. The aims of this study were to apply direct repeat unit (dru) typing in a large collection of well-characterized MRSP isolates and to use dru typing to analyze a collection of previously uncharacterized MRSP isolates. Two collections of MRSP isolates from dogs and cats were included in this study. The first collection comprised 115 well-characterized MRSP isolates from North America and Europe. The data for these isolates included multilocus sequence typing (MLST) and staphylococcal protein A gene (spa) typing results as well as SmaI macrorestriction patterns after pulsed-field gel electrophoresis (PFGE). The second collection was a convenience sample of 360 isolates from North America. The dru region was amplified by PCR, sequenced, and analyzed. For the first collection, the discriminatory indices of the typing methods were calculated. All isolates were successfully dru typed. The discriminatory power for dru typing (D = 0.423) was comparable to that of spa typing (D = 0.445) and of MLST (D = 0.417) in the first collection. Occasionally, dru typing was able to further discriminate between isolates that shared the same spa type. Among all 475 isolates, 26 different dru types were identified, with 2 predominant types (dt9a and dt11a) among 349 (73.4%) isolates. The results of this study underline that dru typing is a useful tool for MRSP typing, being an objective, standardized, sequence-based method that is relatively cost-efficient and easy to perform. Copyright © 2015, American Society for Microbiology. All Rights Reserved.
Retroviral insertions in the VISION database identify molecular pathways in mouse lymphoid leukemia and lymphoma

PubMed Central

Weiser, Keith C.; Liu, Bin; Hansen, Gwenn M.; Skapura, Darlene; Hentges, Kathryn E.; Yarlagadda, Sujatha; Morse III, Herbert C.

2007-01-01

AKXD recombinant inbred (RI) strains develop a variety of leukemias and lymphomas due to somatically acquired insertions of retroviral DNA into the genome of hematopoetic cells that can mutate cellular proto-oncogenes and tumor suppressor genes. We generated a new set of tumors from nine AKXD RI strains selected for their propensity to develop B-cell tumors, the most common type of human hematopoietic cancers. We employed a PCR technique called viral insertion site amplification (VISA) to rapidly isolate genomic sequence at the site of provirus insertion. Here we describe 550 VISA sequence tags (VSTs) that identify 74 common insertion sites (CISs), of which 21 have not been identified previously. Several suspected proto-oncogenes and tumor suppressor genes lie near CISs, providing supportive evidence for their roles in cancer. Furthermore, numerous previously uncharacterized genes lie near CISs, providing a pool of candidate disease genes for future research. Pathway analysis of candidate genes identified several signaling pathways as common and powerful routes to blood cancer, including Notch, E-protein, NFκB, and Ras signaling. Misregulation of several Notch signaling genes was confirmed by quantitative RT-PCR. Our data suggest that analyses of insertional mutagenesis on a single genetic background are biased toward the identification of cooperating mutations. This tumor collection represents the most comprehensive study of the genetics of B-cell leukemia and lymphoma development in mice. We have deposited the VST sequences, CISs in a genome viewer, histopathology, and molecular tumor typing data in a public web database called VISION (Viral Insertion Sites Identifying Oncogenes), which is located at http://www.mouse-genome.bcm.tmc.edu/vision. PMID:17926094
Retroviral insertions in the VISION database identify molecular pathways in mouse lymphoid leukemia and lymphoma.

PubMed

Weiser, Keith C; Liu, Bin; Hansen, Gwenn M; Skapura, Darlene; Hentges, Kathryn E; Yarlagadda, Sujatha; Morse Iii, Herbert C; Justice, Monica J

2007-10-01

AKXD recombinant inbred (RI) strains develop a variety of leukemias and lymphomas due to somatically acquired insertions of retroviral DNA into the genome of hematopoetic cells that can mutate cellular proto-oncogenes and tumor suppressor genes. We generated a new set of tumors from nine AKXD RI strains selected for their propensity to develop B-cell tumors, the most common type of human hematopoietic cancers. We employed a PCR technique called viral insertion site amplification (VISA) to rapidly isolate genomic sequence at the site of provirus insertion. Here we describe 550 VISA sequence tags (VSTs) that identify 74 common insertion sites (CISs), of which 21 have not been identified previously. Several suspected proto-oncogenes and tumor suppressor genes lie near CISs, providing supportive evidence for their roles in cancer. Furthermore, numerous previously uncharacterized genes lie near CISs, providing a pool of candidate disease genes for future research. Pathway analysis of candidate genes identified several signaling pathways as common and powerful routes to blood cancer, including Notch, E-protein, NFkappaB, and Ras signaling. Misregulation of several Notch signaling genes was confirmed by quantitative RT-PCR. Our data suggest that analyses of insertional mutagenesis on a single genetic background are biased toward the identification of cooperating mutations. This tumor collection represents the most comprehensive study of the genetics of B-cell leukemia and lymphoma development in mice. We have deposited the VST sequences, CISs in a genome viewer, histopathology, and molecular tumor typing data in a public web database called VISION (Viral Insertion Sites Identifying Oncogenes), which is located at http://www.mouse-genome.bcm.tmc.edu/vision .
Genome-wide exploration of silicon (Si) transporter genes, Lsi1 and Lsi2 in plants; insights into Si-accumulation status/capacity of plants.

PubMed

Vatansever, Recep; Ozyigit, Ibrahim Ilker; Filiz, Ertugrul; Gozukirmizi, Nermin

2017-04-01

Silicon (Si) is a nonessential, beneficial micronutrient for plants. It increases the plant stress tolerance in relation to its accumulation capacity. In this work, root Si transporter genes were characterized in 17 different plants and inferred for their Si-accumulation status. A total of 62 Si transporter genes (31 Lsi1 and 31 Lsi2) were identified in studied plants. Lsi1s were 261-324 residues protein with a MIP family domain whereas Lsi2s were 472-547 residues with a citrate transporter family domain. Lsi1s possessed characteristic sequence features that can be employed as benchmark in prediction of Si-accumulation status/capacity of the plants. Silicic acid selectivity in Lsi1s was associated with two highly conserved NPA (Asn-Pro-Ala) motifs and a Gly-Ser-Gly-Arg (GSGR) ar/R filter. Two NPA regions were present in all Lsi1 members but some Ala substituted with Ser or Val. GSGR filter was only available in the proposed high and moderate Si accumulators. In phylogeny, Lsi1s formed three clusters as low, moderate and high Si accumulators based on tree topology and availability of GSGR filter. Low-accumulators contained filters WIGR, AIGR, FAAR, WVAR and AVAR, high-accumulators only with GSGR filter, and moderate-accumulators mostly with GSGR but some with A/CSGR filters. A positive correlation was also available between sequence homology and Si-accumulation status of the tested plants. Thus, availability of GSGR selectivity filter and sequence homology degree could be used as signatures in prediction of Si-accumulation status in experimentally uncharacterized plants. Moreover, interaction partner and expression profile analyses implicated the involvement of Si transporters in plant stress tolerance.
Exome Sequencing Identified a Splice Site Mutation in FHL1 that Causes Uruguay Syndrome, an X-Linked Disorder With Skeletal Muscle Hypertrophy and Premature Cardiac Death.

PubMed

Xue, Yuan; Schoser, Benedikt; Rao, Aliz R; Quadrelli, Roberto; Vaglio, Alicia; Rupp, Verena; Beichler, Christine; Nelson, Stanley F; Schapacher-Tilp, Gudrun; Windpassinger, Christian; Wilcox, William R

2016-04-01

Previously, we reported a rare X-linked disorder, Uruguay syndrome in a single family. The main features are pugilistic facies, skeletal deformities, and muscular hypertrophy despite a lack of exercise and cardiac ventricular hypertrophy leading to premature death. An ≈19 Mb critical region on X chromosome was identified through identity-by-descent analysis of 3 affected males. Exome sequencing was conducted on one affected male to identify the disease-causing gene and variant. A splice site variant (c.502-2A>G) in the FHL1 gene was highly suspicious among other candidate genes and variants. FHL1A is the predominant isoform of FHL1 in cardiac and skeletal muscle. Sequencing cDNA showed the splice site variant led to skipping of exons 6 of the FHL1A isoform, equivalent to the FHL1C isoform. Targeted analysis showed that this splice site variant cosegregated with disease in the family. Western blot and immunohistochemical analysis of muscle from the proband showed a significant decrease in protein expression of FHL1A. Real-time polymerase chain reaction analysis of different isoforms of FHL1 demonstrated that the FHL1C is markedly increased. Mutations in the FHL1 gene have been reported in disorders with skeletal and cardiac myopathy but none has the skeletal or facial phenotype seen in patients with Uruguay syndrome. Our data suggest that a novel FHL1 splice site variant results in the absence of FHL1A and the abundance of FHL1C, which may contribute to the complex and severe phenotype. Mutation screening of the FHL1 gene should be considered for patients with uncharacterized myopathies and cardiomyopathies. © 2016 American Heart Association, Inc.
Itaya virus, a Novel Orthobunyavirus Associated with Human Febrile Illness, Peru.

PubMed

Hontz, Robert D; Guevara, Carolina; Halsey, Eric S; Silvas, Jesus; Santiago, Felix W; Widen, Steven G; Wood, Thomas G; Casanova, Wilma; Vasilakis, Nikos; Watts, Douglas M; Kochel, Tadeusz J; Ebihara, Hideki; Aguilar, Patricia V

2015-05-01

Our genetic analyses of uncharacterized bunyaviruses isolated in Peru identified a possible reassortant virus containing small and large gene segment sequences closely related to the Caraparu virus and a medium gene segment sequence potentially derived from an unidentified group C orthobunyavirus. Neutralization tests confirmed serologic distinction among the newly identified virus and the prototype and Caraparu strains. This virus, named Itaya, was isolated in 1999 and 2006 from febrile patients in the cities of Iquitos and Yurimaguas in Peru. The geographic distance between the 2 cases suggests that the Itaya virus could be widely distributed throughout the Amazon basin in northeastern Peru. Identification of a new Orthobunyavirus species that causes febrile disease in humans reinforces the need to expand viral disease surveillance in tropical regions of South America.
Itaya virus, a Novel Orthobunyavirus Associated with Human Febrile Illness, Peru

PubMed Central

Hontz, Robert D.; Guevara, Carolina; Halsey, Eric S.; Silvas, Jesus; Santiago, Felix W.; Widen, Steven G.; Wood, Thomas G.; Casanova, Wilma; Vasilakis, Nikos; Watts, Douglas M.; Kochel, Tadeusz J.; Ebihara, Hideki

2015-01-01

Our genetic analyses of uncharacterized bunyaviruses isolated in Peru identified a possible reassortant virus containing small and large gene segment sequences closely related to the Caraparu virus and a medium gene segment sequence potentially derived from an unidentified group C orthobunyavirus. Neutralization tests confirmed serologic distinction among the newly identified virus and the prototype and Caraparu strains. This virus, named Itaya, was isolated in 1999 and 2006 from febrile patients in the cities of Iquitos and Yurimaguas in Peru. The geographic distance between the 2 cases suggests that the Itaya virus could be widely distributed throughout the Amazon basin in northeastern Peru. Identification of a new Orthobunyavirus species that causes febrile disease in humans reinforces the need to expand viral disease surveillance in tropical regions of South America. PMID:25898901
Profiling the nucleobase and structure selectivity of anticancer drugs and other DNA alkylating agents by RNA sequencing.

PubMed

Gillingham, Dennis; Sauter, Basilius

2018-05-06

Drugs that covalently modify DNA are components of most chemotherapy regimens, often serving as first-line treatments. Classically the chemical reactivity of DNA alkylators has been determined in vitro with short oligonucleotides. Here we use next generation RNA sequencing to report on the chemoselectivity of alkylating agents. We develop the method with the well-known clinically used DNA modifiying drugs streptozotocin and temozolomide, and then apply the technique to profile RNA modification with uncharacterized alkylation reactions such as with powerful electrophiles like trimethylsilyldiazomethane. The multiplexed and massively parallel format of NGS offers analyses of chemical reactivity in nucleic acids to be accomplished in less time with greater statistical power. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Proteolysis of complement factors iC3b and C5 by the serine protease prostate-specific antigen in prostatic fluid and seminal plasma.

PubMed

Manning, Michael L; Williams, Simon A; Jelinek, Christine A; Kostova, Maya B; Denmeade, Samuel R

2013-03-15

Prostate-specific Ag (PSA) is a serine protease that is expressed exclusively by normal and malignant prostate epithelial cells. The continued high-level expression of PSA by the majority of men with both high- and low-grade prostate cancer throughout the course of disease progression, even in the androgen-ablated state, suggests that PSA has a role in the pathogenesis of disease. Current experimental and clinical evidence suggests that chronic inflammation, regardless of the cause, may predispose men to prostate cancer. The responsibility of the immune system in immune surveillance and eventually tumor progression is well appreciated but not completely understood. In this study, we used a mass spectrometry-based evaluation of prostatic fluid obtained from diseased prostates after removal by radical prostatectomy to identify potential immunoregulatory proteins. This analysis revealed the presence of Igs and the complement system proteins C3, factor B, and clusterin. Verification of these findings by Western blot confirmed the high-level expression of C3 in the prostatic fluid and the presence of a previously uncharacterized C-terminal C3 cleavage product. Biochemical analysis of this C3 cleavage fragment revealed a putative PSA cleavage site after tyrosine-1348. Purified PSA was able to cleave iC3b and the related complement protein C5. These results suggest a previously uncharacterized function of PSA as an immunoregulatory protease that could help to create an environment hospitable to malignancy through proteolysis of the complement system.
Pleiotropic Regulation of Virulence Genes in Streptococcus mutans by the Conserved Small Protein SprV.

PubMed

Shankar, Manoharan; Hossain, Mohammad S; Biswas, Indranil

2017-04-15

Streptococcus mutans , an oral pathogen associated with dental caries, colonizes tooth surfaces as polymicrobial biofilms known as dental plaque. S. mutans expresses several virulence factors that allow the organism to tolerate environmental fluctuations and compete with other microorganisms. We recently identified a small hypothetical protein (90 amino acids) essential for the normal growth of the bacterium. Inactivation of the gene, SMU.2137, encoding this protein caused a significant growth defect and loss of various virulence-associated functions. An S. mutans strain lacking this gene was more sensitive to acid, temperature, osmotic, oxidative, and DNA damage-inducing stresses. In addition, we observed an altered protein profile and defects in biofilm formation, bacteriocin production, and natural competence development, possibly due to the fitness defect associated with SMU.2137 deletion. Transcriptome sequencing revealed that nearly 20% of the S. mutans genes were differentially expressed upon SMU.2137 deletion, thereby suggesting a pleiotropic effect. Therefore, we have renamed this hitherto uncharacterized gene as sprV ( s treptococcal p leiotropic r egulator of v irulence). The transcript levels of several relevant genes in the sprV mutant corroborated the phenotypes observed upon sprV deletion. Owing to its highly conserved nature, inactivation of the sprV ortholog in Streptococcus gordonii also resulted in poor growth and defective UV tolerance and competence development as in the case of S. mutans Our experiments suggest that SprV is functionally distinct from its homologs identified by structure and sequence homology. Nonetheless, our current work is aimed at understanding the importance of SprV in the S. mutans biology. IMPORTANCE Streptococcus mutans employs several virulence factors and stress resistance mechanisms to colonize tooth surfaces and cause dental caries. Bacterial pathogenesis is generally controlled by regulators of fitness that are critical for successful disease establishment. Sometimes these regulators, which are potential targets for antimicrobials, are lost in the genomic context due to the lack of annotated homologs. This work outlines the regulatory impact of a small, highly conserved hypothetical protein, SprV, encoded by S. mutans We show that SprV affects the transcript levels of various virulence factors required for normal growth, biofilm formation, stress tolerance, genetic competence, and bacteriocin production. Copyright © 2017 American Society for Microbiology.
Pleiotropic Regulation of Virulence Genes in Streptococcus mutans by the Conserved Small Protein SprV

PubMed Central

Shankar, Manoharan; Hossain, Mohammad S.

2017-01-01

ABSTRACT Streptococcus mutans, an oral pathogen associated with dental caries, colonizes tooth surfaces as polymicrobial biofilms known as dental plaque. S. mutans expresses several virulence factors that allow the organism to tolerate environmental fluctuations and compete with other microorganisms. We recently identified a small hypothetical protein (90 amino acids) essential for the normal growth of the bacterium. Inactivation of the gene, SMU.2137, encoding this protein caused a significant growth defect and loss of various virulence-associated functions. An S. mutans strain lacking this gene was more sensitive to acid, temperature, osmotic, oxidative, and DNA damage-inducing stresses. In addition, we observed an altered protein profile and defects in biofilm formation, bacteriocin production, and natural competence development, possibly due to the fitness defect associated with SMU.2137 deletion. Transcriptome sequencing revealed that nearly 20% of the S. mutans genes were differentially expressed upon SMU.2137 deletion, thereby suggesting a pleiotropic effect. Therefore, we have renamed this hitherto uncharacterized gene as sprV (streptococcal pleiotropic regulator of virulence). The transcript levels of several relevant genes in the sprV mutant corroborated the phenotypes observed upon sprV deletion. Owing to its highly conserved nature, inactivation of the sprV ortholog in Streptococcus gordonii also resulted in poor growth and defective UV tolerance and competence development as in the case of S. mutans. Our experiments suggest that SprV is functionally distinct from its homologs identified by structure and sequence homology. Nonetheless, our current work is aimed at understanding the importance of SprV in the S. mutans biology. IMPORTANCE Streptococcus mutans employs several virulence factors and stress resistance mechanisms to colonize tooth surfaces and cause dental caries. Bacterial pathogenesis is generally controlled by regulators of fitness that are critical for successful disease establishment. Sometimes these regulators, which are potential targets for antimicrobials, are lost in the genomic context due to the lack of annotated homologs. This work outlines the regulatory impact of a small, highly conserved hypothetical protein, SprV, encoded by S. mutans. We show that SprV affects the transcript levels of various virulence factors required for normal growth, biofilm formation, stress tolerance, genetic competence, and bacteriocin production. PMID:28167518
Translation Control of Swarming Proficiency in Bacillus subtilis by 5-Amino-pentanolylated Elongation Factor P.

PubMed

Rajkovic, Andrei; Hummels, Katherine R; Witzky, Anne; Erickson, Sarah; Gafken, Philip R; Whitelegge, Julian P; Faull, Kym F; Kearns, Daniel B; Ibba, Michael

2016-05-20

Elongation factor P (EF-P) accelerates diprolyl synthesis and requires a posttranslational modification to maintain proteostasis. Two phylogenetically distinct EF-P modification pathways have been described and are encoded in the majority of Gram-negative bacteria, but neither is present in Gram-positive bacteria. Prior work suggested that the EF-P-encoding gene (efp) primarily supports Bacillus subtilis swarming differentiation, whereas EF-P in Gram-negative bacteria has a more global housekeeping role, prompting our investigation to determine whether EF-P is modified and how it impacts gene expression in motile cells. We identified a 5-aminopentanol moiety attached to Lys(32) of B. subtilis EF-P that is required for swarming motility. A fluorescent in vivo B. subtilis reporter system identified peptide motifs whose efficient synthesis was most dependent on 5-aminopentanol EF-P. Examination of the B. subtilis genome sequence showed that these EF-P-dependent peptide motifs were represented in flagellar genes. Taken together, these data show that, in B. subtilis, a previously uncharacterized posttranslational modification of EF-P can modulate the synthesis of specific diprolyl motifs present in proteins required for swarming motility. © 2016 by The American Society for Biochemistry and Molecular Biology, Inc.
Novel recA-Independent Horizontal Gene Transfer in Escherichia coli K-12.

PubMed

Kingston, Anthony W; Roussel-Rossin, Chloé; Dupont, Claire; Raleigh, Elisabeth A

2015-01-01

In bacteria, mechanisms that incorporate DNA into a genome without strand-transfer proteins such as RecA play a major role in generating novelty by horizontal gene transfer. We describe a new illegitimate recombination event in Escherichia coli K-12: RecA-independent homologous replacements, with very large (megabase-length) donor patches replacing recipient DNA. A previously uncharacterized gene (yjiP) increases the frequency of RecA-independent replacement recombination. To show this, we used conjugal DNA transfer, combining a classical conjugation donor, HfrH, with modern genome engineering methods and whole genome sequencing analysis to enable interrogation of genetic dependence of integration mechanisms and characterization of recombination products. As in classical experiments, genomic DNA transfer begins at a unique position in the donor, entering the recipient via conjugation; antibiotic resistance markers are then used to select recombinant progeny. Different configurations of this system were used to compare known mechanisms for stable DNA incorporation, including homologous recombination, F'-plasmid formation, and genome duplication. A genome island of interest known as the immigration control region was specifically replaced in a minority of recombinants, at a frequency of 3 X 10(-12) CFU/recipient per hour.

Isolation and Molecular Characterization of the Transformer Gene From Bactrocera cucurbitae (Diptera: Tephritidae)

PubMed Central

Luo, Ya; Zhao, Santao; Li, Jiahui; Li, Peizheng

2017-01-01

transformer (tra) is a switch gene of sex determination in many insects, particularly in Dipterans. However, the sex determination pathway in Bactrocera cucurbitae (Coquillett), a very destructive pest on earth, remains largely uncharacterized. In this study, we have isolated and characterized one female-specific and two male-specific transcripts of the tra gene (Bcutra) of B. cucurbitae. The genomic structure of Bcutra has been determined and the presence of multiple conserved Transformer (TRA)/TRA-2 binding sites in Bcutra has been found. BcuTRA is highly conservative with its homologues in other tephritid fruit flies. Gene expression analysis of Bcutra at different developmental stages demonstrates that the female transcript of Bcutra appears earlier than the male counterparts, indicating that the maternal TRA is inherited in eggs and might play a role in the regulation of TRA expression. The conservation of protein sequence and sex-specific splicing of Bcutra and its expression patterns during development suggest that Bcutra is probably the master gene of sex determination of B. cucurbitae. Isolation of Bcutra will facilitate the development of a genetic sexing strain for its biological control. PMID:28931159
Isolation and Molecular Characterization of the Transformer Gene From Bactrocera cucurbitae (Diptera: Tephritidae).

PubMed

Luo, Ya; Zhao, Santao; Li, Jiahui; Li, Peizheng; Yan, Rihui

2017-01-01

transformer (tra) is a switch gene of sex determination in many insects, particularly in Dipterans. However, the sex determination pathway in Bactrocera cucurbitae (Coquillett), a very destructive pest on earth, remains largely uncharacterized. In this study, we have isolated and characterized one female-specific and two male-specific transcripts of the tra gene (Bcutra) of B. cucurbitae. The genomic structure of Bcutra has been determined and the presence of multiple conserved Transformer (TRA)/TRA-2 binding sites in Bcutra has been found. BcuTRA is highly conservative with its homologues in other tephritid fruit flies. Gene expression analysis of Bcutra at different developmental stages demonstrates that the female transcript of Bcutra appears earlier than the male counterparts, indicating that the maternal TRA is inherited in eggs and might play a role in the regulation of TRA expression. The conservation of protein sequence and sex-specific splicing of Bcutra and its expression patterns during development suggest that Bcutra is probably the master gene of sex determination of B. cucurbitae. Isolation of Bcutra will facilitate the development of a genetic sexing strain for its biological control. © The Authors 2017. Published by Oxford University Press on behalf of Entomological Society of America.
Limonene Arrests Parasite Development and Inhibits Isoprenylation of Proteins in Plasmodium falciparum

PubMed Central

Moura, Ivan Cruz; Wunderlich, Gerhard; Uhrig, Maria L.; Couto, Alicia S.; Peres, Valnice J.; Katzin, Alejandro M.; Kimura, Emília A.

2001-01-01

Isoprenylation is an essential protein modification in eukaryotic cells. Herein, we report that in Plasmodium falciparum, a number of proteins were labeled upon incubation of intraerythrocytic forms with either [3H]farnesyl pyrophosphate or [3H]geranylgeranyl pyrophosphate. By thin-layer chromatography, we showed that attached isoprenoids are partially modified to dolichol and other, uncharacterized, residues, confirming active isoprenoid metabolism in this parasite. Incubation of blood-stage P. falciparum treated with the isoprenylation inhibitor limonene significantly decreased the parasites' progression from the ring stage to the trophozoite stage and at 1.22 mM, 50% of the parasites died after the first cycle. Using Ras- and Rap-specific monoclonal antibodies, putative Rap and Ras proteins of P. falciparum were immunoprecipitated. Upon treatment with 0.5 mM limonene, isoprenylation of these proteins was significantly decreased, possibly explaining the observed arrest of parasite development. PMID:11502528
Glucose-6-phosphate isomerase is necessary for embryo implantation in the domestic ferret

PubMed Central

Schulz, Laura Clamon; Bahr, Janice M.

2003-01-01

The mechanism of implantation in carnivores is poorly understood. However, a previously unidentified 60-kDa protein has been shown to be necessary for embryo implantation in ferrets. Here we identify this protein as glucose-6-phosphate isomerase (GPI). GPI is expressed by the corpus luteum on days 6–9 of pregnancy, the time at which implantation-promoting activity has been found in corpora lutea. Passive immunization against GPI reduced the number of implantation sites in pregnant ferrets in a dose-dependent manner. GPI is a multifunctional protein. Although first identified for its role in glycolysis, GPI has since been implicated in neural growth, lymphocyte maturation, and metastasis. This study demonstrates a previously uncharacterized function of this protein that may represent the natural motility-stimulating activity that has been co-opted by tumor cells. PMID:12826606
Plum Pox Virus 6K1 Protein Is Required for Viral Replication and Targets the Viral Replication Complex at the Early Stage of Infection.

PubMed

Cui, Hongguang; Wang, Aiming

2016-05-15

The potyviral RNA genome encodes two polyproteins that are proteolytically processed by three viral protease domains into 11 mature proteins. Extensive molecular studies have identified functions for the majority of the viral proteins. For example, 6K2, one of the two smallest potyviral proteins, is an integral membrane protein and induces the endoplasmic reticulum (ER)-originated replication vesicles that target the chloroplast for robust viral replication. However, the functional role of 6K1, the other smallest protein, remains uncharacterized. In this study, we developed a series of recombinant full-length viral cDNA clones derived from a Canadian Plum pox virus (PPV) isolate. We found that deletion of any of the short motifs of 6K1 (each of which ranged from 5 to 13 amino acids), most of the 6K1 sequence (but with the conserved sequence of the cleavage sites being retained), or all of the 6K1 sequence in the PPV infectious clone abolished viral replication. The trans expression of 6K1 or the cis expression of a dislocated 6K1 failed to rescue the loss-of-replication phenotype, suggesting the temporal and spatial requirement of 6K1 for viral replication. Disruption of the N- or C-terminal cleavage site of 6K1, which prevented the release of 6K1 from the polyprotein, either partially or completely inhibited viral replication, suggesting the functional importance of the mature 6K1. We further found that green fluorescent protein-tagged 6K1 formed punctate inclusions at the viral early infection stage and colocalized with chloroplast-bound viral replicase elements 6K2 and NIb. Taken together, our results suggest that 6K1 is required for viral replication and is an important viral element of the viral replication complex at the early infection stage. Potyviruses account for more than 30% of known plant viruses and consist of many agriculturally important viruses. The genomes of potyviruses encode two polyproteins that are proteolytically processed into 11 mature proteins, with the majority of them having been at least partially functionally characterized. However, the functional role of a small protein named 6K1 remains obscure. In this study, we showed that deletion of 6K1 or a short motif/region of 6K1 in the full-length cDNA clones of plum pox virus abolishes viral replication and that mutation of the N- or C-terminal cleavage sites of 6K1 to prevent its release from the polyprotein greatly attenuates or completely inhibits viral replication, suggesting its important role in potyviral infection. We report that 6K1 forms punctate structures and targets the replication vesicles in PPV-infected plant leaf cells at the early infection stage. Our data reveal that 6K1 is an important viral protein of the potyviral replication complex. Copyright © 2016, American Society for Microbiology. All Rights Reserved.
3D-SURFER 2.0: web platform for real-time search and characterization of protein surfaces.

PubMed

Xiong, Yi; Esquivel-Rodriguez, Juan; Sael, Lee; Kihara, Daisuke

2014-01-01

The increasing number of uncharacterized protein structures necessitates the development of computational approaches for function annotation using the protein tertiary structures. Protein structure database search is the basis of any structure-based functional elucidation of proteins. 3D-SURFER is a web platform for real-time protein surface comparison of a given protein structure against the entire PDB using 3D Zernike descriptors. It can smoothly navigate the protein structure space in real-time from one query structure to another. A major new feature of Release 2.0 is the ability to compare the protein surface of a single chain, a single domain, or a single complex against databases of protein chains, domains, complexes, or a combination of all three in the latest PDB. Additionally, two types of protein structures can now be compared: all-atom-surface and backbone-atom-surface. The server can also accept a batch job for a large number of database searches. Pockets in protein surfaces can be identified by VisGrid and LIGSITE (csc) . The server is available at http://kiharalab.org/3d-surfer/.
Genome sequencing of bacteria: sequencing, de novo assembly and rapid analysis using open source tools.

PubMed

Kisand, Veljo; Lettieri, Teresa

2013-04-01

De novo genome sequencing of previously uncharacterized microorganisms has the potential to open up new frontiers in microbial genomics by providing insight into both functional capabilities and biodiversity. Until recently, Roche 454 pyrosequencing was the NGS method of choice for de novo assembly because it generates hundreds of thousands of long reads (<450 bps), which are presumed to aid in the analysis of uncharacterized genomes. The array of tools for processing NGS data are increasingly free and open source and are often adopted for both their high quality and role in promoting academic freedom. The error rate of pyrosequencing the Alcanivorax borkumensis genome was such that thousands of insertions and deletions were artificially introduced into the finished genome. Despite a high coverage (~30 fold), it did not allow the reference genome to be fully mapped. Reads from regions with errors had low quality, low coverage, or were missing. The main defect of the reference mapping was the introduction of artificial indels into contigs through lower than 100% consensus and distracting gene calling due to artificial stop codons. No assembler was able to perform de novo assembly comparable to reference mapping. Automated annotation tools performed similarly on reference mapped and de novo draft genomes, and annotated most CDSs in the de novo assembled draft genomes. Free and open source software (FOSS) tools for assembly and annotation of NGS data are being developed rapidly to provide accurate results with less computational effort. Usability is not high priority and these tools currently do not allow the data to be processed without manual intervention. Despite this, genome assemblers now readily assemble medium short reads into long contigs (>97-98% genome coverage). A notable gap in pyrosequencing technology is the quality of base pair calling and conflicting base pairs between single reads at the same nucleotide position. Regardless, using draft whole genomes that are not finished and remain fragmented into tens of contigs allows one to characterize unknown bacteria with modest effort.
CD8+ T Lymphocyte Epitopes From The Herpes Simplex Virus Type 2 ICP27, VP22 and VP13/14 Proteins To Facilitate Vaccine Design And Characterization

PubMed Central

Platt, Rebecca J.; Khodai, Tansi; Townend, Tim J.; Bright, Helen H.; Cockle, Paul; Perez-Tosar, Luis; Webster, Rob; Champion, Brian; Hickling, Timothy P.; Mirza, Fareed

2013-01-01

CD8+ T cells have the potential to control HSV-2 infection. However, limited information has been available on CD8+ T cell epitopes or the functionality of antigen specific T cells during infection or following immunization with experimental vaccines. Peptide panels from HSV-2 proteins ICP27, VP22 and VP13/14 were selected from in silico predictions of binding to human HLA-A*0201 and mouse H-2Kd, Ld and Dd molecules. Nine previously uncharacterized CD8+ T cell epitopes were identified from HSV-2 infected BALB/c mice. HSV-2 specific peptide sequences stabilized HLA-A*02 surface expression with intermediate or high affinity binding. Peptide specific CD8+ human T cell lines from peripheral blood lymphocytes were generated from a HLA-A*02+ donor. High frequencies of peptide specific CD8+ T cell responses were elicited in mice by DNA vaccination with ICP27, VP22 and VP13/14, as demonstrated by CD107a mobilization. Vaccine driven T cell responses displayed a more focused immune response than those induced by viral infection. Furthermore, vaccination with ICP27 reduced viral shedding and reduced the clinical impact of disease. In conclusion, this study describes novel HSV-2 epitopes eliciting strong CD8+ T cell responses that may facilitate epitope based vaccine design and aid immunomonitoring of antigen specific T cell frequencies in preclinical and clinical settings. PMID:24709642
YNL134C from Saccharomyces cerevisiae encodes a novel protein with aldehyde reductase activity for detoxification of furfural derived from lignocellulosic biomass.

PubMed

Zhao, Xianxian; Tang, Juan; Wang, Xu; Yang, Ruoheng; Zhang, Xiaoping; Gu, Yunfu; Li, Xi; Ma, Menggen

2015-05-01

Furfural and 5-hydroxymethylfurfural (HMF) are the two main aldehyde compounds derived from pentoses and hexoses, respectively, during lignocellulosic biomass pretreatment. These two compounds inhibit microbial growth and interfere with subsequent alcohol fermentation. Saccharomyces cerevisiae has the in situ ability to detoxify furfural and HMF to the less toxic 2-furanmethanol (FM) and furan-2,5-dimethanol (FDM), respectively. Herein, we report that an uncharacterized gene, YNL134C, was highly up-regulated under furfural or HMF stress and Yap1p and Msn2/4p transcription factors likely controlled its up-regulated expression. Enzyme activity assays showed that YNL134C is an NADH-dependent aldehyde reductase, which plays a role in detoxification of furfural to FM. However, no NADH- or NADPH-dependent enzyme activity was observed for detoxification of HMF to FDM. This enzyme did not catalyse the reverse reaction of FM to furfural or FDM to HMF. Further studies showed that YNL134C is a broad-substrate aldehyde reductase, which can reduce multiple aldehydes to their corresponding alcohols. Although YNL134C is grouped into the quinone oxidoreductase family, no quinone reductase activity was observed using 1,2-naphthoquinone or 9,10-phenanthrenequinone as a substrate, and phylogenetic analysis indicates that it is genetically distant to quinone reductases. Proteins similar to YNL134C in sequence from S. cerevisiae and other microorganisms were phylogenetically analysed. Copyright © 2015 John Wiley & Sons, Ltd.
Community structure of free-floating filamentous cyanobacterial mats from the Wonder Lake geothermal springs in the Philippines.

PubMed

Lacap, Donnabella C; Smith, Gavin J D; Warren-Rhodes, Kimberley; Pointing, Stephen B

2005-07-01

Cyanobacterial mats were characterized from pools of 45-60 degrees C in near-neutral pH, low-sulphide geothermal springs in the Philippines. Mat structure did not vary with temperature. All mats possessed highly ordered layers of airspaces at both the macroscopic and microscopic level, and these appear to be an adaptation to a free-floating growth habit. Upper mat layers supported biomass with elevated carotenoid:chlorophyll a ratios and an as yet uncharacterized waxy layer on the dorsal surface. Microscopic examination revealed mats comprised a single Fischerella morphotype, with abundant heterocysts throughout mats at all temperatures. Molecular analysis of mat community structure only partly matched morphological identification. All samples supported greater 16S rDNA-defined diversity than morphology suggested, with a progressive loss in the number of genotypes with increasing temperature. Fischerella-like sequences were recovered from mats occurring at all temperatures, but some mats also yielded Oscillatoria-like sequences, although corresponding phenotypes were not observed. Phylogenetic analysis revealed that Fischerella-like sequences were most closely affiliated with Fischerella major and the Oscillatoria-like sequences with Oscillatoria amphigranulata.
Functional Features of TonB Energy Transduction Systems of Acinetobacter baumannii

PubMed Central

Zimbler, Daniel L.; Arivett, Brock A.; Beckett, Amber C.; Menke, Sharon M.

2013-01-01

Acinetobacter baumannii is an opportunistic pathogen that causes severe nosocomial infections. Strain ATCC 19606T utilizes the siderophore acinetobactin to acquire iron under iron-limiting conditions encountered in the host. Accordingly, the genome of this strain has three tonB genes encoding proteins for energy transduction functions needed for the active transport of nutrients, including iron, through the outer membrane. Phylogenetic analysis indicates that these tonB genes, which are present in the genomes of all sequenced A. baumannii strains, were acquired from different sources. Two of these genes occur as components of tonB-exbB-exbD operons and one as a monocistronic copy; all are actively transcribed in ATCC 19606T. The abilities of components of these TonB systems to complement the growth defect of Escherichia coli W3110 mutants KP1344 (tonB) and RA1051 (exbBD) under iron-chelated conditions further support the roles of these TonB systems in iron acquisition. Mutagenesis analysis of ATCC 19606T tonB1 (subscripted numbers represent different copies of genes or proteins) and tonB2 supports this hypothesis: their inactivation results in growth defects in iron-chelated media, without affecting acinetobactin biosynthesis or the production of the acinetobactin outer membrane receptor protein BauA. In vivo assays using Galleria mellonella show that each TonB protein is involved in, but not essential for, bacterial virulence in this infection model. Furthermore, we observed that TonB2 plays a role in the ability of bacteria to bind to fibronectin and to adhere to A549 cells by uncharacterized mechanisms. Taken together, these results indicate that A. baumannii ATCC 19606T produces three independent TonB proteins, which appear to provide the energy-transducing functions needed for iron acquisition and cellular processes that play a role in the virulence of this pathogen. PMID:23817614
Using phylogenetically-informed annotation (PIA) to search for light-interacting genes in transcriptomes from non-model organisms.

PubMed

Speiser, Daniel I; Pankey, M Sabrina; Zaharoff, Alexander K; Battelle, Barbara A; Bracken-Grissom, Heather D; Breinholt, Jesse W; Bybee, Seth M; Cronin, Thomas W; Garm, Anders; Lindgren, Annie R; Patel, Nipam H; Porter, Megan L; Protas, Meredith E; Rivera, Ajna S; Serb, Jeanne M; Zigler, Kirk S; Crandall, Keith A; Oakley, Todd H

2014-11-19

Tools for high throughput sequencing and de novo assembly make the analysis of transcriptomes (i.e. the suite of genes expressed in a tissue) feasible for almost any organism. Yet a challenge for biologists is that it can be difficult to assign identities to gene sequences, especially from non-model organisms. Phylogenetic analyses are one useful method for assigning identities to these sequences, but such methods tend to be time-consuming because of the need to re-calculate trees for every gene of interest and each time a new data set is analyzed. In response, we employed existing tools for phylogenetic analysis to produce a computationally efficient, tree-based approach for annotating transcriptomes or new genomes that we term Phylogenetically-Informed Annotation (PIA), which places uncharacterized genes into pre-calculated phylogenies of gene families. We generated maximum likelihood trees for 109 genes from a Light Interaction Toolkit (LIT), a collection of genes that underlie the function or development of light-interacting structures in metazoans. To do so, we searched protein sequences predicted from 29 fully-sequenced genomes and built trees using tools for phylogenetic analysis in the Osiris package of Galaxy (an open-source workflow management system). Next, to rapidly annotate transcriptomes from organisms that lack sequenced genomes, we repurposed a maximum likelihood-based Evolutionary Placement Algorithm (implemented in RAxML) to place sequences of potential LIT genes on to our pre-calculated gene trees. Finally, we implemented PIA in Galaxy and used it to search for LIT genes in 28 newly-sequenced transcriptomes from the light-interacting tissues of a range of cephalopod mollusks, arthropods, and cubozoan cnidarians. Our new trees for LIT genes are available on the Bitbucket public repository ( http://bitbucket.org/osiris_phylogenetics/pia/ ) and we demonstrate PIA on a publicly-accessible web server ( http://galaxy-dev.cnsi.ucsb.edu/pia/ ). Our new trees for LIT genes will be a valuable resource for researchers studying the evolution of eyes or other light-interacting structures. We also introduce PIA, a high throughput method for using phylogenetic relationships to identify LIT genes in transcriptomes from non-model organisms. With simple modifications, our methods may be used to search for different sets of genes or to annotate data sets from taxa outside of Metazoa.
High-resolution characterization of a hepatocellular carcinoma genome.

PubMed

Totoki, Yasushi; Tatsuno, Kenji; Yamamoto, Shogo; Arai, Yasuhito; Hosoda, Fumie; Ishikawa, Shumpei; Tsutsumi, Shuichi; Sonoda, Kohtaro; Totsuka, Hirohiko; Shirakihara, Takuya; Sakamoto, Hiromi; Wang, Linghua; Ojima, Hidenori; Shimada, Kazuaki; Kosuge, Tomoo; Okusaka, Takuji; Kato, Kazuto; Kusuda, Jun; Yoshida, Teruhiko; Aburatani, Hiroyuki; Shibata, Tatsuhiro

2011-05-01

Hepatocellular carcinoma, one of the most common virus-associated cancers, is the third most frequent cause of cancer-related death worldwide. By massively parallel sequencing of a primary hepatitis C virus-positive hepatocellular carcinoma (36× coverage) and matched lymphocytes (>28× coverage) from the same individual, we identified more than 11,000 somatic substitutions of the tumor genome that showed predominance of T>C/A>G transition and a decrease of the T>C substitution on the transcribed strand, suggesting preferential DNA repair. Gene annotation enrichment analysis of 63 validated non-synonymous substitutions revealed enrichment of phosphoproteins. We further validated 22 chromosomal rearrangements, generating four fusion transcripts that had altered transcriptional regulation (BCORL1-ELF4) or promoter activity. Whole-exome sequencing at a higher sequence depth (>76× coverage) revealed a TSC1 nonsense substitution in a subpopulation of the tumor cells. This first high-resolution characterization of a virus-associated cancer genome identified previously uncharacterized mutation patterns, intra-chromosomal rearrangements and fusion genes, as well as genetic heterogeneity within the tumor.
Precise detection of chromosomal translocation or inversion breakpoints by whole-genome sequencing.

PubMed

Suzuki, Toshifumi; Tsurusaki, Yoshinori; Nakashima, Mitsuko; Miyake, Noriko; Saitsu, Hirotomo; Takeda, Satoru; Matsumoto, Naomichi

2014-12-01

Structural variations (SVs), including translocations, inversions, deletions and duplications, are potentially associated with Mendelian diseases and contiguous gene syndromes. Determination of SV-related breakpoints at the nucleotide level is important to reveal the genetic causes for diseases. Whole-genome sequencing (WGS) by next-generation sequencers is expected to determine structural abnormalities more directly and efficiently than conventional methods. In this study, 14 SVs (9 balanced translocations, 1 inversion and 4 microdeletions) in 9 patients were analyzed by WGS with a shallow (5 × ) to moderate read coverage (20 × ). Among 28 breakpoints (as each SV has two breakpoints), 19 SV breakpoints had been determined previously at the nucleotide level by any other methods and 9 were uncharacterized. BreakDancer and Integrative Genomics Viewer determined 20 breakpoints (16 translocation, 2 inversion and 2 deletion breakpoints), but did not detect 8 breakpoints (2 translocation and 6 deletion breakpoints). These data indicate the efficacy of WGS for the precise determination of translocation and inversion breakpoints.
In vivo insertion pool sequencing identifies virulence factors in a complex fungal–host interaction

PubMed Central

Uhse, Simon; Pflug, Florian G.; Stirnberg, Alexandra; Ehrlinger, Klaus; von Haeseler, Arndt

2018-01-01

Large-scale insertional mutagenesis screens can be powerful genome-wide tools if they are streamlined with efficient downstream analysis, which is a serious bottleneck in complex biological systems. A major impediment to the success of next-generation sequencing (NGS)-based screens for virulence factors is that the genetic material of pathogens is often underrepresented within the eukaryotic host, making detection extremely challenging. We therefore established insertion Pool-Sequencing (iPool-Seq) on maize infected with the biotrophic fungus U. maydis. iPool-Seq features tagmentation, unique molecular barcodes, and affinity purification of pathogen insertion mutant DNA from in vivo-infected tissues. In a proof of concept using iPool-Seq, we identified 28 virulence factors, including 23 that were previously uncharacterized, from an initial pool of 195 candidate effector mutants. Because of its sensitivity and quantitative nature, iPool-Seq can be applied to any insertional mutagenesis library and is especially suitable for genetically complex setups like pooled infections of eukaryotic hosts. PMID:29684023
A lncRNA Perspective into (Re)Building the Heart.

PubMed

Frank, Stefan; Aguirre, Aitor; Hescheler, Juergen; Kurian, Leo

2016-01-01

Our conception of the human genome, long focused on the 2% that codes for proteins, has profoundly changed since its first draft assembly in 2001. Since then, an unanticipatedly expansive functionality and convolution has been attributed to the majority of the genome that is transcribed in a cell-type/context-specific manner into transcripts with no apparent protein coding ability. While the majority of these transcripts, currently annotated as long non-coding RNAs (lncRNAs), are functionally uncharacterized, their prominent role in embryonic development and tissue homeostasis, especially in the context of the heart, is emerging. In this review, we summarize and discuss the latest advances in understanding the relevance of lncRNAs in (re)building the heart.
Proteomic Analysis of Mitotic RNA Polymerase II Reveals Novel Interactors and Association With Proteins Dysfunctional in Disease*

PubMed Central

Möller, André; Xie, Sheila Q.; Hosp, Fabian; Lang, Benjamin; Phatnani, Hemali P.; James, Sonya; Ramirez, Francisco; Collin, Gayle B.; Naggert, Jürgen K.; Babu, M. Madan; Greenleaf, Arno L.; Selbach, Matthias; Pombo, Ana

2012-01-01

RNA polymerase II (RNAPII) transcribes protein-coding genes in eukaryotes and interacts with factors involved in chromatin remodeling, transcriptional activation, elongation, and RNA processing. Here, we present the isolation of native RNAPII complexes using mild extraction conditions and immunoaffinity purification. RNAPII complexes were extracted from mitotic cells, where they exist dissociated from chromatin. The proteomic content of native complexes in total and size-fractionated extracts was determined using highly sensitive LC-MS/MS. Protein associations with RNAPII were validated by high-resolution immunolocalization experiments in both mitotic cells and in interphase nuclei. Functional assays of transcriptional activity were performed after siRNA-mediated knockdown. We identify >400 RNAPII associated proteins in mitosis, among these previously uncharacterized proteins for which we show roles in transcriptional elongation. We also identify, as novel functional RNAPII interactors, two proteins involved in human disease, ALMS1 and TFG, emphasizing the importance of gene regulation for normal development and physiology. PMID:22199231
Identification and characterization of a novel antimicrobial protein from the housefly Musca domestica.

PubMed

Guo, Guo; Tao, Ruyu; Li, Yan; Ma, Huiling; Xiu, Jiangfan; Fu, Ping; Wu, Jianwei

2017-08-26

Antimicrobial peptides/proteins are immune-related molecules that are widely distributed in bacteria, fungi, plants, invertebrates and higher animals. They have exhibited great potential to be developed into antimicrobial drugs. The housefly, Musca domestica, lives in a highly contaminated environment and has adapted a robust immune system against various pathogens. As an effort to search for new antimicrobial molecules in the housefly, we investigated the function of an uncharacterized gene firstly by confirming that its expression was induced by infection in M. domestica. The corresponding protein was then shown to have potent antimicrobial activity. Scanning Electron Microscopy data showed that treatment of C. albicans cells with the protein caused cell size decreasing and cell elongation. The results here suggest the protein a novel class of antimicrobial protein and provide new insights into the immunological mechanisms by which M. domestica combats invading C. albicans. Copyright © 2017 Elsevier Inc. All rights reserved.
XK-related protein 5 (XKR5) is a novel negative regulator of KIT/D816V-mediated transformation.

PubMed

Sun, Jianmin; Thingholm, Tine; Højrup, Peter; Rönnstrand, Lars

2018-06-18

In order to investigate the molecular mechanisms by which the oncogenic mutant KIT/D816V causes transformation of cells, we investigated proteins that selectively bind KIT/D816V, but not wild-type KIT, as potential mediators of transformation. By mass spectrometry several proteins were identified, among them a previously uncharacterized protein denoted XKR5 (XK-related protein 5), which is related to the X Kell blood group proteins. We could demonstrate that interaction between XKR5 and KIT/D816V leads to phosphorylation of XKR5 at Tyr 369, Tyr487, and Tyr 543. Tyrosine phosphorylated XKR5 acts as a negative regulator of KIT signaling, which leads to downregulation of phosphorylation of ERK, AKT, and p38. This led to reduced proliferation and colony forming capacity in semi-solid medium. Taken together, our data demonstrate that XKR5 is a novel type of negative regulator of KIT-mediated transformation.
The developmental proteome of Drosophila melanogaster

PubMed Central

Casas-Vila, Nuria; Bluhm, Alina; Sayols, Sergi; Dinges, Nadja; Dejung, Mario; Altenhein, Tina; Kappei, Dennis; Altenhein, Benjamin; Roignant, Jean-Yves; Butter, Falk

2017-01-01

Drosophila melanogaster is a widely used genetic model organism in developmental biology. While this model organism has been intensively studied at the RNA level, a comprehensive proteomic study covering the complete life cycle is still missing. Here, we apply label-free quantitative proteomics to explore proteome remodeling across Drosophila’s life cycle, resulting in 7952 proteins, and provide a high temporal-resolved embryogenesis proteome of 5458 proteins. Our proteome data enabled us to monitor isoform-specific expression of 34 genes during development, to identify the pseudogene Cyp9f3Ψ as a protein-coding gene, and to obtain evidence of 268 small proteins. Moreover, the comparison with available transcriptomic data uncovered examples of poor correlation between mRNA and protein, underscoring the importance of proteomics to study developmental progression. Data integration of our embryogenesis proteome with tissue-specific data revealed spatial and temporal information for further functional studies of yet uncharacterized proteins. Overall, our high resolution proteomes provide a powerful resource and can be explored in detail in our interactive web interface. PMID:28381612

Comparative sequence analyses of sixteen reptilian paramyxoviruses

USGS Publications Warehouse

Ahne, W.; Batts, W.N.; Kurath, G.; Winton, J.R.

1999-01-01

Viral genomic RNA of Fer-de-Lance virus (FDLV), a paramyxovirus highly pathogenic for reptiles, was reverse transcribed and cloned. Plasmids with significant sequence similarities to the hemagglutinin-neuraminidase (HN) and polymerase (L) genes of mammalian paramyxoviruses were identified by BLAST search. Partial sequences of the FDLV genes were used to design primers for amplification by nested polymerase chain reaction (PCR) and sequencing of 518-bp L gene and 352-bp HN gene fragments from a collection of 15 previously uncharacterized reptilian paramyxoviruses. Phylogenetic analyses of the partial L and HN sequences produced similar trees in which there were two distinct subgroups of isolates that were supported with maximum bootstrap values, and several intermediate isolates. Within each subgroup the nucleotide divergence values were less than 2.5%, while the divergence between the two subgroups was 20-22%. This indicated that the two subgroups represent distinct virus species containing multiple virus strains. The five intermediate isolates had nucleotide divergence values of 11-20% and may represent additional distinct species. In addition to establishing diversity among reptilian paramyxoviruses, the phylogenetic groupings showed some correlation with geographic location, and clearly demonstrated a low level of host species-specificity within these viruses. Copyright (C) 1999 Elsevier Science B.V.
Navigating highly homologous genes in a molecular diagnostic setting: a resource for clinical next-generation sequencing.

PubMed

Mandelker, Diana; Schmidt, Ryan J; Ankala, Arunkanth; McDonald Gibson, Kristin; Bowser, Mark; Sharma, Himanshu; Duffy, Elizabeth; Hegde, Madhuri; Santani, Avni; Lebo, Matthew; Funke, Birgit

2016-12-01

Next-generation sequencing (NGS) is now routinely used to interrogate large sets of genes in a diagnostic setting. Regions of high sequence homology continue to be a major challenge for short-read technologies and can lead to false-positive and false-negative diagnostic errors. At the scale of whole-exome sequencing (WES), laboratories may be limited in their knowledge of genes and regions that pose technical hurdles due to high homology. We have created an exome-wide resource that catalogs highly homologous regions that is tailored toward diagnostic applications. This resource was developed using a mappability-based approach tailored to current Sanger and NGS protocols. Gene-level and exon-level lists delineate regions that are difficult or impossible to analyze via standard NGS. These regions are ranked by degree of affectedness, annotated for medical relevance, and classified by the type of homology (within-gene, different functional gene, known pseudogene, uncharacterized noncoding region). Additionally, we provide a list of exons that cannot be analyzed by short-amplicon Sanger sequencing. This resource can help guide clinical test design, supplemental assay implementation, and results interpretation in the context of high homology.Genet Med 18 12, 1282-1289.
Investigation of DNA sequence recognition by a streptomycete MarR family transcriptional regulator through surface plasmon resonance and X-ray crystallography

PubMed Central

Stevenson, Clare E. M.; Assaad, Aoun; Chandra, Govind; Le, Tung B. K.; Greive, Sandra J.; Bibb, Mervyn J.; Lawson, David M.

2013-01-01

Consistent with their complex lifestyles and rich secondary metabolite profiles, the genomes of streptomycetes encode a plethora of transcription factors, the vast majority of which are uncharacterized. Herein, we use Surface Plasmon Resonance (SPR) to identify and delineate putative operator sites for SCO3205, a MarR family transcriptional regulator from Streptomyces coelicolor that is well represented in sequenced actinomycete genomes. In particular, we use a novel SPR footprinting approach that exploits indirect ligand capture to vastly extend the lifetime of a standard streptavidin SPR chip. We define two operator sites upstream of sco3205 and a pseudopalindromic consensus sequence derived from these enables further potential operator sites to be identified in the S. coelicolor genome. We evaluate each of these through SPR and test the importance of the conserved bases within the consensus sequence. Informed by these results, we determine the crystal structure of a SCO3205-DNA complex at 2.8 Å resolution, enabling molecular level rationalization of the SPR data. Taken together, our observations support a DNA recognition mechanism involving both direct and indirect sequence readout. PMID:23748564
Cell signaling, post-translational protein modifications and NMR spectroscopy

PubMed Central

Theillet, Francois-Xavier; Smet-Nocca, Caroline; Liokatis, Stamatios; Thongwichian, Rossukon; Kosten, Jonas; Yoon, Mi-Kyung; Kriwacki, Richard W.; Landrieu, Isabelle; Lippens, Guy

2016-01-01

Post-translationally modified proteins make up the majority of the proteome and establish, to a large part, the impressive level of functional diversity in higher, multi-cellular organisms. Most eukaryotic post-translational protein modifications (PTMs) denote reversible, covalent additions of small chemical entities such as phosphate-, acyl-, alkyl- and glycosyl-groups onto selected subsets of modifiable amino acids. In turn, these modifications induce highly specific changes in the chemical environments of individual protein residues, which are readily detected by high-resolution NMR spectroscopy. In the following, we provide a concise compendium of NMR characteristics of the main types of eukaryotic PTMs: serine, threonine, tyrosine and histidine phosphorylation, lysine acetylation, lysine and arginine methylation, and serine, threonine O-glycosylation. We further delineate the previously uncharacterized NMR properties of lysine propionylation, butyrylation, succinylation, malonylation and crotonylation, which, altogether, define an initial reference frame for comprehensive PTM studies by high-resolution NMR spectroscopy. PMID:23011410
A human haploid gene trap collection to study lncRNAs with unusual RNA biology.

PubMed

Kornienko, Aleksandra E; Vlatkovic, Irena; Neesen, Jürgen; Barlow, Denise P; Pauler, Florian M

2016-01-01

Many thousand long non-coding (lnc) RNAs are mapped in the human genome. Time consuming studies using reverse genetic approaches by post-transcriptional knock-down or genetic modification of the locus demonstrated diverse biological functions for a few of these transcripts. The Human Gene Trap Mutant Collection in haploid KBM7 cells is a ready-to-use tool for studying protein-coding gene function. As lncRNAs show remarkable differences in RNA biology compared to protein-coding genes, it is unclear if this gene trap collection is useful for functional analysis of lncRNAs. Here we use the uncharacterized LOC100288798 lncRNA as a model to answer this question. Using public RNA-seq data we show that LOC100288798 is ubiquitously expressed, but inefficiently spliced. The minor spliced LOC100288798 isoforms are exported to the cytoplasm, whereas the major unspliced isoform is nuclear localized. This shows that LOC100288798 RNA biology differs markedly from typical mRNAs. De novo assembly from RNA-seq data suggests that LOC100288798 extends 289kb beyond its annotated 3' end and overlaps the downstream SLC38A4 gene. Three cell lines with independent gene trap insertions in LOC100288798 were available from the KBM7 gene trap collection. RT-qPCR and RNA-seq confirmed successful lncRNA truncation and its extended length. Expression analysis from RNA-seq data shows significant deregulation of 41 protein-coding genes upon LOC100288798 truncation. Our data shows that gene trap collections in human haploid cell lines are useful tools to study lncRNAs, and identifies the previously uncharacterized LOC100288798 as a potential gene regulator.
YLL056C from Saccharomyces cerevisiae encodes a novel protein with aldehyde reductase activity.

PubMed

Wang, Han-Yu; Xiao, Di-Fan; Zhou, Chang; Wang, Lin-Lu; Wu, Lan; Lu, Ya-Ting; Xiang, Quan-Ju; Zhao, Ke; Li, Xi; Ma, Meng -Gen

2017-06-01

The short-chain dehydrogenase/reductase (SDR) family, the largest family in dehydrogenase/reductase superfamily, is divided into "classical," "extended," "intermediate," "divergent," "complex," and "atypical" groups. Recently, several open reading frames (ORFs) were characterized as intermediate SDR aldehyde reductase genes in Saccharomyces cerevisiae. However, no functional protein in the atypical group has been characterized in S. cerevisiae till now. Herein, we report that an uncharacterized ORF YLL056C from S. cerevisiae was significantly upregulated under high furfural (2-furaldehyde) or 5-(hydroxymethyl)-2-furaldehyde concentrations, and transcription factors Yap1p, Hsf1p, Pdr1/3p, Yrr1p, and Stb5p likely controlled its upregulated transcription. This ORF indeed encoded a protein (Yll056cp), which was grouped into the atypical subgroup 7 in the SDR family and localized to the cytoplasm. Enzyme activity assays showed that Yll056cp is not a quinone or ketone reductase but an NADH-dependent aldehyde reductase, which can reduce at least seven aldehyde compounds. This enzyme showed the best Vmax, Kcat, and Kcat/Km to glycolaldehyde, but the highest affinity (Km) to formaldehyde. The optimum pH and temperature of this enzyme was pH 6.5 for reduction of glycolaldehyde, furfural, formaldehyde, butyraldehyde, and propylaldehyde, and 30 °C for reduction of formaldehyde or 35 °C for reduction of glycolaldehyde, furfural, butyraldehyde, and propylaldehyde. Temperature and pH affected stability of this enzyme and this influence varied with aldehyde substrate. Metal ions, salts, and chemical protective additives, especially at high concentrations, had different influence on enzyme activities for reduction of different aldehydes. This research provided guidelines for study of more uncharacterized atypical SDR enzymes from S. cerevisiae and other organisms.
An ABRE promoter sequence is involved in osmotic stress-responsive expression of the DREB2A gene, which encodes a transcription factor regulating drought-inducible genes in Arabidopsis.

PubMed

Kim, June-Sik; Mizoi, Junya; Yoshida, Takuya; Fujita, Yasunari; Nakajima, Jun; Ohori, Teppei; Todaka, Daisuke; Nakashima, Kazuo; Hirayama, Takashi; Shinozaki, Kazuo; Yamaguchi-Shinozaki, Kazuko

2011-12-01

In plants, osmotic stress-responsive transcriptional regulation depends mainly on two major classes of cis-acting elements found in the promoter regions of stress-inducible genes: ABA-responsive elements (ABREs) and dehydration-responsive elements (DREs). ABRE has been shown to perceive ABA-mediated osmotic stress signals, whereas DRE is known to be involved in an ABA-independent pathway. Previously, we reported that the transcription factor DRE-BINDING PROTEIN 2A (DREB2A) regulates DRE-mediated transcription of target genes under osmotic stress conditions in Arabidopsis (Arabidopsis thaliana). However, the transcriptional regulation of DREB2A itself remains largely uncharacterized. To elucidate the transcriptional mechanism associated with the DREB2A gene under osmotic stress conditions, we generated a series of truncated and base-substituted variants of the DREB2A promoter and evaluated their transcriptional activities individually. We found that both ABRE and coupling element 3 (CE3)-like sequences located approximately -100 bp from the transcriptional initiation site are necessary for the dehydration-responsive expression of DREB2A. Coupling our transient expression analyses with yeast one-hybrid and chromatin immunoprecipitation (ChIP) assays indicated that the ABRE-BINDING PROTEIN 1 (AREB1), AREB2 and ABRE-BINDING FACTOR 3 (ABF3) bZIP transcription factors can bind to and activate the DREB2A promoter in an ABRE-dependent manner. Exogenous ABA application induced only a modest accumulation of the DREB2A transcript when compared with the osmotic stress treatment. However, the osmotic stress-induced DREB2A expression was found to be markedly impaired in several ABA-deficient and ABA-insensitive mutants. These results suggest that in addition to an ABA-independent pathway, the ABA-dependent pathway plays a positive role in the osmotic stress-responsive expression of DREB2A.
PBOV1 Is a Human De Novo Gene with Tumor-Specific Expression That Is Associated with a Positive Clinical Outcome of Cancer

PubMed Central

Samusik, Nikolay; Krukovskaya, Larisa; Meln, Irina; Shilov, Evgeny; Kozlov, Andrey P.

2013-01-01

PBOV1 is a known human protein-coding gene with an uncharacterized function. We have previously found that PBOV1 lacks orthologs in non-primate genomes and is expressed in a wide range of tumor types. Here we report that PBOV1 protein-coding sequence is human-specific and has originated de novo in the primate evolution through a series of frame-shift and stop codon mutations. We profiled PBOV1 expression in multiple cancer and normal tissue samples and found that it was expressed in 19 out of 34 tumors of various origins but completely lacked expression in any of the normal adult or fetal human tissues. We found that, unlike the cancer/testis antigens that are typically controlled by CpG island-containing promoters, PBOV1 was expressed from a GC-poor TATA-containing promoter which was not influenced by CpG demethylation and was inactive in testis. Our analysis of public microarray data suggests that PBOV1 activation in tumors could be dependent on the Hedgehog signaling pathway. Despite the recent de novo origin and the lack of identifiable functional signatures, a missense SNP in the PBOV1 coding sequence has been previously associated with an increased risk of breast cancer. Using publicly available microarray datasets, we found that high levels of PBOV1 expression in breast cancer and glioma samples were significantly associated with a positive outcome of the cancer disease. We also found that PBOV1 was highly expressed in primary but not in recurrent high-grade gliomas, suggesting the presence of a negative selection against PBOV1-expressing cancer cells. Our findings could contribute to the understanding of the mechanisms behind de novo gene origin and the possible role of tumors in this process. PMID:23418531
The Ser/Thr Protein Kinase Protein-Protein Interaction Map of M. tuberculosis.

PubMed

Wu, Fan-Lin; Liu, Yin; Jiang, He-Wei; Luan, Yi-Zhao; Zhang, Hai-Nan; He, Xiang; Xu, Zhao-Wei; Hou, Jing-Li; Ji, Li-Yun; Xie, Zhi; Czajkowsky, Daniel M; Yan, Wei; Deng, Jiao-Yu; Bi, Li-Jun; Zhang, Xian-En; Tao, Sheng-Ce

2017-08-01

Mycobacterium tuberculosis (Mtb) is the causative agent of tuberculosis, the leading cause of death among all infectious diseases. There are 11 eukaryotic-like serine/threonine protein kinases (STPKs) in Mtb, which are thought to play pivotal roles in cell growth, signal transduction and pathogenesis. However, their underlying mechanisms of action remain largely uncharacterized. In this study, using a Mtb proteome microarray, we have globally identified the binding proteins in Mtb for all of the STPKs, and constructed the first STPK protein interaction (KPI) map that includes 492 binding proteins and 1,027 interactions. Bioinformatics analysis showed that the interacting proteins reflect diverse functions, including roles in two-component system, transcription, protein degradation, and cell wall integrity. Functional investigations confirmed that PknG regulates cell wall integrity through key components of peptidoglycan (PG) biosynthesis, e.g. MurC. The global STPK-KPIs network constructed here is expected to serve as a rich resource for understanding the key signaling pathways in Mtb, thus facilitating drug development and effective control of Mtb. © 2017 by The American Society for Biochemistry and Molecular Biology, Inc.
Holotrichia oblita Midgut Proteins That Bind to Bacillus thuringiensis Cry8-Like Toxin and Assembly of the H. oblita Midgut Tissue Transcriptome

PubMed Central

Jiang, Jian; Huang, Ying; Shu, Changlong; Soberón, Mario; Bravo, Alejandra; Liu, Chunqing; Song, Fuping; Lai, Jinsheng

2017-01-01

ABSTRACT The Bacillus thuringiensis strain HBF-18 (CGMCC 2070), containing two cry genes (cry8-like and cry8Ga), is toxic to Holotrichia oblita larvae. Both Cry8-like and Cry8Ga proteins are active against this insect pest, and Cry8-like is more toxic. To analyze the characteristics of the binding of Cry8-like and Cry8Ga proteins to brush border membrane vesicles (BBMVs) in H. oblita larvae, binding assays were conducted with a fluorescent DyLight488-labeled Cry8-like toxin. The results of saturation binding assays demonstrated that Cry8-like bound specifically to binding sites on BBMVs from H. oblita, and heterologous competition assays revealed that Cry8Ga shared binding sites with Cry8-like. Furthermore, Cry8-like-binding proteins in the midgut from H. oblita larvae were identified by pulldown assays and liquid chromatography-tandem mass spectrometry (LC-MS/MS). In addition, the H. oblita midgut transcriptome was assembled by high-throughput RNA sequencing and used for identification of Cry8-like-binding proteins. Eight Cry8-like-binding proteins were obtained from pulldown assays conducted with BBMVs. The LC-MS/MS data for these proteins were successfully matched with the H. oblita transcriptome, and BLASTX results identified five proteins as serine protease, transferrin-like, uncharacterized protein LOC658236 of Tribolium castaneum, ATPase catalytic subunit, and actin. These identified Cry8-like-binding proteins were different from those confirmed previously as receptors for Cry1A proteins in lepidopteran insect species, such as aminopeptidase, alkaline phosphatase, and cadherin. IMPORTANCE Holotrichia oblita is one of the main soil-dwelling pests in China. The larvae damage the roots of crops, resulting in significant yield reductions and economic losses. H. oblita is difficult to control, principally due to its soil-dwelling habits. In recent years, some Cry8 toxins from Bacillus thuringiensis were shown to be active against this pest. Study of the mechanism of action of these Cry8 toxins is needed for their effective use in the control of H. oblita and for their future utilization in transgenic plants. Our work provides important basic data and promotes understanding of the insecticidal mechanism of Cry8 proteins against H. oblita larvae. PMID:28389549
Unexpected biodiversity of ciliates in marine samples from below the photic zone.

PubMed

Grattepanche, Jean-David; Santoferrara, Luciana F; McManus, George B; Katz, Laura A

2016-08-01

Marine microbial eukaryotes play critical roles in planktonic food webs and have been described as most diverse in the photic zone where productivity is high. We used high-throughput sequencing (HTS) to analyse the spatial distribution of planktonic ciliate diversity from shallow waters (<30 m depth) to beyond the continental shelf (>800 m depth) along a 163 km transect off the coast of New England, USA. We focus on ciliates in the subclasses Oligotrichia and Choreotrichia (class Spirotrichea), as these taxa are major components of marine food webs. We did not observe the decrease of diversity below the photic zone expected based on productivity and previous analyses. Instead, we saw an increase of diversity with depth. We also observed that the ciliate communities assessed by HTS cluster by depth layer and degree of water column stratification, suggesting that community assembly is driven by environmental factors. Across our samples, abundant OTUs tend to match previously characterized morphospecies while rare OTUs are more often undescribed, consistent with the idea that species in the rare biosphere remain to be characterized by microscopy. Finally, samples taken below the photic zone also reveal the prevalence of two uncharacterized (i.e. lacking sequenced morphospecies) clades - clusters X1 and X2 - that are enriched within the nano-sized fraction (2-10 μm) and are defined by deletions within the region of the SSU-rDNA analysed here. Together, these data reinforce that we still have much to learn about microbial diversity in marine ecosystems, especially in deep-waters that may be a reservoir for rare species and uncharacterized taxa. © 2016 John Wiley & Sons Ltd.
Deficiency of a Niemann-Pick, Type C1-related Protein in Toxoplasma Is Associated with Multiple Lipidoses and Increased Pathogenicity

PubMed Central

Lige, Bao; Romano, Julia D.; Bandaru, Veera Venkata Ratnam; Ehrenman, Karen; Levitskaya, Jelena; Sampels, Vera; Haughey, Norman J.; Coppens, Isabelle

2011-01-01

Several proteins that play key roles in cholesterol synthesis, regulation, trafficking and signaling are united by sharing the phylogenetically conserved ‘sterol-sensing domain’ (SSD). The intracellular parasite Toxoplasma possesses at least one gene coding for a protein containing the canonical SSD. We investigated the role of this protein to provide information on lipid regulatory mechanisms in the parasite. The protein sequence predicts an uncharacterized Niemann-Pick, type C1-related protein (NPC1) with significant identity to human NPC1, and it contains many residues implicated in human NPC disease. We named this NPC1-related protein, TgNCR1. Mammalian NPC1 localizes to endo-lysosomes and promotes the movement of sterols and sphingolipids across the membranes of these organelles. Miscoding patient mutations in NPC1 cause overloading of these lipids in endo-lysosomes. TgNCR1, however, lacks endosomal targeting signals, and localizes to flattened vesicles beneath the plasma membrane of Toxoplasma. When expressed in mammalian NPC1 mutant cells and properly addressed to endo-lysosomes, TgNCR1 restores cholesterol and GM1 clearance from these organelles. To clarify the role of TgNCR1 in the parasite, we genetically disrupted NCR1; mutant parasites were viable. Quantitative lipidomic analyses on the ΔNCR1 strain reveal normal cholesterol levels but an overaccumulation of several species of cholesteryl esters, sphingomyelins and ceramides. ΔNCR1 parasites are also characterized by abundant storage lipid bodies and long membranous tubules derived from their parasitophorous vacuoles. Interestingly, these mutants can generate multiple daughters per single mother cell at high frequencies, allowing fast replication in vitro, and they are slightly more virulent in mice than the parental strain. These data suggest that the ΔNCR1 strain has lost the ability to control the intracellular levels of several lipids, which subsequently results in the stimulation of lipid storage, membrane biosynthesis and parasite division. Based on these observations, we ascribe a role for TgNCR1 in lipid homeostasis in Toxoplasma. PMID:22174676
Screening for Protein-DNA Interactions by Automatable DNA-Protein Interaction ELISA

PubMed Central

Schüssler, Axel; Kolukisaoglu, H. Üner; Koch, Grit; Wallmeroth, Niklas; Hecker, Andreas; Thurow, Kerstin; Zell, Andreas; Harter, Klaus; Wanke, Dierk

2013-01-01

DNA-binding proteins (DBPs), such as transcription factors, constitute about 10% of the protein-coding genes in eukaryotic genomes and play pivotal roles in the regulation of chromatin structure and gene expression by binding to short stretches of DNA. Despite their number and importance, only for a minor portion of DBPs the binding sequence had been disclosed. Methods that allow the de novo identification of DNA-binding motifs of known DBPs, such as protein binding microarray technology or SELEX, are not yet suited for high-throughput and automation. To close this gap, we report an automatable DNA-protein-interaction (DPI)-ELISA screen of an optimized double-stranded DNA (dsDNA) probe library that allows the high-throughput identification of hexanucleotide DNA-binding motifs. In contrast to other methods, this DPI-ELISA screen can be performed manually or with standard laboratory automation. Furthermore, output evaluation does not require extensive computational analysis to derive a binding consensus. We could show that the DPI-ELISA screen disclosed the full spectrum of binding preferences for a given DBP. As an example, AtWRKY11 was used to demonstrate that the automated DPI-ELISA screen revealed the entire range of in vitro binding preferences. In addition, protein extracts of AtbZIP63 and the DNA-binding domain of AtWRKY33 were analyzed, which led to a refinement of their known DNA-binding consensi. Finally, we performed a DPI-ELISA screen to disclose the DNA-binding consensus of a yet uncharacterized putative DBP, AtTIFY1. A palindromic TGATCA-consensus was uncovered and we could show that the GATC-core is compulsory for AtTIFY1 binding. This specific interaction between AtTIFY1 and its DNA-binding motif was confirmed by in vivo plant one-hybrid assays in protoplasts. Thus, the value and applicability of the DPI-ELISA screen for de novo binding site identification of DBPs, also under automatized conditions, is a promising approach for a deeper understanding of gene regulation in any organism of choice. PMID:24146751
Keep your fingers off my DNA: protein-protein interactions mediated by C2H2 zinc finger domains.

PubMed

Brayer, Kathryn J; Segal, David J

2008-01-01

Cys2-His2 (C2H2) zinc finger domains (ZFs) were originally identified as DNA-binding domains, and uncharacterized domains are typically assumed to function in DNA binding. However, a growing body of evidence suggests an important and widespread role for these domains in protein binding. There are even examples of zinc fingers that support both DNA and protein interactions, which can be found in well-known DNA-binding proteins such as Sp1, Zif268, and Ying Yang 1 (YY1). C2H2 protein-protein interactions (PPIs) are proving to be more abundant than previously appreciated, more plastic than their DNA-binding counterparts, and more variable and complex in their interactions surfaces. Here we review the current knowledge of over 100 C2H2 zinc finger-mediated PPIs, focusing on what is known about the binding surface, contributions of individual fingers to the interaction, and function. An accurate understanding of zinc finger biology will likely require greater insights into the potential protein interaction capabilities of C2H2 ZFs.
Structural basis of O-GlcNAc recognition by mammalian 14-3-3 proteins.

PubMed

Toleman, Clifford A; Schumacher, Maria A; Yu, Seok-Ho; Zeng, Wenjie; Cox, Nathan J; Smith, Timothy J; Soderblom, Erik J; Wands, Amberlyn M; Kohler, Jennifer J; Boyce, Michael

2018-06-05

O-GlcNAc is an intracellular posttranslational modification that governs myriad cell biological processes and is dysregulated in human diseases. Despite this broad pathophysiological significance, the biochemical effects of most O-GlcNAcylation events remain uncharacterized. One prevalent hypothesis is that O-GlcNAc moieties may be recognized by "reader" proteins to effect downstream signaling. However, no general O-GlcNAc readers have been identified, leaving a considerable gap in the field. To elucidate O-GlcNAc signaling mechanisms, we devised a biochemical screen for candidate O-GlcNAc reader proteins. We identified several human proteins, including 14-3-3 isoforms, that bind O-GlcNAc directly and selectively. We demonstrate that 14-3-3 proteins bind O-GlcNAc moieties in human cells, and we present the structures of 14-3-3β/α and γ bound to glycopeptides, providing biophysical insights into O-GlcNAc-mediated protein-protein interactions. Because 14-3-3 proteins also bind to phospho-serine and phospho-threonine, they may integrate information from O-GlcNAc and O-phosphate signaling pathways to regulate numerous physiological functions.
Physical Mapping of bchG, orf427, and orf177 in the Photosynthesis Gene Cluster of Rhodobacter sphaeroides: Functional Assignment of the Bacteriochlorophyll Synthetase Gene

PubMed Central

Addlesee, Hugh A.; Fiedor, Leszek; Hunter, C. Neil

2000-01-01

The purple photosynthetic bacterium Rhodobacter sphaeroides has within its genome a cluster of photosynthesis-related genes approximately 41 kb in length. In an attempt to identify genes involved in the terminal esterification stage of bacteriochlorophyll biosynthesis, a previously uncharacterized 5-kb region of this cluster was sequenced. Four open reading frames (ORFs) were identified, and each was analyzed by transposon mutagenesis. The product of one of these ORFs, bchG, shows close homologies with (bacterio)chlorophyll synthetases, and mutants in this gene were found to accumulate bacteriopheophorbide, the metal-free derivative of the bacteriochlorophyll precursor bacteriochlorophyllide, suggesting that bchG is responsible for the esterification of bacteriochlorophyllide with an alcohol moiety. This assignment of function to bchG was verified by the performance of assays demonstrating the ability of BchG protein, heterologously synthesized in Escherichia coli, to esterify bacteriochlorophyllide with geranylgeranyl pyrophosphate in vitro, thereby generating bacteriochlorophyll. This step is pivotal to the assembly of a functional photosystem in R. sphaeroides, a model organism for the study of structure-function relationships in photosynthesis. A second gene, orf177, is a member of a large family of isopentenyl diphosphate isomerases, while sequence homologies suggest that a third gene, orf427, may encode an assembly factor for photosynthetic complexes. The function of the remaining ORF, bchP, is the subject of a separate paper (H. Addlesee and C. N. Hunter, J. Bacteriol. 181:7248–7255, 1999). An operonal arrangement of the genes is proposed. PMID:10809697
Hypomorphic mutations in TRNT1 cause retinitis pigmentosa with erythrocytic microcytosis

PubMed Central

DeLuca, Adam P.; Whitmore, S. Scott; Barnes, Jenna; Sharma, Tasneem P.; Westfall, Trudi A.; Scott, C. Anthony; Weed, Matthew C.; Wiley, Jill S.; Wiley, Luke A.; Johnston, Rebecca M.; Schnieders, Michael J.; Lentz, Steven R.; Tucker, Budd A.; Mullins, Robert F.; Scheetz, Todd E.; Stone, Edwin M.; Slusarski, Diane C.

2016-01-01

Retinitis pigmentosa (RP) is a highly heterogeneous group of disorders characterized by degeneration of the retinal photoreceptor cells and progressive loss of vision. While hundreds of mutations in more than 100 genes have been reported to cause RP, discovering the causative mutations in many patients remains a significant challenge. Exome sequencing in an individual affected with non-syndromic RP revealed two plausibly disease-causing variants in TRNT1, a gene encoding a nucleotidyltransferase critical for tRNA processing. A total of 727 additional unrelated individuals with molecularly uncharacterized RP were completely screened for TRNT1 coding sequence variants, and a second family was identified with two members who exhibited a phenotype that was remarkably similar to the index patient. Inactivating mutations in TRNT1 have been previously shown to cause a severe congenital syndrome of sideroblastic anemia, B-cell immunodeficiency, recurrent fevers and developmental delay (SIFD). Complete blood counts of all three of our patients revealed red blood cell microcytosis and anisocytosis with only mild anemia. Characterization of TRNT1 in patient-derived cell lines revealed reduced but detectable TRNT1 protein, consistent with partial function. Suppression of trnt1 expression in zebrafish recapitulated several features of the human SIFD syndrome, including anemia and sensory organ defects. When levels of trnt1 were titrated, visual dysfunction was found in the absence of other phenotypes. The visual defects in the trnt1-knockdown zebrafish were ameliorated by the addition of exogenous human TRNT1 RNA. Our findings indicate that hypomorphic TRNT1 mutations can cause a recessive disease that is almost entirely limited to the retina. PMID:26494905
Characterization of Deletions of the HBA and HBB Loci by Array Comparative Genomic Hybridization

PubMed Central

Sabath, Daniel E.; Bender, Michael A.; Sankaran, Vijay G.; Vamos, Esther; Kentsis, Alex; Yi, Hye-Son; Greisman, Harvey A.

2017-01-01

Thalassemia is among the most common genetic diseases worldwide. α-Thalassemia is usually caused by deletion of one or more of the duplicated HBA genes on chromosome 16. In contrast, most β-thalassemia results from point mutations that decrease or eliminate expression of the HBB gene on chromosome 11. Deletions within the HBB locus result in thalassemia or hereditary persistence of fetal Hb. Although routine diagnostic testing cannot distinguish thalassemia deletions from point mutations, deletional hereditary persistence of fetal Hb is notable for having an elevated HbF level with a normal mean corpuscular volume. A small number of deletions accounts for most α-thalassemias; in contrast, there are no predominant HBB deletions causing β-thalassemia. To facilitate the identification and characterization of deletions of the HBA and HBB globin loci, we performed array-based comparative genomic hybridization using a custom oligonucleotide microarray. We accurately mapped the breakpoints of known and previously uncharacterized HBB deletions defining previously uncharacterized deletion breakpoints by PCR amplification and sequencing. The array also successfully identified the common HBA deletions --SEA and --FIL. In summary, comparative genomic hybridization can be used to characterize deletions of the HBA and HBB loci, allowing high-resolution characterization of novel deletions that are not readily detected by PCR-based methods. PMID:26612711
A database analysis method identifies an endogenous trans-acting short-interfering RNA that targets the Arabidopsis ARF2, ARF3, and ARF4 genes

PubMed Central

Williams, Leor; Carles, Cristel C.; Osmont, Karen S.; Fletcher, Jennifer C.

2005-01-01

Two classes of small RNAs, microRNAs and short-interfering RNA (siRNAs), have been extensively studied in plants and animals. In Arabidopsis, the capacity to uncover previously uncharacterized small RNAs by means of conventional strategies seems to be reaching its limits. To discover new plant small RNAs, we developed a protocol to mine an Arabidopsis nonannotated, noncoding EST database. Using this approach, we identified an endogenous small RNA, trans-acting short-interfering RNA–auxin response factor (tasiR-ARF), that shares a 21- and 22-nt region of sequence similarity with members of the ARF gene family. tasiR-ARF has characteristics of both short-interfering RNA and microRNA, recently defined as tasiRNA. Accumulation of trans-acting siRNA depends on DICER-LIKE1 and RNA-DEPENDENT RNA POLYMERASE6 but not RNA-DEPENDENT RNA POLYMERASE2. We demonstrate that tasiR-ARF targets three ARF genes, ARF2, ARF3/ETT, and ARF4, and that both the tasiR-ARF precursor and its target genes are evolutionarily conserved. The identification of tasiRNA-ARF as a low-abundance, previously uncharacterized small RNA species proves our method to be a useful tool to uncover additional small regulatory RNAs. PMID:15980147
Strategic Protein Target Analysis for Developing Drugs to Stop Dental Caries

PubMed Central

Horst, J.A.; Pieper, U.; Sali, A.; Zhan, L.; Chopra, G.; Samudrala, R.; Featherstone, J.D.B.

2012-01-01

Dental caries is the most common disease to cause irreversible damage in humans. Several therapeutic agents are available to treat or prevent dental caries, but none besides fluoride has significantly influenced the disease burden globally. Etiologic mechanisms of the mutans group streptococci and specific Lactobacillus species have been characterized to various degrees of detail, from identification of physiologic processes to specific proteins. Here, we analyze the entire Streptococcus mutans proteome for potential drug targets by investigating their uniqueness with respect to non-cariogenic dental plaque bacteria, quality of protein structure models, and the likelihood of finding a drug for the active site. Our results suggest specific targets for rational drug discovery, including 15 known virulence factors, 16 proteins for which crystallographic structures are available, and 84 previously uncharacterized proteins, with various levels of similarity to homologs in dental plaque bacteria. This analysis provides a map to streamline the process of clinical development of effective multispecies pharmacologic interventions for dental caries. PMID:22899687

Top-Down Characterization of the Post-Translationally Modified Intact Periplasmic Proteome from the Bacterium Novosphingobium aromaticivorans

DOE PAGES

Wu, Si; Brown, Roslyn N.; Payne, Samuel H.; ...

2013-01-01

The periplasm of Gram-negative bacteria is a dynamic and physiologically important subcellular compartment where the constant exposure to potential environmental insults amplifies the need for proper protein folding and modifications. Top-down proteomics analysis of the periplasmic fraction at the intact protein level provides unrestricted characterization and annotation of the periplasmic proteome, including the post-translational modifications (PTMs) on these proteins. Here, we used single-dimension ultra-high pressure liquid chromatography coupled with the Fourier transform mass spectrometry (FTMS) to investigate the intact periplasmic proteome of Novosphingobium aromaticivorans . Our top-down analysis provided the confident identification of 55 proteins in the periplasm and characterizedmore » their PTMs including signal peptide removal, N-terminal methionine excision, acetylation, glutathionylation, pyroglutamate, and disulfide bond formation. This study provides the first experimental evidence for the expression and periplasmic localization of many hypothetical and uncharacterized proteins and the first unrestrictive, large-scale data on PTMs in the bacterial periplasm.« less
Eros is a novel transmembrane protein that controls the phagocyte respiratory burst and is essential for innate immunity

PubMed Central

Thomas, David C.; Clare, Simon; Sowerby, John M.; Juss, Jatinder K.; Goulding, David A.; van der Weyden, Louise; Prakash, Ananth; Harcourt, Katherine; Mukhopadhyay, Subhankar; Antrobus, Robin; Bateman, Alex

2017-01-01

The phagocyte respiratory burst is crucial for innate immunity. The transfer of electrons to oxygen is mediated by a membrane-bound heterodimer, comprising gp91phox and p22phox subunits. Deficiency of either subunit leads to severe immunodeficiency. We describe Eros (essential for reactive oxygen species), a protein encoded by the previously undefined mouse gene bc017643, and show that it is essential for host defense via the phagocyte NAPDH oxidase. Eros is required for expression of the NADPH oxidase components, gp91phox and p22phox. Consequently, Eros-deficient mice quickly succumb to infection. Eros also contributes to the formation of neutrophil extracellular traps (NETS) and impacts on the immune response to melanoma metastases. Eros is an ortholog of the plant protein Ycf4, which is necessary for expression of proteins of the photosynthetic photosystem 1 complex, itself also an NADPH oxio-reductase. We thus describe the key role of the previously uncharacterized protein Eros in host defense. PMID:28351984
The microbiomes of blowflies and houseflies as bacterial transmission reservoirs.

PubMed

Junqueira, Ana Carolina M; Ratan, Aakrosh; Acerbi, Enzo; Drautz-Moses, Daniela I; Premkrishnan, Balakrishnan N V; Costea, Paul I; Linz, Bodo; Purbojati, Rikky W; Paulo, Daniel F; Gaultier, Nicolas E; Subramanian, Poorani; Hasan, Nur A; Colwell, Rita R; Bork, Peer; Azeredo-Espin, Ana Maria L; Bryant, Donald A; Schuster, Stephan C

2017-11-24

Blowflies and houseflies are mechanical vectors inhabiting synanthropic environments around the world. They feed and breed in fecal and decaying organic matter, but the microbiome they harbour and transport is largely uncharacterized. We sampled 116 individual houseflies and blowflies from varying habitats on three continents and subjected them to high-coverage, whole-genome shotgun sequencing. This allowed for genomic and metagenomic analyses of the host-associated microbiome at the species level. Both fly host species segregate based on principal coordinate analysis of their microbial communities, but they also show an overlapping core microbiome. Legs and wings displayed the largest microbial diversity and were shown to be an important route for microbial dispersion. The environmental sequencing approach presented here detected a stochastic distribution of human pathogens, such as Helicobacter pylori, thereby demonstrating the potential of flies as proxies for environmental and public health surveillance.
Subterranean mammals show convergent regression in ocular genes and enhancers, along with adaptation to tunneling

PubMed Central

Partha, Raghavendran; Chauhan, Bharesh K; Ferreira, Zelia; Robinson, Joseph D; Lathrop, Kira; Nischal, Ken K

2017-01-01

The underground environment imposes unique demands on life that have led subterranean species to evolve specialized traits, many of which evolved convergently. We studied convergence in evolutionary rate in subterranean mammals in order to associate phenotypic evolution with specific genetic regions. We identified a strong excess of vision- and skin-related genes that changed at accelerated rates in the subterranean environment due to relaxed constraint and adaptive evolution. We also demonstrate that ocular-specific transcriptional enhancers were convergently accelerated, whereas enhancers active outside the eye were not. Furthermore, several uncharacterized genes and regulatory sequences demonstrated convergence and thus constitute novel candidate sequences for congenital ocular disorders. The strong evidence of convergence in these species indicates that evolution in this environment is recurrent and predictable and can be used to gain insights into phenotype–genotype relationships. PMID:29035697
SAR11 bacteria linked to ocean anoxia and nitrogen loss

PubMed Central

Tsementzi, Despina; Wu, Jieying; Deutsch, Samuel; Nath, Sangeeta; Rodriguez-R, Luis M; Burns, Andrew S.; Ranjan, Piyush; Sarode, Neha; Malmstrom, Rex R.; Padilla, Cory C.; Stone, Benjamin K.; Bristow, Laura A.; Larsen, Morten; Glass, Jennifer B.; Thamdrup, Bo; Woyke, Tanja; Konstantinidis, Konstantinos T.; Stewart, Frank J.

2016-01-01

Summary Bacteria of the SAR11 clade constitute up to one half of all microbial cells in the oxygen-rich surface ocean. DNA sequences from SAR11 are also abundant in oxygen minimum zones (OMZs) where oxygen falls below detection and anaerobic microbes play important roles in converting bioavailable nitrogen to N2 gas. Evidence for anaerobic metabolism in SAR11 has not yet been observed, and the question of how these bacteria contribute to OMZ biogeochemical cycling is unanswered. Here, we identify the metabolic basis for SAR11 activity in anoxic ocean waters. Genomic analysis of single cells from the world’s largest OMZ revealed diverse and previously uncharacterized SAR11 lineages that peak in abundance at anoxic depths, but are largely undetectable in oxygen-rich ocean regions. OMZ SAR11 contain adaptations to low oxygen, including genes for respiratory nitrate reductases (Nar). SAR11 nar genes were experimentally verified to encode proteins catalyzing the nitrite-producing first step of denitrification and constituted ~40% of all OMZ nar transcripts, with transcription peaking in the zone of maximum nitrate reduction rates. These results redefine the ecological niche of Earth’s most abundant organismal group and suggest an important contribution of SAR11 to nitrite production in OMZs, and thus to pathways of ocean nitrogen loss. PMID:27487207
Global Profiling and Inhibition of Protein Lipidation in Vector and Host Stages of the Sleeping Sickness Parasite Trypanosoma brucei.

PubMed

Wright, Megan H; Paape, Daniel; Price, Helen P; Smith, Deborah F; Tate, Edward W

2016-06-10

The enzyme N-myristoyltransferase (NMT) catalyzes the essential fatty acylation of substrate proteins with myristic acid in eukaryotes and is a validated drug target in the parasite Trypanosoma brucei , the causative agent of African trypanosomiasis (sleeping sickness). N-Myristoylation typically mediates membrane localization of proteins and is essential to the function of many. However, only a handful of proteins are experimentally validated as N-myristoylated in T. brucei . Here, we perform metabolic labeling with an alkyne-tagged myristic acid analogue, enabling the capture of lipidated proteins in insect and host life stages of T. brucei . We further compare this with a longer chain palmitate analogue to explore the chain length-specific incorporation of fatty acids into proteins. Finally, we combine the alkynyl-myristate analogue with NMT inhibitors and quantitative chemical proteomics to globally define N-myristoylated proteins in the clinically relevant bloodstream form parasites. This analysis reveals five ARF family small GTPases, calpain-like proteins, phosphatases, and many uncharacterized proteins as substrates of NMT in the parasite, providing a global view of the scope of this important protein modification and further evidence for the crucial and pleiotropic role of NMT in the cell.
Expression profiling of Crambe abyssinica under arsenate stress identifies genes and gene networks involved in arsenic metabolism and detoxification

PubMed Central

2010-01-01

Background Arsenic contamination is widespread throughout the world and this toxic metalloid is known to cause cancers of organs such as liver, kidney, skin, and lung in human. In spite of a recent surge in arsenic related studies, we are still far from a comprehensive understanding of arsenic uptake, detoxification, and sequestration in plants. Crambe abyssinica, commonly known as 'abyssinian mustard', is a non-food, high biomass oil seed crop that is naturally tolerant to heavy metals. Moreover, it accumulates significantly higher levels of arsenic as compared to other species of the Brassicaceae family. Thus, C. abyssinica has great potential to be utilized as an ideal inedible crop for phytoremediation of heavy metals and metalloids. However, the mechanism of arsenic metabolism in higher plants, including C. abyssinica, remains elusive. Results To identify the differentially expressed transcripts and the pathways involved in arsenic metabolism and detoxification, C. abyssinica plants were subjected to arsenate stress and a PCR-Select Suppression Subtraction Hybridization (SSH) approach was employed. A total of 105 differentially expressed subtracted cDNAs were sequenced which were found to represent 38 genes. Those genes encode proteins functioning as antioxidants, metal transporters, reductases, enzymes involved in the protein degradation pathway, and several novel uncharacterized proteins. The transcripts corresponding to the subtracted cDNAs showed strong upregulation by arsenate stress as confirmed by the semi-quantitative RT-PCR. Conclusions Our study revealed novel insights into the plant defense mechanisms and the regulation of genes and gene networks in response to arsenate toxicity. The differential expression of transcripts encoding glutathione-S-transferases, antioxidants, sulfur metabolism, heat-shock proteins, metal transporters, and enzymes in the ubiquitination pathway of protein degradation as well as several unknown novel proteins serve as molecular evidence for the physiological responses to arsenate stress in plants. Additionally, many of these cDNA clones showing strong upregulation due to arsenate stress could be used as valuable markers. Further characterization of these differentially expressed genes would be useful to develop novel strategies for efficient phytoremediation as well as for engineering arsenic tolerant crops with reduced arsenic translocation to the edible parts of plants. PMID:20546591
Characterization of rat serum amyloid A4 (SAA4): A novel member of the SAA superfamily

DOE Office of Scientific and Technical Information (OSTI.GOV)

Rossmann, Christine; Windpassinger, Christian; Brunner, Daniela

2014-08-08

Highlights: • The full length rat SAA4 (rSAA4) mRNA was characterized by rapid amplification of cDNA ends. • rSAA4 mRNA has 1830 bases including a GA dinucleotide tandem repeat in the 5′UTR. • Three consecutive C/EBP promoter elements are crucial for transcription of rSAA4. • rSAA4 is abundantly expressed in the liver on mRNA and protein level. - Abstract: The serum amyloid A (SAA) family of proteins is encoded by multiple genes, which display allelic variation and a high degree of homology in mammals. The SAA1/2 genes code for non-glycosylated acute-phase SAA1/2 proteins, that may increase up to 1000-fold duringmore » inflammation. The SAA4 gene, well characterized in humans (hSAA4) and mice (mSaa4) codes for a SAA4 protein that is glycosylated only in humans. We here report on a previously uncharacterized SAA4 gene (rSAA4) and its product in Rattus norvegicus, the only mammalian species known not to express acute-phase SAA. The exon/intron organization of rSAA4 is similar to that reported for hSAA4 and mSaa4. By performing 5′- and 3′RACE, we identified a 1830-bases containing rSAA4 mRNA (including a GA-dinucleotide tandem repeat). Highest rSAA4 mRNA expression was detected in rat liver. In McA-RH7777 rat hepatoma cells, rSAA4 transcription was significantly upregulated in response to LPS and IL-6 while IL-1α/β and TNFα were without effect. Luciferase assays with promoter-truncation constructs identified three proximal C/EBP-elements that mediate expression of rSAA4 in McA-RH7777 cells. In line with sequence prediction a 14-kDa non-glycosylated SAA4 protein is abundantly expressed in rat liver. Fluorescence microscopy revealed predominant localization of rSAA4-GFP-tagged fusion protein in the ER.« less
Diversity and evolution of phycobilisomes in marine Synechococcus spp.: a comparative genomics study.

PubMed

Six, Christophe; Thomas, Jean-Claude; Garczarek, Laurence; Ostrowski, Martin; Dufresne, Alexis; Blot, Nicolas; Scanlan, David J; Partensky, Frédéric

2007-01-01

Marine Synechococcus owe their specific vivid color (ranging from blue-green to orange) to their large extrinsic antenna complexes called phycobilisomes, comprising a central allophycocyanin core and rods of variable phycobiliprotein composition. Three major pigment types can be defined depending on the major phycobiliprotein found in the rods (phycocyanin, phycoerythrin I or phycoerythrin II). Among strains containing both phycoerythrins I and II, four subtypes can be distinguished based on the ratio of the two chromophores bound to these phycobiliproteins. Genomes of eleven marine Synechococcus strains recently became available with one to four strains per pigment type or subtype, allowing an unprecedented comparative genomics study of genes involved in phycobilisome metabolism. By carefully comparing the Synechococcus genomes, we have retrieved candidate genes potentially required for the synthesis of phycobiliproteins in each pigment type. This includes linker polypeptides, phycobilin lyases and a number of novel genes of uncharacterized function. Interestingly, strains belonging to a given pigment type have similar phycobilisome gene complements and organization, independent of the core genome phylogeny (as assessed using concatenated ribosomal proteins). While phylogenetic trees based on concatenated allophycocyanin protein sequences are congruent with the latter, those based on phycocyanin and phycoerythrin notably differ and match the Synechococcus pigment types. We conclude that the phycobilisome core has likely evolved together with the core genome, while rods must have evolved independently, possibly by lateral transfer of phycobilisome rod genes or gene clusters between Synechococcus strains, either via viruses or by natural transformation, allowing rapid adaptation to a variety of light niches.
Novel Entries in a Fungal Biofilm Matrix Encyclopedia

PubMed Central

Zarnowski, Robert; Westler, William M.; Lacmbouh, Ghislain Ade; Marita, Jane M.; Bothe, Jameson R.; Bernhardt, Jörg; Lounes-Hadj Sahraoui, Anissa; Fontaine, Joël; Sanchez, Hiram; Hatfield, Ronald D.; Ntambi, James M.; Nett, Jeniel E.; Mitchell, Aaron P.

2014-01-01

ABSTRACT Virulence of Candida is linked with its ability to form biofilms. Once established, biofilm infections are nearly impossible to eradicate. Biofilm cells live immersed in a self-produced matrix, a blend of extracellular biopolymers, many of which are uncharacterized. In this study, we provide a comprehensive analysis of the matrix manufactured by Candida albicans both in vitro and in a clinical niche animal model. We further explore the function of matrix components, including the impact on drug resistance. We uncovered components from each of the macromolecular classes (55% protein, 25% carbohydrate, 15% lipid, and 5% nucleic acid) in the C. albicans biofilm matrix. Three individual polysaccharides were identified and were suggested to interact physically. Surprisingly, a previously identified polysaccharide of functional importance, β-1,3-glucan, comprised only a small portion of the total matrix carbohydrate. Newly described, more abundant polysaccharides included α-1,2 branched α-1,6-mannans (87%) associated with unbranched β-1,6-glucans (13%) in an apparent mannan-glucan complex (MGCx). Functional matrix proteomic analysis revealed 458 distinct activities. The matrix lipids consisted of neutral glycerolipids (89.1%), polar glycerolipids (10.4%), and sphingolipids (0.5%). Examination of matrix nucleic acid identified DNA, primarily noncoding sequences. Several of the in vitro matrix components, including proteins and each of the polysaccharides, were also present in the matrix of a clinically relevant in vivo biofilm. Nuclear magnetic resonance (NMR) analysis demonstrated interaction of aggregate matrix with the antifungal fluconazole, consistent with a role in drug impedance and contribution of multiple matrix components. PMID:25096878
Targeted next-generation sequencing in steroid-resistant nephrotic syndrome: mutations in multiple glomerular genes may influence disease severity.

PubMed

Bullich, Gemma; Trujillano, Daniel; Santín, Sheila; Ossowski, Stephan; Mendizábal, Santiago; Fraga, Gloria; Madrid, Álvaro; Ariceta, Gema; Ballarín, José; Torra, Roser; Estivill, Xavier; Ars, Elisabet

2015-09-01

Genetic diagnosis of steroid-resistant nephrotic syndrome (SRNS) using Sanger sequencing is complicated by the high genetic heterogeneity and phenotypic variability of this disease. We aimed to improve the genetic diagnosis of SRNS by simultaneously sequencing 26 glomerular genes using massive parallel sequencing and to study whether mutations in multiple genes increase disease severity. High-throughput mutation analysis was performed in 50 SRNS and/or focal segmental glomerulosclerosis (FSGS) patients, a validation cohort of 25 patients with known pathogenic mutations, and a discovery cohort of 25 uncharacterized patients with probable genetic etiology. In the validation cohort, we identified the 42 previously known pathogenic mutations across NPHS1, NPHS2, WT1, TRPC6, and INF2 genes. In the discovery cohort, disease-causing mutations in SRNS/FSGS genes were found in nine patients. We detected three patients with mutations in an SRNS/FSGS gene and COL4A3. Two of them were familial cases and presented a more severe phenotype than family members with mutation in only one gene. In conclusion, our results show that massive parallel sequencing is feasible and robust for genetic diagnosis of SRNS/FSGS. Our results indicate that patients carrying mutations in an SRNS/FSGS gene and also in COL4A3 gene have increased disease severity.
Mining Metatranscriptomic Data of a Cyanobacterial Bloom for Patterns of Secondary Metabolism Gene Expression

NASA Astrophysics Data System (ADS)

Penn, K.; Wang, J.; Thompson, J. R.

2012-12-01

The secondary metabolism of bacterial cells produces small molecules that can have both medicinal properties and toxigenic effects. This study focuses on mining metatranscriptomes from a tropical eutrophic water reservoir in Singapore experiencing a cyanobacterial Harmful Algal Bloom dominated by Microcystis, to identify the types of secondary metabolites genes being expressed and by what taxa. A phylogenomic approach as implemented in the online tool Natural Product Domain Seeker (NaPDoS) was used. NaPDoS was recently developed to classify ketosynthase and condensation domains from polyketide synthases and non-ribosomal peptide synthetases, respectively, to provide insight into potential types of pathway products. Water samples from the reservoir were collected six times over a day/night cycle. Total RNA was extracted and subjected to ribosomal depletion followed by cDNA synthesis and next-generation Illumina DNA sequencing, generating 493,468 to 678,064 95-101 base pairs post-quality control reads per sample. Evidence for expression of PKS and NRPS type genes based on identification of a ketosynthase and condensation domains are present in all time points. KS domains fall into to two main phylogenetic groups, type I and type II, within the type II group of domains are domains for fatty acid biosynthesis (fab), which is considered a part of primary metabolism. Type I KS domains are part of the classic PKS natural product biosynthetic genes that make things such as antibiotics and other toxins such as microcystin. 2849 KS domains were detected in the combined reservoir samples, of these 1141 were likely from fatty acid biosynthesis and 1708 were related to secondary metabolism type KS domains. The most abundant KS domains (485) besides the fab genes are closely related to a KS domain that is not currently experimentally linked to a known secondary metabolite but the domain is found in four Microcystis genomes along with two other species of cyanobacteria. The three KS domains from the microcystin pathway make up 238 of the KS domains. The third most abundant KS domain is related to a protein annotated as a heterocyst glycolipid protein from the Microcystis aeruginosa genome sequence, as Microcystis is not known to produce heterocysts the gene is likely a part of an undescribed type of glycolipid biosynthetic pathway. In relation to NRPS pathways there were 899 reads classified as condensation domains. The most abundant one is closely related to the C domains from an uncharacterized NRPS pathway. The next most abundant domains are from microcystin (178), aeruginosin (84) and micropeptin (47) all are NRPS pathways from Microcystis. Although it is unsurprising that most of the KS and C domains are from Microcystis it is clear that there are still uncharacterized secondary metabolites produced by this well studied bacterial genus. Unexpectedly, there are more KS domains related to secondary metabolism then fabs. This study provides unique insight into the production of secondary metabolites in a natural setting and supports that these have an important ecological function because of the significant transcription levels at all time points. A clear understanding of the ecological function of secondary metabolites will undoubtedly be crucial to future efforts to control cyanoHABs.
Preliminary X-ray crystallographic analysis of SMU.573, a putative sugar kinase from Streptococcus mutans

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhou, Yan-Feng; Li, Lan-Fen; Yang, Cheng

2008-01-01

SMU.573 from S. mutans was expressed in E. coli and crystallized. The crystals belong to space group I4 and 2.5 Å resolution diffraction data were collected at an in-house chromium radiation source. SMU.573 from Streptococcus mutans is a structurally and functionally uncharacterized protein that was selected for structural biology studies. Native and SeMet-labelled proteins were expressed with an N-His tag in Escherichia coli BL21 (DE3) and purified by Ni{sup 2+}-chelating and size-exclusion chromatography. Crystals of the SeMet-labelled protein were obtained by the hanging-drop vapour-diffusion method and a 2.5 Å resolution diffraction data set was collected using an in-house chromium radiationmore » source. The crystals belong to space group I4, with unit-cell parameters a = b = 96.53, c = 56.26 Å, α = β = γ = 90°.« less
The Cellular Autophagy Pathway Modulates Human T-Cell Leukemia Virus Type 1 Replication

PubMed Central

Tang, Sai-Wen; Chen, Chia-Yen; Klase, Zachary; Zane, Linda

2013-01-01

Autophagy, a general homeostatic process for degradation of cytosolic proteins or organelles, has been reported to modulate the replication of many viruses. The role of autophagy in human T-cell leukemia virus type 1 (HTLV-1) replication has, however, been uncharacterized. Here, we report that HTLV-1 infection increases the accumulation of autophagosomes and that this accumulation increases HTLV-1 production. We found that the HTLV-1 Tax protein increases cellular autophagosome accumulation by acting to block the fusion of autophagosomes to lysosomes, preventing the degradation of the former by the latter. Interestingly, the inhibition of cellular autophagosome-lysosome fusion using bafilomycin A increased the stability of the Tax protein, suggesting that cellular degradation of Tax occurs in part through autophagy. Our current findings indicate that by interrupting the cell's autophagic process, Tax exerts a positive feedback on its own stability. PMID:23175371
Association Between Germline Mutation in VSIG10L and Familial Barrett Neoplasia.

PubMed

Fecteau, Ryan E; Kong, Jianping; Kresak, Adam; Brock, Wendy; Song, Yeunjoo; Fujioka, Hisashi; Elston, Robert; Willis, Joseph E; Lynch, John P; Markowitz, Sanford D; Guda, Kishore; Chak, Amitabh

2016-10-01

Esophageal adenocarcinoma and its precursor lesion Barrett esophagus have seen a dramatic increase in incidence over the past 4 decades yet marked genetic heterogeneity of this disease has precluded advances in understanding its pathogenesis and improving treatment. To identify novel disease susceptibility variants in a familial syndrome of esophageal adenocarcinoma and Barrett esophagus, termed familial Barrett esophagus, by using high-throughput sequencing in affected individuals from a large, multigenerational family. We performed whole exome sequencing (WES) from peripheral lymphocyte DNA on 4 distant relatives from our multiplex, multigenerational familial Barrett esophagus family to identify candidate disease susceptibility variants. Gene variants were filtered, verified, and segregation analysis performed to identify a single candidate variant. Gene expression analysis was done with both quantitative real-time polymerase chain reaction and in situ RNA hybridization. A 3-dimensional organotypic cell culture model of esophageal maturation was utilized to determine the phenotypic effects of our gene variant. We used electron microscopy on esophageal mucosa from an affected family member carrying the gene variant to assess ultrastructural changes. Identification of a novel, germline disease susceptibility variant in a previously uncharacterized gene. A multiplex, multigenerational family with 14 members affected (3 members with esophageal adenocarcinoma and 11 with Barrett esophagus) was identified, and whole-exome sequencing identified a germline mutation (S631G) at a highly conserved serine residue in the uncharacterized gene VSIG10L that segregated in affected members. Transfection of S631G variant into a 3-dimensional organotypic culture model of normal esophageal squamous cells dramatically inhibited epithelial maturation compared with the wild-type. VSIG10L exhibited high expression in normal squamous esophagus with marked loss of expression in Barrett-associated lesions. Electron microscopy of squamous esophageal mucosa harboring the S631G variant revealed dilated intercellular spaces and reduced desmosomes. This study presents VSIG10L as a candidate familial Barrett esophagus susceptibility gene, with a putative role in maintaining normal esophageal homeostasis. Further research assessing VSIG10L function may reveal pathways important for esophageal maturation and the pathogenesis of Barrett esophagus and esophageal adenocarcinoma.
Association Between Germline Mutation in VSIG10L and Familial Barrett Neoplasia

PubMed Central

Fecteau, Ryan E.; Kong, Jianping; Kresak, Adam; Brock, Wendy; Song, Yeunjoo; Fujioka, Hisashi; Elston, Robert; Willis, Joseph E.; Lynch, John P.; Markowitz, Sanford D.; Guda, Kishore; Chak, Amitabh

2016-01-01

IMPORTANCE Esophageal adenocarcinoma and its precursor lesion Barrett esophagus have seen a dramatic increase in incidence over the past 4 decades yet marked genetic heterogeneity of this disease has precluded advances in understanding its pathogenesis and improving treatment. OBJECTIVE To identify novel disease susceptibility variants in a familial syndrome of esophageal adenocarcinoma and Barrett esophagus, termed familial Barrett esophagus, by using high-throughput sequencing in affected individuals from a large, multigenerational family. DESIGN, SETTING, AND PARTICIPANTS We performed whole exome sequencing (WES) from peripheral lymphocyte DNA on 4 distant relatives from our multiplex, multigenerational familial Barrett esophagus family to identify candidate disease susceptibility variants. Gene variants were filtered, verified, and segregation analysis performed to identify a single candidate variant. Gene expression analysis was done with both quantitative real-time polymerase chain reaction and in situ RNA hybridization. A 3-dimensional organotypic cell culture model of esophageal maturation was utilized to determine the phenotypic effects of our gene variant. We used electron microscopy on esophageal mucosa from an affected family member carrying the gene variant to assess ultrastructural changes. MAIN OUTCOMES AND MEASURES Identification of a novel, germline disease susceptibility variant in a previously uncharacterized gene. RESULTS A multiplex, multigenerational family with 14 members affected (3 members with esophageal adenocarcinoma and 11 with Barrett esophagus) was identified, and whole-exome sequencing identified a germline mutation (S631G) at a highly conserved serine residue in the uncharacterized gene VSIG10L that segregated in affected members. Transfection of S631G variant into a 3-dimensional organotypic culture model of normal esophageal squamous cells dramatically inhibited epithelial maturation compared with the wild-type. VSIG10L exhibited high expression in normal squamous esophagus with marked loss of expression in Barrett-associated lesions. Electron microscopy of squamous esophageal mucosa harboring the S631G variant revealed dilated intercellular spaces and reduced desmosomes. CONCLUSIONS AND RELEVANCE This study presents VSIG10L as a candidate familial Barrett esophagus susceptibility gene, with a putative role in maintaining normal esophageal homeostasis. Further research assessing VSIG10L function may reveal pathways important for esophageal maturation and the pathogenesis of Barrett esophagus and esophageal adenocarcinoma. PMID:27467440
LINKIN, a new transmembrane protein necessary for cell adhesion

PubMed Central

Kato, Mihoko; Chou, Tsui-Fen; Yu, Collin Z; DeModena, John; Sternberg, Paul W

2014-01-01

In epithelial collective migration, leader and follower cells migrate while maintaining cell–cell adhesion and tissue polarity. We have identified a conserved protein and interactors required for maintaining cell adhesion during a simple collective migration in the developing C. elegans male gonad. LINKIN is a previously uncharacterized, transmembrane protein conserved throughout Metazoa. We identified seven atypical FG–GAP domains in the extracellular domain, which potentially folds into a β-propeller structure resembling the α-integrin ligand-binding domain. C. elegans LNKN-1 localizes to the plasma membrane of all gonadal cells, with apical and lateral bias. We identified the LINKIN interactors RUVBL1, RUVBL2, and α-tubulin by using SILAC mass spectrometry on human HEK 293T cells and testing candidates for lnkn-1-like function in C. elegans male gonad. We propose that LINKIN promotes adhesion between neighboring cells through its extracellular domain and regulates microtubule dynamics through RUVBL proteins at its intracellular domain. DOI: http://dx.doi.org/10.7554/eLife.04449.001 PMID:25437307
Mutant phenotypes for thousands of bacterial genes of unknown function

DOE PAGES

Price, Morgan N.; Wetmore, Kelly M.; Waters, R. Jordan; ...

2018-05-16

One-third of all protein-coding genes from bacterial genomes cannot be annotated with a function. Here, to investigate the functions of these genes, we present genome-wide mutant fitness data from 32 diverse bacteria across dozens of growth conditions. We identified mutant phenotypes for 11,779 protein-coding genes that had not been annotated with a specific function. Many genes could be associated with a specific condition because the gene affected fitness only in that condition, or with another gene in the same bacterium because they had similar mutant phenotypes. Of the poorly annotated genes, 2,316 had associations that have high confidence because theymore » are conserved in other bacteria. By combining these conserved associations with comparative genomics, we identified putative DNA repair proteins; in addition, we propose specific functions for poorly annotated enzymes and transporters and for uncharacterized protein families. Lastly, our study demonstrates the scalability of microbial genetics and its utility for improving gene annotations.« less
Mutant phenotypes for thousands of bacterial genes of unknown function

DOE Office of Scientific and Technical Information (OSTI.GOV)

Price, Morgan N.; Wetmore, Kelly M.; Waters, R. Jordan

One-third of all protein-coding genes from bacterial genomes cannot be annotated with a function. Here, to investigate the functions of these genes, we present genome-wide mutant fitness data from 32 diverse bacteria across dozens of growth conditions. We identified mutant phenotypes for 11,779 protein-coding genes that had not been annotated with a specific function. Many genes could be associated with a specific condition because the gene affected fitness only in that condition, or with another gene in the same bacterium because they had similar mutant phenotypes. Of the poorly annotated genes, 2,316 had associations that have high confidence because theymore » are conserved in other bacteria. By combining these conserved associations with comparative genomics, we identified putative DNA repair proteins; in addition, we propose specific functions for poorly annotated enzymes and transporters and for uncharacterized protein families. Lastly, our study demonstrates the scalability of microbial genetics and its utility for improving gene annotations.« less
The protein arginine methyltransferase PRMT5 promotes D2-like dopamine receptor signaling

PubMed Central

Likhite, Neah; Jackson, Christopher A.; Liang, Mao-Shih; Krzyzanowski, Michelle C.; Lei, Pedro; Wood, Jordan F.; Birkaya, Barbara; Michaels, Kerry L.; Andreadis, Stelios T.; Clark, Stewart D.; Yu, Michael C.; Ferkey, Denise M.

2017-01-01

Protein arginine methylation regulates diverse functions of eukaryotic cells, including gene expression, the DNA damage response, and circadian rhythms. We showed that arginine residues within the third intracellular loop of the human D2 dopamine receptor, which are conserved in the DOP-3 receptor in the nematode Caenorhabditis elegans, were methylated by protein arginine methyl-transferase 5 (PRMT5). By mutating these arginine residues, we further showed that their methylation enhanced the D2 receptor–mediated inhibition of cyclic adenosine monophosphate (cAMP) signaling in cultured human embryonic kidney (HEK) 293T cells. Analysis of prmt-5–deficient worms indicated that methylation promoted the dopamine-mediated modulation of chemosensory and locomotory behaviors in C. elegans through the DOP-3 receptor. In addition to delineating a previously uncharacterized means of regulating GPCR (heterotrimeric guanine nucleotide–binding protein–coupled receptor) signaling, these findings may lead to the development of a new class of pharmacological therapies that modulate GPCR signaling by changing the methylation status of these key proteins. PMID:26554819

Convergence of isoprene and polyketide biosynthetic machinery: isoprenyl-S-carrier proteins in the pksX pathway of Bacillus subtilis.

PubMed

Calderone, Christopher T; Kowtoniuk, Walter E; Kelleher, Neil L; Walsh, Christopher T; Dorrestein, Pieter C

2006-06-13

The pksX gene cluster from Bacillus subtilis is predicted to encode the biosynthesis of an as yet uncharacterized hybrid nonribosomal peptide/polyketide secondary metabolite. We used a combination of biochemical and mass spectrometric techniques to assign functional roles to the proteins AcpK, PksC, PksL, PksF, PksG, PksH, and PksI, and we conclude that they act to incorporate an acetate-derived beta-methyl branch on an acetoacetyl-S-carrier protein and ultimately generate a Delta(2)-isoprenyl-S-carrier protein. This work highlights the power of mass spectrometry to elucidate the functions of orphan biosynthetic enzymes, and it details a mechanism by which single-carbon beta-branches can be inserted into polyketide-like structures. This pathway represents a noncanonical route to the construction of prenyl units and serves as a prototype for the intersection of isoprenoid and polyketide biosynthetic manifolds in other natural product biosynthetic pathways.
Identification of Thiotetronic Acid Antibiotic Biosynthetic Pathways by Target-directed Genome Mining.

PubMed

Tang, Xiaoyu; Li, Jie; Millán-Aguiñaga, Natalie; Zhang, Jia Jia; O'Neill, Ellis C; Ugalde, Juan A; Jensen, Paul R; Mantovani, Simone M; Moore, Bradley S

2015-12-18

Recent genome sequencing efforts have led to the rapid accumulation of uncharacterized or "orphaned" secondary metabolic biosynthesis gene clusters (BGCs) in public databases. This increase in DNA-sequenced big data has given rise to significant challenges in the applied field of natural product genome mining, including (i) how to prioritize the characterization of orphan BGCs and (ii) how to rapidly connect genes to biosynthesized small molecules. Here, we show that by correlating putative antibiotic resistance genes that encode target-modified proteins with orphan BGCs, we predict the biological function of pathway specific small molecules before they have been revealed in a process we call target-directed genome mining. By querying the pan-genome of 86 Salinispora bacterial genomes for duplicated house-keeping genes colocalized with natural product BGCs, we prioritized an orphan polyketide synthase-nonribosomal peptide synthetase hybrid BGC (tlm) with a putative fatty acid synthase resistance gene. We employed a new synthetic double-stranded DNA-mediated cloning strategy based on transformation-associated recombination to efficiently capture tlm and the related ttm BGCs directly from genomic DNA and to heterologously express them in Streptomyces hosts. We show the production of a group of unusual thiotetronic acid natural products, including the well-known fatty acid synthase inhibitor thiolactomycin that was first described over 30 years ago, yet never at the genetic level in regards to biosynthesis and autoresistance. This finding not only validates the target-directed genome mining strategy for the discovery of antibiotic producing gene clusters without a priori knowledge of the molecule synthesized but also paves the way for the investigation of novel enzymology involved in thiotetronic acid natural product biosynthesis.
Bacterial sulfite dehydrogenases in organotrophic metabolism: separation and identification in Cupriavidus necator H16 and in Delftia acidovorans SPH-1.

PubMed

Denger, Karin; Weinitschke, Sonja; Smits, Theo H M; Schleheck, David; Cook, Alasdair M

2008-01-01

The utilization of organosulfonates as carbon sources by aerobic or nitrate-reducing bacteria usually involves a measurable, uncharacterized sulfite dehydrogenase. This is tacitly assumed to be sulfite : ferricytochrome-c oxidoreductase [EC 1.8.2.1], despite negligible interaction with (eukaryotic) cytochrome c: the enzyme is assayed at high specific activity with ferricyanide as electron acceptor. Purified periplasmic sulfite dehydrogenases (SorAB, SoxCD) are known from chemoautotrophic growth and are termed 'sulfite oxidases' by bioinformatic services. The catalytic unit (SorA, SoxC; termed 'sulfite oxidases' cd02114 and cd02113, respectively) binds a molybdenum-cofactor (Moco), and involves a cytochrome c (SorB, SoxD) as electron acceptor. The genomes of several bacteria that express a sulfite dehydrogenase during heterotrophic growth contain neither sorAB nor soxCD genes; others contain at least four paralogues, for example Cupriavidus necator H16, which is known to express an inducible sulfite dehydrogenase during growth with taurine (2-aminoethanesulfonate). This soluble enzyme was enriched 320-fold in four steps. The 40 kDa protein (denatured) had an N-terminal amino acid sequence which started at position 42 of the deduced sequence of H16_B0860 (termed 'sulfite oxidase' cd02114), which we named SorA. The neighbouring gene is an orthologue of sorB, and the sorAB genes were co-transcribed. Cell fractionation showed SorA to be periplasmic. The corresponding enzyme in Delftia acidovorans SPH-1 was enriched 270-fold, identified as Daci_0055 (termed 'sulfite oxidase' cd02110) and has a cytochrome c encoded downstream. We presume, from genomic data for bacteria and archaea, that there are several subgroups of sulfite dehydrogenases, which all contain a Moco, and transfer electrons to a specific cytochrome c.
Follow-up on long-term antiretroviral therapy for cats infected with feline immunodeficiency virus.

PubMed

Medeiros, Sheila de Oliveira; Abreu, Celina Monteiro; Delvecchio, Rodrigo; Ribeiro, Anísia Praxedes; Vasconcelos, Zilton; Brindeiro, Rodrigo de Moraes; Tanuri, Amilcar

2016-04-01

Feline immunodeficiency virus (FIV) is a lentivirus that induces AIDS-like disease in cats. Some of the antiretroviral drugs available to treat patients with HIV type 1 are used to treat FIV-infected cats; however, antiretroviral therapy (ART) is not used in cats as a long-term treatment. In this study, the effects of long-term ART were evaluated in domestic cats treated initially with the nucleoside transcriptase reverse inhibitor (NTRI) zidovudine (AZT) over a period ranging from 5-6 years, followed by a regimen of the NTRI lamivudine (3TC) plus AZT over 3 years. Viral load, sequencing of pol (reverse transcriptase [RT]) region and CD4:CD8 lymphocyte ratio were evaluated during and after treatment. Untreated cats were evaluated as a control group. CD4:CD8 ratios were lower, and uncharacterized resistance mutations were found in the RT region in the group of treated cats. A slight increase in viral load was observed in some cats after discontinuing treatment. The data strongly suggest that treated cats were resistant to therapy, and uncharacterized resistance mutations in the RT gene of FIV were selected for by AZT. Few studies have been conducted to evaluate the effect of long-term antiretroviral therapy in cats. To date, resistance mutations have not been described in vivo. © ISFM and AAFP 2015.
Product analysis illuminates the final steps of IES deletion in Tetrahymena thermophila

PubMed Central

Saveliev, Sergei V.; Cox, Michael M.

2001-01-01

DNA sequences (IES elements) eliminated from the developing macronucleus in the ciliate Tetrahymena thermophila are released as linear fragments, which have now been detected and isolated. A PCR-mediated examination of fragment end structures reveals three types of strand scission events, reflecting three steps in the deletion process. New evidence is provided for two steps proposed previously: an initiating double-stranded cleavage, and strand transfer to create a branched deletion intermediate. The fragment ends provide evidence for a previously uncharacterized third step: the branched DNA strand is cleaved at one of several defined sites located within 15–16 nucleotides of the IES boundary, liberating the deleted DNA in a linear form. PMID:11406601
Product analysis illuminates the final steps of IES deletion in Tetrahymena thermophila.

PubMed

Saveliev, S V; Cox, M M

2001-06-15

DNA sequences (IES elements) eliminated from the developing macronucleus in the ciliate Tetrahymena thermophila are released as linear fragments, which have now been detected and isolated. A PCR-mediated examination of fragment end structures reveals three types of strand scission events, reflecting three steps in the deletion process. New evidence is provided for two steps proposed previously: an initiating double-stranded cleavage, and strand transfer to create a branched deletion intermediate. The fragment ends provide evidence for a previously uncharacterized third step: the branched DNA strand is cleaved at one of several defined sites located within 15-16 nucleotides of the IES boundary, liberating the deleted DNA in a linear form.
Protein complexes, big data, machine learning and integrative proteomics: lessons learned over a decade of systematic analysis of protein interaction networks.

PubMed

Havugimana, Pierre C; Hu, Pingzhao; Emili, Andrew

2017-10-01

Elucidation of the networks of physical (functional) interactions present in cells and tissues is fundamental for understanding the molecular organization of biological systems, the mechanistic basis of essential and disease-related processes, and for functional annotation of previously uncharacterized proteins (via guilt-by-association or -correlation). After a decade in the field, we felt it timely to document our own experiences in the systematic analysis of protein interaction networks. Areas covered: Researchers worldwide have contributed innovative experimental and computational approaches that have driven the rapidly evolving field of 'functional proteomics'. These include mass spectrometry-based methods to characterize macromolecular complexes on a global-scale and sophisticated data analysis tools - most notably machine learning - that allow for the generation of high-quality protein association maps. Expert commentary: Here, we recount some key lessons learned, with an emphasis on successful workflows, and challenges, arising from our own and other groups' ongoing efforts to generate, interpret and report proteome-scale interaction networks in increasingly diverse biological contexts.
PiSCP1 and PiCDPK2 Localize to Peroxisomes and Are Involved in Pollen Tube Growth in Petunia Inflata

PubMed Central

Guo, Feng; Yoon, Gyeong Mee; McCubbin, Andrew G.

2013-01-01

Petunia inflata small CDPK-interacting protein 1 (PiSCP1) was identified as a pollen expressed PiCDPK1 interacting protein using the yeast two hybrid system and the interaction confirmed using pull-down and phosphorylation assays. PiSCP1 is pollen specific and shares amino acid homology with uncharacterized proteins from diverse species of higher plants, but no protein of known function. Expression of PiSCP1-GFP in vivo inhibited pollen tube growth and was shown to localize to peroxisomes in growing pollen tubes. As PiCDPK1 is plasma membrane localized, we investigated the localization of a second isoform, PiCDPK2, and show that it co-localizes to peroxisomes with PiSCP1 and that the two proteins interact in the yeast 2 hybrid interaction assay, suggesting that interaction with the latter CDPK isoform is likely the one of biological relevance. Both PiCDPK2 and PiSCP1 affect pollen tube growth, presumably by mediating peroxisome function, however how they do so is currently not clear. PMID:27137367
SOLO: a meiotic protein required for centromere cohesion, coorientation, and SMC1 localization in Drosophila melanogaster.

PubMed

Yan, Rihui; Thomas, Sharon E; Tsai, Jui-He; Yamada, Yukihiro; McKee, Bruce D

2010-02-08

Sister chromatid cohesion is essential to maintain stable connections between homologues and sister chromatids during meiosis and to establish correct centromere orientation patterns on the meiosis I and II spindles. However, the meiotic cohesion apparatus in Drosophila melanogaster remains largely uncharacterized. We describe a novel protein, sisters on the loose (SOLO), which is essential for meiotic cohesion in Drosophila. In solo mutants, sister centromeres separate before prometaphase I, disrupting meiosis I centromere orientation and causing nondisjunction of both homologous and sister chromatids. Centromeric foci of the cohesin protein SMC1 are absent in solo mutants at all meiotic stages. SOLO and SMC1 colocalize to meiotic centromeres from early prophase I until anaphase II in wild-type males, but both proteins disappear prematurely at anaphase I in mutants for mei-S332, which encodes the Drosophila homologue of the cohesin protector protein shugoshin. The solo mutant phenotypes and the localization patterns of SOLO and SMC1 indicate that they function together to maintain sister chromatid cohesion in Drosophila meiosis.
Eros is a novel transmembrane protein that controls the phagocyte respiratory burst and is essential for innate immunity.

PubMed

Thomas, David C; Clare, Simon; Sowerby, John M; Pardo, Mercedes; Juss, Jatinder K; Goulding, David A; van der Weyden, Louise; Storisteanu, Daniel; Prakash, Ananth; Espéli, Marion; Flint, Shaun; Lee, James C; Hoenderdos, Kim; Kane, Leanne; Harcourt, Katherine; Mukhopadhyay, Subhankar; Umrania, Yagnesh; Antrobus, Robin; Nathan, James A; Adams, David J; Bateman, Alex; Choudhary, Jyoti S; Lyons, Paul A; Condliffe, Alison M; Chilvers, Edwin R; Dougan, Gordon; Smith, Kenneth G C

2017-04-03

The phagocyte respiratory burst is crucial for innate immunity. The transfer of electrons to oxygen is mediated by a membrane-bound heterodimer, comprising gp91 phox and p22 phox subunits. Deficiency of either subunit leads to severe immunodeficiency. We describe Eros (essential for reactive oxygen species), a protein encoded by the previously undefined mouse gene bc017643 , and show that it is essential for host defense via the phagocyte NAPDH oxidase. Eros is required for expression of the NADPH oxidase components, gp91 phox and p22 phox Consequently, Eros -deficient mice quickly succumb to infection. Eros also contributes to the formation of neutrophil extracellular traps (NETS) and impacts on the immune response to melanoma metastases. Eros is an ortholog of the plant protein Ycf4, which is necessary for expression of proteins of the photosynthetic photosystem 1 complex, itself also an NADPH oxio-reductase. We thus describe the key role of the previously uncharacterized protein Eros in host defense. © 2017 Thomas et al.
Genome analysis of the thermoacidophilic archaeon Acidianus copahuensis focusing on the metabolisms associated to biomining activities.

PubMed

Urbieta, María Sofía; Rascovan, Nicolás; Vázquez, Martín P; Donati, Edgardo

2017-06-06

Several archaeal species from the order Sulfolobales are interesting from the biotechnological point of view due to their biomining capacities. Within this group, the genus Acidianus contains four biomining species (from ten known Acidianus species), but none of these have their genome sequenced. To get insights into the genetic potential and metabolic pathways involved in the biomining activity of this group, we sequenced the genome of Acidianus copahuensis ALE1 strain, a novel thermoacidophilic crenarchaeon (optimum growth: 75 °C, pH 3) isolated from the volcanic geothermal area of Copahue at Neuquén province in Argentina. Previous experimental characterization of A. copahuensis revealed a high biomining potential, exhibited as high oxidation activity of sulfur and sulfur compounds, ferrous iron and sulfide minerals (e.g.: pyrite). This strain is also autotrophic and tolerant to heavy metals, thus, it can grow under adverse conditions for most forms of life with a low nutrient demand, conditions that are commonly found in mining environments. In this work we analyzed the genome of Acidianus copahuensis and describe the genetic pathways involved in biomining processes. We identified the enzymes that are most likely involved in growth on sulfur and ferrous iron oxidation as well as those involved in autotrophic carbon fixation. We also found that A. copahuensis genome gathers different features that are only present in particular lineages or species from the order Sulfolobales, some of which are involved in biomining. We found that although most of its genes (81%) were found in at least one other Sulfolobales species, it is not specifically closer to any particular species (60-70% of proteins shared with each of them). Although almost one fifth of A. copahuensis proteins are not found in any other Sulfolobales species, most of them corresponded to hypothetical proteins from uncharacterized metabolisms. In this work we identified the genes responsible for the biomining metabolisms that we have previously observed experimentally. We provide a landscape of the metabolic potentials of this strain in the context of Sulfolobales and propose various pathways and cellular processes not yet fully understood that can use A. copahuensis as an experimental model to further understand the fascinating biology of thermoacidophilic biomining archaea.
Systematic Proteomic Approach to Characterize the Impacts of ...

EPA Pesticide Factsheets

Chemical interactions have posed a big challenge in toxicity characterization and human health risk assessment of environmental mixtures. To characterize the impacts of chemical interactions on protein and cytotoxicity responses to environmental mixtures, we established a systems biology approach integrating proteomics, bioinformatics, statistics, and computational toxicology to measure expression or phosphorylation levels of 21 critical toxicity pathway regulators and 445 downstream proteins in human BEAS-28 cells treated with 4 concentrations of nickel, 2 concentrations each of cadmium and chromium, as well as 12 defined binary and 8 defined ternary mixtures of these metals in vitro. Multivariate statistical analysis and mathematical modeling of the metal-mediated proteomic response patterns showed a high correlation between changes in protein expression or phosphorylation and cellular toxic responses to both individual metals and metal mixtures. Of the identified correlated proteins, only a small set of proteins including HIF-1a is likely to be responsible for selective cytotoxic responses to different metals and metals mixtures. Furthermore, support vector machine learning was utilized to computationally predict protein responses to uncharacterized metal mixtures using experimentally generated protein response profiles corresponding to known metal mixtures. This study provides a novel proteomic approach for characterization and prediction of toxicities of
Bt proteins Cry1Ah and Cry2Ab do not affect cotton aphid Aphis gossypii and ladybeetle Propylea japonica

PubMed Central

Zhao, Yao; Zhang, Shuai; Luo, Jun-Yu; Wang, Chun-Yi; Lv, Li-Min; Wang, Xiao-Ping; Cui, Jin-Jie; Lei, Chao-Liang

2016-01-01

Plant varieties expressing the Bt (Bacillus thuringiensis) insecticidal proteins Cry1Ah and Cry2Ab have potential commercialization prospects in China. However, their potential effects on non-target arthropods (NTAs) remain uncharacterized. The cotton aphid Aphis gossypii is a worldwide pest that damages various important crops. The ladybeetle Propylea japonica is a common and abundant natural enemy in many cropping systems in East Asia. In the present study, the effects of Cry1Ah and Cry2Ab proteins on A. gossypii and P. japonica were assessed from three aspects. First, neither of the Cry proteins affected the growth or developmental characteristics of the two test insects. Second, the expression levels of the detoxification-related genes of the two test insects did not change significantly in either Cry protein treatment. Third, neither of the Cry proteins had a favourable effect on the expression of genes associated with the amino acid metabolism of A. gossypii and the nutrition utilization of P. japonica. In conclusion, the Cry1Ah and Cry2Ab proteins do not appear to affect the cotton aphid A. gossypii or the ladybeetle P. japonica. PMID:26829252
Dissecting Stop Transfer versus Conservative Sorting Pathways for Mitochondrial Inner Membrane Proteins in Vivo*

PubMed Central

Park, Kwangjin; Botelho, Salomé Calado; Hong, Joonki; Österberg, Marie; Kim, Hyun

2013-01-01

Mitochondrial inner membrane proteins that carry an N-terminal presequence are sorted by one of two pathways: stop transfer or conservative sorting. However, the sorting pathway is known for only a small number of proteins, in part due to the lack of robust experimental tools with which to study. Here we present an approach that facilitates determination of inner membrane protein sorting pathways in vivo by fusing a mitochondrial inner membrane protein to the C-terminal part of Mgm1p containing the rhomboid cleavage region. We validated the Mgm1 fusion approach using a set of proteins for which the sorting pathway is known, and determined sorting pathways of inner membrane proteins for which the sorting mode was previously uncharacterized. For Sdh4p, a multispanning membrane protein, our results suggest that both conservative sorting and stop transfer mechanisms are required for insertion. Furthermore, the sorting process of Mgm1 fusion proteins was analyzed under different growth conditions and yeast mutant strains that were defective in the import motor or the m-AAA protease function. Our results show that the sorting of mitochondrial proteins carrying moderately hydrophobic transmembrane segments is sensitive to cellular conditions, implying that mitochondrial import and membrane sorting in the physiological environment may be dynamically tuned. PMID:23184936
Identification of Several Mutations in ATP2C1 in Lebanese Families: Insight into the Pathogenesis of Hailey-Hailey Disease

PubMed Central

Btadini, Waed; Abou Hassan, Ossama K.; Saadeh, Dana; Abbas, Ossama; Ballout, Farah; Kibbi, Abdul-Ghani; Dbaibo, Ghassan; Darwiche, Nadine; Nemer, Georges; Kurban, Mazen

2015-01-01

Background Hailey-Hailey disease (HHD) is an inherited blistering dermatosis characterized by recurrent erosions and erythematous plaques that generally manifest in intertriginous areas. Genetically, HHD is an autosomal dominant disease, resulting from heterozygous mutations in ATP2C1, which encodes a Ca2+/Mn2+ATPase. In this study, we aimed at identifying and analyzing mutations in five patients from unrelated families diagnosed with HHD and study the underlying molecular pathogenesis. Objectives To genetically study Lebanese families with HHD, and the underlying molecular pathogenesis of the disease. Methods We performed DNA sequencing for the coding sequence and exon-intron boundaries of ATP2C1. Heat shock experiments were done on several cell types. This was followed by real-time and western blotting for ATP2C1, caspase 3, and PARP proteins to examine any possible role of apoptosis in HHD. This was followed by TUNEL staining to confirm the western blotting results. We then performed heat shock experiments on neonatal rat primary cardiomyocytes. Results Four mutations were detected, three of which were novel and one recurrent mutation in two families. In order for HHD to manifest, it requires both the genetic alteration and the environmental stress, therefore we performed heat shock experiments on fibroblasts (HH and normal) and HaCaT cells, mimicking the environmental factor seen in HHD. It was found that stress stimuli, represented here as temperature stress, leads to an increase in the mRNA and protein levels of ATP2C1 in heat-shocked cells as compared to non-heat shocked ones. However, the increase in ATP2C1 and heat shock protein hsp90 is significantly lower in HH fibroblasts in comparison to normal fibroblasts and HaCaT cells. We did not find a role for apoptosis in the pathogenesis of HHD. A similar approach (heat shock experiments) done on rat cardiomyocytes, led to a significant variation in ATP2C1 transcript and protein levels. Conclusion This is the first genetic report of HHD from Lebanon in which we identified three novel mutations in ATP2C1 and shed light on the molecular mechanisms and pathogenesis of HHD by linking stress signals like heat shock to the observed phenotypes. This link was also found in cultured cardiomyocytes suggesting thus a yet uncharacterized cardiac phenotype in HHD patients masked by its in-expressivity in normal health conditions. PMID:25658765
FmvB: A Francisella tularensis Magnesium-Responsive Outer Membrane Protein that Plays a Role in Virulence

PubMed Central

Wu, Xiaojun; Ren, Guoping; Gunning, William T.; Weaver, David A.; Kalinoski, Andrea L.; Khuder, Sadik A.; Huntley, Jason F.

2016-01-01

Francisella tularensis is the causative agent of the lethal disease tularemia. Despite decades of research, little is understood about why F. tularensis is so virulent. Bacterial outer membrane proteins (OMPs) are involved in various virulence processes, including protein secretion, host cell attachment, and intracellular survival. Many pathogenic bacteria require metals for intracellular survival and OMPs often play important roles in metal uptake. Previous studies identified three F. tularensis OMPs that play roles in iron acquisition. In this study, we examined two previously uncharacterized proteins, FTT0267 (named fmvA, for Francisella metal and virulence) and FTT0602c (fmvB), which are homologs of the previously studied F. tularensis iron acquisition genes and are predicted OMPs. To study the potential roles of FmvA and FmvB in metal acquisition and virulence, we first examined fmvA and fmvB expression following pulmonary infection of mice, finding that fmvB was upregulated up to 5-fold during F. tularensis infection of mice. Despite sequence homology to previously-characterized iron-acquisition genes, FmvA and FmvB do not appear to be involved iron uptake, as neither fmvA nor fmvB were upregulated in iron-limiting media and neither ΔfmvA nor ΔfmvB exhibited growth defects in iron limitation. However, when other metals were examined in this study, magnesium-limitation significantly induced fmvB expression, ΔfmvB was found to express significantly higher levels of lipopolysaccharide (LPS) in magnesium-limiting medium, and increased numbers of surface protrusions were observed on ΔfmvB in magnesium-limiting medium, compared to wild-type F. tularensis grown in magnesium-limiting medium. RNA sequencing analysis of ΔfmvB revealed the potential mechanism for increased LPS expression, as LPS synthesis genes kdtA and wbtA were significantly upregulated in ΔfmvB, compared with wild-type F. tularensis. To provide further evidence for the potential role of FmvB in magnesium uptake, we demonstrated that FmvB was outer membrane-localized. Finally, ΔfmvB was found to be attenuated in mice and cytokine analyses revealed that ΔfmvB-infected mice produced lower levels of pro-inflammatory cytokines, including GM-CSF, IL-3, and IL-10, compared with mice infected with wild-type F. tularensis. Taken together, although the function of FmvA remains unknown, FmvB appears to play a role in magnesium uptake and F. tularensis virulence. These results may provide new insights into the importance of magnesium for intracellular pathogens. PMID:27513341
Saturation Mutagenesis of Burkholderia cepacia R34 2,4-Dinitrotoluene Dioxygenase at DntAc Valine 350 for Synthesizing Nitrohydroquinone, Methylhydroquinone, and Methoxyhydroquinone

PubMed Central

Keenan, Brendan G.; Leungsakul, Thammajun; Smets, Barth F.; Wood, Thomas K.

2004-01-01

Saturation mutagenesis of the 2,4-dinitrotoluene dioxygenase (DDO) of Burkholderia cepacia R34 at position valine 350 of the DntAc α-subunit generated mutant V350F with significantly increased activity towards o-nitrophenol (47 times), m-nitrophenol (34 times), and o-methoxyphenol (174 times) as well as an expanded substrate range that now includes m-methoxyphenol, o-cresol, and m-cresol (wild-type DDO had no detectable activity for these substrates). Another mutant, V350M, also displays increased activity towards o-nitrophenol (20 times) and o-methoxyphenol (162 times) as well as novel activity towards o-cresol. Products were synthesized using whole Escherichia coli TG1 cells expressing the recombinant R34 dntA loci from pBS(Kan)R34, and the initial rates of product formation were determined at 1 mM substrate by reverse-phase high-pressure liquid chromatography. V350F produced both nitrohydroquinone at a rate of 0.75 ± 0.15 nmol/min/mg of protein and 3-nitrocatechol at a rate of 0.069 ± 0.001 nmol/min/mg of protein from o-nitrophenol, 4-nitrocatechol from m-nitrophenol at 0.29 ± 0.02 nmol/min/mg of protein, methoxyhydroquinone from o-methoxyphenol at 2.5 ± 0.6 nmol/min/mg of protein, methoxyhydroquinone from m-methoxyphenol at 0.55 ± 0.02 nmol/min/mg of protein, both methylhydroquinone at 1.52 ± 0.02 nmol/min/mg of protein and 2-hydroxybenzyl alcohol at 0.74 ± 0.05 nmol/min/mg of protein from o-cresol, and methylhydroquinone at 0.43 ± 0.1 nmol/min/mg of protein from m-cresol. V350M produced both nitrohydroquinone at a rate of 0.33 nmol/min/mg of protein and 3-nitrocatechol at 0.089 nmol/min/mg of protein from o-nitrophenol, methoxyhydroquinone from o-methoxyphenol at 2.4 nmol/min/mg of protein, methylhydroquinone at 1.97 nmol/min/mg of protein and 2-hydroxybenzyl alcohol at 0.11 nmol/min/mg of protein from o-cresol. The DDO variants V350F and V350M also exhibited 10-fold-enhanced activity towards naphthalene (8 ± 2.6 nmol/min/mg of protein), forming (1R,2S)-cis-1,2-dihydro-1,2-dihydroxynaphthalene. Hence, mutagenesis of wild-type DDO through active-site engineering generated variants with relatively high rates toward a previously uncharacterized class of substituted phenols for the nitroarene dioxygenases; seven previously uncharacterized substrates were evaluated for wild-type DDO, and four novel monooxygenase-like products were found for the DDO variants V350F and V350M (methoxyhydroquinone, methylhydroquinone, 2-hydroxybenzyl alcohol, and 3-nitrocatechol). PMID:15184115
Differential Protein Expressions in Virus-Infected and Uninfected Trichomonas vaginalis.

PubMed

He, Ding; Pengtao, Gong; Ju, Yang; Jianhua, Li; He, Li; Guocai, Zhang; Xichen, Zhang

2017-04-01

Protozoan viruses may influence the function and pathogenicity of the protozoa. Trichomonas vaginalis is a parasitic protozoan that could contain a double stranded RNA (dsRNA) virus, T. vaginalis virus (TVV). However, there are few reports on the properties of the virus. To further determine variations in protein expression of T. vaginalis , we detected 2 strains of T. vaginalis ; the virus-infected (V + ) and uninfected (V - ) isolates to examine differentially expressed proteins upon TVV infection. Using a stable isotope N-terminal labeling strategy (iTRAQ) on soluble fractions to analyze proteomes, we identified 293 proteins, of which 50 were altered in V + compared with V - isolates. The results showed that the expression of 29 proteins was increased, and 21 proteins decreased in V + isolates. These differentially expressed proteins can be classified into 4 categories: ribosomal proteins, metabolic enzymes, heat shock proteins, and putative uncharacterized proteins. Quantitative PCR was used to detect 4 metabolic processes proteins: glycogen phosphorylase, malate dehydrogenase, triosephosphate isomerase, and glucose-6-phosphate isomerase, which were differentially expressed in V + and V - isolates. Our findings suggest that mRNA levels of these genes were consistent with protein expression levels. This study was the first which analyzed protein expression variations upon TVV infection. These observations will provide a basis for future studies concerning the possible roles of these proteins in host-parasite interactions.
Coralloluteibacterium stylophorae gen. nov., sp. nov., a new member of the family Lysobacteraceae isolated from the reef-building coral Stylophora sp.

PubMed

Chen, Wen-Ming; Xie, Pei-Bei; Tang, Sen-Lin; Sheu, Shih-Yi

2018-04-01

A bacterial strain, designated Sty a-1 T , was isolated from a reef-building coral Stylophora sp., collected off coast of Southern Taiwan and characterized using the polyphasic taxonomy approach. Cells of strain Sty a-1 T were Gram-staining-negative, aerobic, poly-β-hydroxybutyrate accumulating, motile by means of flagella, non-spore forming, straight rod-shaped and colonies were yellow and circular. Growth occurred at 15-40 °C (optimum, 30-35 °C), at pH 6-10 (optimum, pH 6.5-8) and with 0-7% NaCl (optimum, 2-3%). The predominant fatty acids were iso-C 15:0 , iso-C 17:1 ω9c, summed feature 3 (comprising C 16:1 ω7c and/or C 16:1 ω6c) and iso-C 17:0 . The major isoprenoid quinone was Q-8 and the DNA G+C content was 68.5 mol%. The polar lipid profile consisted of a mixture of phosphatidylethanolamine, phosphatidylglycerol, phosphatidylcholine, diphosphatidylglycerol, an uncharacterized aminophospholipid and three uncharacterized lipids. The major polyamines were spermidine, putrescine and homospermidine. Phylogenetic analyses based on 16S rRNA and four housekeeping gene sequences (recA, atpD, rpoA and rpoB) showed that strain Sty a-1 T forms a distinct lineage with respect to closely related genera in the family Lysobacteraceae, most closely related to Lysobacter, Silanimonas, Arenimonas and Luteimonas and the levels of 16S rRNA gene sequence similarity with respect to the type species of related genera are less than 95%. On the basis of the genotypic and phenotypic data, strain Sty a-1 T represents a novel genus and species of the family Lysobacteraceae, for which the name Coralloluteibacterium stylophorae gen. nov., sp. nov. is proposed. The type strain is Sty a-1 T (= BCRC 80968 T = LMG 29479 T = KCTC 52167 T ).
C1orf163/RESA1 is a novel mitochondrial intermembrane space protein connected to respiratory chain assembly.

PubMed

Kozjak-Pavlovic, Vera; Prell, Florian; Thiede, Bernd; Götz, Monika; Wosiek, Dominik; Ott, Christine; Rudel, Thomas

2014-02-20

Oxidative phosphorylation (OXPHOS) in mitochondria takes place at the inner membrane, which folds into numerous cristae. The stability of cristae depends, among other things, on the mitochondrial intermembrane space bridging complex. Its components include inner mitochondrial membrane protein mitofilin and outer membrane protein Sam50. We identified a conserved, uncharacterized protein, C1orf163 [SEL1 repeat containing 1 protein (SELRC1)], as one of the proteins significantly reduced after the knockdown of Sam50 and mitofilin. We show that C1orf163 is a mitochondrial soluble intermembrane space protein. Sam50 depletion affects moderately the import and assembly of C1orf163 into two protein complexes of approximately 60kDa and 150kDa. We observe that the knockdown of C1orf163 leads to reduction of levels of proteins belonging to the OXPHOS complexes. The activity of complexes I and IV is reduced in C1orf163-depleted cells, and we observe the strongest defects in the assembly of complex IV. Therefore, we propose C1orf163 to be a novel factor important for the assembly of respiratory chain complexes in human mitochondria and suggest to name it RESA1 (for RESpiratory chain Assembly 1). Copyright © 2013 Elsevier Ltd. All rights reserved.

The cell envelope proteome of Aggregatibacter actinomycetemcomitans

PubMed Central

Smith, Kenneth P.; Fields, Julia G.; Voogt, Richard D.; Deng, Bin; Lam, Ying-Wai; Mintz, Keith P.

2014-01-01

Summary The cell envelope of Gram-negative bacteria serves a critical role in maintenance of cellular homeostasis, resistance to external stress, and host-pathogen interactions. Envelope protein composition is influenced by the physiological and environmental demands placed on the bacterium. In this study, we report a comprehensive compilation of cell envelope proteins from the periodontal and systemic pathogen Aggregatibacter actinomycetemcomitans VT1169, an afimbriated serotype b strain. The urea-extracted membrane proteins were identified by mass spectrometry-based shotgun proteomics. The membrane proteome, isolated from actively growing bacteria under normal laboratory conditions, included 648 proteins representing 28% of the predicted ORFs in the genome. Bioinformatic analyses were used to annotate and predict the cellular location and function of the proteins. Surface adhesins, porins, lipoproteins, numerous influx and efflux pumps, multiple sugar, amino acid and iron transporters, and components of the type I, II and V secretion systems were identified. Periplasmic space and cytoplasmic proteins with chaperone function were also identified. 107 proteins with unknown function were associated with the cell envelope. Orthologs of a subset of these uncharacterized proteins are present in other bacterial genomes, while others are found exclusively in A. actinomycetemcomitans. This knowledge will contribute to elucidating the role of cell envelope proteins in bacterial growth and survival in the oral cavity. PMID:25055881
Genomic analysis of coxsackieviruses A1, A19, A22, enteroviruses 113 and 104: viruses representing two clades with distinct tropism within enterovirus C

PubMed Central

Haq, Saddef; Sameroff, Stephen; Howie, Stephen R. C.; Lipkin, W. Ian

2013-01-01

Coxsackieviruses (CV) A1, CV-A19 and CV-A22 have historically comprised a distinct phylogenetic clade within Enterovirus (EV) C. Several novel serotypes that are genetically similar to these three viruses have been recently discovered and characterized. Here, we report the coding sequence analysis of two genotypes of a previously uncharacterized serotype EV-C113 from Bangladesh and demonstrate that it is most similar to CV-A22 and EV-C116 within the capsid region. We sequenced novel genotypes of CV-A1, CV-A19 and CV-A22 from Bangladesh and observed a high rate of recombination within this group. We also report genomic analysis of the rarely reported EV-C104 circulating in the Gambia in 2009. All available EV-C104 sequences displayed a high degree of similarity within the structural genes but formed two clusters within the non-structural genes. One cluster included the recently reported EV-C117, suggesting an ancestral recombination between these two serotypes. Phylogenetic analysis of all available complete genome sequences indicated the existence of two subgroups within this distinct Enterovirus C clade: one has been exclusively recovered from gastrointestinal samples, while the other cluster has been implicated in respiratory disease. PMID:23761409
Broad Surveys of DNA Viral Diversity Obtained through Viral Metagenomics of Mosquitoes

PubMed Central

Ng, Terry Fei Fan; Willner, Dana L.; Lim, Yan Wei; Schmieder, Robert; Chau, Betty; Nilsson, Christina; Anthony, Simon; Ruan, Yijun; Rohwer, Forest; Breitbart, Mya

2011-01-01

Viruses are the most abundant and diverse genetic entities on Earth; however, broad surveys of viral diversity are hindered by the lack of a universal assay for viruses and the inability to sample a sufficient number of individual hosts. This study utilized vector-enabled metagenomics (VEM) to provide a snapshot of the diversity of DNA viruses present in three mosquito samples from San Diego, California. The majority of the sequences were novel, suggesting that the viral community in mosquitoes, as well as the animal and plant hosts they feed on, is highly diverse and largely uncharacterized. Each mosquito sample contained a distinct viral community. The mosquito viromes contained sequences related to a broad range of animal, plant, insect and bacterial viruses. Animal viruses identified included anelloviruses, circoviruses, herpesviruses, poxviruses, and papillomaviruses, which mosquitoes may have obtained from vertebrate hosts during blood feeding. Notably, sequences related to human papillomaviruses were identified in one of the mosquito samples. Sequences similar to plant viruses were identified in all mosquito viromes, which were potentially acquired through feeding on plant nectar. Numerous bacteriophages and insect viruses were also detected, including a novel densovirus likely infecting Culex erythrothorax. Through sampling insect vectors, VEM enables broad survey of viral diversity and has significantly increased our knowledge of the DNA viruses present in mosquitoes. PMID:21674005
Whole-genome sequencing in patients with ciliopathies uncovers a novel recurrent tandem duplication in IFT140.

PubMed

Geoffroy, Véronique; Stoetzel, Corinne; Scheidecker, Sophie; Schaefer, Elise; Perrault, Isabelle; Bär, Séverine; Kröll, Ariane; Delbarre, Marion; Antin, Manuela; Leuvrey, Anne-Sophie; Henry, Charline; Blanché, Hélène; Decker, Eva; Kloth, Katja; Klaus, Günter; Mache, Christoph; Martin-Coignard, Dominique; McGinn, Steven; Boland, Anne; Deleuze, Jean-François; Friant, Sylvie; Saunier, Sophie; Rozet, Jean-Michel; Bergmann, Carsten; Dollfus, Hélène; Muller, Jean

2018-04-24

Ciliopathies represent a wide spectrum of rare diseases with overlapping phenotypes and a high genetic heterogeneity. Among those, IFT140 is implicated in a variety of phenotypes ranging from isolated retinis pigmentosa to more syndromic cases. Using whole-genome sequencing in patients with uncharacterized ciliopathies, we identified a novel recurrent tandem duplication of exon 27-30 (6.7 kb) in IFT140, c.3454-488_4182+2588dup p.(Tyr1152_Thr1394dup), missed by whole-exome sequencing. Pathogenicity of the mutation was assessed on the patients' skin fibroblasts. Several hundreds of patients with a ciliopathy phenotype were screened and biallelic mutations were identified in 11 families representing 12 pathogenic variants of which seven are novel. Among those unrelated families especially with a Mainzer-Saldino syndrome, eight carried the same tandem duplication (two at the homozygous state and six at the heterozygous state). In conclusion, we demonstrated the implication of structural variations in IFT140-related diseases expanding its mutation spectrum. We also provide evidences for a unique genomic event mediated by an Alu-Alu recombination occurring on a shared haplotype. We confirm that whole-genome sequencing can be instrumental in the ability to detect structural variants for genomic disorders. © 2018 Wiley Periodicals, Inc.
Dissection of Influenza Infection In Vivo by Single-Cell RNA Sequencing.

PubMed

Steuerman, Yael; Cohen, Merav; Peshes-Yaloz, Naama; Valadarsky, Liran; Cohn, Ofir; David, Eyal; Frishberg, Amit; Mayo, Lior; Bacharach, Eran; Amit, Ido; Gat-Viks, Irit

2018-06-01

The influenza virus is a major cause of morbidity and mortality worldwide. Yet, both the impact of intracellular viral replication and the variation in host response across different cell types remain uncharacterized. Here we used single-cell RNA sequencing to investigate the heterogeneity in the response of lung tissue cells to in vivo influenza infection. Analysis of viral and host transcriptomes in the same single cell enabled us to resolve the cellular heterogeneity of bystander (exposed but uninfected) as compared with infected cells. We reveal that all major immune and non-immune cell types manifest substantial fractions of infected cells, albeit at low viral transcriptome loads relative to epithelial cells. We show that all cell types respond primarily with a robust generic transcriptional response, and we demonstrate novel markers specific for influenza-infected as opposed to bystander cells. These findings open new avenues for targeted therapy aimed exclusively at infected cells. Copyright © 2018 Elsevier Inc. All rights reserved.
Basic Helix-Loop-Helix Transcription Factor Gene Family Phylogenetics and Nomenclature

PubMed Central

Skinner, Michael K.; Rawls, Alan; Wilson-Rawls, Jeanne; Roalson, Eric H.

2010-01-01

A phylogenetic analysis of the basic helix-loop-helix (bHLH) gene superfamily was performed using seven different species (human, mouse, rat, worm, fly, yeast, and plant Arabidopsis) and involving over 600 bHLH genes [1]. All bHLH genes were identified in the genomes of the various species, including expressed sequence tags, and the entire coding sequence was used in the analysis. Nearly 15% of the gene family has been updated or added since the original publication. A super-tree involving six clades and all structural relationships was established and is now presented for four of the species. The wealth of functional data available for members of the bHLH gene superfamily provides us with the opportunity to use this exhaustive phylogenetic tree to predict potential functions of uncharacterized members of the family. This phylogenetic and genomic analysis of the bHLH gene family has revealed unique elements of the evolution and functional relationships of the different genes in the bHLH gene family. PMID:20219281
Reverse transcriptase genes are highly abundant and transcriptionally active in marine plankton assemblages

PubMed Central

Lescot, Magali; Hingamp, Pascal; Kojima, Kenji K; Villar, Emilie; Romac, Sarah; Veluchamy, Alaguraj; Boccara, Martine; Jaillon, Olivier; Iudicone, Daniele; Bowler, Chris; Wincker, Patrick; Claverie, Jean-Michel; Ogata, Hiroyuki

2016-01-01

Genes encoding reverse transcriptases (RTs) are found in most eukaryotes, often as a component of retrotransposons, as well as in retroviruses and in prokaryotic retroelements. We investigated the abundance, classification and transcriptional status of RTs based on Tara Oceans marine metagenomes and metatranscriptomes encompassing a wide organism size range. Our analyses revealed that RTs predominate large-size fraction metagenomes (>5 μm), where they reached a maximum of 13.5% of the total gene abundance. Metagenomic RTs were widely distributed across the phylogeny of known RTs, but many belonged to previously uncharacterized clades. Metatranscriptomic RTs showed distinct abundance patterns across samples compared with metagenomic RTs. The relative abundances of viral and bacterial RTs among identified RT sequences were higher in metatranscriptomes than in metagenomes and these sequences were detected in all metatranscriptome size fractions. Overall, these observations suggest an active proliferation of various RT-assisted elements, which could be involved in genome evolution or adaptive processes of plankton assemblage. PMID:26613339
DOE Office of Scientific and Technical Information (OSTI.GOV)

Rosatelli, M.C.; Faa, V.; Sardu, R.

This study reports the molecular characterization of [beta]-thalassemia in the Sardinian population. Three thousand [beta]-thalassemia chromosomes from prospective parents presenting at the genetic service were initially analyzed by dot blot analysis with oligonucleotide probes complementary to the most common [beta]-thalassemia mutations in the Mediterranean at-risk populations. The mutation which remained uncharacterized by this approach were defined by denaturing gradient gel electrophoresis (DGGE) followed by direct sequence analysis on amplified DNA. The authors reconfirmed that the predominant mutation in the Sardinian population is the codon 39 nonsense mutation, which accounts for 95.7% of the [beta]-thalassemia chromosomes. The other two relatively commonmore » mutations are frameshifts at codon 6 (2.1%) and at codon 76 (0.7%), relatively uncommon in other Mediterranean-origin populations. In this study they have detected a novel [beta]-thalassemia mutation, i.e., a frameshift at codon 1, in three [beta]-thalassemia chromosomes. The DGGE procedure followed by direct sequencing on amplified DNA is a powerful approach for the characterization of unknown mutations in this genetic system.« less
Novel adenoviruses detected in British mustelids, including a unique Aviadenovirus in the tissues of pine martens (Martes martes)

PubMed Central

Gregory, William F.; Turnbull, Dylan; Rocchi, Mara; Meredith, Anna L.; Philbey, Adrian W.; Sharp, Colin P.

2017-01-01

Several adenoviruses are known to cause severe disease in veterinary species. Recent evidence suggests that canine adenovirus type 1 (CAV-1) persists in the tissues of healthy red foxes (Vulpes vulpes), which may be a source of infection for susceptible species. It was hypothesized that mustelids native to the UK, including pine martens (Martes martes) and Eurasian otters (Lutra lutra), may also be persistently infected with adenoviruses. Based on high-throughput sequencing and additional Sanger sequencing, a novel Aviadenovirus, tentatively named marten adenovirus type 1 (MAdV-1), was detected in pine marten tissues. The detection of an Aviadenovirus in mammalian tissue has not been reported previously. Two mastadenoviruses, tentatively designated marten adenovirus type 2 (MAdV-2) and lutrine adenovirus type 1 (LAdV-1), were also detected in tissues of pine martens and Eurasian otters, respectively. Apparently healthy free-ranging animals may be infected with uncharacterized adenoviruses with possible implications for translocation of wildlife. PMID:28749327
A Novel Selenite- and Tellurite-Inducible Gene in Escherichia coli

PubMed Central

Guzzo, Julie; Dubow, Michael S.

2000-01-01

Selenium is both an essential and a toxic trace element, and the range of concentrations between the two is extremely narrow. Although tellurium is not essential and is only rarely found in the environment, it is considered to be extremely toxic. Several hypotheses have been proposed to account for the toxic effects of selenite and tellurite. However, these potential mechanisms have yet to be fully substantiated. Through screening of an Escherichia coli luxAB transcriptional gene fusion library, we identified a clone whose luminescence increased in the presence of increasing concentrations of sodium selenite or sodium tellurite. Cloning and sequencing of the luxAB junction revealed that the fusion had occurred in a previously uncharacterized open reading frame, termed o393 or yhfC, which we have now designated gutS, for gene up-regulated by tellurite and selenite. Transcription from gutS in the presence of selenite or tellurite was confirmed by RNA dot blot analysis. In vivo expression of the GutS polypeptide, using the pET expression system, revealed a polypeptide of approximately 43 kDa, in good agreement with its predicted molecular mass. Although the function of GutS remains to be elucidated, homology searches as well as protein motif and secondary-structure analyses have provided clues which may implicate GutS in transport in response to selenite and tellurite. PMID:11055951
Prediction of Binding Energy of Keap1 Interaction Motifs in the Nrf2 Antioxidant Pathway and Design of Potential High-Affinity Peptides.

PubMed

Karttunen, Mikko; Choy, Wing-Yiu; Cino, Elio A

2018-06-07

Nuclear factor erythroid 2-related factor 2 (Nrf2) is a transcription factor and principal regulator of the antioxidant pathway. The Kelch domain of Kelch-like ECH-associated protein 1 (Keap1) binds to motifs in the N-terminal region of Nrf2, promoting its degradation. There is interest in developing ligands that can compete with Nrf2 for binding to Kelch, thereby activating its transcriptional activities and increasing antioxidant levels. Using experimental Δ G bind values of Kelch-binding motifs determined previously, a revised hydrophobicity-based model was developed for estimating Δ G bind from amino acid sequence and applied to rank potential uncharacterized Kelch-binding motifs identified from interaction databases and BLAST searches. Model predictions and molecular dynamics (MD) simulations suggested that full-length MAD2A binds Kelch more favorably than a high-affinity 20-mer Nrf2 E78P peptide, but that the motif in isolation is not a particularly strong binder. Endeavoring to develop shorter peptides for activating Nrf2, new designs were created based on the E78P peptide, some of which showed considerable propensity to form binding-competent structures in MD, and were predicted to interact with Kelch more favorably than the E78P peptide. The peptides could be promising new ligands for enhancing the oxidative stress response.
Whole-genome relationships among Francisella bacteria of diverse origins define new species and provide specific regions for detection

DOE PAGES

Challacombe, Jean Faust; Petersen, Jeannine M.; Gallegos-Graves, La Verne A.; ...

2016-11-23

Francisella tularensis is a highly virulent zoonotic pathogen that causes tularemia and, because of weaponization efforts in past world wars, is considered a tier 1 biothreat agent. Detection and surveillance of F. tularensis may be confounded by the presence of uncharacterized, closely related organisms. Through DNA-based diagnostics and environmental surveys, novel clinical and environmental Francisella isolates have been obtained in recent years. Here we present 7 new Francisella genomes and a comparison of their characteristics to each other and to 24 publicly available genomes as well as a comparative analysis of 16S rRNA and sdhA genes from over 90 Francisellamore » strains. Delineation of new species in bacteria is challenging, especially when isolates having very close genomic characteristics exhibit different physiological features—for example, when some are virulent pathogens in humans and animals while others are nonpathogenic or are opportunistic pathogens. Species resolution within Francisella varies with analyses of single genes, multiple gene or protein sets, or whole-genome comparisons of nucleic acid and amino acid sequences. Analyses focusing on single genes (16S rRNA, sdhA), multiple gene sets (virulence genes, lipopolysaccharide [LPS] biosynthesis genes, pathogenicity island), and whole-genome comparisons (nucleotide and protein) gave congruent results, but with different levels of discrimination confidence. We designate four new species within the genus; Francisella opportunistica sp. nov. (MA06-7296), Francisella salina sp. nov. (TX07-7308), Francisella uliginis sp. nov. (TX07-7310), and Francisella frigiditurris sp. nov. (CA97-1460). Lastly, this study provides a robust comparative framework to discern species and virulence features of newly detected Francisella bacteria.« less
Whole-genome relationships among Francisella bacteria of diverse origins define new species and provide specific regions for detection

DOE Office of Scientific and Technical Information (OSTI.GOV)

Challacombe, Jean Faust; Petersen, Jeannine M.; Gallegos-Graves, La Verne A.

Francisella tularensis is a highly virulent zoonotic pathogen that causes tularemia and, because of weaponization efforts in past world wars, is considered a tier 1 biothreat agent. Detection and surveillance of F. tularensis may be confounded by the presence of uncharacterized, closely related organisms. Through DNA-based diagnostics and environmental surveys, novel clinical and environmental Francisella isolates have been obtained in recent years. Here we present 7 new Francisella genomes and a comparison of their characteristics to each other and to 24 publicly available genomes as well as a comparative analysis of 16S rRNA and sdhA genes from over 90 Francisellamore » strains. Delineation of new species in bacteria is challenging, especially when isolates having very close genomic characteristics exhibit different physiological features—for example, when some are virulent pathogens in humans and animals while others are nonpathogenic or are opportunistic pathogens. Species resolution within Francisella varies with analyses of single genes, multiple gene or protein sets, or whole-genome comparisons of nucleic acid and amino acid sequences. Analyses focusing on single genes (16S rRNA, sdhA), multiple gene sets (virulence genes, lipopolysaccharide [LPS] biosynthesis genes, pathogenicity island), and whole-genome comparisons (nucleotide and protein) gave congruent results, but with different levels of discrimination confidence. We designate four new species within the genus; Francisella opportunistica sp. nov. (MA06-7296), Francisella salina sp. nov. (TX07-7308), Francisella uliginis sp. nov. (TX07-7310), and Francisella frigiditurris sp. nov. (CA97-1460). Lastly, this study provides a robust comparative framework to discern species and virulence features of newly detected Francisella bacteria.« less
Molecular characterization of a Penicillium chrysogenum exo-rhamnogalacturonan lyase that is structurally distinct from other polysaccharide lyase family proteins.

PubMed

Iwai, Marin; Kawakami, Takuya; Ikemoto, Takeshi; Fujiwara, Daisuke; Takenaka, Shigeo; Nakazawa, Masami; Ueda, Mitsuhiro; Sakamoto, Tatsuji

2015-10-01

We previously described an endo-acting rhamnogalacturonan (RG) lyase, termed PcRGL4A, of Penicillium chrysogenum 31B. Here, we describe a second RG lyase, called PcRGLX. We determined the cDNA sequence of the Pcrglx gene, which encodes PcRGLX. Based on analyses using a BLAST search and a conserved domain search, PcRGLX was found to be structurally distinct from known RG lyases and might belong to a new polysaccharide lyase family together with uncharacterized fungal proteins of Nectria haematococca, Aspergillus oryzae, and Fusarium oxysporum. The Pcrglx cDNA gene product (rPcRGLX) expressed in Escherichia coli demonstrated specific activity against RG but not against homogalacturonan. Divalent cations were not essential for the enzymatic activity of rPcRGLX. rPcRGLX mainly released unsaturated galacturonosyl rhamnose (ΔGR) from RG backbones used as the substrate from the initial stage of the reaction, indicating that the enzyme can be classified as an exo-acting RG lyase (EC 4.2.2.24). This is the first report of an RG lyase with this mode of action in Eukaryota. rPcRGLX acted synergistically with PcRGL4A to degrade soybean RG and released ΔGR. This ΔGR was partially decorated with galactose (Gal) residues, indicating that rPcRGLX preferred oligomeric RGs to polymeric RGs, that the enzyme did not require Gal decoration of RG backbones for degradation, and that the enzyme bypassed the Gal side chains of RG backbones. These characteristics of rPcRGLX might be useful in the determination of complex structures of pectins.
Pathogen-induced ERF68 regulates hypersensitive cell death in tomato.

PubMed

Liu, An-Chi; Cheng, Chiu-Ping

2017-10-01

Ethylene response factors (ERFs) are a large plant-specific transcription factor family and play diverse important roles in various plant functions. However, most tomato ERFs have not been characterized. In this study, we showed that the expression of an uncharacterized member of the tomato ERF-IX subgroup, ERF68, was significantly induced by treatments with different bacterial pathogens, ethylene (ET) and salicylic acid (SA), but only slightly induced by bacterial mutants defective in the type III secretion system (T3SS) or non-host pathogens. The ERF68-green fluorescent protein (ERF68-GFP) fusion protein was localized in the nucleus. Transactivation and electrophoretic mobility shift assays (EMSAs) further showed that ERF68 was a functional transcriptional activator and was bound to the GCC-box. Moreover, transient overexpression of ERF68 led to spontaneous lesions in tomato and tobacco leaves and enhanced the expression of genes involved in ET, SA, jasmonic acid (JA) and hypersensitive response (HR) pathways, whereas silencing of ERF68 increased tomato susceptibility to two incompatible Xanthomonas spp. These results reveal the involvement of ERF68 in the effector-triggered immunity (ETI) pathway. To identify ERF68 target genes, chromatin immunoprecipitation combined with high-throughput sequencing (ChIP-seq) was performed. Amongst the confirmed target genes, a few genes involved in cell death or disease defence were differentially regulated by ERF68. Our study demonstrates the function of ERF68 in the positive regulation of hypersensitive cell death and disease defence by modulation of multiple signalling pathways, and provides important new information on the complex regulatory function of ERFs. © 2016 BSPP AND JOHN WILEY & SONS LTD.
Cloning, functional characterization and genomic organization of 1,8-cineole synthases from Lavandula.

PubMed

Demissie, Zerihun A; Cella, Monica A; Sarker, Lukman S; Thompson, Travis J; Rheault, Mark R; Mahmoud, Soheil S

2012-07-01

Several members of the genus Lavandula produce valuable essential oils (EOs) that are primarily constituted of the low molecular weight isoprenoids, particularly monoterpenes. We isolated over 8,000 ESTs from the glandular trichomes of L. x intermedia flowers (where bulk of the EO is synthesized) to facilitate the discovery of genes that control the biosynthesis of EO constituents. The expression profile of these ESTs in L. x intermedia and its parents L. angustifolia and L. latifolia was established using microarrays. The resulting data highlighted a differentially expressed, previously uncharacterized cDNA with strong homology to known 1,8-cineole synthase (CINS) genes. The ORF, excluding the transit peptide, of this cDNA was expressed in E. coli, purified by Ni-NTA agarose affinity chromatography and functionally characterized in vitro. The ca. 63 kDa bacterially produced recombinant protein, designated L. x intermedia CINS (LiCINS), converted geranyl diphosphate (the linear monoterpene precursor) primarily to 1,8-cineole with K ( m ) and k ( cat ) values of 5.75 μM and 8.8 × 10(-3) s(-1), respectively. The genomic DNA of CINS in the studied Lavandula species had identical exon-intron architecture and coding sequences, except for a single polymorphic nucleotide in the L. angustifolia ortholog which did not alter protein function. Additional nucleotide variations restricted to L. angustifolia introns were also observed, suggesting that LiCINS was most likely inherited from L. latifolia. The LiCINS mRNA levels paralleled the 1,8-cineole content in mature flowers of the three lavender species, and in developmental stages of L. x intermedia inflorescence indicating that the production of 1,8 cineole in Lavandula is most likely controlled through transcriptional regulation of LiCINS.
Computational and Experimental Analysis of the Secretome of Methylococcus capsulatus (Bath)

PubMed Central

Indrelid, Stine; Mathiesen, Geir; Jacobsen, Morten; Lea, Tor; Kleiveland, Charlotte R.

2014-01-01

The Gram-negative methanotroph Methylococcus capsulatus (Bath) was recently demonstrated to abrogate inflammation in a murine model of inflammatory bowel disease, suggesting interactions with cells involved in maintaining mucosal homeostasis and emphasizing the importance of understanding the many properties of M. capsulatus. Secreted proteins determine how bacteria may interact with their environment, and a comprehensive knowledge of such proteins is therefore vital to understand bacterial physiology and behavior. The aim of this study was to systematically analyze protein secretion in M. capsulatus (Bath) by identifying the secretion systems present and the respective secreted substrates. Computational analysis revealed that in addition to previously recognized type II secretion systems and a type VII secretion system, a type Vb (two-partner) secretion system and putative type I secretion systems are present in M. capsulatus (Bath). In silico analysis suggests that the diverse secretion systems in M.capsulatus transport proteins likely to be involved in adhesion, colonization, nutrient acquisition and homeostasis maintenance. Results of the computational analysis was verified and extended by an experimental approach showing that in addition an uncharacterized protein and putative moonlighting proteins are released to the medium during exponential growth of M. capsulatus (Bath). PMID:25479164
The biosynthetic capacities of the plastids and integration between cytoplasmic and chloroplast processes.

PubMed

Rolland, Norbert; Curien, Gilles; Finazzi, Giovanni; Kuntz, Marcel; Maréchal, Eric; Matringe, Michel; Ravanel, Stéphane; Seigneurin-Berny, Daphné

2012-01-01

Plastids are semiautonomous organelles derived from cyanobacterial ancestors. Following endosymbiosis, plastids have evolved to optimize their functions, thereby limiting metabolic redundancy with other cell compartments. Contemporary plastids have also recruited proteins produced by the nuclear genome of the host cell. In addition, many genes acquired from the cyanobacterial ancestor evolved to code for proteins that are targeted to cell compartments other than the plastid. Consequently, metabolic pathways are now a patchwork of enzymes of diverse origins, located in various cell compartments. Because of this, a wide range of metabolites and ions traffic between the plastids and other cell compartments. In this review, we provide a comprehensive analysis of the well-known, and of the as yet uncharacterized, chloroplast/cytosol exchange processes, which can be deduced from what is currently known about compartmentation of plant-cell metabolism.
First insight into the faecal microbiota of the high Arctic muskoxen (Ovibos moschatus)

PubMed Central

Bockwoldt, Mathias; Hagen, Live H.; Pope, Phillip B.; Sundset, Monica A.

2016-01-01

The faecal microbiota of muskoxen (n=3) pasturing on Ryøya (69° 33′ N 18° 43′ E), Norway, in late September was characterized using high-throughput sequencing of partial 16S rRNA gene regions. A total of 16 209 high-quality sequence reads from bacterial domains and 19 462 from archaea were generated. Preliminary taxonomic classifications of 806 bacterial operational taxonomic units (OTUs) resulted in 53.7–59.3 % of the total sequences being without designations beyond the family level. Firmicutes (70.7–81.1 % of the total sequences) and Bacteroidetes (16.8–25.3 %) constituted the two major bacterial phyla, with uncharacterized members within the family Ruminococcaceae (28.9–40.9 %) as the major phylotype. Multiple-library comparisons between muskoxen and other ruminants indicated a higher similarity for muskoxen faeces and reindeer caecum (P>0.05) and some samples from cattle faeces. The archaeal sequences clustered into 37 OTUs, with dominating phylotypes affiliated to the methane-producing genus Methanobrevibacter (80–92 % of the total sequences). UniFrac analysis demonstrated heterogeneity between muskoxen archaeal libraries and those from reindeer and roe deer (P=1.0e-02, Bonferroni corrected), but not with foregut fermenters. The high proportion of cellulose-degrading Ruminococcus-affiliated bacteria agrees with the ingestion of a highly fibrous diet. Further experiments are required to elucidate the role played by these novel bacteria in the digestion of this fibrous Artic diet eaten by muskoxen. PMID:28348861
Proteome-wide covalent ligand discovery in native biological systems

PubMed Central

Backus, Keriann M.; Correia, Bruno E.; Lum, Kenneth M.; Forli, Stefano; Horning, Benjamin D.; González-Páez, Gonzalo E.; Chatterjee, Sandip; Lanning, Bryan R.; Teijaro, John R.; Olson, Arthur J.; Wolan, Dennis W.; Cravatt, Benjamin F.

2016-01-01

Small molecules are powerful tools for investigating protein function and can serve as leads for new therapeutics. Most human proteins, however, lack small-molecule ligands, and entire protein classes are considered “undruggable” 1,2. Fragment-based ligand discovery (FBLD) can identify small-molecule probes for proteins that have proven difficult to target using high-throughput screening of complex compound libraries 1,3. Although reversibly binding ligands are commonly pursued, covalent fragments provide an alternative route to small-molecule probes 4–10, including those that can access regions of proteins that are difficult to access through binding affinity alone 5,10,11. In this manuscript, we report a quantitative analysis of cysteine-reactive small-molecule fragments screened against thousands of proteins. Covalent ligands were identified for >700 cysteines found in both druggable proteins and proteins deficient in chemical probes, including transcription factors, adaptor/scaffolding proteins, and uncharacterized proteins. Among the atypical ligand-protein interactions discovered were compounds that react preferentially with pro- (inactive) caspases. We used these ligands to distinguish extrinsic apoptosis pathways in human cell lines versus primary human T-cells, showing that the former is largely mediated by caspase-8 while the latter depends on both caspase-8 and −10. Fragment-based covalent ligand discovery provides a greatly expanded portrait of the ligandable proteome and furnishes compounds that can illuminate protein functions in native biological systems. PMID:27309814

Proteomic analysis of protein-protein interactions within the Cysteine Sulfinate Desulfinase Fe-S cluster biogenesis system.

PubMed

Bolstad, Heather M; Botelho, Danielle J; Wood, Matthew J

2010-10-01

Fe-S cluster biogenesis is of interest to many fields, including bioenergetics and gene regulation. The CSD system is one of three Fe-S cluster biogenesis systems in E. coli and is comprised of the cysteine desulfurase CsdA, the sulfur acceptor protein CsdE, and the E1-like protein CsdL. The biological role, biochemical mechanism, and protein targets of the system remain uncharacterized. Here we present that the active site CsdE C61 has a lowered pK(a) value of 6.5, which is nearly identical to that of C51 in the homologous SufE protein and which is likely critical for its function. We observed that CsdE forms disulfide bonds with multiple proteins and identified the proteins that copurify with CsdE. The identification of Fe-S proteins and both putative and established Fe-S cluster assembly (ErpA, glutaredoxin-3, glutaredoxin-4) and sulfur trafficking (CsdL, YchN) proteins supports the two-pathway model, in which the CSD system is hypothesized to synthesize both Fe-S clusters and other sulfur-containing cofactors. We suggest that the identified Fe-S cluster assembly proteins may be the scaffold and/or shuttle proteins for the CSD system. By comparison with previous analysis of SufE, we demonstrate that there is some overlap in the CsdE and SufE interactomes.
Comparative Genomics of Four Isosphaeraceae Planctomycetes: A Common Pool of Plasmids and Glycoside Hydrolase Genes Shared by Paludisphaera borealis PX4T, Isosphaera pallida IS1BT, Singulisphaera acidiphila DSM 18658T, and Strain SH-PL62

PubMed Central

Ivanova, Anastasia A.; Naumoff, Daniil G.; Miroshnikov, Kirill K.; Liesack, Werner; Dedysh, Svetlana N.

2017-01-01

The family Isosphaeraceae accommodates stalk-free planctomycetes with spherical cells, which can be assembled in short chains, long filaments, or aggregates. These bacteria inhabit a wide variety of terrestrial environments, among those the recently described Paludisphaera borealis PX4T that was isolated from acidic boreal wetlands. Here, we analyzed its finished genome in comparison to those of three other members of the Isosphaeraceae: Isosphaera pallida IS1BT, Singulisphaera acidiphila DSM 18658T, and the uncharacterized planctomycete strain SH-PL62. The complete genome of P. borealis PX4T consists of a 7.5 Mb chromosome and two plasmids, 112 and 43 kb in size. Annotation of the genome sequence revealed 5802 potential protein-coding genes of which 2775 could be functionally assigned. The genes encoding metabolic pathways common for chemo-organotrophic bacteria, such as glycolysis, citrate cycle, pentose-phosphate pathway, and oxidative phosphorylation were identified. Several genes involved in the synthesis of peptidoglycan as well as N-methylated ornithine lipids were present in the genome of P. borealis PX4T. A total of 26 giant genes with a size >5 kb were detected. The genome encodes a wide repertoire of carbohydrate-active enzymes (CAZymes) including 44 glycoside hydrolases (GH) and 83 glycosyltransferases (GT) affiliated with 21 and 13 CAZy families, respectively. The most-represented families are GH5, GH13, GH57, GT2, GT4, and GT83. The experimentally determined carbohydrate utilization pattern agrees well with the genome-predicted capabilities. The CAZyme repertoire in P. borealis PX4T is highly similar to that in the uncharacterized planctomycete SH-PL62 and S. acidiphila DSM 18658T, but different to that in the thermophile I. pallida IS1BT. The latter strain has a strongly reduced CAZyme content. In P. borealis PX4T, many of its CAZyme genes are organized in clusters. Contrary to most other members of the order Planctomycetales, all four analyzed Isosphaeraceae planctomycetes have plasmids in numbers varying from one to four. The plasmids from P. borealis PX4T display synteny to plasmids from other family members, providing evidence for their common evolutionary origin. PMID:28360896
SAP97-mediated ADAM10 trafficking from Golgi outposts depends on PKC phosphorylation

PubMed Central

Saraceno, C; Marcello, E; Di Marino, D; Borroni, B; Claeysen, S; Perroy, J; Padovani, A; Tramontano, A; Gardoni, F; Di Luca, M

2014-01-01

A disintegrin and metalloproteinase 10 (ADAM10) is the major α-secretase that catalyzes the amyloid precursor protein (APP) ectodomain shedding in the brain and prevents amyloid formation. Its activity depends on correct intracellular trafficking and on synaptic membrane insertion. Here, we describe that in hippocampal neurons the synapse-associated protein-97 (SAP97), an excitatory synapse scaffolding element, governs ADAM10 trafficking from dendritic Golgi outposts to synaptic membranes. This process is mediated by a previously uncharacterized protein kinase C phosphosite in SAP97 SRC homology 3 domain that modulates SAP97 association with ADAM10. Such mechanism is essential for ADAM10 trafficking from the Golgi outposts to the synapse, but does not affect ADAM10 transport from the endoplasmic reticulum. Notably, this process is altered in Alzheimer's disease brains. These results help in understanding the mechanism responsible for the modulation of ADAM10 intracellular path, and can constitute an innovative therapeutic strategy to finely tune ADAM10 shedding activity towards APP. PMID:25429624
EARP, a multisubunit tethering complex involved in endocytic recycling

PubMed Central

Schindler, Christina; Chen, Yu; Pu, Jing; Guo, Xiaoli; Bonifacino, Juan S.

2015-01-01

Recycling of endocytic receptors to the cell surface involves passage through a series of membrane-bound compartments by mechanisms that are poorly understood. In particular, it is unknown if endocytic recycling requires the function of multisubunit tethering complexes, as is the case for other intracellular trafficking pathways. Herein we describe a tethering complex named Endosome-Associated Recycling Protein (EARP) that is structurally related to the previously described Golgi-Associated Retrograde Protein (GARP) complex. Both complexes share the Ang2, Vps52 and Vps53 subunits, but EARP comprises an uncharacterized protein, Syndetin, in place of the Vps54 subunit of GARP. This change determines differential localization of EARP to recycling endosomes and GARP to the Golgi complex. EARP interacts with the target-SNARE Syntaxin 6 and various cognate SNAREs. Depletion of Syndetin or Syntaxin 6 delays recycling of internalized transferrin to the cell surface. These findings implicate EARP in canonical membrane-fusion events in the process of endocytic recycling. PMID:25799061
The vomeronasal organ mediates interspecies defensive behaviors through detection of protein pheromone homologs.

PubMed

Papes, Fabio; Logan, Darren W; Stowers, Lisa

2010-05-14

Potential predators emit uncharacterized chemosignals that warn receiving species of danger. Neurons that sense these stimuli remain unknown. Here we show that detection and processing of fear-evoking odors emitted from cat, rat, and snake require the function of sensory neurons in the vomeronasal organ. To investigate the molecular nature of the sensory cues emitted by predators, we isolated the salient ligands from two species using a combination of innate behavioral assays in naive receiving animals, calcium imaging, and c-Fos induction. Surprisingly, the defensive behavior-promoting activity released by other animals is encoded by species-specific ligands belonging to the major urinary protein (Mup) family, homologs of aggression-promoting mouse pheromones. We show that recombinant Mup proteins are sufficient to activate sensory neurons and initiate defensive behavior similarly to native odors. This co-option of existing sensory mechanisms provides a molecular solution to the difficult problem of evolving a variety of species-specific molecular detectors. Copyright (c) 2010 Elsevier Inc. All rights reserved.
The vomeronasal organ mediates interspecies defensive behaviors through detection of protein pheromone homologs

PubMed Central

Papes, Fabio; Logan, Darren W.; Stowers, Lisa

2010-01-01

Summary Potential predators emit uncharacterized chemosignals that warn receiving species of danger. Neurons that sense these stimuli remain unknown. Here we show that detection and processing of fear-evoking odors emitted from cat, rat, and snake require the function of sensory neurons in the vomeronasal organ. To investigate the molecular nature of the sensory cues emitted by predators, we isolated the salient ligands from two species using a combination of innate behavioral assays in naïve receiving animals, calcium imaging, and cFos induction. Surprisingly, the defensive behavior-promoting activity released by other animals is encoded by species-specific ligands belonging to the major urinary protein (Mup) family, homologs of aggression-promoting mouse pheromones. We show that recombinant Mup proteins are sufficient to activate sensory neurons and initiate defensive behavior similar to native odors. This co-option of existing sensory mechanisms provides a molecular solution to the difficult problem of evolving a variety of species-specific molecular detectors. PMID:20478258
Mutations in C4orf26, Encoding a Peptide with In Vitro Hydroxyapatite Crystal Nucleation and Growth Activity, Cause Amelogenesis Imperfecta

PubMed Central

Parry, David A.; Brookes, Steven J.; Logan, Clare V.; Poulter, James A.; El-Sayed, Walid; Al-Bahlani, Suhaila; Al Harasi, Sharifa; Sayed, Jihad; Raïf, El Mostafa; Shore, Roger C.; Dashash, Mayssoon; Barron, Martin; Morgan, Joanne E.; Carr, Ian M.; Taylor, Graham R.; Johnson, Colin A.; Aldred, Michael J.; Dixon, Michael J.; Wright, J. Tim; Kirkham, Jennifer; Inglehearn, Chris F.; Mighell, Alan J.

2012-01-01

Autozygosity mapping and clonal sequencing of an Omani family identified mutations in the uncharacterized gene, C4orf26, as a cause of recessive hypomineralized amelogenesis imperfecta (AI), a disease in which the formation of tooth enamel fails. Screening of a panel of 57 autosomal-recessive AI-affected families identified eight further families with loss-of-function mutations in C4orf26. C4orf26 encodes a putative extracellular matrix acidic phosphoprotein expressed in the enamel organ. A mineral nucleation assay showed that the protein’s phosphorylated C terminus has the capacity to promote nucleation of hydroxyapatite, suggesting a possible function in enamel mineralization during amelogenesis. PMID:22901946
Early Neolithic genomes from the eastern Fertile Crescent

PubMed Central

Broushaki, Farnaz; Thomas, Mark G; Link, Vivian; López, Saioa; van Dorp, Lucy; Kirsanow, Karola; Hofmanová, Zuzana; Diekmann, Yoan; Cassidy, Lara M.; Díez-del-Molino, David; Kousathanas, Athanasios; Sell, Christian; Robson, Harry K.; Martiniano, Rui; Blöcher, Jens; Scheu, Amelie; Kreutzer, Susanne; Bollongino, Ruth; Bobo, Dean; Davudi, Hossein; Munoz, Olivia; Currat, Mathias; Abdi, Kamyar; Biglari, Fereidoun; Craig, Oliver E.; Bradley, Daniel G; Shennan, Stephen; Veeramah, Krishna; Mashkour, Marjan

2016-01-01

We sequenced Early Neolithic genomes from the Zagros region of Iran (eastern Fertile Crescent), where some of the earliest evidence for farming is found, and identify a previously uncharacterized population that is neither ancestral to the first European farmers nor has contributed significantly to the ancestry of modern Europeans. These people are estimated to have separated from Early Neolithic farmers in Anatolia some 46-77,000 years ago and show affinities to modern day Pakistani and Afghan populations, but particularly to Iranian Zoroastrians. We conclude that multiple, genetically differentiated hunter-gatherer populations adopted farming in SW-Asia, that components of pre-Neolithic population structure were preserved as farming spread into neighboring regions, and that the Zagros region was the cradle of eastward expansion. PMID:27417496
MALDI Top-Down sequencing: calling N- and C-terminal protein sequences with high confidence and speed.

PubMed

Suckau, Detlev; Resemann, Anja

2009-12-01

The ability to match Top-Down protein sequencing (TDS) results by MALDI-TOF to protein sequences by classical protein database searching was evaluated in this work. Resulting from these analyses were the protein identity, the simultaneous assignment of the N- and C-termini and protein sequences of up to 70 residues from either terminus. In combination with de novo sequencing using the MALDI-TDS data, even fusion proteins were assigned and the detailed sequence around the fusion site was elucidated. MALDI-TDS allowed to efficiently match protein sequences quickly and to validate recombinant protein structures-in particular, protein termini-on the level of undigested proteins.
Differential regulation of oligodendrocyte markers by glucocorticoids: Post-transcriptional regulation of both proteolipid protein and myelin basic protein and transcriptional regulation of glycerol phosphate dehydrogenase

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kumar, S.; Cole, R.; Chiappelli, F.

During neonatal development glucocorticoids potentiate oligodendrocyte differentiation and myelinogenesis by regulating the expression of myelin basic protein, proteolipid protein, and glycerol phosphate dehydrogenase. The actual locus at which hydrocortisone exerts its developmental influence on glial physiology is, however, not well understood. Gycerol phosphate dehydrogenase is glucocorticoid-inducible in oligodendrocytes at all stages of development both in vivo and in vitro. In newborn rat cerebral cultures, between 9 and 15 days in vitro, a 2- to 3-fold increase in myelin basic protein and proteolipid protein mRNA levels occurs in oligodendrocytes within 12 hr of hydrocortisone treatment. Immunostaining demonstrates that this increase inmore » mRNAs is followed by a 2- to 3-fold increase in the protein levels within 24 hr. In vitro transcription assays performed with oligodendrocyte nuclei show an 11-fold increase in the transcriptional activity of glycerol phosphate dehydrogenase in response to hydrocortisone but no increase in transcription of myelin basic protein or proteolipid protein. These results indicate that during early myelinogeneis, glucocorticoids influence the expression of key oligodendroglial markers by different processes: The expression of glycerol phosphate dehydrogenase is regulated at the transcriptional level, whereas the expression of myelin basic protein and proteolipid protein is modulated via a different, yet uncharacterized, mechanism involving post-transcriptional regulation.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)

Kim, Inae; Hur, Jung; Jeong, Sunjoo, E-mail: sjsj@dankook.ac.kr

Highlights: • Wnt signaling as well as β-catenin overexpression enhance HuR cytoplasmic export. • HuR overexpression promotes cytoplasmic localization of β-catenin from the perinuclear fraction. • Wnt/β-catenin-mediated transcriptional activity is repressesed by HuR. - Abstract: β-Catenin is the key transcriptional activator of canonical Wnt signaling in the nucleus; thus, nuclear accumulation of β-catenin is a critical step for expressing target genes. β-Catenin accumulates in the nucleus of cancer cells where it activates oncogenic target genes. Hu antigen R (HuR) is a RNA binding protein that regulates multiple post-transcriptional processes including RNA stability. Thus, cytoplasmic HuR protein may be involved inmore » tumorigenesis by stabilizing oncogenic transcripts, but the molecular mechanism remains unclear. Here, we observed that Wnt/β-catenin signaling induced export of the HuR protein, whereas HuR overexpression promoted accumulation of the β-catenin protein in the cytoplasm. Thus, Wnt/β-catenin-mediated transcriptional activity in the nucleus was reduced by overexpressing HuR. These results suggest novel and uncharacterized cytoplasmic β-catenin functions related to HuR-mediated RNA metabolism in cancer cells.« less
Roles of AGCVIII Kinases in the Hypocotyl Phototropism of Arabidopsis Seedlings.

PubMed

Haga, Ken; Frank, Lena; Kimura, Taro; Schwechheimer, Claus; Sakai, Tatsuya

2018-05-01

Regulation of protein function by phosphorylation and dephosphorylation is an important mechanism in many cellular events. The phototropin blue-light photoreceptors, plant-specific AGCVIII kinases, are essential for phototropic responses. Members of the D6 PROTEIN KINASE (D6PK) family, representing a subfamily of the AGCVIII kinases, also contribute to phototropic responses, suggesting that possibly further AGCVIII kinases may potentially control phototropism. The present study investigates the functional roles of Arabidopsis (Arabidopsis thaliana) AGCVIII kinases in hypocotyl phototropism. We demonstrate that D6PK family kinases are not only required for the second but also for the first positive phototropism. In addition, we find that a previously uncharacterized AGCVIII protein, AGC1-12, is involved in the first positive phototropism and gravitropism. AGC1-12 phosphorylates serine residues in the cytoplasmic loop of PIN-FORMED 1 (PIN1) and shares phosphosite preferences with D6PK. Our work strongly suggests that the D6PK family and AGC1-12 are critical components for both hypocotyl phototropism and gravitropism, and that these kinases control tropic responses mainly through regulation of PIN-mediated auxin transport by protein phosphorylation.
The CASTOR proteins are arginine sensors for the mTORC1 pathway

PubMed Central

Chantranupong, Lynne; Scaria, Sonia M.; Saxton, Robert A.; Gygi, Melanie P.; Shen, Kuang; Wyant, Gregory A.; Wang, Tim; Harper, J. Wade; Gygi, Steven P.; Sabatini, David M.

2016-01-01

Amino acids signal to the mTOR complex I (mTORC1) growth pathway through the Rag GTPases. Multiple distinct complexes regulate the Rags, including GATOR1, a GTPase activating protein (GAP), and GATOR2, a positive regulator of unknown molecular function. Arginine stimulation of cells activates mTORC1, but how it is sensed is not well understood. Recently, SLC38A9 was identified as a putative lysosomal arginine sensor required for arginine to activate mTORC1 but how arginine deprivation represses mTORC1 is unknown. Here, we show that CASTOR1, a previously uncharacterized protein, interacts with GATOR2 and is required for arginine deprivation to inhibit mTORC1. CASTOR1 homodimerizes and can also heterodimerize with the related protein, CASTOR2. Arginine disrupts the CASTOR1-GATOR2 complex by binding to CASTOR1 with a dissociation constant of ~30 μM, and its arginine-binding capacity is required for arginine to activate mTORC1 in cells. Collectively, these results establish CASTOR1 as an arginine sensor for the mTORC1 pathway. PMID:26972053
Proteomic analysis of laser capture microscopy purified myotendinous junction regions from muscle sections

PubMed Central

2014-01-01

The myotendinous junction is a specialized structure of the muscle fibre enriched in mechanosensing complexes, including costameric proteins and core elements of the z-disc. Here, laser capture microdissection was applied to purify membrane regions from the myotendinous junctions of mouse skeletal muscles, which were then processed for proteomic analysis. Sarcolemma sections from the longitudinal axis of the muscle fibre were used as control for the specificity of the junctional preparation. Gene ontology term analysis of the combined lists indicated a statistically significant enrichment in membrane-associated proteins. The myotendinous junction preparation contained previously uncharacterized proteins, a number of z-disc costameric ligands (e.g., actinins, capZ, αB cristallin, filamin C, cypher, calsarcin, desmin, FHL1, telethonin, nebulin, titin and an enigma-like protein) and other proposed players of sarcomeric stretch sensing and signalling, such as myotilin and the three myomesin homologs. A subset were confirmed by immunofluorescence analysis as enriched at the myotendinous junction, suggesting that laser capture microdissection from muscle sections is a valid approach to identify novel myotendinous junction players potentially involved in mechanotransduction pathways. PMID:25071420
GRAF1 forms a complex with MICAL-L1 and EHD1 to cooperate in tubular recycling endosome vesiculation

PubMed Central

Cai, Bishuang; Xie, Shuwei; Caplan, Steve; Naslavsky, Naava

2014-01-01

The biogenesis of tubular recycling endosomes (TREs) and their subsequent vesiculation after cargo-sorting has occurred, is essential for receptor and lipid recycling to the plasma membrane. Although recent studies have implicated the C-terminal Eps15 Homology Domain (EHD) protein, EHD1, as a key regulator of TRE vesiculation, additional proteins involved in this process have been largely uncharacterized. In the present study, we identify the GTPase Regulator Associated with Focal adhesion kinase-1 (GRAF1) protein in a complex with EHD1 and the TRE hub protein, Molecules Interacting with CasL-Like1 (MICAL-L1). Over-expression of GRAF1 caused vesiculation of MICAL-L1-containing TRE, whereas GRAF1-depletion led to impaired TRE vesiculation and delayed receptor recycling. Moreover, co-addition of purified EHD1 and GRAF1 in a semi-permeabilized cell vesiculation assay produced synergistic TRE vesiculation. Overall, based on our data, we suggest that in addition to its roles in clathrin-independent endocytosis, GRAF1 synergizes with EHD1 to support TRE vesiculation. PMID:25364729
NHS-Esters As Versatile Reactivity-Based Probes for Mapping Proteome-Wide Ligandable Hotspots.

PubMed

Ward, Carl C; Kleinman, Jordan I; Nomura, Daniel K

2017-06-16

Most of the proteome is considered undruggable, oftentimes hindering translational efforts for drug discovery. Identifying previously unknown druggable hotspots in proteins would enable strategies for pharmacologically interrogating these sites with small molecules. Activity-based protein profiling (ABPP) has arisen as a powerful chemoproteomic strategy that uses reactivity-based chemical probes to map reactive, functional, and ligandable hotspots in complex proteomes, which has enabled inhibitor discovery against various therapeutic protein targets. Here, we report an alkyne-functionalized N-hydroxysuccinimide-ester (NHS-ester) as a versatile reactivity-based probe for mapping the reactivity of a wide range of nucleophilic ligandable hotspots, including lysines, serines, threonines, and tyrosines, encompassing active sites, allosteric sites, post-translational modification sites, protein interaction sites, and previously uncharacterized potential binding sites. Surprisingly, we also show that fragment-based NHS-ester ligands can be made to confer selectivity for specific lysine hotspots on specific targets including Dpyd, Aldh2, and Gstt1. We thus put forth NHS-esters as promising reactivity-based probes and chemical scaffolds for covalent ligand discovery.
The Linker Histone GH1-HMGA1 Is Involved in Telomere Stability and DNA Damage Repair1[OPEN

PubMed Central

Charbonnel, Cyril; Benyahya, Fatiha; Butter, Falk

2018-01-01

Despite intensive searches, few proteins involved in telomere homeostasis have been identified in plants. Here, we used pull-down assays to identify potential telomeric interactors in the model plant species Arabidopsis (Arabidopsis thaliana). We identified the candidate protein GH1-HMGA1 (also known as HON4), an uncharacterized linker histone protein of the High Mobility Group Protein A (HMGA) family in plants. HMGAs are architectural transcription factors and have been suggested to function in DNA damage repair, but their precise biological roles remain unclear. Here, we show that GH1-HMGA1 is required for efficient DNA damage repair and telomere integrity in Arabidopsis. GH1-HMGA1 mutants exhibit developmental and growth defects, accompanied by ploidy defects, increased telomere dysfunction-induced foci, mitotic anaphase bridges, and degraded telomeres. Furthermore, mutants have a higher sensitivity to genotoxic agents such as mitomycin C and γ-irradiation. Our work also suggests that GH1-HMGA1 is involved directly in the repair process by allowing the completion of homologous recombination. PMID:29622687
Evolution of strigolactone receptors by gradual neo-functionalization of KAI2 paralogues.

PubMed

Bythell-Douglas, Rohan; Rothfels, Carl J; Stevenson, Dennis W D; Graham, Sean W; Wong, Gane Ka-Shu; Nelson, David C; Bennett, Tom

2017-06-29

Strigolactones (SLs) are a class of plant hormones that control many aspects of plant growth. The SL signalling mechanism is homologous to that of karrikins (KARs), smoke-derived compounds that stimulate seed germination. In angiosperms, the SL receptor is an α/β-hydrolase known as DWARF14 (D14); its close homologue, KARRIKIN INSENSITIVE2 (KAI2), functions as a KAR receptor and likely recognizes an uncharacterized, endogenous signal ('KL'). Previous phylogenetic analyses have suggested that the KAI2 lineage is ancestral in land plants, and that canonical D14-type SL receptors only arose in seed plants; this is paradoxical, however, as non-vascular plants synthesize and respond to SLs. We have used a combination of phylogenetic and structural approaches to re-assess the evolution of the D14/KAI2 family in land plants. We analysed 339 members of the D14/KAI2 family from land plants and charophyte algae. Our phylogenetic analyses show that the divergence between the eu-KAI2 lineage and the DDK (D14/DLK2/KAI2) lineage that includes D14 occurred very early in land plant evolution. We show that eu-KAI2 proteins are highly conserved, and have unique features not found in DDK proteins. Conversely, we show that DDK proteins show considerable sequence and structural variation to each other, and lack clearly definable characteristics. We use homology modelling to show that the earliest members of the DDK lineage structurally resemble KAI2 and that SL receptors in non-seed plants likely do not have D14-like structure. We also show that certain groups of DDK proteins lack the otherwise conserved MORE AXILLARY GROWTH2 (MAX2) interface, and may thus function independently of MAX2, which we show is highly conserved throughout land plant evolution. Our results suggest that D14-like structure is not required for SL perception, and that SL perception has relatively relaxed structural requirements compared to KAI2-mediated signalling. We suggest that SL perception gradually evolved by neo-functionalization within the DDK lineage, and that the transition from KAI2-like to D14-like protein may have been driven by interactions with protein partners, rather than being required for SL perception per se.
A critical assessment of Mus musculus gene function prediction using integrated genomic evidence

PubMed Central

Peña-Castillo, Lourdes; Tasan, Murat; Myers, Chad L; Lee, Hyunju; Joshi, Trupti; Zhang, Chao; Guan, Yuanfang; Leone, Michele; Pagnani, Andrea; Kim, Wan Kyu; Krumpelman, Chase; Tian, Weidong; Obozinski, Guillaume; Qi, Yanjun; Mostafavi, Sara; Lin, Guan Ning; Berriz, Gabriel F; Gibbons, Francis D; Lanckriet, Gert; Qiu, Jian; Grant, Charles; Barutcuoglu, Zafer; Hill, David P; Warde-Farley, David; Grouios, Chris; Ray, Debajyoti; Blake, Judith A; Deng, Minghua; Jordan, Michael I; Noble, William S; Morris, Quaid; Klein-Seetharaman, Judith; Bar-Joseph, Ziv; Chen, Ting; Sun, Fengzhu; Troyanskaya, Olga G; Marcotte, Edward M; Xu, Dong; Hughes, Timothy R; Roth, Frederick P

2008-01-01

Background: Several years after sequencing the human genome and the mouse genome, much remains to be discovered about the functions of most human and mouse genes. Computational prediction of gene function promises to help focus limited experimental resources on the most likely hypotheses. Several algorithms using diverse genomic data have been applied to this task in model organisms; however, the performance of such approaches in mammals has not yet been evaluated. Results: In this study, a standardized collection of mouse functional genomic data was assembled; nine bioinformatics teams used this data set to independently train classifiers and generate predictions of function, as defined by Gene Ontology (GO) terms, for 21,603 mouse genes; and the best performing submissions were combined in a single set of predictions. We identified strengths and weaknesses of current functional genomic data sets and compared the performance of function prediction algorithms. This analysis inferred functions for 76% of mouse genes, including 5,000 currently uncharacterized genes. At a recall rate of 20%, a unified set of predictions averaged 41% precision, with 26% of GO terms achieving a precision better than 90%. Conclusion: We performed a systematic evaluation of diverse, independently developed computational approaches for predicting gene function from heterogeneous data sources in mammals. The results show that currently available data for mammals allows predictions with both breadth and accuracy. Importantly, many highly novel predictions emerge for the 38% of mouse genes that remain uncharacterized. PMID:18613946
Proteome Analysis of Human Sebaceous Follicle Infundibula Extracted from Healthy and Acne-Affected Skin

PubMed Central

Bek-Thomsen, Malene; Lomholt, Hans B.; Scavenius, Carsten; Enghild, Jan J.; Brüggemann, Holger

2014-01-01

Acne vulgaris is a very common disease of the pilosebaceous unit of the human skin. The pathological processes of acne are not fully understood. To gain further insight sebaceous follicular casts were extracted from 18 healthy and 20 acne-affected individuals by cyanoacrylate-gel biopsies and further processed for mass spectrometry analysis, aiming at a proteomic analysis of the sebaceous follicular casts. Human as well as bacterial proteins were identified. Human proteins enriched in acne and normal samples were detected, respectively. Normal follicular casts are enriched in proteins such as prohibitins and peroxiredoxins which are involved in the protection from various stresses, including reactive oxygen species. By contrast, follicular casts extracted from acne-affected skin contained proteins involved in inflammation, wound healing and tissue remodeling. Among the most distinguishing proteins were myeloperoxidase, lactotransferrin, neutrophil elastase inhibitor and surprisingly, vimentin. The most significant biological process among all acne-enriched proteins was ‘response to a bacterium’. Identified bacterial proteins were exclusively from Propionibacterium acnes. The most abundant P. acnes proteins were surface-exposed dermatan sulphate adhesins, CAMP factors, and a so far uncharacterized lipase in follicular casts extracted from normal as well as acne-affected skin. This is a first proteomic study that identified human proteins together with proteins of the skin microbiota in sebaceous follicular casts. PMID:25238151

Two-dimensional blue native/SDS-PAGE analysis of whole cell lysate protein complexes of rice in response to salt stress.

PubMed

Hashemi, Amenehsadat; Gharechahi, Javad; Nematzadeh, Ghorbanali; Shekari, Faezeh; Hosseini, Seyed Abdollah; Salekdeh, Ghasem Hosseini

2016-08-01

To understand the biology of a plant in response to stress, insight into protein-protein interactions, which almost define cell behavior, is thought to be crucial. Here, we provide a comparative complexomics analysis of leaf whole cell lysate of two rice genotypes with contrasting responses to salt using two-dimensional blue native/SDS-PAGE (2D-BN/SDS-PAGE). We aimed to identify changes in subunit composition and stoichiometry of protein complexes elicited by salt. Using mild detergent for protein complex solubilization, we were able to identify 9 protein assemblies as hetero-oligomeric and 30 as homo-oligomeric complexes. A total of 20 proteins were identified as monomers in the 2D-BN/SDS-PAGE gels. In addition to identifying known protein complexes that confirm the technical validity of our analysis, we were also able to discover novel protein-protein interactions. Interestingly, an interaction was detected for glycolytic enzymes enolase (ENO1) and triosephosphate isomerase (TPI) and also for a chlorophyll a-b binding protein and RuBisCo small subunit. To show changes in subunit composition and stoichiometry of protein assemblies during salt stress, the differential abundance of interacting proteins was compared between salt-treated and control plants. A detailed exploration of some of the protein complexes provided novel insight into the function, composition, stoichiometry and dynamics of known and previously uncharacterized protein complexes in response to salt stress. Copyright © 2016 Elsevier GmbH. All rights reserved.
Rpn (YhgA-Like) Proteins of Escherichia coli K-12 and Their Contribution to RecA-Independent Horizontal Transfer.

PubMed

Kingston, Anthony W; Ponkratz, Christine; Raleigh, Elisabeth A

2017-04-01

Bacteria use a variety of DNA-mobilizing enzymes to facilitate environmental niche adaptation via horizontal gene transfer. This has led to real-world problems, like the spread of antibiotic resistance, yet many mobilization proteins remain undefined. In the study described here, we investigated the uncharacterized family of YhgA-like transposase_31 (Pfam PF04754) proteins. Our primary focus was the genetic and biochemical properties of the five Escherichia coli K-12 members of this family, which we designate RpnA to RpnE, where Rpn represents r ecombination- p romoting n uclease. We employed a conjugal system developed by our lab that demanded RecA-independent recombination following transfer of chromosomal DNA. Overexpression of RpnA (YhgA), RpnB (YfcI), RpnC (YadD), and RpnD (YjiP) increased RecA-independent recombination, reduced cell viability, and induced the expression of reporter of DNA damage. For the exemplar of the family, RpnA, mutational changes in proposed catalytic residues reduced or abolished all three phenotypes in concert. In vitro , RpnA displayed magnesium-dependent, calcium-stimulated DNA endonuclease activity with little, if any, sequence specificity and a preference for double-strand cleavage. We propose that Rpn/YhgA-like family nucleases can participate in gene acquisition processes. IMPORTANCE Bacteria adapt to new environments by obtaining new genes from other bacteria. Here, we characterize a set of genes that can promote the acquisition process by a novel mechanism. Genome comparisons had suggested the horizontal spread of the genes for the YhgA-like family of proteins through bacteria. Although annotated as transposase_31, no member of the family has previously been characterized experimentally. We show that four Escherichia coli K-12 paralogs contribute to a novel RecA-independent recombination mechanism in vivo For RpnA, we demonstrate in vitro action as a magnesium-dependent, calcium-stimulated nonspecific DNA endonuclease. The cleavage products are capable of providing priming sites for DNA polymerase, which can enable DNA joining by primer-template switching. Copyright © 2017 Kingston et al.
Bacterial motility complexes require the actin-like protein, MreB and the Ras homologue, MglA.

PubMed

Mauriello, Emilia M F; Mouhamar, Fabrice; Nan, Beiyan; Ducret, Adrien; Dai, David; Zusman, David R; Mignot, Tâm

2010-01-20

Gliding motility in the bacterium Myxococcus xanthus uses two motility engines: S-motility powered by type-IV pili and A-motility powered by uncharacterized motor proteins and focal adhesion complexes. In this paper, we identified MreB, an actin-like protein, and MglA, a small GTPase of the Ras superfamily, as essential for both motility systems. A22, an inhibitor of MreB cytoskeleton assembly, reversibly inhibited S- and A-motility, causing rapid dispersal of S- and A-motility protein clusters, FrzS and AglZ. This suggests that the MreB cytoskeleton is involved in directing the positioning of these proteins. We also found that a DeltamglA motility mutant showed defective localization of AglZ and FrzS clusters. Interestingly, MglA-YFP localization mimicked both FrzS and AglZ patterns and was perturbed by A22 treatment, consistent with results indicating that both MglA and MreB bind to motility complexes. We propose that MglA and the MreB cytoskeleton act together in a pathway to localize motility proteins such as AglZ and FrzS to assemble the A-motility machineries. Interestingly, M. xanthus motility systems, like eukaryotic systems, use an actin-like protein and a small GTPase spatial regulator.
Bacterial motility complexes require the actin-like protein, MreB and the Ras homologue, MglA

PubMed Central

Mauriello, Emilia M F; Mouhamar, Fabrice; Nan, Beiyan; Ducret, Adrien; Dai, David; Zusman, David R; Mignot, Tâm

2010-01-01

Gliding motility in the bacterium Myxococcus xanthus uses two motility engines: S-motility powered by type-IV pili and A-motility powered by uncharacterized motor proteins and focal adhesion complexes. In this paper, we identified MreB, an actin-like protein, and MglA, a small GTPase of the Ras superfamily, as essential for both motility systems. A22, an inhibitor of MreB cytoskeleton assembly, reversibly inhibited S- and A-motility, causing rapid dispersal of S- and A-motility protein clusters, FrzS and AglZ. This suggests that the MreB cytoskeleton is involved in directing the positioning of these proteins. We also found that a ΔmglA motility mutant showed defective localization of AglZ and FrzS clusters. Interestingly, MglA–YFP localization mimicked both FrzS and AglZ patterns and was perturbed by A22 treatment, consistent with results indicating that both MglA and MreB bind to motility complexes. We propose that MglA and the MreB cytoskeleton act together in a pathway to localize motility proteins such as AglZ and FrzS to assemble the A-motility machineries. Interestingly, M. xanthus motility systems, like eukaryotic systems, use an actin-like protein and a small GTPase spatial regulator. PMID:19959988
Subinhibitory Concentrations of Antimicrobial Agents Reduce the Uptake of Legionella pneumophila into Acanthamoeba castellanii and U937 Cells by Altering the Expression of Virulence-Associated Antigens

PubMed Central

Lück, P. Christian; Schmitt, Jürgen W.; Hengerer, Arne; Helbig, Jürgen H.

1998-01-01

We determined the MICs of ampicillin, ciprofloxacin, erythromycin, imipenem, and rifampin for two clinical isolates of Legionella pneumophila serogroup 1 by 3-(4,5-dimethylthiazol-2-yl)-2,5-diphenyltetrazolium bromide (MTT) reduction assay and by quantitative culture. To test the influence of subinhibitory concentrations (sub-MICs) of antimicrobial agents on Legionella uptake into Acanthamoeba castellanii and U937 macrophage-like cells, both strains were pretreated with 0.25 MICs of the antibiotics for 24 h. In comparison to that for the untreated control, subinhibitory concentrations of antibiotics significantly reduced Legionella uptake into the host cells. Measurement of the binding of monoclonal antibodies against several Legionella antigens by enzyme-linked immunoassays indicated that sub-MIC antibiotic treatment reduced the expression of the macrophage infectivity potentiator protein (Mip), the Hsp 60 protein, the outer membrane protein (OmpM), an as-yet-uncharacterized protein of 55 kDa, and a few lipopolysaccharide (LPS) epitopes. In contrast, the expression of some LPS epitopes recognized by monoclonal antibodies 8/5 and 30/4 as well as a 45-kDa protein, a 58-kDa protein, and the major outer membrane protein (OmpS) remained unaffected. PMID:9797218
A conserved OmpA-like protein in Legionella pneumophila required for efficient intracellular replication.

PubMed

Goodwin, Ian P; Kumova, Ogan K; Ninio, Shira

2016-08-01

The OmpA-like protein domain has been associated with peptidoglycan-binding proteins, and is often found in virulence factors of bacterial pathogens. The intracellular pathogen Legionella pneumophila encodes for six proteins that contain the OmpA-like domain, among them the highly conserved uncharacterized protein we named CmpA. Here we set out to characterize the CmpA protein and determine its contribution to intracellular survival of L. pneumophila Secondary structure analysis suggests that CmpA is an inner membrane protein with a peptidoglycan-binding domain at the C-teminus. A cmpA mutant was able to replicate normally in broth, but failed to compete with an isogenic wild-type strain in an intracellular growth competition assay. The cmpA mutant also displayed significant intracellular growth defects in both the protozoan host Acanthamoeba castellanii and in primary bone marrow-derived macrophages, where uptake into the cells was also impaired. The cmpA phenotypes were completely restored upon expression of CmpA in trans The data presented here establish CmpA as a novel virulence factor of L. pneumophila that is required for efficient intracellular replication in both mammalian and protozoan hosts. © FEMS 2016. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
A Conserved Coatomer-related Complex Containing Sec13 and Seh1 Dynamically Associates With the Vacuole in Saccharomyces cerevisiae*

PubMed Central

Dokudovskaya, Svetlana; Waharte, Francois; Schlessinger, Avner; Pieper, Ursula; Devos, Damien P.; Cristea, Ileana M.; Williams, Rosemary; Salamero, Jean; Chait, Brian T.; Sali, Andrej; Field, Mark C.; Rout, Michael P.; Dargemont, Catherine

2011-01-01

The presence of multiple membrane-bound intracellular compartments is a major feature of eukaryotic cells. Many of the proteins required for formation and maintenance of these compartments share an evolutionary history. Here, we identify the SEA (Seh1-associated) protein complex in yeast that contains the nucleoporin Seh1 and Sec13, the latter subunit of both the nuclear pore complex and the COPII coating complex. The SEA complex also contains Npr2 and Npr3 proteins (upstream regulators of TORC1 kinase) and four previously uncharacterized proteins (Sea1–Sea4). Combined computational and biochemical approaches indicate that the SEA complex proteins possess structural characteristics similar to the membrane coating complexes COPI, COPII, the nuclear pore complex, and, in particular, the related Vps class C vesicle tethering complexes HOPS and CORVET. The SEA complex dynamically associates with the vacuole in vivo. Genetic assays indicate a role for the SEA complex in intracellular trafficking, amino acid biogenesis, and response to nitrogen starvation. These data demonstrate that the SEA complex is an additional member of a family of membrane coating and vesicle tethering assemblies, extending the repertoire of protocoatomer-related complexes. PMID:21454883
CUP-1 Is a Novel Protein Involved in Dietary Cholesterol Uptake in Caenorhabditis elegans

PubMed Central

Valdes, Victor J.; Athie, Alejandro; Salinas, Laura S.; Navarro, Rosa E.; Vaca, Luis

2012-01-01

Sterols transport and distribution are essential processes in all multicellular organisms. Survival of the nematode Caenorhabditis elegans depends on dietary absorption of sterols present in the environment. However the general mechanisms associated to sterol uptake in nematodes are poorly understood. In the present work we provide evidence showing that a previously uncharacterized transmembrane protein, designated Cholesterol Uptake Protein-1 (CUP-1), is involved in dietary cholesterol uptake in C. elegans. Animals lacking CUP-1 showed hypersensitivity to cholesterol limitation and were unable to uptake cholesterol. A CUP-1-GFP fusion protein colocalized with cholesterol-rich vesicles, endosomes and lysosomes as well as the plasma membrane. Additionally, by FRET imaging, a direct interaction was found between the cholesterol analog DHE and the transmembrane “cholesterol recognition/interaction amino acid consensus” (CRAC) motif present in C. elegans CUP-1. In-silico analysis identified two mammalian homologues of CUP-1. Most interestingly, CRAC motifs are conserved in mammalian CUP-1 homologous. Our results suggest a role of CUP-1 in cholesterol uptake in C. elegans and open up the possibility for the existence of a new class of proteins involved in sterol absorption in mammals. PMID:22479487
The protein interaction map of bacteriophage lambda

PubMed Central

2011-01-01

Background Bacteriophage lambda is a model phage for most other dsDNA phages and has been studied for over 60 years. Although it is probably the best-characterized phage there are still about 20 poorly understood open reading frames in its 48-kb genome. For a complete understanding we need to know all interactions among its proteins. We have manually curated the lambda literature and compiled a total of 33 interactions that have been found among lambda proteins. We set out to find out how many protein-protein interactions remain to be found in this phage. Results In order to map lambda's interactions, we have cloned 68 out of 73 lambda open reading frames (the "ORFeome") into Gateway vectors and systematically tested all proteins for interactions using exhaustive array-based yeast two-hybrid screens. These screens identified 97 interactions. We found 16 out of 30 previously published interactions (53%). We have also found at least 18 new plausible interactions among functionally related proteins. All previously found and new interactions are combined into structural and network models of phage lambda. Conclusions Phage lambda serves as a benchmark for future studies of protein interactions among phage, viruses in general, or large protein assemblies. We conclude that we could not find all the known interactions because they require chaperones, post-translational modifications, or multiple proteins for their interactions. The lambda protein network connects 12 proteins of unknown function with well characterized proteins, which should shed light on the functional associations of these uncharacterized proteins. PMID:21943085
Development of aptamers against unpurified proteins.

PubMed

Goto, Shinichi; Tsukakoshi, Kaori; Ikebukuro, Kazunori

2017-12-01

SELEX (Systematic Evolution of Ligands by EXponential enrichment) has been widely used for the generation of aptamers against target proteins. However, its requirement for pure target proteins remains a major problem in aptamer selection, as procedures for protein purification from crude bio-samples are not only complicated but also time and labor consuming. This is because native proteins can be found in a large number of diverse forms because of posttranslational modifications and their complicated molecular conformations. Moreover, several proteins are difficult to purify owing to their chemical fragility and/or rarity in native samples. An alternative route is the use of recombinant proteins for aptamer selection, because they are homogenous and easily purified. However, aptamers generated against recombinant proteins produced in prokaryotic cells may not interact with the same proteins expressed in eukaryotic cells because of posttranslational modifications. Moreover, to date recombinant proteins have been constructed for only a fraction of proteins expressed in the human body. Therefore, the demand for advanced SELEX methods not relying on complicated purification processes from native samples or recombinant proteins is growing. This review article describes several such techniques that allow researchers to directly develop an aptamer from various unpurified samples, such as whole cells, tissues, serum, and cell lysates. The key advantages of advanced SELEX are that it does not require a purification process from a crude bio-sample, maintains the functional states of target proteins, and facilitates the development of aptamers against unidentified and uncharacterized proteins in unpurified biological samples. © 2017 Wiley Periodicals, Inc.
Genome sequence and analysis of a stress-tolerant, wild-derived strain of Saccharomyces cerevisiae used in biofuels research

DOE Office of Scientific and Technical Information (OSTI.GOV)

McIlwain, Sean J.; Peris, Davis; Sardi, Maria

The genome sequences of more than 100 strains of the yeast Saccharomyces cerevisiae have been published. Unfortunately, most of these genome assemblies contain dozens to hundreds of gaps at repetitive sequences, including transposable elements, tRNAs, and subtelomeric regions, which is where novel genes generally reside. Relatively few strains have been chosen for genome sequencing based on their biofuel production potential, leaving an additional knowledge gap. Here, we describe the nearly complete genome sequence of GLBRCY22-3 (Y22-3), a strain of S. cerevisiae derived from the stress-tolerant wild strain NRRL YB-210 and subsequently engineered for xylose metabolism. After benchmarking several genome assemblymore » approaches, we developed a pipeline to integrate Pacific Biosciences (PacBio) and Illumina sequencing data and achieved one of the highest quality genome assemblies for any S. cerevisiae strain. Specifically, the contig N50 is 693 kbp, and the sequences of most chromosomes, the mitochondrial genome, and the 2-micron plasmid are complete. Our annotation predicts 92 genes that are not present in the reference genome of the laboratory strain S288c, over 70% of which were expressed. We predicted functions for 43 of these genes, 28 of which were previously uncharacterized and unnamed. Remarkably, many of these genes are predicted to be involved in stress tolerance and carbon metabolism and are shared with a Brazilian bioethanol production strain, even though the strains differ dramatically at most genetic loci. Lastly, the Y22-3 genome sequence provides an exceptionally high-quality resource for basic and applied research in bioenergy and genetics.« less
Genome sequence and analysis of a stress-tolerant, wild-derived strain of Saccharomyces cerevisiae used in biofuels research

DOE PAGES

McIlwain, Sean J.; Peris, Davis; Sardi, Maria; ...

2016-04-20

The genome sequences of more than 100 strains of the yeast Saccharomyces cerevisiae have been published. Unfortunately, most of these genome assemblies contain dozens to hundreds of gaps at repetitive sequences, including transposable elements, tRNAs, and subtelomeric regions, which is where novel genes generally reside. Relatively few strains have been chosen for genome sequencing based on their biofuel production potential, leaving an additional knowledge gap. Here, we describe the nearly complete genome sequence of GLBRCY22-3 (Y22-3), a strain of S. cerevisiae derived from the stress-tolerant wild strain NRRL YB-210 and subsequently engineered for xylose metabolism. After benchmarking several genome assemblymore » approaches, we developed a pipeline to integrate Pacific Biosciences (PacBio) and Illumina sequencing data and achieved one of the highest quality genome assemblies for any S. cerevisiae strain. Specifically, the contig N50 is 693 kbp, and the sequences of most chromosomes, the mitochondrial genome, and the 2-micron plasmid are complete. Our annotation predicts 92 genes that are not present in the reference genome of the laboratory strain S288c, over 70% of which were expressed. We predicted functions for 43 of these genes, 28 of which were previously uncharacterized and unnamed. Remarkably, many of these genes are predicted to be involved in stress tolerance and carbon metabolism and are shared with a Brazilian bioethanol production strain, even though the strains differ dramatically at most genetic loci. Lastly, the Y22-3 genome sequence provides an exceptionally high-quality resource for basic and applied research in bioenergy and genetics.« less
Identification and Validation of Selected Universal Stress Protein Domain Containing Drought-Responsive Genes in Pigeonpea (Cajanus cajan L.)

PubMed Central

Sinha, Pallavi; Pazhamala, Lekha T.; Singh, Vikas K.; Saxena, Rachit K.; Krishnamurthy, L.; Azam, Sarwar; Khan, Aamir W.; Varshney, Rajeev K.

2016-01-01

Pigeonpea is a resilient crop, which is relatively more drought tolerant than many other legume crops. To understand the molecular mechanisms of this unique feature of pigeonpea, 51 genes were selected using the Hidden Markov Models (HMM) those codes for proteins having close similarity to universal stress protein domain. Validation of these genes was conducted on three pigeonpea genotypes (ICPL 151, ICPL 8755, and ICPL 227) having different levels of drought tolerance. Gene expression analysis using qRT-PCR revealed 6, 8, and 18 genes to be ≥2-fold differentially expressed in ICPL 151, ICPL 8755, and ICPL 227, respectively. A total of 10 differentially expressed genes showed ≥2-fold up-regulation in the more drought tolerant genotype, which encoded four different classes of proteins. These include plant U-box protein (four genes), universal stress protein A-like protein (four genes), cation/H(+) antiporter protein (one gene) and an uncharacterized protein (one gene). Genes C.cajan_29830 and C.cajan_33874 belonging to uspA, were found significantly expressed in all the three genotypes with ≥2-fold expression variations. Expression profiling of these two genes on the four other legume crops revealed their specific role in pigeonpea. Therefore, these genes seem to be promising candidates for conferring drought tolerance specifically to pigeonpea. PMID:26779199
A Forward Genetic Approach in Chlamydomonas reinhardtii as a Strategy for Exploring Starch Catabolism

PubMed Central

Duchêne, Thierry; Cogez, Virginie; Cousin, Charlotte; Peltier, Gilles; Ball, Steven G.; Dauvillée, David

2013-01-01

A screen was recently developed to study the mobilization of starch in the unicellular green alga Chlamydomonas reinhardtii. This screen relies on starch synthesis accumulation during nitrogen starvation followed by the supply of nitrogen and the switch to darkness. Hence multiple regulatory networks including those of nutrient starvation, cell cycle control and light to dark transitions are likely to impact the recovery of mutant candidates. In this paper we monitor the specificity of this mutant screen by characterizing the nature of the genes disrupted in the selected mutants. We show that one third of the mutants consisted of strains mutated in genes previously reported to be of paramount importance in starch catabolism such as those encoding β-amylases, the maltose export protein, and branching enzyme I. The other mutants were defective for previously uncharacterized functions some of which are likely to define novel proteins affecting starch mobilization in green algae. PMID:24019981
Alternative cytoskeletal landscapes: cytoskeletal novelty and evolution in basal excavate protists

PubMed Central

Dawson, Scott C.; Paredez, Alexander R.

2016-01-01

Microbial eukaryotes encompass the majority of eukaryotic evolutionary and cytoskeletal diversity. The cytoskeletal complexity observed in multicellular organisms appears to be an expansion of components present in genomes of diverse microbial eukaryotes such as the basal lineage of flagellates, the Excavata. Excavate protists have complex and diverse cytoskeletal architectures and life cycles – essentially alternative cytoskeletal “landscapes” – yet still possess conserved microtubule- and actin-associated proteins. Comparative genomic analyses have revealed that a subset of excavates, however, lack many canonical actin-binding proteins central to actin cytoskeleton function in other eukaryotes. Overall, excavates possess numerous uncharacterized and “hypothetical” genes, and may represent an undiscovered reservoir of novel cytoskeletal genes and cytoskeletal mechanisms. The continued development of molecular genetic tools in these complex microbial eukaryotes will undoubtedly contribute to our overall understanding of cytoskeletal diversity and evolution. PMID:23312067
Algorithm, applications and evaluation for protein comparison by Ramanujan Fourier transform.

PubMed

Zhao, Jian; Wang, Jiasong; Hua, Wei; Ouyang, Pingkai

2015-12-01

The amino acid sequence of a protein determines its chemical properties, chain conformation and biological functions. Protein sequence comparison is of great importance to identify similarities of protein structures and infer their functions. Many properties of a protein correspond to the low-frequency signals within the sequence. Low frequency modes in protein sequences are linked to the secondary structures, membrane protein types, and sub-cellular localizations of the proteins. In this paper, we present Ramanujan Fourier transform (RFT) with a fast algorithm to analyze the low-frequency signals of protein sequences. The RFT method is applied to similarity analysis of protein sequences with the Resonant Recognition Model (RRM). The results show that the proposed fast RFT method on protein comparison is more efficient than commonly used discrete Fourier transform (DFT). RFT can detect common frequencies as significant feature for specific protein families, and the RFT spectrum heat-map of protein sequences demonstrates the information conservation in the sequence comparison. The proposed method offers a new tool for pattern recognition, feature extraction and structural analysis on protein sequences. Copyright © 2015 Elsevier Ltd. All rights reserved.
Exome sequencing identifies a DNAJB6 mutation in a family with dominantly-inherited limb-girdle muscular dystrophy.

PubMed

Couthouis, Julien; Raphael, Alya R; Siskind, Carly; Findlay, Andrew R; Buenrostro, Jason D; Greenleaf, William J; Vogel, Hannes; Day, John W; Flanigan, Kevin M; Gitler, Aaron D

2014-05-01

Limb-girdle muscular dystrophy primarily affects the muscles of the hips and shoulders (the "limb-girdle" muscles), although it is a heterogeneous disorder that can present with varying symptoms. There is currently no cure. We sought to identify the genetic basis of limb-girdle muscular dystrophy type 1 in an American family of Northern European descent using exome sequencing. Exome sequencing was performed on DNA samples from two affected siblings and one unaffected sibling and resulted in the identification of eleven candidate mutations that co-segregated with the disease. Notably, this list included a previously reported mutation in DNAJB6, p.Phe89Ile, which was recently identified as a cause of limb-girdle muscular dystrophy type 1D. Additional family members were Sanger sequenced and the mutation in DNAJB6 was only found in affected individuals. Subsequent haplotype analysis indicated that this DNAJB6 p.Phe89Ile mutation likely arose independently of the previously reported mutation. Since other published mutations are located close by in the G/F domain of DNAJB6, this suggests that the area may represent a mutational hotspot. Exome sequencing provided an unbiased and effective method for identifying the genetic etiology of limb-girdle muscular dystrophy type 1 in a previously genetically uncharacterized family. This work further confirms the causative role of DNAJB6 mutations in limb-girdle muscular dystrophy type 1D. Copyright © 2014 Elsevier B.V. All rights reserved.
Identification of Clinical Coryneform Bacterial Isolates: Comparison of Biochemical Methods and Sequence Analysis of 16S rRNA and rpoB Genes▿

PubMed Central

Adderson, Elisabeth E.; Boudreaux, Jan W.; Cummings, Jessica R.; Pounds, Stanley; Wilson, Deborah A.; Procop, Gary W.; Hayden, Randall T.

2008-01-01

We compared the relative levels of effectiveness of three commercial identification kits and three nucleic acid amplification tests for the identification of coryneform bacteria by testing 50 diverse isolates, including 12 well-characterized control strains and 38 organisms obtained from pediatric oncology patients at our institution. Between 33.3 and 75.0% of control strains were correctly identified to the species level by phenotypic systems or nucleic acid amplification assays. The most sensitive tests were the API Coryne system and amplification and sequencing of the 16S rRNA gene using primers optimized for coryneform bacteria, which correctly identified 9 of 12 control isolates to the species level, and all strains with a high-confidence call were correctly identified. Organisms not correctly identified were species not included in the test kit databases or not producing a pattern of reactions included in kit databases or which could not be differentiated among several genospecies based on reaction patterns. Nucleic acid amplification assays had limited abilities to identify some bacteria to the species level, and comparison of sequence homologies was complicated by the inclusion of allele sequences obtained from uncultivated and uncharacterized strains in databases. The utility of rpoB genotyping was limited by the small number of representative gene sequences that are currently available for comparison. The correlation between identifications produced by different classification systems was poor, particularly for clinical isolates. PMID:18160450
The Peculiar Landscape of Repetitive Sequences in the Olive (Olea europaea L.) Genome

PubMed Central

Barghini, Elena; Natali, Lucia; Cossu, Rosa Maria; Giordani, Tommaso; Pindo, Massimo; Cattonaro, Federica; Scalabrin, Simone; Velasco, Riccardo; Morgante, Michele; Cavallini, Andrea

2014-01-01

Analyzing genome structure in different species allows to gain an insight into the evolution of plant genome size. Olive (Olea europaea L.) has a medium-sized haploid genome of 1.4 Gb, whose structure is largely uncharacterized, despite the growing importance of this tree as oil crop. Next-generation sequencing technologies and different computational procedures have been used to study the composition of the olive genome and its repetitive fraction. A total of 2.03 and 2.3 genome equivalents of Illumina and 454 reads from genomic DNA, respectively, were assembled following different procedures, which produced more than 200,000 differently redundant contigs, with mean length higher than 1,000 nt. Mapping Illumina reads onto the assembled sequences was used to estimate their redundancy. The genome data set was subdivided into highly and medium redundant and nonredundant contigs. By combining identification and mapping of repeated sequences, it was established that tandem repeats represent a very large portion of the olive genome (∼31% of the whole genome), consisting of six main families of different length, two of which were first discovered in these experiments. The other large redundant class in the olive genome is represented by transposable elements (especially long terminal repeat-retrotransposons). On the whole, the results of our analyses show the peculiar landscape of the olive genome, related to the massive amplification of tandem repeats, more than that reported for any other sequenced plant genome. PMID:24671744
The peculiar landscape of repetitive sequences in the olive (Olea europaea L.) genome.

PubMed

Barghini, Elena; Natali, Lucia; Cossu, Rosa Maria; Giordani, Tommaso; Pindo, Massimo; Cattonaro, Federica; Scalabrin, Simone; Velasco, Riccardo; Morgante, Michele; Cavallini, Andrea

2014-04-01

Analyzing genome structure in different species allows to gain an insight into the evolution of plant genome size. Olive (Olea europaea L.) has a medium-sized haploid genome of 1.4 Gb, whose structure is largely uncharacterized, despite the growing importance of this tree as oil crop. Next-generation sequencing technologies and different computational procedures have been used to study the composition of the olive genome and its repetitive fraction. A total of 2.03 and 2.3 genome equivalents of Illumina and 454 reads from genomic DNA, respectively, were assembled following different procedures, which produced more than 200,000 differently redundant contigs, with mean length higher than 1,000 nt. Mapping Illumina reads onto the assembled sequences was used to estimate their redundancy. The genome data set was subdivided into highly and medium redundant and nonredundant contigs. By combining identification and mapping of repeated sequences, it was established that tandem repeats represent a very large portion of the olive genome (∼31% of the whole genome), consisting of six main families of different length, two of which were first discovered in these experiments. The other large redundant class in the olive genome is represented by transposable elements (especially long terminal repeat-retrotransposons). On the whole, the results of our analyses show the peculiar landscape of the olive genome, related to the massive amplification of tandem repeats, more than that reported for any other sequenced plant genome.

Sequence and RT-PCR expression analysis of two peroxidases from Arabidopsis thaliana belonging to a novel evolutionary branch of plant peroxidases.

PubMed

Kjaersgård, I V; Jespersen, H M; Rasmussen, S K; Welinder, K G

1997-03-01

cDNA clones encoding two new Arabidopsis thaliana peroxidases, ATP 1a and ATP 2a, have been identified by searching the Arabidopsis database of expressed sequence tags (dbEST). They represent a novel branch of hitherto uncharacterized plant peroxidases which is only 35% identical in amino acid sequence to the well characterized group of basic plant peroxidases represented by the horseradish (Armoracia rusticana) isoperoxidases HRP C, HRP E5 and the similar Arabidopsis isoperoxidases ATP Ca, ATP Cb, and ATP Ea. However ATP 1a is 87% identical in amino acid sequence to a peroxidase encoded by an mRNA isolated from cotton (Gossypium hirsutum). As cotton and Arabidopsis belong to rather diverse families (Malvaceae and Crucifereae, respectively), in contrast with Arabidopsis and horseradish (both Crucifereae), the high degree of sequence identity indicates that this novel type of peroxidase, albeit of unknown function, is likely to be widespread in plant species. The atp 1 and atp 2 types of cDNA sequences were the most redundant among the 28 different isoperoxidases identified among about 200 peroxidase encoding ESTs. Interestingly, 8 out of totally 38 EST sequences coding for ATP 1 showed three identical nucleotide substitutions. This variant form is designated ATP 1b. Similarly, six out of totally 16 EST sequences coding for ATP 2 showed a number of deletions and nucleotide changes. This variant form is designated ATP 2b. The selected EST clones are full-length and contain coding regions of 993 nucleotides for atp 1a, and 984 nucleotides for atp 2a. These regions show 61% DNA sequence identity. The predicted mature proteins ATP 1a, and ATP 2a are 57% identical in sequence and contain the structurally and functionally important residues, characteristic of the plant peroxidase superfamily. However, they do show two differences of importance to peroxidase catalysis: (1) the asparagine residue linked with the active site distal histidine via hydrogen bonding is absent; (2) an N-glycosylation site is located right at the entrance to the heme channel. The reverse transcriptase polymerase chain reaction (RT-PCR) was used to identify mRNAs coding for ATP 1a/b and ATP 2a/b in germinating seeds, seedlings, roots, leaves, stems, flowers and cell suspension culture using elongation factor 1alpha (EF-1alpha) for the first time as a positive control. Both mRNAs were transcribed at levels comparable to EF-1alpha in all plant tissues investigated which were more than two days old, and in cell suspension culture. In addition, the mRNA coding for ATP 1a/b was found in two day old germinating seeds. The abundant transcription of ATP 1a/b and ATP 2a/b is in line with their many entries in dbEST, and indicates essential roles for these novel peroxidases.
Clinical germline diagnostic exome sequencing for hereditary cancer: Findings within novel candidate genes are prevalent.

PubMed

Powis, Zöe; Espenschied, Carin R; LaDuca, Holly; Hagman, Kelly D; Paudyal, Tripti; Li, Shuwei; Inaba, Hiroto; Mauer, Ann; Nathanson, Katherine L; Knost, James; Chao, Elizabeth C; Tang, Sha

2018-08-01

Clinical diagnostic exome sequencing (DES) has been effective in diagnosing individuals with suspected genetic conditions; nevertheless little has been described regarding its clinical utility in individuals with a personal and family history of cancer. This study aimed to assess diagnostic yield and clinical characteristics of pediatric and adult patients undergoing germline DES for hereditary cancer. We retrospectively reviewed 2171 patients referred for DES; cases with a personal and/or family history of cancer were further studied. Of 39 cancer patients, relevant alterations were found in eight individuals (21%), including one (3%) positive pathogenic alteration within a characterized gene, two (5%) uncertain findings in characterized genes, and five (13%) alterations in novel candidate genes. Two of the 5 pediatric patients, undergoing testing, (40%) had findings in novel candidate genes, with the remainder being negative. We include brief case studies to illustrate the variety of challenging issues related to these patients. Our observations demonstrate utility of family-based exome sequencing in patients for suspected hereditary cancer, including familial co-segregation analysis, and comprehensive medical review. DES may be particularly useful when traditional approaches do not result in a diagnosis or in families with unique phenotypes. This work also highlights the importance and complexity of analysis of uncharacterized genes in exome sequencing for hereditary cancer. Copyright © 2018 Elsevier Inc. All rights reserved.
Folding and Stabilization of Native-Sequence-Reversed Proteins

PubMed Central

Zhang, Yuanzhao; Weber, Jeffrey K; Zhou, Ruhong

2016-01-01

Though the problem of sequence-reversed protein folding is largely unexplored, one might speculate that reversed native protein sequences should be significantly more foldable than purely random heteropolymer sequences. In this article, we investigate how the reverse-sequences of native proteins might fold by examining a series of small proteins of increasing structural complexity (α-helix, β-hairpin, α-helix bundle, and α/β-protein). Employing a tandem protein structure prediction algorithmic and molecular dynamics simulation approach, we find that the ability of reverse sequences to adopt native-like folds is strongly influenced by protein size and the flexibility of the native hydrophobic core. For β-hairpins with reverse-sequences that fail to fold, we employ a simple mutational strategy for guiding stable hairpin formation that involves the insertion of amino acids into the β-turn region. This systematic look at reverse sequence duality sheds new light on the problem of protein sequence-structure mapping and may serve to inspire new protein design and protein structure prediction protocols. PMID:27113844
Folding and Stabilization of Native-Sequence-Reversed Proteins

NASA Astrophysics Data System (ADS)

Zhang, Yuanzhao; Weber, Jeffrey K.; Zhou, Ruhong

2016-04-01

Though the problem of sequence-reversed protein folding is largely unexplored, one might speculate that reversed native protein sequences should be significantly more foldable than purely random heteropolymer sequences. In this article, we investigate how the reverse-sequences of native proteins might fold by examining a series of small proteins of increasing structural complexity (α-helix, β-hairpin, α-helix bundle, and α/β-protein). Employing a tandem protein structure prediction algorithmic and molecular dynamics simulation approach, we find that the ability of reverse sequences to adopt native-like folds is strongly influenced by protein size and the flexibility of the native hydrophobic core. For β-hairpins with reverse-sequences that fail to fold, we employ a simple mutational strategy for guiding stable hairpin formation that involves the insertion of amino acids into the β-turn region. This systematic look at reverse sequence duality sheds new light on the problem of protein sequence-structure mapping and may serve to inspire new protein design and protein structure prediction protocols.
A CRM domain protein functions dually in group I and group II intron splicing in land plant chloroplasts.

PubMed

Asakura, Yukari; Barkan, Alice

2007-12-01

The CRM domain is a recently recognized RNA binding domain found in three group II intron splicing factors in chloroplasts, in a bacterial protein that associates with ribosome precursors, and in a family of uncharacterized proteins in plants. To elucidate the functional repertoire of proteins with CRM domains, we studied CFM2 (for CRM Family Member 2), which harbors four CRM domains. RNA coimmunoprecipitation assays showed that CFM2 in maize (Zea mays) chloroplasts is associated with the group I intron in pre-trnL-UAA and group II introns in the ndhA and ycf3 pre-mRNAs. T-DNA insertions in the Arabidopsis thaliana ortholog condition a defective-seed phenotype (strong allele) or chlorophyll-deficient seedlings with impaired splicing of the trnL group I intron and the ndhA, ycf3-int1, and clpP-int2 group II introns (weak alleles). CFM2 and two previously described CRM proteins are bound simultaneously to the ndhA and ycf3-int1 introns and act in a nonredundant fashion to promote their splicing. With these findings, CRM domain proteins are implicated in the activities of three classes of catalytic RNA: group I introns, group II introns, and 23S rRNA.
Shotgun proteomics of Aspergillus niger microsomes upon D-xylose induction.

PubMed

Ferreira de Oliveira, José Miguel P; van Passel, Mark W J; Schaap, Peter J; de Graaff, Leo H

2010-07-01

Protein secretion plays an eminent role in cell maintenance and adaptation to the extracellular environment of microorganisms. Although protein secretion is an extremely efficient process in filamentous fungi, the mechanisms underlying protein secretion have remained largely uncharacterized in these organisms. In this study, we analyzed the effects of the d-xylose induction of cellulase and hemicellulase enzyme secretion on the protein composition of secretory organelles in Aspergillus niger. We aimed to systematically identify the components involved in the secretion of these enzymes via mass spectrometry of enriched subcellular microsomal fractions. Under each condition, fractions enriched for secretory organelles were processed for tandem mass spectrometry, resulting in the identification of peptides that originate from 1,081 proteins, 254 of which-many of them hypothetical proteins-were predicted to play direct roles in the secretory pathway. d-Xylose induction led to an increase in specific small GTPases known to be associated with polarized growth, exocytosis, and endocytosis. Moreover, the endoplasmic-reticulum-associated degradation (ERAD) components Cdc48 and all 14 of the 20S proteasomal subunits were recruited to the secretory organelles. In conclusion, induction of extracellular enzymes results in specific changes in the secretory subproteome of A. niger, and the most prominent change found in this study was the recruitment of the 20S proteasomal subunits to the secretory organelles.
Transport capabilities of environmental Pseudomonads for sulfur compounds

DOE PAGES

Zerbs, Sarah; Korajczyk, Peter J.; Noirot, Philippe H.; ...

2017-01-27

Sulfur is an essential element in plant rhizospheres and microbial activity plays a key role in increasing the biological availability of sulfur in soil environments. To better understand the mechanisms facilitating the exchange of sulfur-containing molecules in soil, we profiled the binding specificities of eight previously uncharacterized ABC transporter solute-binding proteins from plant-associated Pseudomonads. A high-throughput screening procedure indicated eighteen significant organosulfur binding ligands, with at least one high-quality screening hit for each protein target. Calorimetric and spectroscopic methods were used to validate the best ligand assignments and catalog the thermodynamic properties of the protein-ligand interactions. Two novel high-affinity ligandmore » binding activities were identified and quantified in this set of solute binding proteins. Bacteria were cultured in minimal media with screening library components supplied as the sole sulfur sources, demonstrating that these organosulfur compounds can be metabolized and confirming the relevance of ligand assignments. These results expand the set of experimentally validated ligands amenable to transport by this ABC transporter family and demonstrate the complex range of protein-ligand interactions that can be accomplished by solute-binding proteins. As a result, characterizing new nutrient import pathways provides insight into Pseudomonad metabolic capabilities which can be used to further interrogate bacterial survival and participation in soil and rhizosphere communities.« less
Hekate: Software Suite for the Mass Spectrometric Analysis and Three-Dimensional Visualization of Cross-Linked Protein Samples

PubMed Central

2013-01-01

Chemical cross-linking of proteins combined with mass spectrometry provides an attractive and novel method for the analysis of native protein structures and protein complexes. Analysis of the data however is complex. Only a small number of cross-linked peptides are produced during sample preparation and must be identified against a background of more abundant native peptides. To facilitate the search and identification of cross-linked peptides, we have developed a novel software suite, named Hekate. Hekate is a suite of tools that address the challenges involved in analyzing protein cross-linking experiments when combined with mass spectrometry. The software is an integrated pipeline for the automation of the data analysis workflow and provides a novel scoring system based on principles of linear peptide analysis. In addition, it provides a tool for the visualization of identified cross-links using three-dimensional models, which is particularly useful when combining chemical cross-linking with other structural techniques. Hekate was validated by the comparative analysis of cytochrome c (bovine heart) against previously reported data.1 Further validation was carried out on known structural elements of DNA polymerase III, the catalytic α-subunit of the Escherichia coli DNA replisome along with new insight into the previously uncharacterized C-terminal domain of the protein. PMID:24010795
DOE Office of Scientific and Technical Information (OSTI.GOV)

Zerbs, Sarah; Korajczyk, Peter J.; Noirot, Philippe H.

Sulfur is an essential element in plant rhizospheres and microbial activity plays a key role in increasing the biological availability of sulfur in soil environments. To better understand the mechanisms facilitating the exchange of sulfur-containing molecules in soil, we profiled the binding specificities of eight previously uncharacterized ABC transporter solute-binding proteins from plant-associated Pseudomonads. A high-throughput screening procedure indicated eighteen significant organosulfur binding ligands, with at least one high-quality screening hit for each protein target. Calorimetric and spectroscopic methods were used to validate the best ligand assignments and catalog the thermodynamic properties of the protein-ligand interactions. Two novel high-affinity ligandmore » binding activities were identified and quantified in this set of solute binding proteins. Bacteria were cultured in minimal media with screening library components supplied as the sole sulfur sources, demonstrating that these organosulfur compounds can be metabolized and confirming the relevance of ligand assignments. These results expand the set of experimentally validated ligands amenable to transport by this ABC transporter family and demonstrate the complex range of protein-ligand interactions that can be accomplished by solute-binding proteins. As a result, characterizing new nutrient import pathways provides insight into Pseudomonad metabolic capabilities which can be used to further interrogate bacterial survival and participation in soil and rhizosphere communities.« less
Genome-wide identification, characterization, and expression profile of aquaporin gene family in flax (Linum usitatissimum)

PubMed Central

Shivaraj, S. M.; Deshmukh, Rupesh K.; Rai, Rhitu; Bélanger, Richard; Agrawal, Pawan K.; Dash, Prasanta K.

2017-01-01

Membrane intrinsic proteins (MIPs) form transmembrane channels and facilitate transport of myriad substrates across the cell membrane in many organisms. Majority of plant MIPs have water transporting ability and are commonly referred as aquaporins (AQPs). In the present study, we identified aquaporin coding genes in flax by genome-wide analysis, their structure, function and expression pattern by pan-genome exploration. Cross-genera phylogenetic analysis with known aquaporins from rice, arabidopsis, and poplar showed five subgroups of flax aquaporins representing 16 plasma membrane intrinsic proteins (PIPs), 17 tonoplast intrinsic proteins (TIPs), 13 NOD26-like intrinsic proteins (NIPs), 2 small basic intrinsic proteins (SIPs), and 3 uncharacterized intrinsic proteins (XIPs). Amongst aquaporins, PIPs contained hydrophilic aromatic arginine (ar/R) selective filter but TIP, NIP, SIP and XIP subfamilies mostly contained hydrophobic ar/R selective filter. Analysis of RNA-seq and microarray data revealed high expression of PIPs in multiple tissues, low expression of NIPs, and seed specific expression of TIP3 in flax. Exploration of aquaporin homologs in three closely related Linum species bienne, grandiflorum and leonii revealed presence of 49, 39 and 19 AQPs, respectively. The genome-wide identification of aquaporins, first in flax, provides insight to elucidate their physiological and developmental roles in flax. PMID:28447607
Genome-wide identification, characterization, and expression profile of aquaporin gene family in flax (Linum usitatissimum).

PubMed

Shivaraj, S M; Deshmukh, Rupesh K; Rai, Rhitu; Bélanger, Richard; Agrawal, Pawan K; Dash, Prasanta K

2017-04-27

Membrane intrinsic proteins (MIPs) form transmembrane channels and facilitate transport of myriad substrates across the cell membrane in many organisms. Majority of plant MIPs have water transporting ability and are commonly referred as aquaporins (AQPs). In the present study, we identified aquaporin coding genes in flax by genome-wide analysis, their structure, function and expression pattern by pan-genome exploration. Cross-genera phylogenetic analysis with known aquaporins from rice, arabidopsis, and poplar showed five subgroups of flax aquaporins representing 16 plasma membrane intrinsic proteins (PIPs), 17 tonoplast intrinsic proteins (TIPs), 13 NOD26-like intrinsic proteins (NIPs), 2 small basic intrinsic proteins (SIPs), and 3 uncharacterized intrinsic proteins (XIPs). Amongst aquaporins, PIPs contained hydrophilic aromatic arginine (ar/R) selective filter but TIP, NIP, SIP and XIP subfamilies mostly contained hydrophobic ar/R selective filter. Analysis of RNA-seq and microarray data revealed high expression of PIPs in multiple tissues, low expression of NIPs, and seed specific expression of TIP3 in flax. Exploration of aquaporin homologs in three closely related Linum species bienne, grandiflorum and leonii revealed presence of 49, 39 and 19 AQPs, respectively. The genome-wide identification of aquaporins, first in flax, provides insight to elucidate their physiological and developmental roles in flax.
Database-independent Protein Sequencing (DiPS) Enables Full-length de Novo Protein and Antibody Sequence Determination.

PubMed

Savidor, Alon; Barzilay, Rotem; Elinger, Dalia; Yarden, Yosef; Lindzen, Moshit; Gabashvili, Alexandra; Adiv Tal, Ophir; Levin, Yishai

2017-06-01

Traditional "bottom-up" proteomic approaches use proteolytic digestion, LC-MS/MS, and database searching to elucidate peptide identities and their parent proteins. Protein sequences absent from the database cannot be identified, and even if present in the database, complete sequence coverage is rarely achieved even for the most abundant proteins in the sample. Thus, sequencing of unknown proteins such as antibodies or constituents of metaproteomes remains a challenging problem. To date, there is no available method for full-length protein sequencing, independent of a reference database, in high throughput. Here, we present Database-independent Protein Sequencing, a method for unambiguous, rapid, database-independent, full-length protein sequencing. The method is a novel combination of non-enzymatic, semi-random cleavage of the protein, LC-MS/MS analysis, peptide de novo sequencing, extraction of peptide tags, and their assembly into a consensus sequence using an algorithm named "Peptide Tag Assembler." As proof-of-concept, the method was applied to samples of three known proteins representing three size classes and to a previously un-sequenced, clinically relevant monoclonal antibody. Excluding leucine/isoleucine and glutamic acid/deamidated glutamine ambiguities, end-to-end full-length de novo sequencing was achieved with 99-100% accuracy for all benchmarking proteins and the antibody light chain. Accuracy of the sequenced antibody heavy chain, including the entire variable region, was also 100%, but there was a 23-residue gap in the constant region sequence. © 2017 by The American Society for Biochemistry and Molecular Biology, Inc.
Experimental measurement-device-independent quantum key distribution with uncharacterized encoding.

PubMed

Wang, Chao; Wang, Shuang; Yin, Zhen-Qiang; Chen, Wei; Li, Hong-Wei; Zhang, Chun-Mei; Ding, Yu-Yang; Guo, Guang-Can; Han, Zheng-Fu

2016-12-01

Measurement-device-independent quantum key distribution (MDI QKD) is an efficient way to share secrets using untrusted measurement devices. However, the assumption on the characterizations of encoding states is still necessary in this promising protocol, which may lead to unnecessary complexity and potential loopholes in realistic implementations. Here, by using the mismatched-basis statistics, we present the first proof-of-principle experiment of MDI QKD with uncharacterized encoding sources. In this demonstration, the encoded states are only required to be constrained in a two-dimensional Hilbert space, and two distant parties (Alice and Bob) are resistant to state preparation flaws even if they have no idea about the detailed information of their encoding states. The positive final secure key rates of our system exhibit the feasibility of this novel protocol, and demonstrate its value for the application of secure communication with uncharacterized devices.
Proteomic analysis of cow, yak, buffalo, goat and camel milk whey proteins: quantitative differential expression patterns.

PubMed

Yang, Yongxin; Bu, Dengpan; Zhao, Xiaowei; Sun, Peng; Wang, Jiaqi; Zhou, Lingyun

2013-04-05

To aid in unraveling diverse genetic and biological unknowns, a proteomic approach was used to analyze the whey proteome in cow, yak, buffalo, goat, and camel milk based on the isobaric tag for relative and absolute quantification (iTRAQ) techniques. This analysis is the first to produce proteomic data for the milk from the above-mentioned animal species: 211 proteins have been identified and 113 proteins have been categorized according to molecular function, cellular components, and biological processes based on gene ontology annotation. The results of principal component analysis showed significant differences in proteomic patterns among goat, camel, cow, buffalo, and yak milk. Furthermore, 177 differentially expressed proteins were submitted to advanced hierarchical clustering. The resulting clustering pattern included three major sample clusters: (1) cow, buffalo, and yak milk; (2) goat, cow, buffalo, and yak milk; and (3) camel milk. Certain proteins were chosen as characterization traits for a given species: whey acidic protein and quinone oxidoreductase for camel milk, biglycan for goat milk, uncharacterized protein (Accession Number: F1MK50 ) for yak milk, clusterin for buffalo milk, and primary amine oxidase for cow milk. These results help reveal the quantitative milk whey proteome pattern for analyzed species. This provides information for evaluating adulteration of specific specie milk and may provide potential directions for application of specific milk protein production based on physiological differences among animal species.
Sequence Complexity of Amyloidogenic Regions in Intrinsically Disordered Human Proteins

PubMed Central

Das, Swagata; Pal, Uttam; Das, Supriya; Bagga, Khyati; Roy, Anupam; Mrigwani, Arpita; Maiti, Nakul C.

2014-01-01

An amyloidogenic region (AR) in a protein sequence plays a significant role in protein aggregation and amyloid formation. We have investigated the sequence complexity of AR that is present in intrinsically disordered human proteins. More than 80% human proteins in the disordered protein databases (DisProt+IDEAL) contained one or more ARs. With decrease of protein disorder, AR content in the protein sequence was decreased. A probability density distribution analysis and discrete analysis of AR sequences showed that ∼8% residue in a protein sequence was in AR and the region was in average 8 residues long. The residues in the AR were high in sequence complexity and it seldom overlapped with low complexity regions (LCR), which was largely abundant in disorder proteins. The sequences in the AR showed mixed conformational adaptability towards α-helix, β-sheet/strand and coil conformations. PMID:24594841
How Many Protein Sequences Fold to a Given Structure? A Coevolutionary Analysis.

PubMed

Tian, Pengfei; Best, Robert B

2017-10-17

Quantifying the relationship between protein sequence and structure is key to understanding the protein universe. A fundamental measure of this relationship is the total number of amino acid sequences that can fold to a target protein structure, known as the "sequence capacity," which has been suggested as a proxy for how designable a given protein fold is. Although sequence capacity has been extensively studied using lattice models and theory, numerical estimates for real protein structures are currently lacking. In this work, we have quantitatively estimated the sequence capacity of 10 proteins with a variety of different structures using a statistical model based on residue-residue co-evolution to capture the variation of sequences from the same protein family. Remarkably, we find that even for the smallest protein folds, such as the WW domain, the number of foldable sequences is extremely large, exceeding the Avogadro constant. In agreement with earlier theoretical work, the calculated sequence capacity is positively correlated with the size of the protein, or better, the density of contacts. This allows the absolute sequence capacity of a given protein to be approximately predicted from its structure. On the other hand, the relative sequence capacity, i.e., normalized by the total number of possible sequences, is an extremely tiny number and is strongly anti-correlated with the protein length. Thus, although there may be more foldable sequences for larger proteins, it will be much harder to find them. Lastly, we have correlated the evolutionary age of proteins in the CATH database with their sequence capacity as predicted by our model. The results suggest a trade-off between the opposing requirements of high designability and the likelihood of a novel fold emerging by chance. Published by Elsevier Inc.
Metabolomic strategies to map functions of metabolic pathways

PubMed Central

Mulvihill, Melinda M.

2014-01-01

Genome sequencing efforts have revealed a strikingly large number of unannotated and uncharacterized genes that fall into metabolic enzymes classes, likely indicating that our current knowledge of biochemical pathways in normal physiology, let alone in disease states, remains largely incomplete. This realization presents a daunting challenge for post-genomic-era scientists in deciphering the biochemical and (patho)physiological roles of these enzymes and their metabolites and metabolic networks. This is further complicated by many recent studies showing a rewiring of normal metabolic networks in disease states to give rise to unique pathophysiological functions of enzymes, metabolites, and metabolic pathways. This review focuses on recent discoveries made using metabolic mapping technologies to uncover novel pathways and metabolite-mediated posttranslational modifications and epigenetic alterations and their impact on physiology and disease. PMID:24918200
Chemical Approaches to Probe Metabolic Networks

PubMed Central

Medina-Cleghorn, Daniel; Nomura, Daniel K.

2013-01-01

One of the more provocative realizations that have come out of the genome sequencing projects is that organisms possess a large number of uncharacterized or poorly characterized enzymes. This finding belies the commonly held notion that our knowledge of cell metabolism is nearly complete, underscoring the vast landscape of unannotated metabolic and signaling networks that operate under normal physiological conditions, let alone in disease states where metabolic networks may be rewired, dysregulated, or altered to drive disease progression. Consequently, the functional annotation of enzymatic pathways represents a grand challenge for researchers in the post-genomic era. This review will highlight the chemical technologies that have been successfully used to characterize metabolism, and put forth some of the challenges we face as we expand our map of metabolic pathways. PMID:23296751
'FloraArray' for screening of specific DNA probes representing the characteristics of a certain microbial community.

PubMed

Yokoi, Takahide; Kaku, Yoshiko; Suzuki, Hiroyuki; Ohta, Masayuki; Ikuta, Hajime; Isaka, Kazuichi; Sumino, Tatsuo; Wagatsuma, Masako

2007-08-01

To investigate uncharacterized microbial communities, a custom DNA microarray named 'FloraArray' was developed for screening specific probes that would represent the characteristics of a microbial community. The array was prepared by spotting 2000 plasmid DNAs from a genomic shotgun library of a sludge sample on a DNA microarray. By comparative hybridization of the array with two different samples of genomic DNA, one from the activated sludge and the other from a nonactivated sludge sample of an anaerobic ammonium oxidation (anammox) bacterial community, specific spots were visualized as a definite fluctuating profile in an MA (differential intensity ratio vs. spot intensity) plot. About 300 spots of the array accounted for the candidate probes to represent anammox reaction of the activated sludge. After sequence analysis of the probes and examination of the results of blastn searches against the reported anammox reference sequence, complete matches were found for 161 probes (58.3%) and >90% matches were found for 242 probes (87.1%). These results demonstrate that 'FloraArray' could be a useful tool for screening specific DNA molecules of unknown microbial communities.
Marked alterations in the distal gut microbiome linked to diet-induced obesity

PubMed Central

Turnbaugh, Peter J.; Backhed, Fredrik; Fulton, Lucinda; Gordon, Jeffrey I.

2013-01-01

SUMMARY We have investigated the inter-relationship between diet, gut microbial ecology and energy balance using a mouse model of obesity produced by consumption of a prototypic Western diet. Diet-induced obesity (DIO) produced a bloom in a single uncultured clade within the Mollicutes class of the Firmicutes, which became the dominant lineage within the distal gut microbiota. This bloom was diminished by subsequent dietary manipulations that limit weight gain and reduce adiposity. Transplantation of the microbiota from mice with DIO to lean germ-free recipients produced a significantly greater increase in adiposity than transplants from lean donors. Metagenomic sequencing of the gut microbiome, biochemical assays, plus sequencing and in silico metabolic reconstructions of a related human gut-associated Mollicute (E.dolichum), revealed features that may provide a competitive advantage for members of the bloom in the Western diet nutrient milieu, including genes involved in import and metabolism of simple sugars. Our study illustrates how combining comparative metagenomics with gnotobiotic mouse models and specific dietary manipulations can disclose the niches of previously uncharacterized members of the gut microbiota. PMID:18407065

Regulatory Features for Odorant Receptor Genes in the Mouse Genome.

PubMed

Degl'Innocenti, Andrea; D'Errico, Anna

2017-01-01

The odorant receptor genes, seven transmembrane receptor genes constituting the vastest mammalian gene multifamily, are expressed monogenically and monoallelicaly in each sensory neuron in the olfactory epithelium. This characteristic, often referred to as the one neuron-one receptor rule, is driven by mostly uncharacterized molecular dynamics, generally named odorant receptor gene choice . Much attention has been paid by the scientific community to the identification of sequences regulating the expression of odorant receptor genes within their loci , where related genes are usually arranged in genomic clusters. A number of studies identified transcription factor binding sites on odorant receptor promoter sequences. Similar binding sites were also found on a number of enhancers that regulate in cis their transcription, but have been proposed to form interchromosomal networks. Odorant receptor gene choice seems to occur via the local removal of strongly repressive epigenetic markings, put in place during the maturation of the sensory neuron on each odorant receptor locus . Here we review the fast-changing state of art for the study of regulatory features for odorant receptor genes.
Detecting nitrous oxide reductase (NosZ) genes in soil metagenomes: method development and implications for the nitrogen cycle.

PubMed

Orellana, L H; Rodriguez-R, L M; Higgins, S; Chee-Sanford, J C; Sanford, R A; Ritalahti, K M; Löffler, F E; Konstantinidis, K T

2014-06-03

Microbial activities in soils, such as (incomplete) denitrification, represent major sources of nitrous oxide (N2O), a potent greenhouse gas. The key enzyme for mitigating N2O emissions is NosZ, which catalyzes N2O reduction to N2. We recently described "atypical" functional NosZ proteins encoded by both denitrifiers and nondenitrifiers, which were missed in previous environmental surveys (R. A. Sanford et al., Proc. Natl. Acad. Sci. U. S. A. 109:19709-19714, 2012, doi:10.1073/pnas.1211238109). Here, we analyzed the abundance and diversity of both nosZ types in whole-genome shotgun metagenomes from sandy and silty loam agricultural soils that typify the U.S. Midwest corn belt. First, different search algorithms and parameters for detecting nosZ metagenomic reads were evaluated based on in silico-generated (mock) metagenomes. Using the derived cutoffs, 71 distinct alleles (95% amino acid identity level) encoding typical or atypical NosZ proteins were detected in both soil types. Remarkably, more than 70% of the total nosZ reads in both soils were classified as atypical, emphasizing that prior surveys underestimated nosZ abundance. Approximately 15% of the total nosZ reads were taxonomically related to Anaeromyxobacter, which was the most abundant genus encoding atypical NosZ-type proteins in both soil types. Further analyses revealed that atypical nosZ genes outnumbered typical nosZ genes in most publicly available soil metagenomes, underscoring their potential role in mediating N2O consumption in soils. Therefore, this study provides a bioinformatics strategy to reliably detect target genes in complex short-read metagenomes and suggests that the analysis of both typical and atypical nosZ sequences is required to understand and predict N2O flux in soils. Nitrous oxide (N2O) is a potent greenhouse gas with ozone layer destruction potential. Microbial activities control both the production and the consumption of N2O, i.e., its conversion to innocuous dinitrogen gas (N2). Until recently, consumption of N2O was attributed to bacteria encoding "typical" nitrous oxide reductase (NosZ). However, recent phylogenetic and physiological studies have shown that previously uncharacterized, functional, "atypical" NosZ proteins are encoded in genomes of diverse bacterial groups. The present study revealed that atypical nosZ genes outnumbered their typical counterparts, highlighting their potential role in N2O consumption in soils and possibly other environments. These findings advance our understanding of the diversity of microbes and functional genes involved in the nitrogen cycle and provide the means (e.g., gene sequences) to study N2O fluxes to the atmosphere and associated climate change. Copyright © 2014 Orellana et al.
Negative Regulation of Violacein Biosynthesis in Chromobacterium violaceum.

PubMed

Devescovi, Giulia; Kojic, Milan; Covaceuszach, Sonia; Cámara, Miguel; Williams, Paul; Bertani, Iris; Subramoni, Sujatha; Venturi, Vittorio

2017-01-01

In Chromobacteium violaceum , the purple pigment violacein is under positive regulation by the N -acylhomoserine lactone CviI/R quorum sensing system and negative regulation by an uncharacterized putative repressor. In this study we report that the biosynthesis of violacein is negatively controlled by a novel repressor protein, VioS. The violacein operon is regulated negatively by VioS and positively by the CviI/R system in both C. violaceum and in a heterologous Escherichia coli genetic background. VioS does not regulate the CviI/R system and apart from violacein, VioS, and quorum sensing regulate other phenotypes antagonistically. Quorum sensing regulated phenotypes in C. violaceum are therefore further regulated providing an additional level of control.
Discovery and characterization of a sulfoquinovose mutarotase using kinetic analysis at equilibrium by exchange spectroscopy

PubMed Central

Abayakoon, Palika; Lingford, James P.; Jin, Yi; Bengt, Christopher; Davies, Gideon J.; Yao, Shenggen; Goddard-Borger, Ethan D.

2018-01-01

Bacterial sulfoglycolytic pathways catabolize sulfoquinovose (SQ), or glycosides thereof, to generate a three-carbon metabolite for primary cellular metabolism and a three-carbon sulfonate that is expelled from the cell. Sulfoglycolytic operons encoding an Embden–Meyerhof–Parnas-like or Entner–Doudoroff (ED)-like pathway harbor an uncharacterized gene (yihR in Escherichia coli; PpSQ1_00415 in Pseudomonas putida) that is up-regulated in the presence of SQ, has been annotated as an aldose-1-epimerase and which may encode an SQ mutarotase. Our sequence analyses and structural modeling confirmed that these proteins possess mutarotase-like active sites with conserved catalytic residues. We overexpressed the homolog from the sulfo-ED operon of Herbaspirillum seropedicaea (HsSQM) and used it to demonstrate SQ mutarotase activity for the first time. This was accomplished using nuclear magnetic resonance exchange spectroscopy, a method that allows the chemical exchange of magnetization between the two SQ anomers at equilibrium. HsSQM also catalyzed the mutarotation of various aldohexoses with an equatorial 2-hydroxy group, including d-galactose, d-glucose, d-glucose-6-phosphate (Glc-6-P), and d-glucuronic acid, but not d-mannose. HsSQM displayed only 5-fold selectivity in terms of efficiency (kcat/KM) for SQ versus the glycolysis intermediate Glc-6-P; however, its proficiency [kuncat/(kcat/KM)] for SQ was 17 000-fold better than for Glc-6-P, revealing that HsSQM preferentially stabilizes the SQ transition state. PMID:29535276
Loss of a highly conserved sterile alpha motif domain gene (WEEP) results in pendulous branch growth in peach trees.

PubMed

Hollender, Courtney A; Pascal, Thierry; Tabb, Amy; Hadiarto, Toto; Srinivasan, Chinnathambi; Wang, Wanpeng; Liu, Zhongchi; Scorza, Ralph; Dardick, Chris

2018-05-15

Plant shoots typically grow upward in opposition to the pull of gravity. However, exceptions exist throughout the plant kingdom. Most conspicuous are trees with weeping or pendulous branches. While such trees have long been cultivated and appreciated for their ornamental value, the molecular basis behind the weeping habit is not known. Here, we characterized a weeping tree phenotype in Prunus persica (peach) and identified the underlying genetic mutation using a genomic sequencing approach. Weeping peach tree shoots exhibited a downward elliptical growth pattern and did not exhibit an upward bending in response to 90° reorientation. The causative allele was found to be an uncharacterized gene, Ppa013325 , having a 1.8-Kb deletion spanning the 5' end. This gene, dubbed WEEP , was predominantly expressed in phloem tissues and encodes a highly conserved 129-amino acid protein containing a sterile alpha motif (SAM) domain. Silencing WEEP in the related tree species Prunus domestica (plum) resulted in more outward, downward, and wandering shoot orientations compared to standard trees, supporting a role for WEEP in directing lateral shoot growth in trees. This previously unknown regulator of branch orientation, which may also be a regulator of gravity perception or response, provides insights into our understanding of how tree branches grow in opposition to gravity and could serve as a critical target for manipulating tree architecture for improved tree shape in agricultural and horticulture applications. Copyright © 2018 the Author(s). Published by PNAS.
Predictive regulatory models in Drosophila melanogaster by integrative inference of transcriptional networks

PubMed Central

Marbach, Daniel; Roy, Sushmita; Ay, Ferhat; Meyer, Patrick E.; Candeias, Rogerio; Kahveci, Tamer; Bristow, Christopher A.; Kellis, Manolis

2012-01-01

Gaining insights on gene regulation from large-scale functional data sets is a grand challenge in systems biology. In this article, we develop and apply methods for transcriptional regulatory network inference from diverse functional genomics data sets and demonstrate their value for gene function and gene expression prediction. We formulate the network inference problem in a machine-learning framework and use both supervised and unsupervised methods to predict regulatory edges by integrating transcription factor (TF) binding, evolutionarily conserved sequence motifs, gene expression, and chromatin modification data sets as input features. Applying these methods to Drosophila melanogaster, we predict ∼300,000 regulatory edges in a network of ∼600 TFs and 12,000 target genes. We validate our predictions using known regulatory interactions, gene functional annotations, tissue-specific expression, protein–protein interactions, and three-dimensional maps of chromosome conformation. We use the inferred network to identify putative functions for hundreds of previously uncharacterized genes, including many in nervous system development, which are independently confirmed based on their tissue-specific expression patterns. Last, we use the regulatory network to predict target gene expression levels as a function of TF expression, and find significantly higher predictive power for integrative networks than for motif or ChIP-based networks. Our work reveals the complementarity between physical evidence of regulatory interactions (TF binding, motif conservation) and functional evidence (coordinated expression or chromatin patterns) and demonstrates the power of data integration for network inference and studies of gene regulation at the systems level. PMID:22456606
Novel approaches in function-driven single-cell genomics.

PubMed

Doud, Devin F R; Woyke, Tanja

2017-07-01

Deeper sequencing and improved bioinformatics in conjunction with single-cell and metagenomic approaches continue to illuminate undercharacterized environmental microbial communities. This has propelled the 'who is there, and what might they be doing' paradigm to the uncultivated and has already radically changed the topology of the tree of life and provided key insights into the microbial contribution to biogeochemistry. While characterization of 'who' based on marker genes can describe a large fraction of the community, answering 'what are they doing' remains the elusive pinnacle for microbiology. Function-driven single-cell genomics provides a solution by using a function-based screen to subsample complex microbial communities in a targeted manner for the isolation and genome sequencing of single cells. This enables single-cell sequencing to be focused on cells with specific phenotypic or metabolic characteristics of interest. Recovered genomes are conclusively implicated for both encoding and exhibiting the feature of interest, improving downstream annotation and revealing activity levels within that environment. This emerging approach has already improved our understanding of microbial community functioning and facilitated the experimental analysis of uncharacterized gene product space. Here we provide a comprehensive review of strategies that have been applied for function-driven single-cell genomics and the future directions we envision. © FEMS 2017.
Novel approaches in function-driven single-cell genomics

DOE Office of Scientific and Technical Information (OSTI.GOV)

Doud, Devin F. R.; Woyke, Tanja

Deeper sequencing and improved bioinformatics in conjunction with single-cell and metagenomic approaches continue to illuminate undercharacterized environmental microbial communities. This has propelled the 'who is there, and what might they be doing' paradigm to the uncultivated and has already radically changed the topology of the tree of life and provided key insights into the microbial contribution to biogeochemistry. While characterization of 'who' based on marker genes can describe a large fraction of the community, answering 'what are they doing' remains the elusive pinnacle for microbiology. Function-driven single-cell genomics provides a solution by using a function-based screen to subsample complex microbialmore » communities in a targeted manner for the isolation and genome sequencing of single cells. This enables single-cell sequencing to be focused on cells with specific phenotypic or metabolic characteristics of interest. Recovered genomes are conclusively implicated for both encoding and exhibiting the feature of interest, improving downstream annotation and revealing activity levels within that environment. This emerging approach has already improved our understanding of microbial community functioning and facilitated the experimental analysis of uncharacterized gene product space. Here we provide a comprehensive review of strategies that have been applied for function-driven single-cell genomics and the future directions we envision.« less
Novel approaches in function-driven single-cell genomics

DOE PAGES

Doud, Devin F. R.; Woyke, Tanja

2017-06-07

Deeper sequencing and improved bioinformatics in conjunction with single-cell and metagenomic approaches continue to illuminate undercharacterized environmental microbial communities. This has propelled the 'who is there, and what might they be doing' paradigm to the uncultivated and has already radically changed the topology of the tree of life and provided key insights into the microbial contribution to biogeochemistry. While characterization of 'who' based on marker genes can describe a large fraction of the community, answering 'what are they doing' remains the elusive pinnacle for microbiology. Function-driven single-cell genomics provides a solution by using a function-based screen to subsample complex microbialmore » communities in a targeted manner for the isolation and genome sequencing of single cells. This enables single-cell sequencing to be focused on cells with specific phenotypic or metabolic characteristics of interest. Recovered genomes are conclusively implicated for both encoding and exhibiting the feature of interest, improving downstream annotation and revealing activity levels within that environment. This emerging approach has already improved our understanding of microbial community functioning and facilitated the experimental analysis of uncharacterized gene product space. Here we provide a comprehensive review of strategies that have been applied for function-driven single-cell genomics and the future directions we envision.« less
Toward Understanding Phage:Host Interactions in the Rumen; Complete Genome Sequences of Lytic Phages Infecting Rumen Bacteria

PubMed Central

Gilbert, Rosalind A.; Kelly, William J.; Altermann, Eric; Leahy, Sinead C.; Minchin, Catherine; Ouwerkerk, Diane; Klieve, Athol V.

2017-01-01

The rumen is known to harbor dense populations of bacteriophages (phages) predicted to be capable of infecting a diverse range of rumen bacteria. While bacterial genome sequencing projects are revealing the presence of phages which can integrate their DNA into the genome of their host to form stable, lysogenic associations, little is known of the genetics of phages which utilize lytic replication. These phages infect and replicate within the host, culminating in host lysis, and the release of progeny phage particles. While lytic phages for rumen bacteria have been previously isolated, their genomes have remained largely uncharacterized. Here we report the first complete genome sequences of lytic phage isolates specifically infecting three genera of rumen bacteria: Bacteroides, Ruminococcus, and Streptococcus. All phages were classified within the viral order Caudovirales and include two phage morphotypes, representative of the Siphoviridae and Podoviridae families. The phage genomes displayed modular organization and conserved viral genes were identified which enabled further classification and determination of closest phage relatives. Co-examination of bacterial host genomes led to the identification of several genes responsible for modulating phage:host interactions, including CRISPR/Cas elements and restriction-modification phage defense systems. These findings provide new genetic information and insights into how lytic phages may interact with bacteria of the rumen microbiome. PMID:29259581
Mesocarp localization of a bi-functional resveratrol/hydroxycinnamic acid glucosyltransferase of Concord grape (Vitis labrusca).

PubMed

Hall, Dawn; De Luca, Vincenzo

2007-02-01

Resveratrol is a stilbene with well-known health-promoting effects in humans that is produced constitutively or accumulates as a phytoalexin in several plant species including grape (Vitis sp.). Grape berries accumulate stilbenes in the exocarp as cis- and trans-isomers of resveratrol, together with their respective 3-O-monoglucosides. An enzyme glucosylating cis- and trans-resveratrol was purified to apparent homogeneity from Concord (Vitis labrusca) grape berries, and peptide sequencing associated it to an uncharacterized Vitis vinifera full-length clone (TC38971, tigr database). A corresponding gene from Vitis labrusca (VLRSgt) had 98% sequence identity to clone TC38971 and 92% sequence identity to a Vitis viniferap-hydroxybenzoic acid glucosyltransferase that produces glucose esters. The recombinant enzyme was active over a broad pH range (5.5-10), producing glucosides of stilbenes, flavonoids and coumarins at higher pH and glucose esters of several hydroxybenzoic and hydroxycinnamic acids at low pH. Vitis labrusca grape berries accumulated both stilbene glucosides and hydroxycinnamic acid glucose esters, consistent with the bi-functional role of VLRSgt in stilbene and hydroxycinnamic acid modification. While phylogenetic analysis of VLRSgt and other functionally characterized glucosyltransferases places it with other glucose ester-producing enzymes, the present results indicate broader biochemical activities for this class of enzymes.
Generic detection of poleroviruses using an RT-PCR assay targeting the RdRp coding sequence.

PubMed

Lotos, Leonidas; Efthimiou, Konstantinos; Maliogka, Varvara I; Katis, Nikolaos I

2014-03-01

In this study a two-step RT-PCR assay was developed for the generic detection of poleroviruses. The RdRp coding region was selected as the primers' target, since it differs significantly from that of other members in the family Luteoviridae and its sequence can be more informative than other regions in the viral genome. Species specific RT-PCR assays targeting the same region were also developed for the detection of the six most widespread poleroviral species (Beet mild yellowing virus, Beet western yellows virus, Cucurbit aphid-borne virus, Carrot red leaf virus, Potato leafroll virus and Turnip yellows virus) in Greece and the collection of isolates. These isolates along with other characterized ones were used for the evaluation of the generic PCR's detection range. The developed assay efficiently amplified a 593bp RdRp fragment from 46 isolates of 10 different Polerovirus species. Phylogenetic analysis using the generic PCR's amplicon sequence showed that although it cannot accurately infer evolutionary relationships within the genus it can differentiate poleroviruses at the species level. Overall, the described generic assay could be applied for the reliable detection of Polerovirus infections and, in combination with the specific PCRs, for the identification of new and uncharacterized species in the genus. Copyright © 2013 Elsevier B.V. All rights reserved.
Extensive sequence analysis of CFTR, SCNN1A, SCNN1B, SCNN1G and SERPINA1 suggests an oligogenic basis for cystic fibrosis-like phenotypes.

PubMed

Ramos, M D; Trujillano, D; Olivar, R; Sotillo, F; Ossowski, S; Manzanares, J; Costa, J; Gartner, S; Oliva, C; Quintana, E; Gonzalez, M I; Vazquez, C; Estivill, X; Casals, T

2014-07-01

The term cystic fibrosis (CF)-like disease is used to describe patients with a borderline sweat test and suggestive CF clinical features but without two CFTR(cystic fibrosis transmembrane conductance regulator) mutations. We have performed the extensive molecular analysis of four candidate genes (SCNN1A, SCNN1B, SCNN1G and SERPINA1) in a cohort of 10 uncharacterized patients with CF and CF-like disease. We have used whole-exome sequencing to characterize mutations in the CFTR gene and these four candidate genes. CFTR molecular analysis allowed a complete characterization of three of four CF patients. Candidate variants in SCNN1A, SCNN1B, SCNN1G and SERPINA1 in six patients with CF-like phenotypes were confirmed by Sanger sequencing and were further supported by in silico predictive analysis, pedigree studies, sweat test in other family members, and analysis in CF patients and healthy subjects. Our results suggest that CF-like disease probably results from complex genotypes in several genes in an oligogenic form, with rare variants interacting with environmental factors. © 2013 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
The Viral Gene ORF79 Encodes a Repressor Regulating Induction of the Lytic Life Cycle in the Haloalkaliphilic Virus ϕCh1

PubMed Central

Selb, Regina; Derntl, Christian; Klein, Reinhard; Alte, Beatrix; Hofbauer, Christoph; Kaufmann, Martin; Beraha, Judith; Schöner, Léa

2017-01-01

ABSTRACT In this study, we describe the construction of the first genetically modified mutant of a halovirus infecting haloalkaliphilic Archaea. By random choice, we targeted ORF79, a currently uncharacterized viral gene of the haloalkaliphilic virus ϕCh1. We used a polyethylene glycol (PEG)-mediated transformation method to deliver a disruption cassette into a lysogenic strain of the haloalkaliphilic archaeon Natrialba magadii bearing ϕCh1 as a provirus. This approach yielded mutant virus particles carrying a disrupted version of ORF79. Disruption of ORF79 did not influence morphology of the mature virions. The mutant virus was able to infect cured strains of N. magadii, resulting in a lysogenic, ORF79-disrupted strain. Analysis of this strain carrying the mutant virus revealed a repressor function of ORF79. In the absence of gp79, onset of lysis and expression of viral proteins occurred prematurely compared to their timing in the wild-type strain. Constitutive expression of ORF79 in a cured strain of N. magadii reduced the plating efficiency of ϕCh1 by seven orders of magnitude. Overexpression of ORF79 in a lysogenic strain of N. magadii resulted in an inhibition of lysis and total absence of viral proteins as well as viral progeny. In further experiments, gp79 directly regulated the expression of the tail fiber protein ORF34 but did not influence the methyltransferase gene ORF94. Further, we describe the establishment of an inducible promoter for in vivo studies in N. magadii. IMPORTANCE Genetic analyses of haloalkaliphilic Archaea or haloviruses are only rarely reported. Therefore, only little insight into the in vivo roles of proteins and their functions has been gained so far. We used a reverse genetics approach to identify the function of a yet undescribed gene of ϕCh1. We provide evidence that gp79, a currently unknown protein of ϕCh1, acts as a repressor protein of the viral life cycle, affecting the transition from the lysogenic to the lytic state of the virus. Thus, repressor genes in other haloviruses could be identified by sequence homologies to gp79 in the future. Moreover, we describe the use of an inducible promoter of N. magadii. Our work provides valuable tools for the identification of other unknown viral genes by our approach as well as for functional studies of proteins by inducible expression. PMID:28202757
Analysis of sequence repeats of proteins in the PDB.

PubMed

Mary Rajathei, David; Selvaraj, Samuel

2013-12-01

Internal repeats in protein sequences play a significant role in the evolution of protein structure and function. Applications of different bioinformatics tools help in the identification and characterization of these repeats. In the present study, we analyzed sequence repeats in a non-redundant set of proteins available in the Protein Data Bank (PDB). We used RADAR for detecting internal repeats in a protein, PDBeFOLD for assessing structural similarity, PDBsum for finding functional involvement and Pfam for domain assignment of the repeats in a protein. Through the analysis of sequence repeats, we found that identity of the sequence repeats falls in the range of 20-40% and, the superimposed structures of the most of the sequence repeats maintain similar overall folding. Analysis sequence repeats at the functional level reveals that most of the sequence repeats are involved in the function of the protein through functionally involved residues in the repeat regions. We also found that sequence repeats in single and two domain proteins often contained conserved sequence motifs for the function of the domain. Copyright © 2013 Elsevier Ltd. All rights reserved.
Plasmodium Helical Interspersed Subtelomeric (PHIST) Proteins, at the Center of Host Cell Remodeling

PubMed Central

Warncke, Jan D.; Vakonakis, Ioannis

2016-01-01

SUMMARY During the asexual cycle, Plasmodium falciparum extensively remodels the human erythrocyte to make it a suitable host cell. A large number of exported proteins facilitate this remodeling process, which causes erythrocytes to become more rigid, cytoadherent, and permeable for nutrients and metabolic products. Among the exported proteins, a family of 89 proteins, called the Plasmodium helical interspersed subtelomeric (PHIST) protein family, has been identified. While also found in other Plasmodium species, the PHIST family is greatly expanded in P. falciparum. Although a decade has passed since their first description, to date, most PHIST proteins remain uncharacterized and are of unknown function and localization within the host cell, and there are few data on their interactions with other host or parasite proteins. However, over the past few years, PHIST proteins have been mentioned in the literature at an increasing rate owing to their presence at various localizations within the infected erythrocyte. Expression of PHIST proteins has been implicated in molecular and cellular processes such as the surface display of PfEMP1, gametocytogenesis, changes in cell rigidity, and also cerebral and pregnancy-associated malaria. Thus, we conclude that PHIST proteins are central to host cell remodeling, but despite their obvious importance in pathology, PHIST proteins seem to be understudied. Here we review current knowledge, shed light on the definition of PHIST proteins, and discuss these proteins with respect to their localization and probable function. We take into consideration interaction studies, microarray analyses, or data from blood samples from naturally infected patients to combine all available information on this protein family. PMID:27582258
New insights into the biogenesis of nuclear RNA polymerases?

PubMed

Cloutier, Philippe; Coulombe, Benoit

2010-04-01

More than 30 years of research on nuclear RNA polymerases (RNAP I, II, and III) has uncovered numerous factors that regulate the activity of these enzymes during the transcription reaction. However, very little is known about the machinery that regulates the fate of RNAPs before or after transcription. In particular, the mechanisms of biogenesis of the 3 nuclear RNAPs, which comprise both common and specific subunits, remains mostly uncharacterized and the proteins involved are yet to be discovered. Using protein affinity purification coupled to mass spectrometry (AP-MS), we recently unraveled a high-density interaction network formed by nuclear RNAP subunits from the soluble fraction of human cell extracts. Validation of the dataset using a machine learning approach trained to minimize the rate of false positives and false negatives yielded a high-confidence dataset and uncovered novel interactors that regulate the RNAP II transcription machinery, including a set of proteins we named the RNAP II-associated proteins (RPAPs). One of the RPAPs, RPAP3, is part of an 11-subunit complex we termed the RPAP3/R2TP/prefoldin-like complex. Here, we review the literature on the subunits of this complex, which points to a role in nuclear RNAP biogenesis.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Phuthong, Witchukorn; Huang, Zubin; Wittkopp, Tyler M.

To investigate the dynamics of photosynthetic pigment-protein complexes in vascular plants at high resolution in an aqueous environment, membrane-protruding oxygen-evolving complexes (OECs) associated with photosystem II (PSII) on spinach ( Spinacia oleracea) grana membranes were examined using contact mode atomic force microscopy. This study represents, to our knowledge, the first use of atomic force microscopy to distinguish the putative large extrinsic loop of Photosystem II CP47 reaction center protein (CP47) from the putative oxygen-evolving enhancer proteins 1, 2, and 3 (PsbO, PsbP, and PsbQ) and large extrinsic loop of Photosystem II CP43 reaction center protein (CP43) in the PSII-OEC extrinsicmore » domains of grana membranes under conditions resulting in the disordered arrangement of PSII-OEC particles. Moreover, we observed uncharacterized membrane particles that, based on their physical characteristics and electrophoretic analysis of the polypeptides associated with the grana samples, are hypothesized to be a domain of photosystem I that protrudes from the stromal face of single thylakoid bilayers. Furthermore, our results are interpreted in the context of the results of others that were obtained using cryo-electron microscopy (and single particle analysis), negative staining and freeze-fracture electron microscopy, as well as previous atomic force microscopy studies.« less
An adenosine triphosphate-independent proteasome activator contributes to the virulence of Mycobacterium tuberculosis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jastrab, Jordan B.; Wang, Tong; Murphy, J. Patrick

Mycobacterium tuberculosis encodes a proteasome that is highly similar to eukaryotic proteasomes and is required to cause lethal infections in animals. The only pathway known to target proteins for proteasomal degradation in bacteria is pupylation, which is functionally analogous to eukaryotic ubiquitylation. However, evidence suggests that the M. tuberculosis proteasome contributes to pupylation-independent pathways as well. To identify new proteasome cofactors that might contribute to such pathways, we isolated proteins that bound to proteasomes overproduced in M. tuberculosis and found a previously uncharacterized protein, Rv3780, which formed rings and capped M. tuberculosis proteasome core particles. Rv3780 enhanced peptide and proteinmore » degradation by proteasomes in an adenosine triphosphate (ATP)-independent manner. We identified putative Rv3780-dependent proteasome substrates and found that Rv3780 promoted robust degradation of the heat shock protein repressor, HspR. Importantly, an M. tuberculosis Rv3780 mutant had a general growth defect, was sensitive to heat stress, and was attenuated for growth in mice. Collectively, these data demonstrate that ATP-independent proteasome activators are not confined to eukaryotes and can contribute to the virulence of one the world’s most devastating pathogens.« less
Red fluorescent protein responsible for pigmentation in trematode-infected Porites compressa tissues.

PubMed

Palmer, Caroline V; Roth, Melissa S; Gates, Ruth D

2009-02-01

Reports of coral disease have increased dramatically over the last decade; however, the biological mechanisms that corals utilize to limit infection and resist disease remain poorly understood. Compromised coral tissues often display non-normal pigmentation that potentially represents an inflammation-like response, although these pigments remain uncharacterized. Using spectral emission analysis and cryo-histological and electrophoretic techniques, we investigated the pink pigmentation associated with trematodiasis, infection with Podocotyloides stenometre larval trematode, in Porites compressa. Spectral emission analysis reveals that macroscopic areas of pink pigmentation fluoresce under blue light excitation (450 nm) and produce a broad emission peak at 590 nm (+/-6) with a 60-nm full width at half maximum. Electrophoretic protein separation of pigmented tissue extract confirms the red fluorescence to be a protein rather than a low-molecular-weight compound. Histological sections demonstrate green fluorescence in healthy coral tissue and red fluorescence in the trematodiasis-compromised tissue. The red fluorescent protein (FP) is limited to the epidermis, is not associated with cells or granules, and appears unstructured. These data collectively suggest that the red FP is produced and localized in tissue infected by larval trematodes and plays a role in the immune response in corals.

RNA-binding activity of TRIM25 is mediated by its PRY/SPRY domain and is required for ubiquitination.

PubMed

Choudhury, Nila Roy; Heikel, Gregory; Trubitsyna, Maryia; Kubik, Peter; Nowak, Jakub Stanislaw; Webb, Shaun; Granneman, Sander; Spanos, Christos; Rappsilber, Juri; Castello, Alfredo; Michlewski, Gracjan

2017-11-08

TRIM25 is a novel RNA-binding protein and a member of the Tripartite Motif (TRIM) family of E3 ubiquitin ligases, which plays a pivotal role in the innate immune response. However, there is scarce knowledge about its RNA-related roles in cell biology. Furthermore, its RNA-binding domain has not been characterized. Here, we reveal that the RNA-binding activity of TRIM25 is mediated by its PRY/SPRY domain, which we postulate to be a novel RNA-binding domain. Using CLIP-seq and SILAC-based co-immunoprecipitation assays, we uncover TRIM25's endogenous RNA targets and protein binding partners. We demonstrate that TRIM25 controls the levels of Zinc Finger Antiviral Protein (ZAP). Finally, we show that the RNA-binding activity of TRIM25 is important for its ubiquitin ligase activity towards itself (autoubiquitination) and its physiologically relevant target ZAP. Our results suggest that many other proteins with the PRY/SPRY domain could have yet uncharacterized RNA-binding potential. Together, our data reveal new insights into the molecular roles and characteristics of RNA-binding E3 ubiquitin ligases and demonstrate that RNA could be an essential factor in their enzymatic activity.
New insights into the biogenesis of nuclear RNA polymerases?1

PubMed Central

Cloutier, Philippe; Coulombe, Benoit

2015-01-01

More than 30 years of research on nuclear RNA polymerases (RNAP I, II, and III) has uncovered numerous factors that regulate the activity of these enzymes during the transcription reaction. However, very little is known about the machinery that regulates the fate of RNAPs before or after transcription. In particular, the mechanisms of biogenesis of the 3 nuclear RNAPs, which comprise both common and specific subunits, remains mostly uncharacterized and the proteins involved are yet to be discovered. Using protein affinity purification coupled to mass spectrometry (AP–MS), we recently unraveled a high-density interaction network formed by nuclear RNAP subunits from the soluble fraction of human cell extracts. Validation of the dataset using a machine learning approach trained to minimize the rate of false positives and false negatives yielded a high-confidence dataset and uncovered novel interactors that regulate the RNAP II transcription machinery, including a set of proteins we named the RNAP II-associated proteins (RPAPs). One of the RPAPs, RPAP3, is part of an 11-subunit complex we termed the RPAP3/R2TP/prefoldin-like complex. Here, we review the literature on the subunits of this complex, which points to a role in nuclear RNAP biogenesis. PMID:20453924
Expression, purification, crystallization and preliminary X-ray crystallographic analysis of a novel plant-type ferredoxin/thioredoxin reductase-like protein from Methanosarcina acetivorans

PubMed Central

Kumar, Adepu K.; Yennawar, Neela H.; Yennawar, Hemant P.; Ferry, James G.

2011-01-01

The genome of Methanosarcina acetivorans contains a gene (ma1659) that is predicted to encode an uncharacterized chimeric protein containing a plant-type ferredoxin/thioredoxin reductase-like catalytic domain in the N-terminal region and a bacterial-like rubredoxin domain in the C-terminal region. To understand the structural and functional properties of the protein, the ma1659 gene was cloned and overexpressed in Escherichia coli. Crystals of the MA1659 protein were grown by the sitting-drop method using 2 M ammonium sulfate, 0.1 M HEPES buffer pH 7.5 and 0.1 M urea. Diffraction data were collected to 2.8 Å resolution using the remote data-collection feature of the Advanced Light Source, Lawrence Berkeley National Laboratory. The crystal belonged to the primitive cubic space group P23 or P213, with unit-cell parameters a = b = c = 92.72 Å. Assuming the presence of one molecule in the asymmetric unit gave a Matthews coefficient (V M) of 3.55 Å3 Da−1, corresponding to a solvent content of 65%. PMID:21795791
Regulator of G protein signaling 5 (RGS5) inhibits sonic hedgehog function in mouse cortical neurons.

PubMed

Liu, Chuanliang; Hu, Qiongqiong; Jing, Jia; Zhang, Yun; Jin, Jing; Zhang, Liulei; Mu, Lili; Liu, Yumei; Sun, Bo; Zhang, Tongshuai; Kong, Qingfei; Wang, Guangyou; Wang, Dandan; Zhang, Yao; Liu, Xijun; Zhao, Wei; Wang, Jinghua; Feng, Tao; Li, Hulun

2017-09-01

Regulator of G protein signaling 5 (RGS5) acts as a GTPase-activating protein (GAP) for the Gαi subunit and negatively regulates G protein-coupled receptor signaling. However, its presence and function in postmitotic differentiated primary neurons remains largely uncharacterized. During neural development, sonic hedgehog (Shh) signaling is involved in cell signaling pathways via Gαi activity. In particular, Shh signaling is essential for embryonic neural tube patterning, which has been implicated in neuronal polarization involving neurite outgrowth. Here, we examined whether RGS5 regulates Shh signaling in neurons. RGS5 transcripts were found to be expressed in cortical neurons and their expression gradually declined in a time-dependent manner in culture system. When an adenovirus expressing RGS5 was introduced into an in vitro cell culture model of cortical neurons, RGS5 overexpression significantly reduced neurite outgrowth and FM4-64 uptake, while cAMP-PKA signaling was also affected. These findings suggest that RGS5 inhibits Shh function during neurite outgrowth and the presynaptic terminals of primary cortical neurons mature via modulation of cAMP. Copyright © 2017 Elsevier Inc. All rights reserved.
An adenosine triphosphate-independent proteasome activator contributes to the virulence of Mycobacterium tuberculosis

DOE PAGES

Jastrab, Jordan B.; Wang, Tong; Murphy, J. Patrick; ...

2015-03-23

Mycobacterium tuberculosis encodes a proteasome that is highly similar to eukaryotic proteasomes and is required to cause lethal infections in animals. The only pathway known to target proteins for proteasomal degradation in bacteria is pupylation, which is functionally analogous to eukaryotic ubiquitylation. However, evidence suggests that the M. tuberculosis proteasome contributes to pupylation-independent pathways as well. To identify new proteasome cofactors that might contribute to such pathways, we isolated proteins that bound to proteasomes overproduced in M. tuberculosis and found a previously uncharacterized protein, Rv3780, which formed rings and capped M. tuberculosis proteasome core particles. Rv3780 enhanced peptide and proteinmore » degradation by proteasomes in an adenosine triphosphate (ATP)-independent manner. We identified putative Rv3780-dependent proteasome substrates and found that Rv3780 promoted robust degradation of the heat shock protein repressor, HspR. Importantly, an M. tuberculosis Rv3780 mutant had a general growth defect, was sensitive to heat stress, and was attenuated for growth in mice. Collectively, these data demonstrate that ATP-independent proteasome activators are not confined to eukaryotes and can contribute to the virulence of one the world’s most devastating pathogens.« less
Modular protein domains: an engineering approach toward functional biomaterials.

PubMed

Lin, Charng-Yu; Liu, Julie C

2016-08-01

Protein domains and peptide sequences are a powerful tool for conferring specific functions to engineered biomaterials. Protein sequences with a wide variety of functionalities, including structure, bioactivity, protein-protein interactions, and stimuli responsiveness, have been identified, and advances in molecular biology continue to pinpoint new sequences. Protein domains can be combined to make recombinant proteins with multiple functionalities. The high fidelity of the protein translation machinery results in exquisite control over the sequence of recombinant proteins and the resulting properties of protein-based materials. In this review, we discuss protein domains and peptide sequences in the context of functional protein-based materials, composite materials, and their biological applications. Copyright © 2016 Elsevier Ltd. All rights reserved.
Analysis of the genomic sequences and metabolites of Serratia surfactantfaciens sp. nov. YD25T that simultaneously produces prodigiosin and serrawettin W2.

PubMed

Su, Chun; Xiang, Zhaoju; Liu, Yibo; Zhao, Xinqing; Sun, Yan; Li, Zhi; Li, Lijun; Chang, Fan; Chen, Tianjun; Wen, Xinrong; Zhou, Yidan; Zhao, Furong

2016-11-03

Gram-negative bacteria of the genus Serratia are potential producers of many useful secondary metabolites, such as prodigiosin and serrawettins, which have potential applications in environmental bioremediation or in the pharmaceutical industry. Several Serratia strains produce prodigiosin and serrawettin W1 as the main bioactive compounds, and the biosynthetic pathways are co-regulated by quorum sensing (QS). In contrast, the Serratia strain, which can simultaneously produce prodigiosin and serrawettin W2, has not been reported. This study focused on analyzing the genomic sequence of Serratia sp. strain YD25 T isolated from rhizosphere soil under continuously planted burley tobacco collected from Yongding, Fujian province, China, which is unique in producing both prodigiosin and serrawettin W2. A hybrid polyketide synthases (PKS)-non-ribosomal peptide synthetases (NRPS) gene cluster putatively involved in biosynthesis of antimicrobial serrawettin W2 was identified in the genome of YD25 T , and its biosynthesis pathway was proposed. We found potent antimicrobial activity of serrawettin W2 purified from YD25 T against various pathogenic bacteria and fungi as well as antitumor activity against Hela cells. Subsequently, comparative genomic analyses were performed among a total of 133 Serratia species. The prodigiosin biosynthesis gene cluster in YD25 T belongs to the type I pig cluster, which is the main form of pig-encoding genes existing in most of the pigmented Serratia species. In addition, a complete autoinducer-2 (AI-2) system (including luxS, lsrBACDEF, lsrGK, and lsrR) as a conserved bacterial operator is found in the genome of Serratia sp. strain YD25 T . Phylogenetic analysis based on concatenated Lsr and LuxS proteins revealed that YD25 T formed an independent branch and was clearly distant from the strains that solely produce either prodigiosin or serrawettin W2. The Fe (III) ion reduction assay confirmed that strain YD25 T could produce an AI-2 signal molecule. Phylogenetic analysis using the genomic sequence of YD25 T combined with phylogenetic and phenotypic analyses support this strain as a member of a novel and previously uncharacterized Serratia species. Genomic sequence and metabolite analysis of Serratia surfactantfaciens YD25 T indicate that this strain can be further explored for the production of useful metabolites. Unveiling the genomic sequence of S. surfactantfaciens YD25 T benefits the usage of this unique strain as a model system for studying the biosynthesis regulation of both prodigiosin and serrawettin W2 by the QS system.
Apocrine Secretion in Drosophila Salivary Glands: Subcellular Origin, Dynamics, and Identification of Secretory Proteins

PubMed Central

Farkaš, Robert; Ďatková, Zuzana; Mentelová, Lucia; Löw, Péter; Beňová-Liszeková, Denisa; Beňo, Milan; Sass, Miklós; Řehulka, Pavel; Řehulková, Helena; Raška, Otakar; Kováčik, Lubomír; Šmigová, Jana; Raška, Ivan; Mechler, Bernard M.

2014-01-01

In contrast to the well defined mechanism of merocrine exocytosis, the mechanism of apocrine secretion, which was first described over 180 years ago, remains relatively uncharacterized. We identified apocrine secretory activity in the late prepupal salivary glands of Drosophila melanogaster just prior to the execution of programmed cell death (PCD). The excellent genetic tools available in Drosophila provide an opportunity to dissect for the first time the molecular and mechanistic aspects of this process. A prerequisite for such an analysis is to have pivotal immunohistochemical, ultrastructural, biochemical and proteomic data that fully characterize the process. Here we present data showing that the Drosophila salivary glands release all kinds of cellular proteins by an apocrine mechanism including cytoskeletal, cytosolic, mitochondrial, nuclear and nucleolar components. Surprisingly, the apocrine release of these proteins displays a temporal pattern with the sequential release of some proteins (e.g. transcription factor BR-C, tumor suppressor p127, cytoskeletal β-tubulin, non-muscle myosin) earlier than others (e.g. filamentous actin, nuclear lamin, mitochondrial pyruvate dehydrogenase). Although the apocrine release of proteins takes place just prior to the execution of an apoptotic program, the nuclear DNA is never released. Western blotting indicates that the secreted proteins remain undegraded in the lumen. Following apocrine secretion, the salivary gland cells remain quite vital, as they retain highly active transcriptional and protein synthetic activity. PMID:24732043
Prediction of vaccine candidates against Pseudomonas aeruginosa: An integrated genomics and proteomics approach.

PubMed

Rashid, Muhammad Ibrahim; Naz, Anam; Ali, Amjad; Andleeb, Saadia

2017-07-01

Pseudomonas aeruginosa is among top critical nosocomial infectious agents due to its persistent infections and tendency for acquiring drug resistance mechanisms. To date, there is no vaccine available for this pathogen. We attempted to exploit the genomic and proteomic information of P. aeruginosa though reverse-vaccinology approaches to unveil the prospective vaccine candidates. P. aeruginosa strain PAO1 genome was subjected to sequential prioritization approach following genomic, proteomics and structural analyses. Among, the predicted vaccine candidates: surface components of antibiotic efflux pumps (Q9HY88, PA2837), chaperone-usher pathway components (CupC2, CupB3), penicillin binding protein of bacterial cell wall (PBP1a/mrcA), extracellular component of Type 3 secretory system (PscC) and three uncharacterized secretory proteins (PA0629, PA2822, PA0978) were identified as potential candidates qualifying all the set criteria. These proteins were then analyzed for potential immunogenic surface exposed epitopes. These predicted epitopes may provide a basis for development of a reliable subunit vaccine against P. aeruginosa. Copyright © 2017 Elsevier Inc. All rights reserved.
An early cytoplasmic step of peptidoglycan synthesis is associated to MreB in Bacillus subtilis.

PubMed

Rueff, Anne-Stéphanie; Chastanet, Arnaud; Domínguez-Escobar, Julia; Yao, Zhizhong; Yates, James; Prejean, Maria-Victoria; Delumeau, Olivier; Noirot, Philippe; Wedlich-Söldner, Roland; Filipe, Sergio R; Carballido-López, Rut

2014-01-01

MreB proteins play a major role during morphogenesis of rod-shaped bacteria by organizing biosynthesis of the peptidoglycan cell wall. However, the mechanisms underlying this process are not well understood. In Bacillus subtilis, membrane-associated MreB polymers have been shown to be associated to elongation-specific complexes containing transmembrane morphogenetic factors and extracellular cell wall assembly proteins. We have now found that an early intracellular step of cell wall synthesis is also associated to MreB. We show that the previously uncharacterized protein YkuR (renamed DapI) is required for synthesis of meso-diaminopimelate (m-DAP), an essential constituent of the peptidoglycan precursor, and that it physically interacts with MreB. Highly inclined laminated optical sheet microscopy revealed that YkuR forms uniformly distributed foci that exhibit fast motion in the cytoplasm, and are not detected in cells lacking MreB. We propose a model in which soluble MreB organizes intracellular steps of peptidoglycan synthesis in the cytoplasm to feed the membrane-associated cell wall synthesizing machineries. © 2013 John Wiley & Sons Ltd.
Genome-wide characterization of monomeric transcriptional regulators in Mycobacterium tuberculosis.

PubMed

Feng, Lipeng; Chen, Zhenkang; Wang, Zhongwei; Hu, Yangbo; Chen, Shiyun

2016-05-01

Gene transcription catalysed by RNA polymerase is regulated by transcriptional regulators, which play central roles in the control of gene transcription in both eukaryotes and prokaryotes. In regulating gene transcription, many regulators form dimers that bind to DNA with repeated motifs. However, some regulators function as monomers, but their mechanisms of gene expression control are largely uncharacterized. Here we systematically characterized monomeric versus dimeric regulators in the tuberculosis causative agent Mycobacterium tuberculosis. Of the >160 transcriptional regulators annotated in M. tuberculosis, 154 transcriptional regulators were tested, 22 % probably act as monomers and most are annotated as hypothetical regulators. Notably, all members of the WhiB-like protein family are classified as monomers. To further investigate mechanisms of monomeric regulators, we analysed the actions of these WhiB proteins and found that the majority interact with the principal sigma factor σA, which is also a monomeric protein within the RNA polymerase holoenzyme. Taken together, our study for the first time globally classified monomeric regulators in M. tuberculosis and suggested a mechanism for monomeric regulators in controlling gene transcription through interacting with monomeric sigma factors.
A selfish DNA element engages a meiosis-specific motor and telomeres for germ-line propagation.

PubMed

Sau, Soumitra; Conrad, Michael N; Lee, Chih-Ying; Kaback, David B; Dresser, Michael E; Jayaram, Makkuni

2014-06-09

The chromosome-like mitotic stability of the yeast 2 micron plasmid is conferred by the plasmid proteins Rep1-Rep2 and the cis-acting locus STB, likely by promoting plasmid-chromosome association and segregation by hitchhiking. Our analysis reveals that stable plasmid segregation during meiosis requires the bouquet proteins Ndj1 and Csm4. Plasmid relocalization from the nuclear interior in mitotic cells to the periphery at or proximal to telomeres rises from early meiosis to pachytene. Analogous to chromosomes, the plasmid undergoes Csm4- and Ndj1-dependent rapid prophase movements with speeds comparable to those of telomeres. Lack of Ndj1 partially disrupts plasmid-telomere association without affecting plasmid colocalization with the telomere-binding protein Rap1. The plasmid appears to engage a meiosis-specific motor that orchestrates telomere-led chromosome movements for its telomere-associated segregation during meiosis I. This hitherto uncharacterized mode of germ-line transmission by a selfish genetic element signifies a mechanistic variation within the shared theme of chromosome-coupled plasmid segregation during mitosis and meiosis. © 2014 Sau et al.
Rapid and Scalable Characterization of CRISPR Technologies Using an E. coli Cell-Free Transcription-Translation System.

PubMed

Marshall, Ryan; Maxwell, Colin S; Collins, Scott P; Jacobsen, Thomas; Luo, Michelle L; Begemann, Matthew B; Gray, Benjamin N; January, Emma; Singer, Anna; He, Yonghua; Beisel, Chase L; Noireaux, Vincent

2018-01-04

CRISPR-Cas systems offer versatile technologies for genome engineering, yet their implementation has been outpaced by ongoing discoveries of new Cas nucleases and anti-CRISPR proteins. Here, we present the use of E. coli cell-free transcription-translation (TXTL) systems to vastly improve the speed and scalability of CRISPR characterization and validation. TXTL can express active CRISPR machinery from added plasmids and linear DNA, and TXTL can output quantitative dynamics of DNA cleavage and gene repression-all without protein purification or live cells. We used TXTL to measure the dynamics of DNA cleavage and gene repression for single- and multi-effector CRISPR nucleases, predict gene repression strength in E. coli, determine the specificities of 24 diverse anti-CRISPR proteins, and develop a fast and scalable screen for protospacer-adjacent motifs that was successfully applied to five uncharacterized Cpf1 nucleases. These examples underscore how TXTL can facilitate the characterization and application of CRISPR technologies across their many uses. Copyright © 2017 Elsevier Inc. All rights reserved.
New Aminoacyl-tRNA Synthetase-like Protein in Insecta with an Essential Mitochondrial Function*♦

PubMed Central

Guitart, Tanit; Leon Bernardo, Teresa; Sagalés, Jessica; Stratmann, Thomas; Bernués, Jordi; Ribas de Pouplana, Lluís

2010-01-01

Aminoacyl-tRNA synthetases (ARS) are modular enzymes that aminoacylate transfer RNAs (tRNA) for their use by the ribosome during protein synthesis. ARS are essential and universal components of the genetic code that were almost completely established before the appearance of the last common ancestor of all living species. This long evolutionary history explains the growing number of functions being discovered for ARS, and for ARS homologues, beyond their canonical role in gene translation. Here we present a previously uncharacterized paralogue of seryl-tRNA synthetase named SLIMP (seryl-tRNA synthetase-like insect mitochondrial protein). SLIMP is the result of a duplication of a mitochondrial seryl-tRNA synthetase (SRS) gene that took place in early metazoans and was fixed in Insecta. Here we show that SLIMP is localized in the mitochondria, where it carries out an essential function that is unrelated to the aminoacylation of tRNA. The knockdown of SLIMP by RNA interference (RNAi) causes a decrease in respiration capacity and an increase in mitochondrial mass in the form of aberrant mitochondria. PMID:20870726
ISG15 counteracts Listeria monocytogenes infection

PubMed Central

Radoshevich, Lilliana; Impens, Francis; Ribet, David; Quereda, Juan J; Nam Tham, To; Nahori, Marie-Anne; Bierne, Hélène; Dussurget, Olivier; Pizarro-Cerdá, Javier; Knobeloch, Klaus-Peter; Cossart, Pascale

2015-01-01

ISG15 is an interferon-stimulated, linear di-ubiquitin-like protein, with anti-viral activity. The role of ISG15 during bacterial infection remains elusive. We show that ISG15 expression in nonphagocytic cells is dramatically induced upon Listeria infection. Surprisingly this induction can be type I interferon independent and depends on the cytosolic surveillance pathway, which senses bacterial DNA and signals through STING, TBK1, IRF3 and IRF7. Most importantly, we observed that ISG15 expression restricts Listeria infection in vitro and in vivo. We made use of stable isotope labeling in tissue culture (SILAC) to identify ISGylated proteins that could be responsible for the protective effect. Strikingly, infection or overexpression of ISG15 leads to ISGylation of ER and Golgi proteins, which correlates with increased secretion of cytokines known to counteract infection. Together, our data reveal a previously uncharacterized ISG15-dependent restriction of Listeria infection, reinforcing the view that ISG15 is a key component of the innate immune response. DOI: http://dx.doi.org/10.7554/eLife.06848.001 PMID:26259872
A selfish DNA element engages a meiosis-specific motor and telomeres for germ-line propagation

PubMed Central

Sau, Soumitra; Conrad, Michael N.; Lee, Chih-Ying; Kaback, David B.; Dresser, Michael E.

2014-01-01

The chromosome-like mitotic stability of the yeast 2 micron plasmid is conferred by the plasmid proteins Rep1-Rep2 and the cis-acting locus STB, likely by promoting plasmid-chromosome association and segregation by hitchhiking. Our analysis reveals that stable plasmid segregation during meiosis requires the bouquet proteins Ndj1 and Csm4. Plasmid relocalization from the nuclear interior in mitotic cells to the periphery at or proximal to telomeres rises from early meiosis to pachytene. Analogous to chromosomes, the plasmid undergoes Csm4- and Ndj1-dependent rapid prophase movements with speeds comparable to those of telomeres. Lack of Ndj1 partially disrupts plasmid–telomere association without affecting plasmid colocalization with the telomere-binding protein Rap1. The plasmid appears to engage a meiosis-specific motor that orchestrates telomere-led chromosome movements for its telomere-associated segregation during meiosis I. This hitherto uncharacterized mode of germ-line transmission by a selfish genetic element signifies a mechanistic variation within the shared theme of chromosome-coupled plasmid segregation during mitosis and meiosis. PMID:24914236
Screen for mitochondrial DNA copy number maintenance genes reveals essential role for ATP synthase

PubMed Central

Fukuoh, Atsushi; Cannino, Giuseppe; Gerards, Mike; Buckley, Suzanne; Kazancioglu, Selena; Scialo, Filippo; Lihavainen, Eero; Ribeiro, Andre; Dufour, Eric; Jacobs, Howard T

2014-01-01

The machinery of mitochondrial DNA (mtDNA) maintenance is only partially characterized and is of wide interest due to its involvement in disease. To identify novel components of this machinery, plus other cellular pathways required for mtDNA viability, we implemented a genome-wide RNAi screen in Drosophila S2 cells, assaying for loss of fluorescence of mtDNA nucleoids stained with the DNA-intercalating agent PicoGreen. In addition to previously characterized components of the mtDNA replication and transcription machineries, positives included many proteins of the cytosolic proteasome and ribosome (but not the mitoribosome), three proteins involved in vesicle transport, some other factors involved in mitochondrial biogenesis or nuclear gene expression, > 30 mainly uncharacterized proteins and most subunits of ATP synthase (but no other OXPHOS complex). ATP synthase knockdown precipitated a burst of mitochondrial ROS production, followed by copy number depletion involving increased mitochondrial turnover, not dependent on the canonical autophagy machinery. Our findings will inform future studies of the apparatus and regulation of mtDNA maintenance, and the role of mitochondrial bioenergetics and signaling in modulating mtDNA copy number. PMID:24952591
Shotgun protein sequencing: assembly of peptide tandem mass spectra from mixtures of modified proteins.

PubMed

Bandeira, Nuno; Clauser, Karl R; Pevzner, Pavel A

2007-07-01

Despite significant advances in the identification of known proteins, the analysis of unknown proteins by MS/MS still remains a challenging open problem. Although Klaus Biemann recognized the potential of MS/MS for sequencing of unknown proteins in the 1980s, low throughput Edman degradation followed by cloning still remains the main method to sequence unknown proteins. The automated interpretation of MS/MS spectra has been limited by a focus on individual spectra and has not capitalized on the information contained in spectra of overlapping peptides. Indeed the powerful shotgun DNA sequencing strategies have not been extended to automated protein sequencing. We demonstrate, for the first time, the feasibility of automated shotgun protein sequencing of protein mixtures by utilizing MS/MS spectra of overlapping and possibly modified peptides generated via multiple proteases of different specificities. We validate this approach by generating highly accurate de novo reconstructions of multiple regions of various proteins in western diamondback rattlesnake venom. We further argue that shotgun protein sequencing has the potential to overcome the limitations of current protein sequencing approaches and thus catalyze the otherwise impractical applications of proteomics methodologies in studies of unknown proteins.
Looking the cow in the eye: deletion in the NID1 gene is associated with recessive inherited cataract in Romagnola cattle.

PubMed

Murgiano, Leonardo; Jagannathan, Vidhya; Calderoni, Valerio; Joechler, Monika; Gentile, Arcangelo; Drögemüller, Cord

2014-01-01

Cataract is a known condition leading to opacification of the eye lens causing partial or total blindness. Mutations are known to cause autosomal dominant or recessive inherited forms of cataracts in humans, mice, rats, guinea pigs and dogs. The use of large-sized animal models instead of those using mice for the study of this condition has been discussed due to the small size of rodent lenses. Four juvenile-onset cases of bilateral incomplete immature nuclear cataract were recently observed in Romagnola cattle. Pedigree analysis suggested a monogenic autosomal recessive inheritance. In addition to the cataract, one of the cases displayed abnormal head movements. Genome-wide association and homozygosity mapping and subsequent whole genome sequencing of a single case identified two perfectly associated sequence variants in a critical interval of 7.2 Mb on cattle chromosome 28: a missense point mutation located in an uncharacterized locus and an 855 bp deletion across the exon 19/intron 19 border of the bovine nidogen 1 (NID1) gene (c.3579_3604+829del). RT-PCR showed that NID1 is expressed in bovine lenses while the transcript of the second locus was absent. The NID1 deletion leads to the skipping of exon 19 during transcription and is therefore predicted to cause a frameshift and premature stop codon (p.1164fs27X). The truncated protein lacks a C-terminal domain essential for binding with matrix assembly complexes. Nidogen 1 deficient mice show neurological abnormalities and highly irregular crystal lens alterations. This study adds NID1 to the list of candidate genes for inherited cataract in humans and is the first report of a naturally occurring mutation leading to non-syndromic catarct in cattle provides a potential large animal model for human cataract.
Gene Unprediction with Spurio: A tool to identify spurious protein sequences.

PubMed

Höps, Wolfram; Jeffryes, Matt; Bateman, Alex

2018-01-01

We now have access to the sequences of tens of millions of proteins. These protein sequences are essential for modern molecular biology and computational biology. The vast majority of protein sequences are derived from gene prediction tools and have no experimental supporting evidence for their translation. Despite the increasing accuracy of gene prediction tools there likely exists a large number of spurious protein predictions in the sequence databases. We have developed the Spurio tool to help identify spurious protein predictions in prokaryotes. Spurio searches the query protein sequence against a prokaryotic nucleotide database using tblastn and identifies homologous sequences. The tblastn matches are used to score the query sequence's likelihood of being a spurious protein prediction using a Gaussian process model. The most informative feature is the appearance of stop codons within the presumed translation of homologous DNA sequences. Benchmarking shows that the Spurio tool is able to distinguish spurious from true proteins. However, transposon proteins are prone to be predicted as spurious because of the frequency of degraded homologs found in the DNA sequence databases. Our initial experiments suggest that less than 1% of the proteins in the UniProtKB sequence database are likely to be spurious and that Spurio is able to identify over 60 times more spurious proteins than the AntiFam resource. The Spurio software and source code is available under an MIT license at the following URL: https://bitbucket.org/bateman-group/spurio.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.