ChIPpeakAnno: a Bioconductor package to annotate ChIP-seq and ChIP-chip data
2010-01-01
Background Chromatin immunoprecipitation (ChIP) followed by high-throughput sequencing (ChIP-seq) or ChIP followed by genome tiling array analysis (ChIP-chip) have become standard technologies for genome-wide identification of DNA-binding protein target sites. A number of algorithms have been developed in parallel that allow identification of binding sites from ChIP-seq or ChIP-chip datasets and subsequent visualization in the University of California Santa Cruz (UCSC) Genome Browser as custom annotation tracks. However, summarizing these tracks can be a daunting task, particularly if there are a large number of binding sites or the binding sites are distributed widely across the genome. Results We have developed ChIPpeakAnno as a Bioconductor package within the statistical programming environment R to facilitate batch annotation of enriched peaks identified from ChIP-seq, ChIP-chip, cap analysis of gene expression (CAGE) or any experiments resulting in a large number of enriched genomic regions. The binding sites annotated with ChIPpeakAnno can be viewed easily as a table, a pie chart or plotted in histogram form, i.e., the distribution of distances to the nearest genes for each set of peaks. In addition, we have implemented functionalities for determining the significance of overlap between replicates or binding sites among transcription factors within a complex, and for drawing Venn diagrams to visualize the extent of the overlap between replicates. Furthermore, the package includes functionalities to retrieve sequences flanking putative binding sites for PCR amplification, cloning, or motif discovery, and to identify Gene Ontology (GO) terms associated with adjacent genes. Conclusions ChIPpeakAnno enables batch annotation of the binding sites identified from ChIP-seq, ChIP-chip, CAGE or any technology that results in a large number of enriched genomic regions within the statistical programming environment R. Allowing users to pass their own annotation data such as a different Chromatin immunoprecipitation (ChIP) preparation and a dataset from literature, or existing annotation packages, such as GenomicFeatures and BSgenome, provides flexibility. Tight integration to the biomaRt package enables up-to-date annotation retrieval from the BioMart database. PMID:20459804
Incorporating evolution of transcription factor binding sites into annotated alignments.
Bais, Abha S; Grossmann, Stefen; Vingron, Martin
2007-08-01
Identifying transcription factor binding sites (TFBSs) is essential to elucidate putative regulatory mechanisms. A common strategy is to combine cross-species conservation with single sequence TFBS annotation to yield "conserved TFBSs". Most current methods in this field adopt a multi-step approach that segregates the two aspects. Again, it is widely accepted that the evolutionary dynamics of binding sites differ from those of the surrounding sequence. Hence, it is desirable to have an approach that explicitly takes this factor into account. Although a plethora of approaches have been proposed for the prediction of conserved TFBSs, very few explicitly model TFBS evolutionary properties, while additionally being multi-step. Recently, we introduced a novel approach to simultaneously align and annotate conserved TFBSs in a pair of sequences. Building upon the standard Smith-Waterman algorithm for local alignments, SimAnn introduces additional states for profiles to output extended alignments or annotated alignments. That is, alignments with parts annotated as gaplessly aligned TFBSs (pair-profile hits)are generated. Moreover,the pair- profile related parameters are derived in a sound statistical framework. In this article, we extend this approach to explicitly incorporate evolution of binding sites in the SimAnn framework. We demonstrate the extension in the theoretical derivations through two position-specific evolutionary models, previously used for modelling TFBS evolution. In a simulated setting, we provide a proof of concept that the approach works given the underlying assumptions,as compared to the original work. Finally, using a real dataset of experimentally verified binding sites in human-mouse sequence pairs,we compare the new approach (eSimAnn) to an existing multi-step tool that also considers TFBS evolution. Although it is widely accepted that binding sites evolve differently from the surrounding sequences, most comparative TFBS identification methods do not explicitly consider this.Additionally, prediction of conserved binding sites is carried out in a multi-step approach that segregates alignment from TFBS annotation. In this paper, we demonstrate how the simultaneous alignment and annotation approach of SimAnn can be further extended to incorporate TFBS evolutionary relationships. We study how alignments and binding site predictions interplay at varying evolutionary distances and for various profile qualities.
Mudgal, Richa; Srinivasan, Narayanaswamy; Chandra, Nagasuma
2017-07-01
Functional annotation is seldom straightforward with complexities arising due to functional divergence in protein families or functional convergence between non-homologous protein families, leading to mis-annotations. An enzyme may contain multiple domains and not all domains may be involved in a given function, adding to the complexity in function annotation. To address this, we use binding site information from bound cognate ligands and catalytic residues, since it can help in resolving fold-function relationships at a finer level and with higher confidence. A comprehensive database of 2,020 fold-function-binding site relationships has been systematically generated. A network-based approach is employed to capture the complexity in these relationships, from which different types of associations are deciphered, that identify versatile protein folds performing diverse functions, same function associated with multiple folds and one-to-one relationships. Binding site similarity networks integrated with fold, function, and ligand similarity information are generated to understand the depth of these relationships. Apart from the observed continuity in the functional site space, network properties of these revealed versatile families with topologically different or dissimilar binding sites and structural families that perform very similar functions. As a case study, subtle changes in the active site of a set of evolutionarily related superfamilies are studied using these networks. Tracing of such similarities in evolutionarily related proteins provide clues into the transition and evolution of protein functions. Insights from this study will be helpful in accurate and reliable functional annotations of uncharacterized proteins, poly-pharmacology, and designing enzymes with new functional capabilities. Proteins 2017; 85:1319-1335. © 2017 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.
OnTheFly: a database of Drosophila melanogaster transcription factors and their binding sites.
Shazman, Shula; Lee, Hunjoong; Socol, Yakov; Mann, Richard S; Honig, Barry
2014-01-01
We present OnTheFly (http://bhapp.c2b2.columbia.edu/OnTheFly/index.php), a database comprising a systematic collection of transcription factors (TFs) of Drosophila melanogaster and their DNA-binding sites. TFs predicted in the Drosophila melanogaster genome are annotated and classified and their structures, obtained via experiment or homology models, are provided. All known preferred TF DNA-binding sites obtained from the B1H, DNase I and SELEX methodologies are presented. DNA shape parameters predicted for these sites are obtained from a high throughput server or from crystal structures of protein-DNA complexes where available. An important feature of the database is that all DNA-binding domains and their binding sites are fully annotated in a eukaryote using structural criteria and evolutionary homology. OnTheFly thus provides a comprehensive view of TFs and their binding sites that will be a valuable resource for deciphering non-coding regulatory DNA.
Suplatov, Dmitry; Kirilin, Eugeny; Arbatsky, Mikhail; Takhaveev, Vakil; Švedas, Vytas
2014-01-01
The new web-server pocketZebra implements the power of bioinformatics and geometry-based structural approaches to identify and rank subfamily-specific binding sites in proteins by functional significance, and select particular positions in the structure that determine selective accommodation of ligands. A new scoring function has been developed to annotate binding sites by the presence of the subfamily-specific positions in diverse protein families. pocketZebra web-server has multiple input modes to meet the needs of users with different experience in bioinformatics. The server provides on-site visualization of the results as well as off-line version of the output in annotated text format and as PyMol sessions ready for structural analysis. pocketZebra can be used to study structure–function relationship and regulation in large protein superfamilies, classify functionally important binding sites and annotate proteins with unknown function. The server can be used to engineer ligand-binding sites and allosteric regulation of enzymes, or implemented in a drug discovery process to search for potential molecular targets and novel selective inhibitors/effectors. The server, documentation and examples are freely available at http://biokinet.belozersky.msu.ru/pocketzebra and there are no login requirements. PMID:24852248
Meslamani, Jamel; Rognan, Didier; Kellenberger, Esther
2011-05-01
The sc-PDB database is an annotated archive of druggable binding sites extracted from the Protein Data Bank. It contains all-atoms coordinates for 8166 protein-ligand complexes, chosen for their geometrical and physico-chemical properties. The sc-PDB provides a functional annotation for proteins, a chemical description for ligands and the detailed intermolecular interactions for complexes. The sc-PDB now includes a hierarchical classification of all the binding sites within a functional class. The sc-PDB entries were first clustered according to the protein name indifferent of the species. For each cluster, we identified dissimilar sites (e.g. catalytic and allosteric sites of an enzyme). SCOPE AND APPLICATIONS: The classification of sc-PDB targets by binding site diversity was intended to facilitate chemogenomics approaches to drug design. In ligand-based approaches, it avoids comparing ligands that do not share the same binding site. In structure-based approaches, it permits to quantitatively evaluate the diversity of the binding site definition (variations in size, sequence and/or structure). The sc-PDB database is freely available at: http://bioinfo-pharma.u-strasbg.fr/scPDB.
Text Mining Improves Prediction of Protein Functional Sites
Cohn, Judith D.; Ravikumar, Komandur E.
2012-01-01
We present an approach that integrates protein structure analysis and text mining for protein functional site prediction, called LEAP-FS (Literature Enhanced Automated Prediction of Functional Sites). The structure analysis was carried out using Dynamics Perturbation Analysis (DPA), which predicts functional sites at control points where interactions greatly perturb protein vibrations. The text mining extracts mentions of residues in the literature, and predicts that residues mentioned are functionally important. We assessed the significance of each of these methods by analyzing their performance in finding known functional sites (specifically, small-molecule binding sites and catalytic sites) in about 100,000 publicly available protein structures. The DPA predictions recapitulated many of the functional site annotations and preferentially recovered binding sites annotated as biologically relevant vs. those annotated as potentially spurious. The text-based predictions were also substantially supported by the functional site annotations: compared to other residues, residues mentioned in text were roughly six times more likely to be found in a functional site. The overlap of predictions with annotations improved when the text-based and structure-based methods agreed. Our analysis also yielded new high-quality predictions of many functional site residues that were not catalogued in the curated data sources we inspected. We conclude that both DPA and text mining independently provide valuable high-throughput protein functional site predictions, and that integrating the two methods using LEAP-FS further improves the quality of these predictions. PMID:22393388
Damienikan, Aliaksandr U.
2016-01-01
The majority of bacterial genome annotations are currently automated and based on a ‘gene by gene’ approach. Regulatory signals and operon structures are rarely taken into account which often results in incomplete and even incorrect gene function assignments. Here we present SigmoID, a cross-platform (OS X, Linux and Windows) open-source application aiming at simplifying the identification of transcription regulatory sites (promoters, transcription factor binding sites and terminators) in bacterial genomes and providing assistance in correcting annotations in accordance with regulatory information. SigmoID combines a user-friendly graphical interface to well known command line tools with a genome browser for visualising regulatory elements in genomic context. Integrated access to online databases with regulatory information (RegPrecise and RegulonDB) and web-based search engines speeds up genome analysis and simplifies correction of genome annotation. We demonstrate some features of SigmoID by constructing a series of regulatory protein binding site profiles for two groups of bacteria: Soft Rot Enterobacteriaceae (Pectobacterium and Dickeya spp.) and Pseudomonas spp. Furthermore, we inferred over 900 transcription factor binding sites and alternative sigma factor promoters in the annotated genome of Pectobacterium atrosepticum. These regulatory signals control putative transcription units covering about 40% of the P. atrosepticum chromosome. Reviewing the annotation in cases where it didn’t fit with regulatory information allowed us to correct product and gene names for over 300 loci. PMID:27257541
Suplatov, Dmitry; Kirilin, Eugeny; Arbatsky, Mikhail; Takhaveev, Vakil; Svedas, Vytas
2014-07-01
The new web-server pocketZebra implements the power of bioinformatics and geometry-based structural approaches to identify and rank subfamily-specific binding sites in proteins by functional significance, and select particular positions in the structure that determine selective accommodation of ligands. A new scoring function has been developed to annotate binding sites by the presence of the subfamily-specific positions in diverse protein families. pocketZebra web-server has multiple input modes to meet the needs of users with different experience in bioinformatics. The server provides on-site visualization of the results as well as off-line version of the output in annotated text format and as PyMol sessions ready for structural analysis. pocketZebra can be used to study structure-function relationship and regulation in large protein superfamilies, classify functionally important binding sites and annotate proteins with unknown function. The server can be used to engineer ligand-binding sites and allosteric regulation of enzymes, or implemented in a drug discovery process to search for potential molecular targets and novel selective inhibitors/effectors. The server, documentation and examples are freely available at http://biokinet.belozersky.msu.ru/pocketzebra and there are no login requirements. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Revised annotation of Plutella xylostella microRNAs and their genome-wide target identification.
Etebari, K; Asgari, S
2016-12-01
The diamondback moth, Plutella xylostella, is the most devastating pest of brassica crops worldwide. Although 128 mature microRNAs (miRNAs) have been annotated from this species in miRBase, there is a need to extend and correct the current P. xylostella miRNA repertoire as a result of its recently improved genome assembly and more available small RNA sequence data. We used our new ultra-deep sequence data and bioinformatics to re-annotate the P. xylostella genome for high confidence miRNAs with the correct 5p and 3p arm features. Furthermore, all the P. xylostella annotated genes were also screened to identify potential miRNA binding sites using three target-predicting algorithms. In total, 203 mature miRNAs were annotated, including 33 novel miRNAs. We identified 7691 highly confident binding sites for 160 pxy-miRNAs. The data provided here will facilitate future studies involving functional analyses of P. xylostella miRNAs as a platform to introduce novel approaches for sustainable management of this destructive pest. © 2016 The Royal Entomological Society.
Doppelt-Azeroual, Olivia; Delfaud, François; Moriaud, Fabrice; de Brevern, Alexandre G
2010-04-01
Ligand-protein interactions are essential for biological processes, and precise characterization of protein binding sites is crucial to understand protein functions. MED-SuMo is a powerful technology to localize similar local regions on protein surfaces. Its heuristic is based on a 3D representation of macromolecules using specific surface chemical features associating chemical characteristics with geometrical properties. MED-SMA is an automated and fast method to classify binding sites. It is based on MED-SuMo technology, which builds a similarity graph, and it uses the Markov Clustering algorithm. Purine binding sites are well studied as drug targets. Here, purine binding sites of the Protein DataBank (PDB) are classified. Proteins potentially inhibited or activated through the same mechanism are gathered. Results are analyzed according to PROSITE annotations and to carefully refined functional annotations extracted from the PDB. As expected, binding sites associated with related mechanisms are gathered, for example, the Small GTPases. Nevertheless, protein kinases from different Kinome families are also found together, for example, Aurora-A and CDK2 proteins which are inhibited by the same drugs. Representative examples of different clusters are presented. The effectiveness of the MED-SMA approach is demonstrated as it gathers binding sites of proteins with similar structure-activity relationships. Moreover, an efficient new protocol associates structures absent of cocrystallized ligands to the purine clusters enabling those structures to be associated with a specific binding mechanism. Applications of this classification by binding mode similarity include target-based drug design and prediction of cross-reactivity and therefore potential toxic side effects.
Doppelt-Azeroual, Olivia; Delfaud, François; Moriaud, Fabrice; de Brevern, Alexandre G
2010-01-01
Ligand–protein interactions are essential for biological processes, and precise characterization of protein binding sites is crucial to understand protein functions. MED-SuMo is a powerful technology to localize similar local regions on protein surfaces. Its heuristic is based on a 3D representation of macromolecules using specific surface chemical features associating chemical characteristics with geometrical properties. MED-SMA is an automated and fast method to classify binding sites. It is based on MED-SuMo technology, which builds a similarity graph, and it uses the Markov Clustering algorithm. Purine binding sites are well studied as drug targets. Here, purine binding sites of the Protein DataBank (PDB) are classified. Proteins potentially inhibited or activated through the same mechanism are gathered. Results are analyzed according to PROSITE annotations and to carefully refined functional annotations extracted from the PDB. As expected, binding sites associated with related mechanisms are gathered, for example, the Small GTPases. Nevertheless, protein kinases from different Kinome families are also found together, for example, Aurora-A and CDK2 proteins which are inhibited by the same drugs. Representative examples of different clusters are presented. The effectiveness of the MED-SMA approach is demonstrated as it gathers binding sites of proteins with similar structure-activity relationships. Moreover, an efficient new protocol associates structures absent of cocrystallized ligands to the purine clusters enabling those structures to be associated with a specific binding mechanism. Applications of this classification by binding mode similarity include target-based drug design and prediction of cross-reactivity and therefore potential toxic side effects. PMID:20162627
Thermodynamic Modeling of Donor Splice Site Recognition in pre-mRNA
NASA Astrophysics Data System (ADS)
Aalberts, Daniel P.; Garland, Jeffrey A.
2004-03-01
When eukaryotic genes are edited by the spliceosome, the first step in intron recognition is the binding of a U1 snRNA with the donor (5') splice site. We model this interaction thermodynamically to identify splice sites. Applied to a set of 65 annotated genes, our Finding with Binding method achieves a significant separation between real and false sites. Analyzing binding patterns allows us to discard a large number of decoy sites. Our results improve statistics-based methods for donor site recognition, demonstrating the promise of physical modeling to find functional elements in the genome.
GBshape: a genome browser database for DNA shape annotations
Chiu, Tsu-Pei; Yang, Lin; Zhou, Tianyin; Main, Bradley J.; Parker, Stephen C.J.; Nuzhdin, Sergey V.; Tullius, Thomas D.; Rohs, Remo
2015-01-01
Many regulatory mechanisms require a high degree of specificity in protein-DNA binding. Nucleotide sequence does not provide an answer to the question of why a protein binds only to a small subset of the many putative binding sites in the genome that share the same core motif. Whereas higher-order effects, such as chromatin accessibility, cooperativity and cofactors, have been described, DNA shape recently gained attention as another feature that fine-tunes the DNA binding specificities of some transcription factor families. Our Genome Browser for DNA shape annotations (GBshape; freely available at http://rohslab.cmb.usc.edu/GBshape/) provides minor groove width, propeller twist, roll, helix twist and hydroxyl radical cleavage predictions for the entire genomes of 94 organisms. Additional genomes can easily be added using the GBshape framework. GBshape can be used to visualize DNA shape annotations qualitatively in a genome browser track format, and to download quantitative values of DNA shape features as a function of genomic position at nucleotide resolution. As biological applications, we illustrate the periodicity of DNA shape features that are present in nucleosome-occupied sequences from human, fly and worm, and we demonstrate structural similarities between transcription start sites in the genomes of four Drosophila species. PMID:25326329
Composite Structural Motifs of Binding Sites for Delineating Biological Functions of Proteins
Kinjo, Akira R.; Nakamura, Haruki
2012-01-01
Most biological processes are described as a series of interactions between proteins and other molecules, and interactions are in turn described in terms of atomic structures. To annotate protein functions as sets of interaction states at atomic resolution, and thereby to better understand the relation between protein interactions and biological functions, we conducted exhaustive all-against-all atomic structure comparisons of all known binding sites for ligands including small molecules, proteins and nucleic acids, and identified recurring elementary motifs. By integrating the elementary motifs associated with each subunit, we defined composite motifs that represent context-dependent combinations of elementary motifs. It is demonstrated that function similarity can be better inferred from composite motif similarity compared to the similarity of protein sequences or of individual binding sites. By integrating the composite motifs associated with each protein function, we define meta-composite motifs each of which is regarded as a time-independent diagrammatic representation of a biological process. It is shown that meta-composite motifs provide richer annotations of biological processes than sequence clusters. The present results serve as a basis for bridging atomic structures to higher-order biological phenomena by classification and integration of binding site structures. PMID:22347478
Kumar, Sunil; Ambrosini, Giovanna; Bucher, Philipp
2017-01-04
SNP2TFBS is a computational resource intended to support researchers investigating the molecular mechanisms underlying regulatory variation in the human genome. The database essentially consists of a collection of text files providing specific annotations for human single nucleotide polymorphisms (SNPs), namely whether they are predicted to abolish, create or change the affinity of one or several transcription factor (TF) binding sites. A SNP's effect on TF binding is estimated based on a position weight matrix (PWM) model for the binding specificity of the corresponding factor. These data files are regenerated at regular intervals by an automatic procedure that takes as input a reference genome, a comprehensive SNP catalogue and a collection of PWMs. SNP2TFBS is also accessible over a web interface, enabling users to view the information provided for an individual SNP, to extract SNPs based on various search criteria, to annotate uploaded sets of SNPs or to display statistics about the frequencies of binding sites affected by selected SNPs. Homepage: http://ccg.vital-it.ch/snp2tfbs/. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
GBshape: a genome browser database for DNA shape annotations.
Chiu, Tsu-Pei; Yang, Lin; Zhou, Tianyin; Main, Bradley J; Parker, Stephen C J; Nuzhdin, Sergey V; Tullius, Thomas D; Rohs, Remo
2015-01-01
Many regulatory mechanisms require a high degree of specificity in protein-DNA binding. Nucleotide sequence does not provide an answer to the question of why a protein binds only to a small subset of the many putative binding sites in the genome that share the same core motif. Whereas higher-order effects, such as chromatin accessibility, cooperativity and cofactors, have been described, DNA shape recently gained attention as another feature that fine-tunes the DNA binding specificities of some transcription factor families. Our Genome Browser for DNA shape annotations (GBshape; freely available at http://rohslab.cmb.usc.edu/GBshape/) provides minor groove width, propeller twist, roll, helix twist and hydroxyl radical cleavage predictions for the entire genomes of 94 organisms. Additional genomes can easily be added using the GBshape framework. GBshape can be used to visualize DNA shape annotations qualitatively in a genome browser track format, and to download quantitative values of DNA shape features as a function of genomic position at nucleotide resolution. As biological applications, we illustrate the periodicity of DNA shape features that are present in nucleosome-occupied sequences from human, fly and worm, and we demonstrate structural similarities between transcription start sites in the genomes of four Drosophila species. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Thermodynamic modeling of donor splice site recognition in pre-mRNA
NASA Astrophysics Data System (ADS)
Garland, Jeffrey A.; Aalberts, Daniel P.
2004-04-01
When eukaryotic genes are edited by the spliceosome, the first step in intron recognition is the binding of a U1 small nuclear RNA with the donor ( 5' ) splice site. We model this interaction thermodynamically to identify splice sites. Applied to a set of 65 annotated genes, our “finding with binding” method achieves a significant separation between real and false sites. Analyzing binding patterns allows us to discard a large number of decoy sites. Our results improve statistics-based methods for donor site recognition, demonstrating the promise of physical modeling to find functional elements in the genome.
Cheng, Chia-Yang; Chu, Chia-Han; Hsu, Hung-Wei; Hsu, Fang-Rong; Tang, Chung Yi; Wang, Wen-Ching; Kung, Hsing-Jien; Chang, Pei-Ching
2014-01-01
Post-translational modification (PTM) of transcriptional factors and chromatin remodelling proteins is recognized as a major mechanism by which transcriptional regulation occurs. Chromatin immunoprecipitation (ChIP) in combination with high-throughput sequencing (ChIP-seq) is being applied as a gold standard when studying the genome-wide binding sites of transcription factor (TFs). This has greatly improved our understanding of protein-DNA interactions on a genomic-wide scale. However, current ChIP-seq peak calling tools are not sufficiently sensitive and are unable to simultaneously identify post-translational modified TFs based on ChIP-seq analysis; this is largely due to the wide-spread presence of multiple modified TFs. Using SUMO-1 modification as an example; we describe here an improved approach that allows the simultaneous identification of the particular genomic binding regions of all TFs with SUMO-1 modification. Traditional peak calling methods are inadequate when identifying multiple TF binding sites that involve long genomic regions and therefore we designed a ChIP-seq processing pipeline for the detection of peaks via a combinatorial fusion method. Then, we annotate the peaks with known transcription factor binding sites (TFBS) using the Transfac Matrix Database (v7.0), which predicts potential SUMOylated TFs. Next, the peak calling result was further analyzed based on the promoter proximity, TFBS annotation, a literature review, and was validated by ChIP-real-time quantitative PCR (qPCR) and ChIP-reChIP real-time qPCR. The results show clearly that SUMOylated TFs are able to be pinpointed using our pipeline. A methodology is presented that analyzes SUMO-1 ChIP-seq patterns and predicts related TFs. Our analysis uses three peak calling tools. The fusion of these different tools increases the precision of the peak calling results. TFBS annotation method is able to predict potential SUMOylated TFs. Here, we offer a new approach that enhances ChIP-seq data analysis and allows the identification of multiple SUMOylated TF binding sites simultaneously, which can then be utilized for other functional PTM binding site prediction in future.
sc-PDB: an annotated database of druggable binding sites from the Protein Data Bank.
Kellenberger, Esther; Muller, Pascal; Schalon, Claire; Bret, Guillaume; Foata, Nicolas; Rognan, Didier
2006-01-01
The sc-PDB is a collection of 6 415 three-dimensional structures of binding sites found in the Protein Data Bank (PDB). Binding sites were extracted from all high-resolution crystal structures in which a complex between a protein cavity and a small-molecular-weight ligand could be identified. Importantly, ligands are considered from a pharmacological and not a structural point of view. Therefore, solvents, detergents, and most metal ions are not stored in the sc-PDB. Ligands are classified into four main categories: nucleotides (< 4-mer), peptides (< 9-mer), cofactors, and organic compounds. The corresponding binding site is formed by all protein residues (including amino acids, cofactors, and important metal ions) with at least one atom within 6.5 angstroms of any ligand atom. The database was carefully annotated by browsing several protein databases (PDB, UniProt, and GO) and storing, for every sc-PDB entry, the following features: protein name, function, source, domain and mutations, ligand name, and structure. The repository of ligands has also been archived by diversity analysis of molecular scaffolds, and several chemoinformatics descriptors were computed to better understand the chemical space covered by stored ligands. The sc-PDB may be used for several purposes: (i) screening a collection of binding sites for predicting the most likely target(s) of any ligand, (ii) analyzing the molecular similarity between different cavities, and (iii) deriving rules that describe the relationship between ligand pharmacophoric points and active-site properties. The database is periodically updated and accessible on the web at http://bioinfo-pharma.u-strasbg.fr/scPDB/.
Brylinski, Michal; Skolnick, Jeffrey
2010-01-01
The rapid accumulation of gene sequences, many of which are hypothetical proteins with unknown function, has stimulated the development of accurate computational tools for protein function prediction with evolution/structure-based approaches showing considerable promise. In this paper, we present FINDSITE-metal, a new threading-based method designed specifically to detect metal binding sites in modeled protein structures. Comprehensive benchmarks using different quality protein structures show that weakly homologous protein models provide sufficient structural information for quite accurate annotation by FINDSITE-metal. Combining structure/evolutionary information with machine learning results in highly accurate metal binding annotations; for protein models constructed by TASSER, whose average Cα RMSD from the native structure is 8.9 Å, 59.5% (71.9%) of the best of top five predicted metal locations are within 4 Å (8 Å) from a bound metal in the crystal structure. For most of the targets, multiple metal binding sites are detected with the best predicted binding site at rank 1 and within the top 2 ranks in 65.6% and 83.1% of the cases, respectively. Furthermore, for iron, copper, zinc, calcium and magnesium ions, the binding metal can be predicted with high, typically 70-90%, accuracy. FINDSITE-metal also provides a set of confidence indexes that help assess the reliability of predictions. Finally, we describe the proteome-wide application of FINDSITE-metal that quantifies the metal binding complement of the human proteome. FINDSITE-metal is freely available to the academic community at http://cssb.biology.gatech.edu/findsite-metal/. PMID:21287609
VASP-E: Specificity Annotation with a Volumetric Analysis of Electrostatic Isopotentials
Chen, Brian Y.
2014-01-01
Algorithms for comparing protein structure are frequently used for function annotation. By searching for subtle similarities among very different proteins, these algorithms can identify remote homologs with similar biological functions. In contrast, few comparison algorithms focus on specificity annotation, where the identification of subtle differences among very similar proteins can assist in finding small structural variations that create differences in binding specificity. Few specificity annotation methods consider electrostatic fields, which play a critical role in molecular recognition. To fill this gap, this paper describes VASP-E (Volumetric Analysis of Surface Properties with Electrostatics), a novel volumetric comparison tool based on the electrostatic comparison of protein-ligand and protein-protein binding sites. VASP-E exploits the central observation that three dimensional solids can be used to fully represent and compare both electrostatic isopotentials and molecular surfaces. With this integrated representation, VASP-E is able to dissect the electrostatic environments of protein-ligand and protein-protein binding interfaces, identifying individual amino acids that have an electrostatic influence on binding specificity. VASP-E was used to examine a nonredundant subset of the serine and cysteine proteases as well as the barnase-barstar and Rap1a-raf complexes. Based on amino acids established by various experimental studies to have an electrostatic influence on binding specificity, VASP-E identified electrostatically influential amino acids with 100% precision and 83.3% recall. We also show that VASP-E can accurately classify closely related ligand binding cavities into groups with different binding preferences. These results suggest that VASP-E should prove a useful tool for the characterization of specific binding and the engineering of binding preferences in proteins. PMID:25166865
Kılıç, Sefa; Sagitova, Dinara M; Wolfish, Shoshannah; Bely, Benoit; Courtot, Mélanie; Ciufo, Stacy; Tatusova, Tatiana; O'Donovan, Claire; Chibucos, Marcus C; Martin, Maria J; Erill, Ivan
2016-01-01
Domain-specific databases are essential resources for the biomedical community, leveraging expert knowledge to curate published literature and provide access to referenced data and knowledge. The limited scope of these databases, however, poses important challenges on their infrastructure, visibility, funding and usefulness to the broader scientific community. CollecTF is a community-oriented database documenting experimentally validated transcription factor (TF)-binding sites in the Bacteria domain. In its quest to become a community resource for the annotation of transcriptional regulatory elements in bacterial genomes, CollecTF aims to move away from the conventional data-repository paradigm of domain-specific databases. Through the adoption of well-established ontologies, identifiers and collaborations, CollecTF has progressively become also a portal for the annotation and submission of information on transcriptional regulatory elements to major biological sequence resources (RefSeq, UniProtKB and the Gene Ontology Consortium). This fundamental change in database conception capitalizes on the domain-specific knowledge of contributing communities to provide high-quality annotations, while leveraging the availability of stable information hubs to promote long-term access and provide high-visibility to the data. As a submission portal, CollecTF generates TF-binding site information through direct annotation of RefSeq genome records, definition of TF-based regulatory networks in UniProtKB entries and submission of functional annotations to the Gene Ontology. As a database, CollecTF provides enhanced search and browsing, targeted data exports, binding motif analysis tools and integration with motif discovery and search platforms. This innovative approach will allow CollecTF to focus its limited resources on the generation of high-quality information and the provision of specialized access to the data.Database URL: http://www.collectf.org/. © The Author(s) 2016. Published by Oxford University Press.
footprintDB: a database of transcription factors with annotated cis elements and binding interfaces.
Sebastian, Alvaro; Contreras-Moreira, Bruno
2014-01-15
Traditional and high-throughput techniques for determining transcription factor (TF) binding specificities are generating large volumes of data of uneven quality, which are scattered across individual databases. FootprintDB integrates some of the most comprehensive freely available libraries of curated DNA binding sites and systematically annotates the binding interfaces of the corresponding TFs. The first release contains 2422 unique TF sequences, 10 112 DNA binding sites and 3662 DNA motifs. A survey of the included data sources, organisms and TF families was performed together with proprietary database TRANSFAC, finding that footprintDB has a similar coverage of multicellular organisms, while also containing bacterial regulatory data. A search engine has been designed that drives the prediction of DNA motifs for input TFs, or conversely of TF sequences that might recognize input regulatory sequences, by comparison with database entries. Such predictions can also be extended to a single proteome chosen by the user, and results are ranked in terms of interface similarity. Benchmark experiments with bacterial, plant and human data were performed to measure the predictive power of footprintDB searches, which were able to correctly recover 10, 55 and 90% of the tested sequences, respectively. Correctly predicted TFs had a higher interface similarity than the average, confirming its diagnostic value. Web site implemented in PHP,Perl, MySQL and Apache. Freely available from http://floresta.eead.csic.es/footprintdb.
sc-PDB: a 3D-database of ligandable binding sites—10 years on
Desaphy, Jérémy; Bret, Guillaume; Rognan, Didier; Kellenberger, Esther
2015-01-01
The sc-PDB database (available at http://bioinfo-pharma.u-strasbg.fr/scPDB/) is a comprehensive and up-to-date selection of ligandable binding sites of the Protein Data Bank. Sites are defined from complexes between a protein and a pharmacological ligand. The database provides the all-atom description of the protein, its ligand, their binding site and their binding mode. Currently, the sc-PDB archive registers 9283 binding sites from 3678 unique proteins and 5608 unique ligands. The sc-PDB database was publicly launched in 2004 with the aim of providing structure files suitable for computational approaches to drug design, such as docking. During the last 10 years we have improved and standardized the processes for (i) identifying binding sites, (ii) correcting structures, (iii) annotating protein function and ligand properties and (iv) characterizing their binding mode. This paper presents the latest enhancements in the database, specifically pertaining to the representation of molecular interaction and to the similarity between ligand/protein binding patterns. The new website puts emphasis in pictorial analysis of data. PMID:25300483
sc-PDB: a 3D-database of ligandable binding sites--10 years on.
Desaphy, Jérémy; Bret, Guillaume; Rognan, Didier; Kellenberger, Esther
2015-01-01
The sc-PDB database (available at http://bioinfo-pharma.u-strasbg.fr/scPDB/) is a comprehensive and up-to-date selection of ligandable binding sites of the Protein Data Bank. Sites are defined from complexes between a protein and a pharmacological ligand. The database provides the all-atom description of the protein, its ligand, their binding site and their binding mode. Currently, the sc-PDB archive registers 9283 binding sites from 3678 unique proteins and 5608 unique ligands. The sc-PDB database was publicly launched in 2004 with the aim of providing structure files suitable for computational approaches to drug design, such as docking. During the last 10 years we have improved and standardized the processes for (i) identifying binding sites, (ii) correcting structures, (iii) annotating protein function and ligand properties and (iv) characterizing their binding mode. This paper presents the latest enhancements in the database, specifically pertaining to the representation of molecular interaction and to the similarity between ligand/protein binding patterns. The new website puts emphasis in pictorial analysis of data. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
ePIANNO: ePIgenomics ANNOtation tool.
Liu, Chia-Hsin; Ho, Bing-Ching; Chen, Chun-Ling; Chang, Ya-Hsuan; Hsu, Yi-Chiung; Li, Yu-Cheng; Yuan, Shin-Sheng; Huang, Yi-Huan; Chang, Chi-Sheng; Li, Ker-Chau; Chen, Hsuan-Yu
2016-01-01
Recently, with the development of next generation sequencing (NGS), the combination of chromatin immunoprecipitation (ChIP) and NGS, namely ChIP-seq, has become a powerful technique to capture potential genomic binding sites of regulatory factors, histone modifications and chromatin accessible regions. For most researchers, additional information including genomic variations on the TF binding site, allele frequency of variation between different populations, variation associated disease, and other neighbour TF binding sites are essential to generate a proper hypothesis or a meaningful conclusion. Many ChIP-seq datasets had been deposited on the public domain to help researchers make new discoveries. However, researches are often intimidated by the complexity of data structure and largeness of data volume. Such information would be more useful if they could be combined or downloaded with ChIP-seq data. To meet such demands, we built a webtool: ePIgenomic ANNOtation tool (ePIANNO, http://epianno.stat.sinica.edu.tw/index.html). ePIANNO is a web server that combines SNP information of populations (1000 Genomes Project) and gene-disease association information of GWAS (NHGRI) with ChIP-seq (hmChIP, ENCODE, and ROADMAP epigenomics) data. ePIANNO has a user-friendly website interface allowing researchers to explore, navigate, and extract data quickly. We use two examples to demonstrate how users could use functions of ePIANNO webserver to explore useful information about TF related genomic variants. Users could use our query functions to search target regions, transcription factors, or annotations. ePIANNO may help users to generate hypothesis or explore potential biological functions for their studies.
CaMELS: In silico prediction of calmodulin binding proteins and their binding sites.
Abbasi, Wajid Arshad; Asif, Amina; Andleeb, Saiqa; Minhas, Fayyaz Ul Amir Afsar
2017-09-01
Due to Ca 2+ -dependent binding and the sequence diversity of Calmodulin (CaM) binding proteins, identifying CaM interactions and binding sites in the wet-lab is tedious and costly. Therefore, computational methods for this purpose are crucial to the design of such wet-lab experiments. We present an algorithm suite called CaMELS (CalModulin intEraction Learning System) for predicting proteins that interact with CaM as well as their binding sites using sequence information alone. CaMELS offers state of the art accuracy for both CaM interaction and binding site prediction and can aid biologists in studying CaM binding proteins. For CaM interaction prediction, CaMELS uses protein sequence features coupled with a large-margin classifier. CaMELS models the binding site prediction problem using multiple instance machine learning with a custom optimization algorithm which allows more effective learning over imprecisely annotated CaM-binding sites during training. CaMELS has been extensively benchmarked using a variety of data sets, mutagenic studies, proteome-wide Gene Ontology enrichment analyses and protein structures. Our experiments indicate that CaMELS outperforms simple motif-based search and other existing methods for interaction and binding site prediction. We have also found that the whole sequence of a protein, rather than just its binding site, is important for predicting its interaction with CaM. Using the machine learning model in CaMELS, we have identified important features of protein sequences for CaM interaction prediction as well as characteristic amino acid sub-sequences and their relative position for identifying CaM binding sites. Python code for training and evaluating CaMELS together with a webserver implementation is available at the URL: http://faculty.pieas.edu.pk/fayyaz/software.html#camels. © 2017 Wiley Periodicals, Inc.
Benchmarking database performance for genomic data.
Khushi, Matloob
2015-06-01
Genomic regions represent features such as gene annotations, transcription factor binding sites and epigenetic modifications. Performing various genomic operations such as identifying overlapping/non-overlapping regions or nearest gene annotations are common research needs. The data can be saved in a database system for easy management, however, there is no comprehensive database built-in algorithm at present to identify overlapping regions. Therefore I have developed a novel region-mapping (RegMap) SQL-based algorithm to perform genomic operations and have benchmarked the performance of different databases. Benchmarking identified that PostgreSQL extracts overlapping regions much faster than MySQL. Insertion and data uploads in PostgreSQL were also better, although general searching capability of both databases was almost equivalent. In addition, using the algorithm pair-wise, overlaps of >1000 datasets of transcription factor binding sites and histone marks, collected from previous publications, were reported and it was found that HNF4G significantly co-locates with cohesin subunit STAG1 (SA1).Inc. © 2015 Wiley Periodicals, Inc.
Gold, Nicola D; Jackson, Richard M
2006-02-03
The rapid growth in protein structural data and the emergence of structural genomics projects have increased the need for automatic structure analysis and tools for function prediction. Small molecule recognition is critical to the function of many proteins; therefore, determination of ligand binding site similarity is important for understanding ligand interactions and may allow their functional classification. Here, we present a binding sites database (SitesBase) that given a known protein-ligand binding site allows rapid retrieval of other binding sites with similar structure independent of overall sequence or fold similarity. However, each match is also annotated with sequence similarity and fold information to aid interpretation of structure and functional similarity. Similarity in ligand binding sites can indicate common binding modes and recognition of similar molecules, allowing potential inference of function for an uncharacterised protein or providing additional evidence of common function where sequence or fold similarity is already known. Alternatively, the resource can provide valuable information for detailed studies of molecular recognition including structure-based ligand design and in understanding ligand cross-reactivity. Here, we show examples of atomic similarity between superfamily or more distant fold relatives as well as between seemingly unrelated proteins. Assignment of unclassified proteins to structural superfamiles is also undertaken and in most cases substantiates assignments made using sequence similarity. Correct assignment is also possible where sequence similarity fails to find significant matches, illustrating the potential use of binding site comparisons for newly determined proteins.
FALDO: a semantic standard for describing the location of nucleotide and protein feature annotation.
Bolleman, Jerven T; Mungall, Christopher J; Strozzi, Francesco; Baran, Joachim; Dumontier, Michel; Bonnal, Raoul J P; Buels, Robert; Hoehndorf, Robert; Fujisawa, Takatomo; Katayama, Toshiaki; Cock, Peter J A
2016-06-13
Nucleotide and protein sequence feature annotations are essential to understand biology on the genomic, transcriptomic, and proteomic level. Using Semantic Web technologies to query biological annotations, there was no standard that described this potentially complex location information as subject-predicate-object triples. We have developed an ontology, the Feature Annotation Location Description Ontology (FALDO), to describe the positions of annotated features on linear and circular sequences. FALDO can be used to describe nucleotide features in sequence records, protein annotations, and glycan binding sites, among other features in coordinate systems of the aforementioned "omics" areas. Using the same data format to represent sequence positions that are independent of file formats allows us to integrate sequence data from multiple sources and data types. The genome browser JBrowse is used to demonstrate accessing multiple SPARQL endpoints to display genomic feature annotations, as well as protein annotations from UniProt mapped to genomic locations. Our ontology allows users to uniformly describe - and potentially merge - sequence annotations from multiple sources. Data sources using FALDO can prospectively be retrieved using federalised SPARQL queries against public SPARQL endpoints and/or local private triple stores.
FALDO: a semantic standard for describing the location of nucleotide and protein feature annotation
Bolleman, Jerven T.; Mungall, Christopher J.; Strozzi, Francesco; ...
2016-06-13
Nucleotide and protein sequence feature annotations are essential to understand biology on the genomic, transcriptomic, and proteomic level. Using Semantic Web technologies to query biological annotations, there was no standard that described this potentially complex location information as subject-predicate-object triples. In this paper, we have developed an ontology, the Feature Annotation Location Description Ontology (FALDO), to describe the positions of annotated features on linear and circular sequences. FALDO can be used to describe nucleotide features in sequence records, protein annotations, and glycan binding sites, among other features in coordinate systems of the aforementioned “omics” areas. Using the same data formatmore » to represent sequence positions that are independent of file formats allows us to integrate sequence data from multiple sources and data types. The genome browser JBrowse is used to demonstrate accessing multiple SPARQL endpoints to display genomic feature annotations, as well as protein annotations from UniProt mapped to genomic locations. Our ontology allows users to uniformly describe – and potentially merge – sequence annotations from multiple sources. Finally, data sources using FALDO can prospectively be retrieved using federalised SPARQL queries against public SPARQL endpoints and/or local private triple stores.« less
FALDO: a semantic standard for describing the location of nucleotide and protein feature annotation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bolleman, Jerven T.; Mungall, Christopher J.; Strozzi, Francesco
Nucleotide and protein sequence feature annotations are essential to understand biology on the genomic, transcriptomic, and proteomic level. Using Semantic Web technologies to query biological annotations, there was no standard that described this potentially complex location information as subject-predicate-object triples. In this paper, we have developed an ontology, the Feature Annotation Location Description Ontology (FALDO), to describe the positions of annotated features on linear and circular sequences. FALDO can be used to describe nucleotide features in sequence records, protein annotations, and glycan binding sites, among other features in coordinate systems of the aforementioned “omics” areas. Using the same data formatmore » to represent sequence positions that are independent of file formats allows us to integrate sequence data from multiple sources and data types. The genome browser JBrowse is used to demonstrate accessing multiple SPARQL endpoints to display genomic feature annotations, as well as protein annotations from UniProt mapped to genomic locations. Our ontology allows users to uniformly describe – and potentially merge – sequence annotations from multiple sources. Finally, data sources using FALDO can prospectively be retrieved using federalised SPARQL queries against public SPARQL endpoints and/or local private triple stores.« less
MetalPDB in 2018: a database of metal sites in biological macromolecular structures.
Putignano, Valeria; Rosato, Antonio; Banci, Lucia; Andreini, Claudia
2018-01-04
MetalPDB (http://metalweb.cerm.unifi.it/) is a database providing information on metal-binding sites detected in the three-dimensional (3D) structures of biological macromolecules. MetalPDB represents such sites as 3D templates, called Minimal Functional Sites (MFSs), which describe the local environment around the metal(s) independently of the larger context of the macromolecular structure. The 2018 update of MetalPDB includes new contents and tools. A major extension is the inclusion of proteins whose structures do not contain metal ions although their sequences potentially contain a known MFS. In addition, MetalPDB now provides extensive statistical analyses addressing several aspects of general metal usage within the PDB, across protein families and in catalysis. Users can also query MetalPDB to extract statistical information on structural aspects associated with individual metals, such as preferred coordination geometries or aminoacidic environment. A further major improvement is the functional annotation of MFSs; the annotation is manually performed via a password-protected annotator interface. At present, ∼50% of all MFSs have such a functional annotation. Other noteworthy improvements are bulk query functionality, through the upload of a list of PDB identifiers, and ftp access to MetalPDB contents, allowing users to carry out in-depth analyses on their own computational infrastructure. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
A Graph Approach to Mining Biological Patterns in the Binding Interfaces.
Cheng, Wen; Yan, Changhui
2017-01-01
Protein-RNA interactions play important roles in the biological systems. Searching for regular patterns in the Protein-RNA binding interfaces is important for understanding how protein and RNA recognize each other and bind to form a complex. Herein, we present a graph-mining method for discovering biological patterns in the protein-RNA interfaces. We represented known protein-RNA interfaces using graphs and then discovered graph patterns enriched in the interfaces. Comparison of the discovered graph patterns with UniProt annotations showed that the graph patterns had a significant overlap with residue sites that had been proven crucial for the RNA binding by experimental methods. Using 200 patterns as input features, a support vector machine method was able to classify protein surface patches into RNA-binding sites and non-RNA-binding sites with 84.0% accuracy and 88.9% precision. We built a simple scoring function that calculated the total number of the graph patterns that occurred in a protein-RNA interface. That scoring function was able to discriminate near-native protein-RNA complexes from docking decoys with a performance comparable with that of a state-of-the-art complex scoring function. Our work also revealed possible patterns that might be important for binding affinity.
Jian, Jhih-Wei; Elumalai, Pavadai; Pitti, Thejkiran; Wu, Chih Yuan; Tsai, Keng-Chang; Chang, Jeng-Yih; Peng, Hung-Pin; Yang, An-Suei
2016-01-01
Predicting ligand binding sites (LBSs) on protein structures, which are obtained either from experimental or computational methods, is a useful first step in functional annotation or structure-based drug design for the protein structures. In this work, the structure-based machine learning algorithm ISMBLab-LIG was developed to predict LBSs on protein surfaces with input attributes derived from the three-dimensional probability density maps of interacting atoms, which were reconstructed on the query protein surfaces and were relatively insensitive to local conformational variations of the tentative ligand binding sites. The prediction accuracy of the ISMBLab-LIG predictors is comparable to that of the best LBS predictors benchmarked on several well-established testing datasets. More importantly, the ISMBLab-LIG algorithm has substantial tolerance to the prediction uncertainties of computationally derived protein structure models. As such, the method is particularly useful for predicting LBSs not only on experimental protein structures without known LBS templates in the database but also on computationally predicted model protein structures with structural uncertainties in the tentative ligand binding sites. PMID:27513851
Identification of metal ion binding sites based on amino acid sequences
Cao, Xiaoyong; Zhang, Xiaojin; Gao, Sujuan; Ding, Changjiang; Feng, Yonge; Bao, Weihua
2017-01-01
The identification of metal ion binding sites is important for protein function annotation and the design of new drug molecules. This study presents an effective method of analyzing and identifying the binding residues of metal ions based solely on sequence information. Ten metal ions were extracted from the BioLip database: Zn2+, Cu2+, Fe2+, Fe3+, Ca2+, Mg2+, Mn2+, Na+, K+ and Co2+. The analysis showed that Zn2+, Cu2+, Fe2+, Fe3+, and Co2+ were sensitive to the conservation of amino acids at binding sites, and promising results can be achieved using the Position Weight Scoring Matrix algorithm, with an accuracy of over 79.9% and a Matthews correlation coefficient of over 0.6. The binding sites of other metals can also be accurately identified using the Support Vector Machine algorithm with multifeature parameters as input. In addition, we found that Ca2+ was insensitive to hydrophobicity and hydrophilicity information and Mn2+ was insensitive to polarization charge information. An online server was constructed based on the framework of the proposed method and is freely available at http://60.31.198.140:8081/metal/HomePage/HomePage.html. PMID:28854211
Identification of metal ion binding sites based on amino acid sequences.
Cao, Xiaoyong; Hu, Xiuzhen; Zhang, Xiaojin; Gao, Sujuan; Ding, Changjiang; Feng, Yonge; Bao, Weihua
2017-01-01
The identification of metal ion binding sites is important for protein function annotation and the design of new drug molecules. This study presents an effective method of analyzing and identifying the binding residues of metal ions based solely on sequence information. Ten metal ions were extracted from the BioLip database: Zn2+, Cu2+, Fe2+, Fe3+, Ca2+, Mg2+, Mn2+, Na+, K+ and Co2+. The analysis showed that Zn2+, Cu2+, Fe2+, Fe3+, and Co2+ were sensitive to the conservation of amino acids at binding sites, and promising results can be achieved using the Position Weight Scoring Matrix algorithm, with an accuracy of over 79.9% and a Matthews correlation coefficient of over 0.6. The binding sites of other metals can also be accurately identified using the Support Vector Machine algorithm with multifeature parameters as input. In addition, we found that Ca2+ was insensitive to hydrophobicity and hydrophilicity information and Mn2+ was insensitive to polarization charge information. An online server was constructed based on the framework of the proposed method and is freely available at http://60.31.198.140:8081/metal/HomePage/HomePage.html.
Identification of Candidate Transcription Factor Binding Sites in the Cattle Genome
Bickhart, Derek M.; Liu, George E.
2013-01-01
A resource that provides candidate transcription factor binding sites (TFBSs) does not currently exist for cattle. Such data is necessary, as predicted sites may serve as excellent starting locations for future omics studies to develop transcriptional regulation hypotheses. In order to generate this resource, we employed a phylogenetic footprinting approach—using sequence conservation across cattle, human and dog—and position-specific scoring matrices to identify 379,333 putative TFBSs upstream of nearly 8000 Mammalian Gene Collection (MGC) annotated genes within the cattle genome. Comparisons of our predictions to known binding site loci within the PCK1, ACTA1 and G6PC promoter regions revealed 75% sensitivity for our method of discovery. Additionally, we intersected our predictions with known cattle SNP variants in dbSNP and on the Illumina BovineHD 770k and Bos 1 SNP chips, finding 7534, 444 and 346 overlaps, respectively. Due to our stringent filtering criteria, these results represent high quality predictions of putative TFBSs within the cattle genome. All binding site predictions are freely available at http://bfgl.anri.barc.usda.gov/BovineTFBS/ or http://199.133.54.77/BovineTFBS. PMID:23433959
Hu, Xiao-Qian; Guo, Peng-Chao; Ma, Jin-Di; Li, Wei-Fang
2013-11-01
The primary role of yeast Ara1, previously mis-annotated as a D-arabinose dehydrogenase, is to catalyze the reduction of a variety of toxic α,β-dicarbonyl compounds using NADPH as a cofactor at physiological pH levels. Here, crystal structures of Ara1 in apo and NADPH-complexed forms are presented at 2.10 and 2.00 Å resolution, respectively. Ara1 exists as a homodimer, each subunit of which adopts an (α/β)8-barrel structure and has a highly conserved cofactor-binding pocket. Structural comparison revealed that induced fit upon NADPH binding yielded an intact active-site pocket that recognizes the substrate. Moreover, the crystal structures combined with computational simulation defined an open substrate-binding site to accommodate various substrates that possess a dicarbonyl group.
BioSAVE: display of scored annotation within a sequence context.
Pollock, Richard F; Adryan, Boris
2008-03-20
Visualization of sequence annotation is a common feature in many bioinformatics tools. For many applications it is desirable to restrict the display of such annotation according to a score cutoff, as biological interpretation can be difficult in the presence of the entire data. Unfortunately, many visualisation solutions are somewhat static in the way they handle such score cutoffs. We present BioSAVE, a sequence annotation viewer with on-the-fly selection of visualisation thresholds for each feature. BioSAVE is a versatile OS X program for visual display of scored features (annotation) within a sequence context. The program reads sequence and additional supplementary annotation data (e.g., position weight matrix matches, conservation scores, structural domains) from a variety of commonly used file formats and displays them graphically. Onscreen controls then allow for live customisation of these graphics, including on-the-fly selection of visualisation thresholds for each feature. Possible applications of the program include display of transcription factor binding sites in a genomic context or the visualisation of structural domain assignments in protein sequences and many more. The dynamic visualisation of these annotations is useful, e.g., for the determination of cutoff values of predicted features to match experimental data. Program, source code and exemplary files are freely available at the BioSAVE homepage.
BioSAVE: Display of scored annotation within a sequence context
Pollock, Richard F; Adryan, Boris
2008-01-01
Background Visualization of sequence annotation is a common feature in many bioinformatics tools. For many applications it is desirable to restrict the display of such annotation according to a score cutoff, as biological interpretation can be difficult in the presence of the entire data. Unfortunately, many visualisation solutions are somewhat static in the way they handle such score cutoffs. Results We present BioSAVE, a sequence annotation viewer with on-the-fly selection of visualisation thresholds for each feature. BioSAVE is a versatile OS X program for visual display of scored features (annotation) within a sequence context. The program reads sequence and additional supplementary annotation data (e.g., position weight matrix matches, conservation scores, structural domains) from a variety of commonly used file formats and displays them graphically. Onscreen controls then allow for live customisation of these graphics, including on-the-fly selection of visualisation thresholds for each feature. Conclusion Possible applications of the program include display of transcription factor binding sites in a genomic context or the visualisation of structural domain assignments in protein sequences and many more. The dynamic visualisation of these annotations is useful, e.g., for the determination of cutoff values of predicted features to match experimental data. Program, source code and exemplary files are freely available at the BioSAVE homepage. PMID:18366701
Hwang, Hun-Way; Park, Christopher Y.; Goodarzi, Hani; Fak, John J.; Mele, Aldo; Moore, Michael J.; Saito, Yuhki; Darnell, Robert B.
2016-01-01
Accurate and precise annotation of the 3′ untranslated regions (3′ UTRs) is critical in understanding how mRNAs are regulated by microRNAs (miRNAs) and RNA-binding proteins (RBPs). Here we describe a method, PAPERCLIP (Poly(A) binding Protein-mediated mRNA 3′ End Retrieval by CrossLinking ImmunoPrecipitation), which shows high specificity for the mRNA 3′ ends and compares favorably to existing 3′ end mapping methods. PAPERCLIP uncovers a previously unrecognized role of CstF64/64tau in promoting the usage of a selected group of non-canonical poly(A) sites, the majority of them containing a downstream GUKKU motif. Furthermore, in mouse brain, PAPERCLIP discovers extended 3′ UTR sequences harboring functional miRNA binding sites and reveals developmentally regulated APA shifts including one in Atp2b2 that is evolutionarily conserved in human and results in a gain of a functional binding site of miR-137. PAPERCLIP provides a powerful tool to decipher post-transcriptional regulation of mRNAs through APA in vivo. PMID:27050522
Nucleosome regulatory dynamics in response to TGFβ
Enroth, Stefan; Andersson, Robin; Bysani, Madhusudhan; Wallerman, Ola; Termén, Stefan; Tuch, Brian B.; De La Vega, Francisco M.; Heldin, Carl-Henrik; Moustakas, Aristidis; Komorowski, Jan; Wadelius, Claes
2014-01-01
Nucleosomes play important roles in a cell beyond their basal functionality in chromatin compaction. Their placement affects all steps in transcriptional regulation, from transcription factor (TF) binding to messenger ribonucleic acid (mRNA) synthesis. Careful profiling of their locations and dynamics in response to stimuli is important to further our understanding of transcriptional regulation by the state of chromatin. We measured nucleosome occupancy in human hepatic cells before and after treatment with transforming growth factor beta 1 (TGFβ1), using massively parallel sequencing. With a newly developed method, SuMMIt, for precise positioning of nucleosomes we inferred dynamics of the nucleosomal landscape. Distinct nucleosome positioning has previously been described at transcription start site and flanking TF binding sites. We found that the average pattern is present at very few sites and, in case of TF binding, the double peak surrounding the sites is just an artifact of averaging over many loci. We systematically searched for depleted nucleosomes in stimulated cells compared to unstimulated cells and identified 24 318 loci. Depending on genomic annotation, 44–78% of them were over-represented in binding motifs for TFs. Changes in binding affinity were verified for HNF4α by qPCR. Strikingly many of these loci were associated with expression changes, as measured by RNA sequencing. PMID:24771338
Gilad, Yoav; Pritchard, Jonathan K.; Stephens, Matthew
2015-01-01
Understanding global gene regulation depends critically on accurate annotation of regulatory elements that are functional in a given cell type. CENTIPEDE, a powerful, probabilistic framework for identifying transcription factor binding sites from tissue-specific DNase I cleavage patterns and genomic sequence content, leverages the hypersensitivity of factor-bound chromatin and the information in the DNase I spatial cleavage profile characteristic of each DNA binding protein to accurately infer functional factor binding sites. However, the model for the spatial profile in this framework fails to account for the substantial variation in the DNase I cleavage profiles across different binding sites. Neither does it account for variation in the profiles at the same binding site across multiple replicate DNase I experiments, which are increasingly available. In this work, we introduce new methods, based on multi-scale models for inhomogeneous Poisson processes, to account for such variation in DNase I cleavage patterns both within and across binding sites. These models account for the spatial structure in the heterogeneity in DNase I cleavage patterns for each factor. Using DNase-seq measurements assayed in a lymphoblastoid cell line, we demonstrate the improved performance of this model for several transcription factors by comparing against the Chip-seq peaks for those factors. Finally, we explore the effects of DNase I sequence bias on inference of factor binding using a simple extension to our framework that allows for a more flexible background model. The proposed model can also be easily applied to paired-end ATAC-seq and DNase-seq data. msCentipede, a Python implementation of our algorithm, is available at http://rajanil.github.io/msCentipede. PMID:26406244
Raj, Anil; Shim, Heejung; Gilad, Yoav; Pritchard, Jonathan K; Stephens, Matthew
2015-01-01
Understanding global gene regulation depends critically on accurate annotation of regulatory elements that are functional in a given cell type. CENTIPEDE, a powerful, probabilistic framework for identifying transcription factor binding sites from tissue-specific DNase I cleavage patterns and genomic sequence content, leverages the hypersensitivity of factor-bound chromatin and the information in the DNase I spatial cleavage profile characteristic of each DNA binding protein to accurately infer functional factor binding sites. However, the model for the spatial profile in this framework fails to account for the substantial variation in the DNase I cleavage profiles across different binding sites. Neither does it account for variation in the profiles at the same binding site across multiple replicate DNase I experiments, which are increasingly available. In this work, we introduce new methods, based on multi-scale models for inhomogeneous Poisson processes, to account for such variation in DNase I cleavage patterns both within and across binding sites. These models account for the spatial structure in the heterogeneity in DNase I cleavage patterns for each factor. Using DNase-seq measurements assayed in a lymphoblastoid cell line, we demonstrate the improved performance of this model for several transcription factors by comparing against the Chip-seq peaks for those factors. Finally, we explore the effects of DNase I sequence bias on inference of factor binding using a simple extension to our framework that allows for a more flexible background model. The proposed model can also be easily applied to paired-end ATAC-seq and DNase-seq data. msCentipede, a Python implementation of our algorithm, is available at http://rajanil.github.io/msCentipede.
Pattern similarity study of functional sites in protein sequences: lysozymes and cystatins
Nakai, Shuryo; Li-Chan, Eunice CY; Dou, Jinglie
2005-01-01
Background Although it is generally agreed that topography is more conserved than sequences, proteins sharing the same fold can have different functions, while there are protein families with low sequence similarity. An alternative method for profile analysis of characteristic conserved positions of the motifs within the 3D structures may be needed for functional annotation of protein sequences. Using the approach of quantitative structure-activity relationships (QSAR), we have proposed a new algorithm for postulating functional mechanisms on the basis of pattern similarity and average of property values of side-chains in segments within sequences. This approach was used to search for functional sites of proteins belonging to the lysozyme and cystatin families. Results Hydrophobicity and β-turn propensity of reference segments with 3–7 residues were used for the homology similarity search (HSS) for active sites. Hydrogen bonding was used as the side-chain property for searching the binding sites of lysozymes. The profiles of similarity constants and average values of these parameters as functions of their positions in the sequences could identify both active and substrate binding sites of the lysozyme of Streptomyces coelicolor, which has been reported as a new fold enzyme (Cellosyl). The same approach was successfully applied to cystatins, especially for postulating the mechanisms of amyloidosis of human cystatin C as well as human lysozyme. Conclusion Pattern similarity and average index values of structure-related properties of side chains in short segments of three residues or longer were, for the first time, successfully applied for predicting functional sites in sequences. This new approach may be applicable to studying functional sites in un-annotated proteins, for which complete 3D structures are not yet available. PMID:15904486
Biological and functional relevance of CASP predictions
Liu, Tianyun; Ish‐Shalom, Shirbi; Torng, Wen; Lafita, Aleix; Bock, Christian; Mort, Matthew; Cooper, David N; Bliven, Spencer; Capitani, Guido; Mooney, Sean D.
2017-01-01
Abstract Our goal is to answer the question: compared with experimental structures, how useful are predicted models for functional annotation? We assessed the functional utility of predicted models by comparing the performances of a suite of methods for functional characterization on the predictions and the experimental structures. We identified 28 sites in 25 protein targets to perform functional assessment. These 28 sites included nine sites with known ligand binding (holo‐sites), nine sites that are expected or suggested by experimental authors for small molecule binding (apo‐sites), and Ten sites containing important motifs, loops, or key residues with important disease‐associated mutations. We evaluated the utility of the predictions by comparing their microenvironments to the experimental structures. Overall structural quality correlates with functional utility. However, the best‐ranked predictions (global) may not have the best functional quality (local). Our assessment provides an ability to discriminate between predictions with high structural quality. When assessing ligand‐binding sites, most prediction methods have higher performance on apo‐sites than holo‐sites. Some servers show consistently high performance for certain types of functional sites. Finally, many functional sites are associated with protein‐protein interaction. We also analyzed biologically relevant features from the protein assemblies of two targets where the active site spanned the protein‐protein interface. For the assembly targets, we find that the features in the models are mainly determined by the choice of template. PMID:28975675
The Innate Immune Database (IIDB)
Korb, Martin; Rust, Aistair G; Thorsson, Vesteinn; Battail, Christophe; Li, Bin; Hwang, Daehee; Kennedy, Kathleen A; Roach, Jared C; Rosenberger, Carrie M; Gilchrist, Mark; Zak, Daniel; Johnson, Carrie; Marzolf, Bruz; Aderem, Alan; Shmulevich, Ilya; Bolouri, Hamid
2008-01-01
Background As part of a National Institute of Allergy and Infectious Diseases funded collaborative project, we have performed over 150 microarray experiments measuring the response of C57/BL6 mouse bone marrow macrophages to toll-like receptor stimuli. These microarray expression profiles are available freely from our project web site . Here, we report the development of a database of computationally predicted transcription factor binding sites and related genomic features for a set of over 2000 murine immune genes of interest. Our database, which includes microarray co-expression clusters and a host of web-based query, analysis and visualization facilities, is available freely via the internet. It provides a broad resource to the research community, and a stepping stone towards the delineation of the network of transcriptional regulatory interactions underlying the integrated response of macrophages to pathogens. Description We constructed a database indexed on genes and annotations of the immediate surrounding genomic regions. To facilitate both gene-specific and systems biology oriented research, our database provides the means to analyze individual genes or an entire genomic locus. Although our focus to-date has been on mammalian toll-like receptor signaling pathways, our database structure is not limited to this subject, and is intended to be broadly applicable to immunology. By focusing on selected immune-active genes, we were able to perform computationally intensive expression and sequence analyses that would currently be prohibitive if applied to the entire genome. Using six complementary computational algorithms and methodologies, we identified transcription factor binding sites based on the Position Weight Matrices available in TRANSFAC. For one example transcription factor (ATF3) for which experimental data is available, over 50% of our predicted binding sites coincide with genome-wide chromatin immnuopreciptation (ChIP-chip) results. Our database can be interrogated via a web interface. Genomic annotations and binding site predictions can be automatically viewed with a customized version of the Argo genome browser. Conclusion We present the Innate Immune Database (IIDB) as a community resource for immunologists interested in gene regulatory systems underlying innate responses to pathogens. The database website can be freely accessed at . PMID:18321385
Kirshner, Daniel A.; Nilmeier, Jerome P.; Lightstone, Felice C.
2013-01-01
The catalytic site identification web server provides the innovative capability to find structural matches to a user-specified catalytic site among all Protein Data Bank proteins rapidly (in less than a minute). The server also can examine a user-specified protein structure or model to identify structural matches to a library of catalytic sites. Finally, the server provides a database of pre-calculated matches between all Protein Data Bank proteins and the library of catalytic sites. The database has been used to derive a set of hypothesized novel enzymatic function annotations. In all cases, matches and putative binding sites (protein structure and surfaces) can be visualized interactively online. The website can be accessed at http://catsid.llnl.gov. PMID:23680785
Kirshner, Daniel A; Nilmeier, Jerome P; Lightstone, Felice C
2013-07-01
The catalytic site identification web server provides the innovative capability to find structural matches to a user-specified catalytic site among all Protein Data Bank proteins rapidly (in less than a minute). The server also can examine a user-specified protein structure or model to identify structural matches to a library of catalytic sites. Finally, the server provides a database of pre-calculated matches between all Protein Data Bank proteins and the library of catalytic sites. The database has been used to derive a set of hypothesized novel enzymatic function annotations. In all cases, matches and putative binding sites (protein structure and surfaces) can be visualized interactively online. The website can be accessed at http://catsid.llnl.gov.
Identification of widespread adenosine nucleotide binding in Mycobacterium tuberculosis
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ansong, Charles; Ortega, Corrie; Payne, Samuel H.
The annotation of protein function is almost completely performed by in silico approaches. However, computational prediction of protein function is frequently incomplete and error prone. In Mycobacterium tuberculosis (Mtb), ~25% of all genes have no predicted function and are annotated as hypothetical proteins. This lack of functional information severely limits our understanding of Mtb pathogenicity. Current tools for experimental functional annotation are limited and often do not scale to entire protein families. Here, we report a generally applicable chemical biology platform to functionally annotate bacterial proteins by combining activity-based protein profiling (ABPP) and quantitative LC-MS-based proteomics. As an example ofmore » this approach for high-throughput protein functional validation and discovery, we experimentally annotate the families of ATP-binding proteins in Mtb. Our data experimentally validate prior in silico predictions of >250 ATPases and adenosine nucleotide-binding proteins, and reveal 73 hypothetical proteins as novel ATP-binding proteins. We identify adenosine cofactor interactions with many hypothetical proteins containing a diversity of unrelated sequences, providing a new and expanded view of adenosine nucleotide binding in Mtb. Furthermore, many of these hypothetical proteins are both unique to Mycobacteria and essential for infection, suggesting specialized functions in mycobacterial physiology and pathogenicity. Thus, we provide a generally applicable approach for high throughput protein function discovery and validation, and highlight several ways in which application of activity-based proteomics data can improve the quality of functional annotations to facilitate novel biological insights.« less
PFAAT version 2.0: a tool for editing, annotating, and analyzing multiple sequence alignments.
Caffrey, Daniel R; Dana, Paul H; Mathur, Vidhya; Ocano, Marco; Hong, Eun-Jong; Wang, Yaoyu E; Somaroo, Shyamal; Caffrey, Brian E; Potluri, Shobha; Huang, Enoch S
2007-10-11
By virtue of their shared ancestry, homologous sequences are similar in their structure and function. Consequently, multiple sequence alignments are routinely used to identify trends that relate to function. This type of analysis is particularly productive when it is combined with structural and phylogenetic analysis. Here we describe the release of PFAAT version 2.0, a tool for editing, analyzing, and annotating multiple sequence alignments. Support for multiple annotations is a key component of this release as it provides a framework for most of the new functionalities. The sequence annotations are accessible from the alignment and tree, where they are typically used to label sequences or hyperlink them to related databases. Sequence annotations can be created manually or extracted automatically from UniProt entries. Once a multiple sequence alignment is populated with sequence annotations, sequences can be easily selected and sorted through a sophisticated search dialog. The selected sequences can be further analyzed using statistical methods that explicitly model relationships between the sequence annotations and residue properties. Residue annotations are accessible from the alignment viewer and are typically used to designate binding sites or properties for a particular residue. Residue annotations are also searchable, and allow one to quickly select alignment columns for further sequence analysis, e.g. computing percent identities. Other features include: novel algorithms to compute sequence conservation, mapping conservation scores to a 3D structure in Jmol, displaying secondary structure elements, and sorting sequences by residue composition. PFAAT provides a framework whereby end-users can specify knowledge for a protein family in the form of annotation. The annotations can be combined with sophisticated analysis to test hypothesis that relate to sequence, structure and function.
The Biomolecular Interaction Network Database and related tools 2005 update
Alfarano, C.; Andrade, C. E.; Anthony, K.; Bahroos, N.; Bajec, M.; Bantoft, K.; Betel, D.; Bobechko, B.; Boutilier, K.; Burgess, E.; Buzadzija, K.; Cavero, R.; D'Abreo, C.; Donaldson, I.; Dorairajoo, D.; Dumontier, M. J.; Dumontier, M. R.; Earles, V.; Farrall, R.; Feldman, H.; Garderman, E.; Gong, Y.; Gonzaga, R.; Grytsan, V.; Gryz, E.; Gu, V.; Haldorsen, E.; Halupa, A.; Haw, R.; Hrvojic, A.; Hurrell, L.; Isserlin, R.; Jack, F.; Juma, F.; Khan, A.; Kon, T.; Konopinsky, S.; Le, V.; Lee, E.; Ling, S.; Magidin, M.; Moniakis, J.; Montojo, J.; Moore, S.; Muskat, B.; Ng, I.; Paraiso, J. P.; Parker, B.; Pintilie, G.; Pirone, R.; Salama, J. J.; Sgro, S.; Shan, T.; Shu, Y.; Siew, J.; Skinner, D.; Snyder, K.; Stasiuk, R.; Strumpf, D.; Tuekam, B.; Tao, S.; Wang, Z.; White, M.; Willis, R.; Wolting, C.; Wong, S.; Wrong, A.; Xin, C.; Yao, R.; Yates, B.; Zhang, S.; Zheng, K.; Pawson, T.; Ouellette, B. F. F.; Hogue, C. W. V.
2005-01-01
The Biomolecular Interaction Network Database (BIND) (http://bind.ca) archives biomolecular interaction, reaction, complex and pathway information. Our aim is to curate the details about molecular interactions that arise from published experimental research and to provide this information, as well as tools to enable data analysis, freely to researchers worldwide. BIND data are curated into a comprehensive machine-readable archive of computable information and provides users with methods to discover interactions and molecular mechanisms. BIND has worked to develop new methods for visualization that amplify the underlying annotation of genes and proteins to facilitate the study of molecular interaction networks. BIND has maintained an open database policy since its inception in 1999. Data growth has proceeded at a tremendous rate, approaching over 100 000 records. New services provided include a new BIND Query and Submission interface, a Standard Object Access Protocol service and the Small Molecule Interaction Database (http://smid.blueprint.org) that allows users to determine probable small molecule binding sites of new sequences and examine conserved binding residues. PMID:15608229
Regad, Leslie; Martin, Juliette; Camproux, Anne-Claude
2011-06-20
One of the strategies for protein function annotation is to search particular structural motifs that are known to be shared by proteins with a given function. Here, we present a systematic extraction of structural motifs of seven residues from protein loops and we explore their correspondence with functional sites. Our approach is based on the structural alphabet HMM-SA (Hidden Markov Model - Structural Alphabet), which allows simplification of protein structures into uni-dimensional sequences, and advanced pattern statistics adapted to short sequences. Structural motifs of interest are selected by looking for structural motifs significantly over-represented in SCOP superfamilies in protein loops. We discovered two types of structural motifs significantly over-represented in SCOP superfamilies: (i) ubiquitous motifs, shared by several superfamilies and (ii) superfamily-specific motifs, over-represented in few superfamilies. A comparison of ubiquitous words with known small structural motifs shows that they contain well-described motifs as turn, niche or nest motifs. A comparison between superfamily-specific motifs and biological annotations of Swiss-Prot reveals that some of them actually correspond to functional sites involved in the binding sites of small ligands, such as ATP/GTP, NAD(P) and SAH/SAM. Our findings show that statistical over-representation in SCOP superfamilies is linked to functional features. The detection of over-represented motifs within structures simplified by HMM-SA is therefore a promising approach for prediction of functional sites and annotation of uncharacterized proteins.
2011-01-01
Background One of the strategies for protein function annotation is to search particular structural motifs that are known to be shared by proteins with a given function. Results Here, we present a systematic extraction of structural motifs of seven residues from protein loops and we explore their correspondence with functional sites. Our approach is based on the structural alphabet HMM-SA (Hidden Markov Model - Structural Alphabet), which allows simplification of protein structures into uni-dimensional sequences, and advanced pattern statistics adapted to short sequences. Structural motifs of interest are selected by looking for structural motifs significantly over-represented in SCOP superfamilies in protein loops. We discovered two types of structural motifs significantly over-represented in SCOP superfamilies: (i) ubiquitous motifs, shared by several superfamilies and (ii) superfamily-specific motifs, over-represented in few superfamilies. A comparison of ubiquitous words with known small structural motifs shows that they contain well-described motifs as turn, niche or nest motifs. A comparison between superfamily-specific motifs and biological annotations of Swiss-Prot reveals that some of them actually correspond to functional sites involved in the binding sites of small ligands, such as ATP/GTP, NAD(P) and SAH/SAM. Conclusions Our findings show that statistical over-representation in SCOP superfamilies is linked to functional features. The detection of over-represented motifs within structures simplified by HMM-SA is therefore a promising approach for prediction of functional sites and annotation of uncharacterized proteins. PMID:21689388
Biological and functional relevance of CASP predictions.
Liu, Tianyun; Ish-Shalom, Shirbi; Torng, Wen; Lafita, Aleix; Bock, Christian; Mort, Matthew; Cooper, David N; Bliven, Spencer; Capitani, Guido; Mooney, Sean D; Altman, Russ B
2018-03-01
Our goal is to answer the question: compared with experimental structures, how useful are predicted models for functional annotation? We assessed the functional utility of predicted models by comparing the performances of a suite of methods for functional characterization on the predictions and the experimental structures. We identified 28 sites in 25 protein targets to perform functional assessment. These 28 sites included nine sites with known ligand binding (holo-sites), nine sites that are expected or suggested by experimental authors for small molecule binding (apo-sites), and Ten sites containing important motifs, loops, or key residues with important disease-associated mutations. We evaluated the utility of the predictions by comparing their microenvironments to the experimental structures. Overall structural quality correlates with functional utility. However, the best-ranked predictions (global) may not have the best functional quality (local). Our assessment provides an ability to discriminate between predictions with high structural quality. When assessing ligand-binding sites, most prediction methods have higher performance on apo-sites than holo-sites. Some servers show consistently high performance for certain types of functional sites. Finally, many functional sites are associated with protein-protein interaction. We also analyzed biologically relevant features from the protein assemblies of two targets where the active site spanned the protein-protein interface. For the assembly targets, we find that the features in the models are mainly determined by the choice of template. © 2017 The Authors Proteins: Structure, Function and Bioinformatics Published by Wiley Periodicals, Inc.
Bhagavat, Raghu; Sankar, Santhosh; Srinivasan, Narayanaswamy; Chandra, Nagasuma
2018-03-06
Protein-ligand interactions form the basis of most cellular events. Identifying ligand binding pockets in proteins will greatly facilitate rationalizing and predicting protein function. Ligand binding sites are unknown for many proteins of known three-dimensional (3D) structure, creating a gap in our understanding of protein structure-function relationships. To bridge this gap, we detect pockets in proteins of known 3D structures, using computational techniques. This augmented pocketome (PocketDB) consists of 249,096 pockets, which is about seven times larger than what is currently known. We deduce possible ligand associations for about 46% of the newly identified pockets. The augmented pocketome, when subjected to clustering based on similarities among pockets, yielded 2,161 site types, which are associated with 1,037 ligand types, together providing fold-site-type-ligand-type associations. The PocketDB resource facilitates a structure-based function annotation, delineation of the structural basis of ligand recognition, and provides functional clues for domains of unknown functions, allosteric proteins, and druggable pockets. Copyright © 2018 Elsevier Ltd. All rights reserved.
G-LoSA for Prediction of Protein-Ligand Binding Sites and Structures.
Lee, Hui Sun; Im, Wonpil
2017-01-01
Recent advances in high-throughput structure determination and computational protein structure prediction have significantly enriched the universe of protein structure. However, there is still a large gap between the number of available protein structures and that of proteins with annotated function in high accuracy. Computational structure-based protein function prediction has emerged to reduce this knowledge gap. The identification of a ligand binding site and its structure is critical to the determination of a protein's molecular function. We present a computational methodology for predicting small molecule ligand binding site and ligand structure using G-LoSA, our protein local structure alignment and similarity measurement tool. All the computational procedures described here can be easily implemented using G-LoSA Toolkit, a package of standalone software programs and preprocessed PDB structure libraries. G-LoSA and G-LoSA Toolkit are freely available to academic users at http://compbio.lehigh.edu/GLoSA . We also illustrate a case study to show the potential of our template-based approach harnessing G-LoSA for protein function prediction.
Dictionary-driven protein annotation.
Rigoutsos, Isidore; Huynh, Tien; Floratos, Aris; Parida, Laxmi; Platt, Daniel
2002-09-01
Computational methods seeking to automatically determine the properties (functional, structural, physicochemical, etc.) of a protein directly from the sequence have long been the focus of numerous research groups. With the advent of advanced sequencing methods and systems, the number of amino acid sequences that are being deposited in the public databases has been increasing steadily. This has in turn generated a renewed demand for automated approaches that can annotate individual sequences and complete genomes quickly, exhaustively and objectively. In this paper, we present one such approach that is centered around and exploits the Bio-Dictionary, a collection of amino acid patterns that completely covers the natural sequence space and can capture functional and structural signals that have been reused during evolution, within and across protein families. Our annotation approach also makes use of a weighted, position-specific scoring scheme that is unaffected by the over-representation of well-conserved proteins and protein fragments in the databases used. For a given query sequence, the method permits one to determine, in a single pass, the following: local and global similarities between the query and any protein already present in a public database; the likeness of the query to all available archaeal/ bacterial/eukaryotic/viral sequences in the database as a function of amino acid position within the query; the character of secondary structure of the query as a function of amino acid position within the query; the cytoplasmic, transmembrane or extracellular behavior of the query; the nature and position of binding domains, active sites, post-translationally modified sites, signal peptides, etc. In terms of performance, the proposed method is exhaustive, objective and allows for the rapid annotation of individual sequences and full genomes. Annotation examples are presented and discussed in Results, including individual queries and complete genomes that were released publicly after we built the Bio-Dictionary that is used in our experiments. Finally, we have computed the annotations of more than 70 complete genomes and made them available on the World Wide Web at http://cbcsrv.watson.ibm.com/Annotations/.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Giulliani, S. E.; Frank, A. E.; Collart, F. R.
2008-12-08
We have used a fluorescence-based thermal shift (FTS) assay to identify amino acids that bind to solute-binding proteins in the bacterial ABC transporter family. The assay was validated with a set of six proteins with known binding specificity and was consistently able to map proteins with their known binding ligands. The assay also identified additional candidate binding ligands for several of the amino acid-binding proteins in the validation set. We extended this approach to additional targets and demonstrated the ability of the FTS assay to unambiguously identify preferential binding for several homologues of amino acid-binding proteins with known specificity andmore » to functionally annotate proteins of unknown binding specificity. The assay is implemented in a microwell plate format and provides a rapid approach to validate an anticipated function or to screen proteins of unknown function. The ABC-type transporter family is ubiquitous and transports a variety of biological compounds, but the current annotation of the ligand-binding proteins is limited to mostly generic descriptions of function. The results illustrate the feasibility of the FTS assay to improve the functional annotation of binding proteins associated with ABC-type transporters and suggest this approach that can also be extended to other protein families.« less
Bioinformatics approaches to predict target genes from transcription factor binding data.
Essebier, Alexandra; Lamprecht, Marnie; Piper, Michael; Bodén, Mikael
2017-12-01
Transcription factors regulate gene expression and play an essential role in development by maintaining proliferative states, driving cellular differentiation and determining cell fate. Transcription factors are capable of regulating multiple genes over potentially long distances making target gene identification challenging. Currently available experimental approaches to detect distal interactions have multiple weaknesses that have motivated the development of computational approaches. Although an improvement over experimental approaches, existing computational approaches are still limited in their application, with different weaknesses depending on the approach. Here, we review computational approaches with a focus on data dependency, cell type specificity and usability. With the aim of identifying transcription factor target genes, we apply available approaches to typical transcription factor experimental datasets. We show that approaches are not always capable of annotating all transcription factor binding sites; binding sites should be treated disparately; and a combination of approaches can increase the biological relevance of the set of genes identified as targets. Copyright © 2017 Elsevier Inc. All rights reserved.
Bromberg, Yana; Yachdav, Guy; Ofran, Yanay; Schneider, Reinhard; Rost, Burkhard
2009-05-01
The rapidly increasing quantity of protein sequence data continues to widen the gap between available sequences and annotations. Comparative modeling suggests some aspects of the 3D structures of approximately half of all known proteins; homology- and network-based inferences annotate some aspect of function for a similar fraction of the proteome. For most known protein sequences, however, there is detailed knowledge about neither their function nor their structure. Comprehensive efforts towards the expert curation of sequence annotations have failed to meet the demand of the rapidly increasing number of available sequences. Only the automated prediction of protein function in the absence of homology can close the gap between available sequences and annotations in the foreseeable future. This review focuses on two novel methods for automated annotation, and briefly presents an outlook on how modern web software may revolutionize the field of protein sequence annotation. First, predictions of protein binding sites and functional hotspots, and the evolution of these into the most successful type of prediction of protein function from sequence will be discussed. Second, a new tool, comprehensive in silico mutagenesis, which contributes important novel predictions of function and at the same time prepares for the onset of the next sequencing revolution, will be described. While these two new sub-fields of protein prediction represent the breakthroughs that have been achieved methodologically, it will then be argued that a different development might further change the way biomedical researchers benefit from annotations: modern web software can connect the worldwide web in any browser with the 'Deep Web' (ie, proprietary data resources). The availability of this direct connection, and the resulting access to a wealth of data, may impact drug discovery and development more than any existing method that contributes to protein annotation.
Efficient Integrative Multi-SNP Association Analysis via Deterministic Approximation of Posteriors.
Wen, Xiaoquan; Lee, Yeji; Luca, Francesca; Pique-Regi, Roger
2016-06-02
With the increasing availability of functional genomic data, incorporating genomic annotations into genetic association analysis has become a standard procedure. However, the existing methods often lack rigor and/or computational efficiency and consequently do not maximize the utility of functional annotations. In this paper, we propose a rigorous inference procedure to perform integrative association analysis incorporating genomic annotations for both traditional GWASs and emerging molecular QTL mapping studies. In particular, we propose an algorithm, named deterministic approximation of posteriors (DAP), which enables highly efficient and accurate joint enrichment analysis and identification of multiple causal variants. We use a series of simulation studies to highlight the power and computational efficiency of our proposed approach and further demonstrate it by analyzing the cross-population eQTL data from the GEUVADIS project and the multi-tissue eQTL data from the GTEx project. In particular, we find that genetic variants predicted to disrupt transcription factor binding sites are enriched in cis-eQTLs across all tissues. Moreover, the enrichment estimates obtained across the tissues are correlated with the cell types for which the annotations are derived. Copyright © 2016 American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.
Liu, Shijia; Shao, Shangjin; Li, Linlin; Cheng, Zhi; Tian, Li; Gao, Peiji; Wang, Lushan
2015-12-11
Chitinases and chitosanases, referred to as chitinolytic enzymes, are two important categories of glycoside hydrolases (GH) that play a key role in degrading chitin and chitosan, two naturally abundant polysaccharides. Here, we investigate the active site architecture of the major chitosanase (GH8, GH46) and chitinase families (GH18, GH19). Both charged (Glu, His, Arg, Asp) and aromatic amino acids (Tyr, Trp, Phe) are observed with higher frequency within chitinolytic active sites as compared to elsewhere in the enzyme structure, indicating significant roles related to enzyme function. Hydrogen bonds between chitinolytic enzymes and the substrate C2 functional groups, i.e. amino groups and N-acetyl groups, drive substrate recognition, while non-specific CH-π interactions between aromatic residues and substrate mainly contribute to tighter binding and enhanced processivity evident in GH8 and GH18 enzymes. For different families of chitinolytic enzymes, the number, type, and position of substrate atoms bound in the active site vary, resulting in different substrate-binding specificities. The data presented here explain the synergistic action of multiple enzyme families at a molecular level and provide a more reasonable method for functional annotation, which can be further applied toward the practical engineering of chitinases and chitosanases. Copyright © 2015 Elsevier Ltd. All rights reserved.
Massive GGAAs in genomic repetitive sequences serve as a nuclear reservoir of NF-κB.
Wu, Jian; Wang, Qiao; Dai, Wei; Wang, Wei; Yue, Ming; Wang, Jinke
2018-04-13
Nuclear factor κB (NF-κB) is a DNA-binding transcription factor. Characterizing its genomic binding sites is crucial for understanding its gene regulatory function and mechanism in cells. This study characterized the binding sites of NF-κB RelA/p65 in the tumor neurosis factor-α (TNFα) stimulated HeLa cells by a precise chromatin immunoprecipitation-sequencing (ChIP-seq). The results revealed that NF-κB binds nontraditional motifs (nt-motifs) containing conserved GGAA quadruplet. Moreover, nt-motifs mainly distribute in the peaks nearby centromeres that contain a larger number of repetitive elements such as satellite, simple repeats and short interspersed nuclear elements (SINEs). This intracellular binding pattern was then confirmed by the in vitro detection, indicating that NF-κB dimers can bind the nontraditional κB (nt-κB) sites with low affinity. However, this binding hardly activates transcription. This study thus deduced that NF-κB binding nt-motifs may realize functions other than gene regulation as NF-κB binding traditional motifs (t-motifs). To testify the deduction, many ChIP-seq data of other cell lines were then analyzed. The results indicate that NF-κB binding nt-motifs is also widely present in other cells. The ChIP-seq data analysis also revealed that nt-motifs more widely distribute in the peaks with low-fold enrichment. Importantly, it was also found that NF-κB binding nt-motifs is mainly present in the resting cells, whereas NF-κB binding t-motifs is mainly present in the stimulated cells. Astonishingly, no known function was enriched by the gene annotation of nt-motif peaks. Based on these results, this study proposed that the nt-κB sites that extensively distribute in larger numbers of repeat elements function as a nuclear reservoir of NF-κB. The nuclear NF-κB proteins stored at nt-κB sites in the resting cells may be recruited to the t-κB sites for regulating its target genes upon stimulation. Copyright © 2018 Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, and Genetics Society of China. Published by Elsevier Ltd. All rights reserved.
NaviSE: superenhancer navigator integrating epigenomics signal algebra.
Ascensión, Alex M; Arrospide-Elgarresta, Mikel; Izeta, Ander; Araúzo-Bravo, Marcos J
2017-06-06
Superenhancers are crucial structural genomic elements determining cell fate, and they are also involved in the determination of several diseases, such as cancer or neurodegeneration. Although there are pipelines which use independent pieces of software to predict the presence of superenhancers from genome-wide chromatin marks or DNA-interaction protein binding sites, there is not yet an integrated software tool that processes automatically algebra combinations of raw data sequencing into a comprehensive final annotated report of predicted superenhancers. We have developed NaviSE, a user-friendly streamlined tool which performs a fully-automated parallel processing of genome-wide epigenomics data from sequencing files into a final report, built with a comprehensive set of annotated files that are navigated through a graphic user interface dynamically generated by NaviSE. NaviSE also implements an 'epigenomics signal algebra' that allows the combination of multiple activation and repression epigenomics signals. NaviSE provides an interactive chromosomal landscaping of the locations of superenhancers, which can be navigated to obtain annotated information about superenhancer signal profile, associated genes, gene ontology enrichment analysis, motifs of transcription factor binding sites enriched in superenhancers, graphs of the metrics evaluating the superenhancers quality, protein-protein interaction networks and enriched metabolic pathways among other features. We have parallelised the most time-consuming tasks achieving a reduction up to 30% for a 15 CPUs machine. We have optimized the default parameters of NaviSE to facilitate its use. NaviSE allows different entry levels of data processing, from sra-fastq files to bed files; and unifies the processing of multiple replicates. NaviSE outperforms the more time-consuming processes required in a non-integrated pipeline. Alongside its high performance, NaviSE is able to provide biological insights, predicting cell type specific markers, such as SOX2 and ZIC3 in embryonic stem cells, CDK5R1 and REST in neurons and CD86 and TLR2 in monocytes. NaviSE is a user-friendly streamlined solution for superenhancer analysis, annotation and navigation, requiring only basic computer and next generation sequencing knowledge. NaviSE binaries and documentation are available at: https://sourceforge.net/projects/navise-superenhancer/ .
Pan, Xiaoyong; Shen, Hong-Bin
2018-05-02
RNA-binding proteins (RBPs) take over 5∼10% of the eukaryotic proteome and play key roles in many biological processes, e.g. gene regulation. Experimental detection of RBP binding sites is still time-intensive and high-costly. Instead, computational prediction of the RBP binding sites using pattern learned from existing annotation knowledge is a fast approach. From the biological point of view, the local structure context derived from local sequences will be recognized by specific RBPs. However, in computational modeling using deep learning, to our best knowledge, only global representations of entire RNA sequences are employed. So far, the local sequence information is ignored in the deep model construction process. In this study, we present a computational method iDeepE to predict RNA-protein binding sites from RNA sequences by combining global and local convolutional neural networks (CNNs). For the global CNN, we pad the RNA sequences into the same length. For the local CNN, we split a RNA sequence into multiple overlapping fixed-length subsequences, where each subsequence is a signal channel of the whole sequence. Next, we train deep CNNs for multiple subsequences and the padded sequences to learn high-level features, respectively. Finally, the outputs from local and global CNNs are combined to improve the prediction. iDeepE demonstrates a better performance over state-of-the-art methods on two large-scale datasets derived from CLIP-seq. We also find that the local CNN run 1.8 times faster than the global CNN with comparable performance when using GPUs. Our results show that iDeepE has captured experimentally verified binding motifs. https://github.com/xypan1232/iDeepE. xypan172436@gmail.com or hbshen@sjtu.edu.cn. Supplementary data are available at Bioinformatics online.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Giuliani, Sarah E; Frank, Ashley M; Corgliano, Danielle M
Abstract Background: Transporter proteins are one of an organism s primary interfaces with the environment. The expressed set of transporters mediates cellular metabolic capabilities and influences signal transduction pathways and regulatory networks. The functional annotation of most transporters is currently limited to general classification into families. The development of capabilities to map ligands with specific transporters would improve our knowledge of the function of these proteins, improve the annotation of related genomes, and facilitate predictions for their role in cellular responses to environmental changes. Results: To improve the utility of the functional annotation for ABC transporters, we expressed and purifiedmore » the set of solute binding proteins from Rhodopseudomonas palustris and characterized their ligand-binding specificity. Our approach utilized ligand libraries consisting of environmental and cellular metabolic compounds, and fluorescence thermal shift based high throughput ligand binding screens. This process resulted in the identification of specific binding ligands for approximately 64% of the purified and screened proteins. The collection of binding ligands is representative of common functionalities associated with many bacterial organisms as well as specific capabilities linked to the ecological niche occupied by R. palustris. Conclusion: The functional screen identified specific ligands that bound to ABC transporter periplasmic binding subunits from R. palustris. These assignments provide unique insight for the metabolic capabilities of this organism and are consistent with the ecological niche of strain isolation. This functional insight can be used to improve the annotation of related organisms and provides a route to evaluate the evolution of this important and diverse group of transporter proteins.« less
Dictionary-driven protein annotation
Rigoutsos, Isidore; Huynh, Tien; Floratos, Aris; Parida, Laxmi; Platt, Daniel
2002-01-01
Computational methods seeking to automatically determine the properties (functional, structural, physicochemical, etc.) of a protein directly from the sequence have long been the focus of numerous research groups. With the advent of advanced sequencing methods and systems, the number of amino acid sequences that are being deposited in the public databases has been increasing steadily. This has in turn generated a renewed demand for automated approaches that can annotate individual sequences and complete genomes quickly, exhaustively and objectively. In this paper, we present one such approach that is centered around and exploits the Bio-Dictionary, a collection of amino acid patterns that completely covers the natural sequence space and can capture functional and structural signals that have been reused during evolution, within and across protein families. Our annotation approach also makes use of a weighted, position-specific scoring scheme that is unaffected by the over-representation of well-conserved proteins and protein fragments in the databases used. For a given query sequence, the method permits one to determine, in a single pass, the following: local and global similarities between the query and any protein already present in a public database; the likeness of the query to all available archaeal/bacterial/eukaryotic/viral sequences in the database as a function of amino acid position within the query; the character of secondary structure of the query as a function of amino acid position within the query; the cytoplasmic, transmembrane or extracellular behavior of the query; the nature and position of binding domains, active sites, post-translationally modified sites, signal peptides, etc. In terms of performance, the proposed method is exhaustive, objective and allows for the rapid annotation of individual sequences and full genomes. Annotation examples are presented and discussed in Results, including individual queries and complete genomes that were released publicly after we built the Bio-Dictionary that is used in our experiments. Finally, we have computed the annotations of more than 70 complete genomes and made them available on the World Wide Web at http://cbcsrv.watson.ibm.com/Annotations/. PMID:12202776
Sequence-based model of gap gene regulatory network.
Kozlov, Konstantin; Gursky, Vitaly; Kulakovskiy, Ivan; Samsonova, Maria
2014-01-01
The detailed analysis of transcriptional regulation is crucially important for understanding biological processes. The gap gene network in Drosophila attracts large interest among researches studying mechanisms of transcriptional regulation. It implements the most upstream regulatory layer of the segmentation gene network. The knowledge of molecular mechanisms involved in gap gene regulation is far less complete than that of genetics of the system. Mathematical modeling goes beyond insights gained by genetics and molecular approaches. It allows us to reconstruct wild-type gene expression patterns in silico, infer underlying regulatory mechanism and prove its sufficiency. We developed a new model that provides a dynamical description of gap gene regulatory systems, using detailed DNA-based information, as well as spatial transcription factor concentration data at varying time points. We showed that this model correctly reproduces gap gene expression patterns in wild type embryos and is able to predict gap expression patterns in Kr mutants and four reporter constructs. We used four-fold cross validation test and fitting to random dataset to validate the model and proof its sufficiency in data description. The identifiability analysis showed that most model parameters are well identifiable. We reconstructed the gap gene network topology and studied the impact of individual transcription factor binding sites on the model output. We measured this impact by calculating the site regulatory weight as a normalized difference between the residual sum of squares error for the set of all annotated sites and for the set with the site of interest excluded. The reconstructed topology of the gap gene network is in agreement with previous modeling results and data from literature. We showed that 1) the regulatory weights of transcription factor binding sites show very weak correlation with their PWM score; 2) sites with low regulatory weight are important for the model output; 3) functional important sites are not exclusively located in cis-regulatory elements, but are rather dispersed through regulatory region. It is of importance that some of the sites with high functional impact in hb, Kr and kni regulatory regions coincide with strong sites annotated and verified in Dnase I footprint assays.
Integrative analysis of the Caenorhabditis elegans genome by the modENCODE project.
Gerstein, Mark B; Lu, Zhi John; Van Nostrand, Eric L; Cheng, Chao; Arshinoff, Bradley I; Liu, Tao; Yip, Kevin Y; Robilotto, Rebecca; Rechtsteiner, Andreas; Ikegami, Kohta; Alves, Pedro; Chateigner, Aurelien; Perry, Marc; Morris, Mitzi; Auerbach, Raymond K; Feng, Xin; Leng, Jing; Vielle, Anne; Niu, Wei; Rhrissorrakrai, Kahn; Agarwal, Ashish; Alexander, Roger P; Barber, Galt; Brdlik, Cathleen M; Brennan, Jennifer; Brouillet, Jeremy Jean; Carr, Adrian; Cheung, Ming-Sin; Clawson, Hiram; Contrino, Sergio; Dannenberg, Luke O; Dernburg, Abby F; Desai, Arshad; Dick, Lindsay; Dosé, Andréa C; Du, Jiang; Egelhofer, Thea; Ercan, Sevinc; Euskirchen, Ghia; Ewing, Brent; Feingold, Elise A; Gassmann, Reto; Good, Peter J; Green, Phil; Gullier, Francois; Gutwein, Michelle; Guyer, Mark S; Habegger, Lukas; Han, Ting; Henikoff, Jorja G; Henz, Stefan R; Hinrichs, Angie; Holster, Heather; Hyman, Tony; Iniguez, A Leo; Janette, Judith; Jensen, Morten; Kato, Masaomi; Kent, W James; Kephart, Ellen; Khivansara, Vishal; Khurana, Ekta; Kim, John K; Kolasinska-Zwierz, Paulina; Lai, Eric C; Latorre, Isabel; Leahey, Amber; Lewis, Suzanna; Lloyd, Paul; Lochovsky, Lucas; Lowdon, Rebecca F; Lubling, Yaniv; Lyne, Rachel; MacCoss, Michael; Mackowiak, Sebastian D; Mangone, Marco; McKay, Sheldon; Mecenas, Desirea; Merrihew, Gennifer; Miller, David M; Muroyama, Andrew; Murray, John I; Ooi, Siew-Loon; Pham, Hoang; Phippen, Taryn; Preston, Elicia A; Rajewsky, Nikolaus; Rätsch, Gunnar; Rosenbaum, Heidi; Rozowsky, Joel; Rutherford, Kim; Ruzanov, Peter; Sarov, Mihail; Sasidharan, Rajkumar; Sboner, Andrea; Scheid, Paul; Segal, Eran; Shin, Hyunjin; Shou, Chong; Slack, Frank J; Slightam, Cindie; Smith, Richard; Spencer, William C; Stinson, E O; Taing, Scott; Takasaki, Teruaki; Vafeados, Dionne; Voronina, Ksenia; Wang, Guilin; Washington, Nicole L; Whittle, Christina M; Wu, Beijing; Yan, Koon-Kiu; Zeller, Georg; Zha, Zheng; Zhong, Mei; Zhou, Xingliang; Ahringer, Julie; Strome, Susan; Gunsalus, Kristin C; Micklem, Gos; Liu, X Shirley; Reinke, Valerie; Kim, Stuart K; Hillier, LaDeana W; Henikoff, Steven; Piano, Fabio; Snyder, Michael; Stein, Lincoln; Lieb, Jason D; Waterston, Robert H
2010-12-24
We systematically generated large-scale data sets to improve genome annotation for the nematode Caenorhabditis elegans, a key model organism. These data sets include transcriptome profiling across a developmental time course, genome-wide identification of transcription factor-binding sites, and maps of chromatin organization. From this, we created more complete and accurate gene models, including alternative splice forms and candidate noncoding RNAs. We constructed hierarchical networks of transcription factor-binding and microRNA interactions and discovered chromosomal locations bound by an unusually large number of transcription factors. Different patterns of chromatin composition and histone modification were revealed between chromosome arms and centers, with similarly prominent differences between autosomes and the X chromosome. Integrating data types, we built statistical models relating chromatin, transcription factor binding, and gene expression. Overall, our analyses ascribed putative functions to most of the conserved genome.
Zhou, Shiyong; Liu, Pengfei; Zhang, Huilai
2017-01-01
Acute myeloid leukemia (AML) is a frequently occurring malignant disease of the blood and may result from a variety of genetic disorders. The present study aimed to identify the underlying mechanisms associated with the therapeutic effects of decitabine and cytarabine on AML, using microarray analysis. The microarray datasets GSE40442 and GSE40870 were downloaded from the Gene Expression Omnibus database. Differentially expressed genes (DEGs) and differentially methylated sites were identified in AML cells treated with decitabine compared with those treated with cytarabine via the Linear Models for Microarray Data package, following data pre-processing. Gene Ontology (GO) analysis of DEGs was performed using the Database for Annotation, Visualization and Integrated Analysis Discovery. Genes corresponding to the differentially methylated sites were obtained using the annotation package of the methylation microarray platform. The overlapping genes were identified, which exhibited the opposite variation trend between gene expression and DNA methylation. Important transcription factor (TF)-gene pairs were screened out, and a regulated network subsequently constructed. A total of 190 DEGs and 540 differentially methylated sites were identified in AML cells treated with decitabine compared with those treated with cytarabine. A total of 36 GO terms of DEGs were enriched, including nucleosomes, protein-DNA complexes and the nucleosome assembly. The 540 differentially methylated sites were located on 240 genes, including the acid-repeat containing protein (ACRC) gene that was additionally differentially expressed. In addition, 60 TF pairs and overlapped methylated sites, and 140 TF-pairs and DEGs were screened out. The regulated network included 68 nodes and 140 TF-gene pairs. The present study identified various genes including ACRC and proliferating cell nuclear antigen, in addition to various TFs, including TATA-box binding protein associated factor 1 and CCCTC-binding factor, which may be potential therapeutic targets of AML. PMID:28498449
Zhou, Shiyong; Liu, Pengfei; Zhang, Huilai
2017-07-01
Acute myeloid leukemia (AML) is a frequently occurring malignant disease of the blood and may result from a variety of genetic disorders. The present study aimed to identify the underlying mechanisms associated with the therapeutic effects of decitabine and cytarabine on AML, using microarray analysis. The microarray datasets GSE40442 and GSE40870 were downloaded from the Gene Expression Omnibus database. Differentially expressed genes (DEGs) and differentially methylated sites were identified in AML cells treated with decitabine compared with those treated with cytarabine via the Linear Models for Microarray Data package, following data pre‑processing. Gene Ontology (GO) analysis of DEGs was performed using the Database for Annotation, Visualization and Integrated Analysis Discovery. Genes corresponding to the differentially methylated sites were obtained using the annotation package of the methylation microarray platform. The overlapping genes were identified, which exhibited the opposite variation trend between gene expression and DNA methylation. Important transcription factor (TF)‑gene pairs were screened out, and a regulated network subsequently constructed. A total of 190 DEGs and 540 differentially methylated sites were identified in AML cells treated with decitabine compared with those treated with cytarabine. A total of 36 GO terms of DEGs were enriched, including nucleosomes, protein‑DNA complexes and the nucleosome assembly. The 540 differentially methylated sites were located on 240 genes, including the acid‑repeat containing protein (ACRC) gene that was additionally differentially expressed. In addition, 60 TF pairs and overlapped methylated sites, and 140 TF‑pairs and DEGs were screened out. The regulated network included 68 nodes and 140 TF‑gene pairs. The present study identified various genes including ACRC and proliferating cell nuclear antigen, in addition to various TFs, including TATA‑box binding protein associated factor 1 and CCCTC‑binding factor, which may be potential therapeutic targets of AML.
SInCRe—structural interactome computational resource for Mycobacterium tuberculosis
Metri, Rahul; Hariharaputran, Sridhar; Ramakrishnan, Gayatri; Anand, Praveen; Raghavender, Upadhyayula S.; Ochoa-Montaño, Bernardo; Higueruelo, Alicia P.; Sowdhamini, Ramanathan; Chandra, Nagasuma R.; Blundell, Tom L.; Srinivasan, Narayanaswamy
2015-01-01
We have developed an integrated database for Mycobacterium tuberculosis H37Rv (Mtb) that collates information on protein sequences, domain assignments, functional annotation and 3D structural information along with protein–protein and protein–small molecule interactions. SInCRe (Structural Interactome Computational Resource) is developed out of CamBan (Cambridge and Bangalore) collaboration. The motivation for development of this database is to provide an integrated platform to allow easily access and interpretation of data and results obtained by all the groups in CamBan in the field of Mtb informatics. In-house algorithms and databases developed independently by various academic groups in CamBan are used to generate Mtb-specific datasets and are integrated in this database to provide a structural dimension to studies on tuberculosis. The SInCRe database readily provides information on identification of functional domains, genome-scale modelling of structures of Mtb proteins and characterization of the small-molecule binding sites within Mtb. The resource also provides structure-based function annotation, information on small-molecule binders including FDA (Food and Drug Administration)-approved drugs, protein–protein interactions (PPIs) and natural compounds that bind to pathogen proteins potentially and result in weakening or elimination of host–pathogen protein–protein interactions. Together they provide prerequisites for identification of off-target binding. Database URL: http://proline.biochem.iisc.ernet.in/sincre PMID:26130660
dbPTM 2016: 10-year anniversary of a resource for post-translational modification of proteins.
Huang, Kai-Yao; Su, Min-Gang; Kao, Hui-Ju; Hsieh, Yun-Chung; Jhong, Jhih-Hua; Cheng, Kuang-Hao; Huang, Hsien-Da; Lee, Tzong-Yi
2016-01-04
Owing to the importance of the post-translational modifications (PTMs) of proteins in regulating biological processes, the dbPTM (http://dbPTM.mbc.nctu.edu.tw/) was developed as a comprehensive database of experimentally verified PTMs from several databases with annotations of potential PTMs for all UniProtKB protein entries. For this 10th anniversary of dbPTM, the updated resource provides not only a comprehensive dataset of experimentally verified PTMs, supported by the literature, but also an integrative interface for accessing all available databases and tools that are associated with PTM analysis. As well as collecting experimental PTM data from 14 public databases, this update manually curates over 12 000 modified peptides, including the emerging S-nitrosylation, S-glutathionylation and succinylation, from approximately 500 research articles, which were retrieved by text mining. As the number of available PTM prediction methods increases, this work compiles a non-homologous benchmark dataset to evaluate the predictive power of online PTM prediction tools. An increasing interest in the structural investigation of PTM substrate sites motivated the mapping of all experimental PTM peptides to protein entries of Protein Data Bank (PDB) based on database identifier and sequence identity, which enables users to examine spatially neighboring amino acids, solvent-accessible surface area and side-chain orientations for PTM substrate sites on tertiary structures. Since drug binding in PDB is annotated, this update identified over 1100 PTM sites that are associated with drug binding. The update also integrates metabolic pathways and protein-protein interactions to support the PTM network analysis for a group of proteins. Finally, the web interface is redesigned and enhanced to facilitate access to this resource. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Mahalingam, Rajasekaran; Peng, Hung-Pin; Yang, An-Suei
2014-08-01
Protein-fatty acid interaction is vital for many cellular processes and understanding this interaction is important for functional annotation as well as drug discovery. In this work, we present a method for predicting the fatty acid (FA)-binding residues by using three-dimensional probability density distributions of interacting atoms of FAs on protein surfaces which are derived from the known protein-FA complex structures. A machine learning algorithm was established to learn the characteristic patterns of the probability density maps specific to the FA-binding sites. The predictor was trained with five-fold cross validation on a non-redundant training set and then evaluated with an independent test set as well as on holo-apo pair's dataset. The results showed good accuracy in predicting the FA-binding residues. Further, the predictor developed in this study is implemented as an online server which is freely accessible at the following website, http://ismblab.genomics.sinica.edu.tw/. Copyright © 2014 Elsevier B.V. All rights reserved.
Mapping small molecule binding data to structural domains
2012-01-01
Background Large-scale bioactivity/SAR Open Data has recently become available, and this has allowed new analyses and approaches to be developed to help address the productivity and translational gaps of current drug discovery. One of the current limitations of these data is the relative sparsity of reported interactions per protein target, and complexities in establishing clear relationships between bioactivity and targets using bioinformatics tools. We detail in this paper the indexing of targets by the structural domains that bind (or are likely to bind) the ligand within a full-length protein. Specifically, we present a simple heuristic to map small molecule binding to Pfam domains. This profiling can be applied to all proteins within a genome to give some indications of the potential pharmacological modulation and regulation of all proteins. Results In this implementation of our heuristic, ligand binding to protein targets from the ChEMBL database was mapped to structural domains as defined by profiles contained within the Pfam-A database. Our mapping suggests that the majority of assay targets within the current version of the ChEMBL database bind ligands through a small number of highly prevalent domains, and conversely the majority of Pfam domains sampled by our data play no currently established role in ligand binding. Validation studies, carried out firstly against Uniprot entries with expert binding-site annotation and secondly against entries in the wwPDB repository of crystallographic protein structures, demonstrate that our simple heuristic maps ligand binding to the correct domain in about 90 percent of all assessed cases. Using the mappings obtained with our heuristic, we have assembled ligand sets associated with each Pfam domain. Conclusions Small molecule binding has been mapped to Pfam-A domains of protein targets in the ChEMBL bioactivity database. The result of this mapping is an enriched annotation of small molecule bioactivity data and a grouping of activity classes following the Pfam-A specifications of protein domains. This is valuable for data-focused approaches in drug discovery, for example when extrapolating potential targets of a small molecule with known activity against one or few targets, or in the assessment of a potential target for drug discovery or screening studies. PMID:23282026
Recent advances in ChIP-seq analysis: from quality management to whole-genome annotation.
Nakato, Ryuichiro; Shirahige, Katsuhiko
2017-03-01
Chromatin immunoprecipitation followed by sequencing (ChIP-seq) analysis can detect protein/DNA-binding and histone-modification sites across an entire genome. Recent advances in sequencing technologies and analyses enable us to compare hundreds of samples simultaneously; such large-scale analysis has potential to reveal the high-dimensional interrelationship level for regulatory elements and annotate novel functional genomic regions de novo. Because many experimental considerations are relevant to the choice of a method in a ChIP-seq analysis, the overall design and quality management of the experiment are of critical importance. This review offers guiding principles of computation and sample preparation for ChIP-seq analyses, highlighting the validity and limitations of the state-of-the-art procedures at each step. We also discuss the latest challenges of single-cell analysis that will encourage a new era in this field. © The Author 2016. Published by Oxford University Press.
Detection of functionally important regions in "hypothetical proteins" of known structure.
Nimrod, Guy; Schushan, Maya; Steinberg, David M; Ben-Tal, Nir
2008-12-10
Structural genomics initiatives provide ample structures of "hypothetical proteins" (i.e., proteins of unknown function) at an ever increasing rate. However, without function annotation, this structural goldmine is of little use to biologists who are interested in particular molecular systems. To this end, we used (an improved version of) the PatchFinder algorithm for the detection of functional regions on the protein surface, which could mediate its interactions with, e.g., substrates, ligands, and other proteins. Examination, using a data set of annotated proteins, showed that PatchFinder outperforms similar methods. We collected 757 structures of hypothetical proteins and their predicted functional regions in the N-Func database. Inspection of several of these regions demonstrated that they are useful for function prediction. For example, we suggested an interprotein interface and a putative nucleotide-binding site. A web-server implementation of PatchFinder and the N-Func database are available at http://patchfinder.tau.ac.il/.
Data on the genome-wide identification of CNL R-genes in Setaria italica (L.) P. Beauv.
Andersen, Ethan J; Nepal, Madhav P
2017-08-01
We report data associated with the identification of 242 disease resistance genes (R-genes) in the genome of Setaria italica as presented in "Genetic diversity of disease resistance genes in foxtail millet ( Setaria italica L.)" (Andersen and Nepal, 2017) [1]. Our data describe the structure and evolution of the Coiled-coil, Nucleotide-binding site, Leucine-rich repeat (CNL) R-genes in foxtail millet. The CNL genes were identified through rigorous extraction and analysis of recently available plant genome sequences using cutting-edge analytical software. Data visualization includes gene structure diagrams, chromosomal syntenic maps, a chromosomal density plot, and a maximum-likelihood phylogenetic tree comparing Sorghum bicolor , Panicum virgatum , Setaria italica , and Arabidopsis thaliana . Compilation of InterProScan annotations, Gene Ontology (GO) annotations, and Basic Local Alignment Search Tool (BLAST) results for the 242 R-genes identified in the foxtail millet genome are also included in tabular format.
Publication Production: An Annotated Bibliography.
ERIC Educational Resources Information Center
Firman, Anthony H.
1994-01-01
Offers brief annotations of 52 articles and papers on document production (from the Society for Technical Communication's journal and proceedings) on 9 topics: information processing, document design, using color, typography, tables, illustrations, photography, printing and binding, and production management. (SR)
NASA Technical Reports Server (NTRS)
Dominiak, P.; Ciszak, Ewa
2004-01-01
Thiamin pyrophosphate (TPP)-dependent enzymes are a divergent family of TPP and metal ion binding proteins that perform a wide range of functions with the common decarboxylation steps of a -(O=)C-C(OH)- fragment of alpha-ketoacids and alpha- hydroxyaldehydes. To determine how structure and catalytic action are conserved in the context of large sequence differences existing within this family of enzymes, we have carried out an analysis of TPP-dependent enzymes of known structures. The common structure of TPP-dependent enzymes is formed at the interface of four alpha/beta domains from at least two subunits, which provide for two metal and TPP-binding sites. Residues around these catalytic sites are conserved for functional purpose, while those further away from TPP are conserved for structural reasons. Together they provide a network of contacts required for flip-flop catalytic action within TPP-dependent enzymes. Thus our analysis defines a TPP-action motif that is proposed for annotating TPP-dependent enzymes for advancing functional proteomics.
Contains annotated index of site specific documents for the American Drum & Pallet Co. Removal Site in Memphis, Shelby County, Tennessee, January 9, 2008 Region ID: 04 DocID: 10517016, DocDate: 01-09-2008
SpliceRover: Interpretable Convolutional Neural: Networks for Improved Splice Site Prediction.
Zuallaert, Jasper; Godin, Fréderic; Kim, Mijung; Soete, Arne; Saeys, Yvan; De Neve, Wesley
2018-06-21
During the last decade, improvements in high-throughput sequencing have generated a wealth of genomic data. Functionally interpreting these sequences and finding the biological signals that are hallmarks of gene function and regulation is currently mostly done using automated genome annotation platforms, which mainly rely on integrated machine learning frameworks to identify different functional sites of interest, including splice sites. Splicing is an essential step in the gene regulation process, and the correct identification of splice sites is a major cornerstone in a genome annotation system. In this paper, we present SpliceRover, a predictive deep learning approach that outperforms the state-of-the-art in splice site prediction. SpliceRover uses convolutional neural networks (CNNs), which have been shown to obtain cutting edge performance on a wide variety of prediction tasks. We adapted this approach to deal with genomic sequence inputs, and show it consistently outperforms already existing approaches, with relative improvements in prediction effectiveness of up to 80.9% when measured in terms of false discovery rate. However, a major criticism of CNNs concerns their "black box" nature, as mechanisms to obtain insight into their reasoning processes are limited. To facilitate interpretability of the SpliceRover models, we introduce an approach to visualize the biologically relevant information learnt. We show that our visualization approach is able to recover features known to be important for splice site prediction (binding motifs around the splice site, presence of polypyrimidine tracts and branch points), as well as reveal new features (e.g., several types of exclusion patterns near splice sites). SpliceRover is available as a web service. The prediction tool and instructions can be found at http://bioit2.irc.ugent.be/splicerover/. Supplementary materials are available at Bioinformatics online.
A domain-centric solution to functional genomics via dcGO Predictor
2013-01-01
Background Computational/manual annotations of protein functions are one of the first routes to making sense of a newly sequenced genome. Protein domain predictions form an essential part of this annotation process. This is due to the natural modularity of proteins with domains as structural, evolutionary and functional units. Sometimes two, three, or more adjacent domains (called supra-domains) are the operational unit responsible for a function, e.g. via a binding site at the interface. These supra-domains have contributed to functional diversification in higher organisms. Traditionally functional ontologies have been applied to individual proteins, rather than families of related domains and supra-domains. We expect, however, to some extent functional signals can be carried by protein domains and supra-domains, and consequently used in function prediction and functional genomics. Results Here we present a domain-centric Gene Ontology (dcGO) perspective. We generalize a framework for automatically inferring ontological terms associated with domains and supra-domains from full-length sequence annotations. This general framework has been applied specifically to primary protein-level annotations from UniProtKB-GOA, generating GO term associations with SCOP domains and supra-domains. The resulting 'dcGO Predictor', can be used to provide functional annotation to protein sequences. The functional annotation of sequences in the Critical Assessment of Function Annotation (CAFA) has been used as a valuable opportunity to validate our method and to be assessed by the community. The functional annotation of all completely sequenced genomes has demonstrated the potential for domain-centric GO enrichment analysis to yield functional insights into newly sequenced or yet-to-be-annotated genomes. This generalized framework we have presented has also been applied to other domain classifications such as InterPro and Pfam, and other ontologies such as mammalian phenotype and disease ontology. The dcGO and its predictor are available at http://supfam.org/SUPERFAMILY/dcGO including an enrichment analysis tool. Conclusions As functional units, domains offer a unique perspective on function prediction regardless of whether proteins are multi-domain or single-domain. The 'dcGO Predictor' holds great promise for contributing to a domain-centric functional understanding of genomes in the next generation sequencing era. PMID:23514627
Characterization of the Saccharomyces cerevisiae ATP-Interactome using the iTRAQ-SPROX Technique
NASA Astrophysics Data System (ADS)
Geer, M. Ariel; Fitzgerald, Michael C.
2016-02-01
The stability of proteins from rates of oxidation (SPROX) technique was used in combination with an isobaric mass tagging strategy to identify adenosine triphosphate (ATP) interacting proteins in the Saccharomyces cerevisiae proteome. The SPROX methodology utilized in this work enabled 373 proteins in a yeast cell lysate to be assayed for ATP interactions (both direct and indirect) using the non-hydrolyzable ATP analog, adenylyl imidodiphosphate (AMP-PNP). A total of 28 proteins were identified with AMP-PNP-induced thermodynamic stability changes. These protein hits included 14 proteins that were previously annotated as ATP-binding proteins in the Saccharomyces Genome Database (SGD). The 14 non-annotated ATP-binding proteins included nine proteins that were previously found to be ATP-sensitive in an earlier SPROX study using a stable isotope labeling with amino acids in cell culture (SILAC)-based approach. A bioinformatics analysis of the protein hits identified here and in the earlier SILAC-SPROX experiments revealed that many of the previously annotated ATP-binding protein hits were kinases, ligases, and chaperones. In contrast, many of the newly discovered ATP-sensitive proteins were not from these protein classes, but rather were hydrolases, oxidoreductases, and nucleic acid-binding proteins.
Kirby, Marie K; Ramaker, Ryne C; Roberts, Brian S; Lasseigne, Brittany N; Gunther, David S; Burwell, Todd C; Davis, Nicholas S; Gulzar, Zulfiqar G; Absher, Devin M; Cooper, Sara J; Brooks, James D; Myers, Richard M
2017-04-17
Current diagnostic tools for prostate cancer lack specificity and sensitivity for detecting very early lesions. DNA methylation is a stable genomic modification that is detectable in peripheral patient fluids such as urine and blood plasma that could serve as a non-invasive diagnostic biomarker for prostate cancer. We measured genome-wide DNA methylation patterns in 73 clinically annotated fresh-frozen prostate cancers and 63 benign-adjacent prostate tissues using the Illumina Infinium HumanMethylation450 BeadChip array. We overlaid the most significantly differentially methylated sites in the genome with transcription factor binding sites measured by the Encyclopedia of DNA Elements consortium. We used logistic regression and receiver operating characteristic curves to assess the performance of candidate diagnostic models. We identified methylation patterns that have a high predictive power for distinguishing malignant prostate tissue from benign-adjacent prostate tissue, and these methylation signatures were validated using data from The Cancer Genome Atlas Project. Furthermore, by overlaying ENCODE transcription factor binding data, we observed an enrichment of enhancer of zeste homolog 2 binding in gene regulatory regions with higher DNA methylation in malignant prostate tissues. DNA methylation patterns are greatly altered in prostate cancer tissue in comparison to benign-adjacent tissue. We have discovered patterns of DNA methylation marks that can distinguish prostate cancers with high specificity and sensitivity in multiple patient tissue cohorts, and we have identified transcription factors binding in these differentially methylated regions that may play important roles in prostate cancer development.
Mathelier, Anthony; Fornes, Oriol; Arenillas, David J.; Chen, Chih-yu; Denay, Grégoire; Lee, Jessica; Shi, Wenqiang; Shyr, Casper; Tan, Ge; Worsley-Hunt, Rebecca; Zhang, Allen W.; Parcy, François; Lenhard, Boris; Sandelin, Albin; Wasserman, Wyeth W.
2016-01-01
JASPAR (http://jaspar.genereg.net) is an open-access database storing curated, non-redundant transcription factor (TF) binding profiles representing transcription factor binding preferences as position frequency matrices for multiple species in six taxonomic groups. For this 2016 release, we expanded the JASPAR CORE collection with 494 new TF binding profiles (315 in vertebrates, 11 in nematodes, 3 in insects, 1 in fungi and 164 in plants) and updated 59 profiles (58 in vertebrates and 1 in fungi). The introduced profiles represent an 83% expansion and 10% update when compared to the previous release. We updated the structural annotation of the TF DNA binding domains (DBDs) following a published hierarchical structural classification. In addition, we introduced 130 transcription factor flexible models trained on ChIP-seq data for vertebrates, which capture dinucleotide dependencies within TF binding sites. This new JASPAR release is accompanied by a new web tool to infer JASPAR TF binding profiles recognized by a given TF protein sequence. Moreover, we provide the users with a Ruby module complementing the JASPAR API to ease programmatic access and use of the JASPAR collection of profiles. Finally, we provide the JASPAR2016 R/Bioconductor data package with the data of this release. PMID:26531826
Cao, Xueyan; Song, Dahe; Yang, Mei; Yang, Ning; Ye, Qing; Tao, Dongbing; Liu, Biao; Wu, Rina; Yue, Xiqing
2017-11-29
Glycosylation is a ubiquitous post-translational protein modification that plays a substantial role in various processes. However, whey glycoproteins in human milk have not been completely profiled. Herein, we used quantitative glycoproteomics to quantify whey N-glycosylation sites and their alteration in human milk during lactation; 110 N-glycosylation sites on 63 proteins and 91 N-glycosylation sites on 53 proteins were quantified in colostrum and mature milk whey, respectively. Among these, 68 glycosylation sites on 38 proteins were differentially expressed in human colostrum and mature milk whey. These differentially expressed N-glycoproteins were highly enriched in "localization", "extracellular region part", and "modified amino acid binding" according to gene ontology annotation and mainly involved in complement and coagulation cascades pathway. These results shed light on the glycosylation sites, composition and biological functions of whey N-glycoproteins in human colostrum and mature milk, and provide substantial insight into the role of protein glycosylation during infant development.
Staufen1 senses overall transcript secondary structure to regulate translation
Ricci, Emiliano P; Kucukural, Alper; Cenik, Can; Mercier, Blandine C; Singh, Guramrit; Heyer, Erin E; Ashar-Patel, Ami; Peng, Lingtao; Moore, Melissa J
2015-01-01
Human Staufen1 (Stau1) is a double-stranded RNA (dsRNA)-binding protein implicated in multiple post-transcriptional gene-regulatory processes. Here we combined RNA immunoprecipitation in tandem (RIPiT) with RNase footprinting, formaldehyde cross-linking, sonication-mediated RNA fragmentation and deep sequencing to map Staufen1-binding sites transcriptome wide. We find that Stau1 binds complex secondary structures containing multiple short helices, many of which are formed by inverted Alu elements in annotated 3′ untranslated regions (UTRs) or in ‘strongly distal’ 3′ UTRs. Stau1 also interacts with actively translating ribosomes and with mRNA coding sequences (CDSs) and 3′ UTRs in proportion to their GC content and propensity to form internal secondary structure. On mRNAs with high CDS GC content, higher Stau1 levels lead to greater ribosome densities, thus suggesting a general role for Stau1 in modulating translation elongation through structured CDS regions. Our results also indicate that Stau1 regulates translation of transcription-regulatory proteins. PMID:24336223
BLSSpeller: exhaustive comparative discovery of conserved cis-regulatory elements.
De Witte, Dieter; Van de Velde, Jan; Decap, Dries; Van Bel, Michiel; Audenaert, Pieter; Demeester, Piet; Dhoedt, Bart; Vandepoele, Klaas; Fostier, Jan
2015-12-01
The accurate discovery and annotation of regulatory elements remains a challenging problem. The growing number of sequenced genomes creates new opportunities for comparative approaches to motif discovery. Putative binding sites are then considered to be functional if they are conserved in orthologous promoter sequences of multiple related species. Existing methods for comparative motif discovery usually rely on pregenerated multiple sequence alignments, which are difficult to obtain for more diverged species such as plants. As a consequence, misaligned regulatory elements often remain undetected. We present a novel algorithm that supports both alignment-free and alignment-based motif discovery in the promoter sequences of related species. Putative motifs are exhaustively enumerated as words over the IUPAC alphabet and screened for conservation using the branch length score. Additionally, a confidence score is established in a genome-wide fashion. In order to take advantage of a cloud computing infrastructure, the MapReduce programming model is adopted. The method is applied to four monocotyledon plant species and it is shown that high-scoring motifs are significantly enriched for open chromatin regions in Oryza sativa and for transcription factor binding sites inferred through protein-binding microarrays in O.sativa and Zea mays. Furthermore, the method is shown to recover experimentally profiled ga2ox1-like KN1 binding sites in Z.mays. BLSSpeller was written in Java. Source code and manual are available at http://bioinformatics.intec.ugent.be/blsspeller Klaas.Vandepoele@psb.vib-ugent.be or jan.fostier@intec.ugent.be. Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press.
BLSSpeller: exhaustive comparative discovery of conserved cis-regulatory elements
De Witte, Dieter; Van de Velde, Jan; Decap, Dries; Van Bel, Michiel; Audenaert, Pieter; Demeester, Piet; Dhoedt, Bart; Vandepoele, Klaas; Fostier, Jan
2015-01-01
Motivation: The accurate discovery and annotation of regulatory elements remains a challenging problem. The growing number of sequenced genomes creates new opportunities for comparative approaches to motif discovery. Putative binding sites are then considered to be functional if they are conserved in orthologous promoter sequences of multiple related species. Existing methods for comparative motif discovery usually rely on pregenerated multiple sequence alignments, which are difficult to obtain for more diverged species such as plants. As a consequence, misaligned regulatory elements often remain undetected. Results: We present a novel algorithm that supports both alignment-free and alignment-based motif discovery in the promoter sequences of related species. Putative motifs are exhaustively enumerated as words over the IUPAC alphabet and screened for conservation using the branch length score. Additionally, a confidence score is established in a genome-wide fashion. In order to take advantage of a cloud computing infrastructure, the MapReduce programming model is adopted. The method is applied to four monocotyledon plant species and it is shown that high-scoring motifs are significantly enriched for open chromatin regions in Oryza sativa and for transcription factor binding sites inferred through protein-binding microarrays in O.sativa and Zea mays. Furthermore, the method is shown to recover experimentally profiled ga2ox1-like KN1 binding sites in Z.mays. Availability and implementation: BLSSpeller was written in Java. Source code and manual are available at http://bioinformatics.intec.ugent.be/blsspeller Contact: Klaas.Vandepoele@psb.vib-ugent.be or jan.fostier@intec.ugent.be Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26254488
Ozyurt, A Sinem; Selby, Thomas L
2008-07-01
This study describes a method to computationally assess the function of homologous enzymes through small molecule binding interaction energy. Three experimentally determined X-ray structures and four enzyme models from ornithine cyclo-deaminase, alanine dehydrogenase, and mu-crystallin were used in combination with nine small molecules to derive a function score (FS) for each enzyme-model combination. While energy values varied for a single molecule-enzyme combination due to differences in the active sites, we observe that the binding energies for the entire pathway were proportional for each set of small molecules investigated. This proportionality of energies for a reaction pathway appears to be dependent on the amino acids in the active site and their direct interactions with the small molecules, which allows a function score (FS) to be calculated to assess the specificity of each enzyme. Potential of mean force (PMF) calculations were used to obtain the energies, and the resulting FS values demonstrate that a measurement of function may be obtained using differences between these PMF values. Additionally, limitations of this method are discussed based on: (a) larger substrates with significant conformational flexibility; (b) low homology enzymes; and (c) open active sites. This method should be useful in accurately predicting specificity for single enzymes that have multiple steps in their reactions and in high throughput computational methods to accurately annotate uncharacterized proteins based on active site interaction analysis. 2008 Wiley-Liss, Inc.
PTMScout, a Web Resource for Analysis of High Throughput Post-translational Proteomics Studies*
Naegle, Kristen M.; Gymrek, Melissa; Joughin, Brian A.; Wagner, Joel P.; Welsch, Roy E.; Yaffe, Michael B.; Lauffenburger, Douglas A.; White, Forest M.
2010-01-01
The rate of discovery of post-translational modification (PTM) sites is increasing rapidly and is significantly outpacing our biological understanding of the function and regulation of those modifications. To help meet this challenge, we have created PTMScout, a web-based interface for viewing, manipulating, and analyzing high throughput experimental measurements of PTMs in an effort to facilitate biological understanding of protein modifications in signaling networks. PTMScout is constructed around a custom database of PTM experiments and contains information from external protein and post-translational resources, including gene ontology annotations, Pfam domains, and Scansite predictions of kinase and phosphopeptide binding domain interactions. PTMScout functionality comprises data set comparison tools, data set summary views, and tools for protein assignments of peptides identified by mass spectrometry. Analysis tools in PTMScout focus on informed subset selection via common criteria and on automated hypothesis generation through subset labeling derived from identification of statistically significant enrichment of other annotations in the experiment. Subset selection can be applied through the PTMScout flexible query interface available for quantitative data measurements and data annotations as well as an interface for importing data set groupings by external means, such as unsupervised learning. We exemplify the various functions of PTMScout in application to data sets that contain relative quantitative measurements as well as data sets lacking quantitative measurements, producing a set of interesting biological hypotheses. PTMScout is designed to be a widely accessible tool, enabling generation of multiple types of biological hypotheses from high throughput PTM experiments and advancing functional assignment of novel PTM sites. PTMScout is available at http://ptmscout.mit.edu. PMID:20631208
Liu, Ching-Ti; Raghavan, Sridharan; Maruthur, Nisa; Kabagambe, Edmond Kato; Hong, Jaeyoung; Ng, Maggie C Y; Hivert, Marie-France; Lu, Yingchang; An, Ping; Bentley, Amy R; Drolet, Anne M; Gaulton, Kyle J; Guo, Xiuqing; Armstrong, Loren L; Irvin, Marguerite R; Li, Man; Lipovich, Leonard; Rybin, Denis V; Taylor, Kent D; Agyemang, Charles; Palmer, Nicholette D; Cade, Brian E; Chen, Wei-Min; Dauriz, Marco; Delaney, Joseph A C; Edwards, Todd L; Evans, Daniel S; Evans, Michele K; Lange, Leslie A; Leong, Aaron; Liu, Jingmin; Liu, Yongmei; Nayak, Uma; Patel, Sanjay R; Porneala, Bianca C; Rasmussen-Torvik, Laura J; Snijder, Marieke B; Stallings, Sarah C; Tanaka, Toshiko; Yanek, Lisa R; Zhao, Wei; Becker, Diane M; Bielak, Lawrence F; Biggs, Mary L; Bottinger, Erwin P; Bowden, Donald W; Chen, Guanjie; Correa, Adolfo; Couper, David J; Crawford, Dana C; Cushman, Mary; Eicher, John D; Fornage, Myriam; Franceschini, Nora; Fu, Yi-Ping; Goodarzi, Mark O; Gottesman, Omri; Hara, Kazuo; Harris, Tamara B; Jensen, Richard A; Johnson, Andrew D; Jhun, Min A; Karter, Andrew J; Keller, Margaux F; Kho, Abel N; Kizer, Jorge R; Krauss, Ronald M; Langefeld, Carl D; Li, Xiaohui; Liang, Jingling; Liu, Simin; Lowe, William L; Mosley, Thomas H; North, Kari E; Pacheco, Jennifer A; Peyser, Patricia A; Patrick, Alan L; Rice, Kenneth M; Selvin, Elizabeth; Sims, Mario; Smith, Jennifer A; Tajuddin, Salman M; Vaidya, Dhananjay; Wren, Mary P; Yao, Jie; Zhu, Xiaofeng; Ziegler, Julie T; Zmuda, Joseph M; Zonderman, Alan B; Zwinderman, Aeilko H; Adeyemo, Adebowale; Boerwinkle, Eric; Ferrucci, Luigi; Hayes, M Geoffrey; Kardia, Sharon L R; Miljkovic, Iva; Pankow, James S; Rotimi, Charles N; Sale, Michele M; Wagenknecht, Lynne E; Arnett, Donna K; Chen, Yii-Der Ida; Nalls, Michael A; Province, Michael A; Kao, W H Linda; Siscovick, David S; Psaty, Bruce M; Wilson, James G; Loos, Ruth J F; Dupuis, Josée; Rich, Stephen S; Florez, Jose C; Rotter, Jerome I; Morris, Andrew P; Meigs, James B
2016-07-07
Knowledge of the genetic basis of the type 2 diabetes (T2D)-related quantitative traits fasting glucose (FG) and insulin (FI) in African ancestry (AA) individuals has been limited. In non-diabetic subjects of AA (n = 20,209) and European ancestry (EA; n = 57,292), we performed trans-ethnic (AA+EA) fine-mapping of 54 established EA FG or FI loci with detailed functional annotation, assessed their relevance in AA individuals, and sought previously undescribed loci through trans-ethnic (AA+EA) meta-analysis. We narrowed credible sets of variants driving association signals for 22/54 EA-associated loci; 18/22 credible sets overlapped with active islet-specific enhancers or transcription factor (TF) binding sites, and 21/22 contained at least one TF motif. Of the 54 EA-associated loci, 23 were shared between EA and AA. Replication with an additional 10,096 AA individuals identified two previously undescribed FI loci, chrX FAM133A (rs213676) and chr5 PELO (rs6450057). Trans-ethnic analyses with regulatory annotation illuminate the genetic architecture of glycemic traits and suggest gene regulation as a target to advance precision medicine for T2D. Our approach to utilize state-of-the-art functional annotation and implement trans-ethnic association analysis for discovery and fine-mapping offers a framework for further follow-up and characterization of GWAS signals of complex trait loci. Copyright © 2016 American Society of Human Genetics. All rights reserved.
Transcription Factor Map Alignment of Promoter Regions
Blanco, Enrique; Messeguer, Xavier; Smith, Temple F; Guigó, Roderic
2006-01-01
We address the problem of comparing and characterizing the promoter regions of genes with similar expression patterns. This remains a challenging problem in sequence analysis, because often the promoter regions of co-expressed genes do not show discernible sequence conservation. In our approach, thus, we have not directly compared the nucleotide sequence of promoters. Instead, we have obtained predictions of transcription factor binding sites, annotated the predicted sites with the labels of the corresponding binding factors, and aligned the resulting sequences of labels—to which we refer here as transcription factor maps (TF-maps). To obtain the global pairwise alignment of two TF-maps, we have adapted an algorithm initially developed to align restriction enzyme maps. We have optimized the parameters of the algorithm in a small, but well-curated, collection of human–mouse orthologous gene pairs. Results in this dataset, as well as in an independent much larger dataset from the CISRED database, indicate that TF-map alignments are able to uncover conserved regulatory elements, which cannot be detected by the typical sequence alignments. PMID:16733547
T-Reg Comparator: an analysis tool for the comparison of position weight matrices
Roepcke, Stefan; Grossmann, Steffen; Rahmann, Sven; Vingron, Martin
2005-01-01
T-Reg Comparator is a novel software tool designed to support research into transcriptional regulation. Sequence motifs representing transcription factor binding sites are usually encoded as position weight matrices. The user inputs a set of such weight matrices or binding site sequences and our program matches them against the T-Reg database, which is presently built on data from the Transfac [E. Wingender (2004) In Silico Biol., 4, 55–61] and Jaspar [A. Sandelin, W. Alkema, P. Engstrom, W. W. Wasserman and B. Lenhard (2004) Nucleic Acids Res., 32, D91–D94]. Our tool delivers a detailed report on similarities between user-supplied motifs and motifs in the database. Apart from simple one-to-one relationships, T-Reg Comparator is also able to detect similarities between submatrices. In addition, we provide a user interface to a program for sequence scanning with weight matrices. Typical areas of application for T-Reg Comparator are motif and regulatory module finding and annotation of regulatory genomic regions. T-Reg Comparator is available at . PMID:15980506
T-Reg Comparator: an analysis tool for the comparison of position weight matrices.
Roepcke, Stefan; Grossmann, Steffen; Rahmann, Sven; Vingron, Martin
2005-07-01
T-Reg Comparator is a novel software tool designed to support research into transcriptional regulation. Sequence motifs representing transcription factor binding sites are usually encoded as position weight matrices. The user inputs a set of such weight matrices or binding site sequences and our program matches them against the T-Reg database, which is presently built on data from the Transfac [E. Wingender (2004) In Silico Biol., 4, 55-61] and Jaspar [A. Sandelin, W. Alkema, P. Engstrom, W. W. Wasserman and B. Lenhard (2004) Nucleic Acids Res., 32, D91-D94]. Our tool delivers a detailed report on similarities between user-supplied motifs and motifs in the database. Apart from simple one-to-one relationships, T-Reg Comparator is also able to detect similarities between submatrices. In addition, we provide a user interface to a program for sequence scanning with weight matrices. Typical areas of application for T-Reg Comparator are motif and regulatory module finding and annotation of regulatory genomic regions. T-Reg Comparator is available at http://treg.molgen.mpg.de.
Diehl, Adam G
2018-01-01
Abstract The mouse is widely used as system to study human genetic mechanisms. However, extensive rewiring of transcriptional regulatory networks often confounds translation of findings between human and mouse. Site-specific gain and loss of individual transcription factor binding sites (TFBS) has caused functional divergence of orthologous regulatory loci, and so we must look beyond this positional conservation to understand common themes of regulatory control. Fortunately, transcription factor co-binding patterns shared across species often perform conserved regulatory functions. These can be compared to ‘regulatory sentences’ that retain the same meanings regardless of sequence and species context. By analyzing TFBS co-occupancy patterns observed in four human and mouse cell types, we learned a regulatory grammar: the rules by which TFBS are combined into meaningful regulatory sentences. Different parts of this grammar associate with specific sets of functional annotations regardless of sequence conservation and predict functional signatures more accurately than positional conservation. We further show that both species-specific and conserved portions of this grammar are involved in gene expression divergence and human disease risk. These findings expand our understanding of transcriptional regulatory mechanisms, suggesting that phenotypic divergence and disease risk are driven by a complex interplay between deeply conserved and species-specific transcriptional regulatory pathways. PMID:29361190
Small molecule annotation for the Protein Data Bank
Sen, Sanchayita; Young, Jasmine; Berrisford, John M.; Chen, Minyu; Conroy, Matthew J.; Dutta, Shuchismita; Di Costanzo, Luigi; Gao, Guanghua; Ghosh, Sutapa; Hudson, Brian P.; Igarashi, Reiko; Kengaku, Yumiko; Liang, Yuhe; Peisach, Ezra; Persikova, Irina; Mukhopadhyay, Abhik; Narayanan, Buvaneswari Coimbatore; Sahni, Gaurav; Sato, Junko; Sekharan, Monica; Shao, Chenghua; Tan, Lihua; Zhuravleva, Marina A.
2014-01-01
The Protein Data Bank (PDB) is the single global repository for three-dimensional structures of biological macromolecules and their complexes, and its more than 100 000 structures contain more than 20 000 distinct ligands or small molecules bound to proteins and nucleic acids. Information about these small molecules and their interactions with proteins and nucleic acids is crucial for our understanding of biochemical processes and vital for structure-based drug design. Small molecules present in a deposited structure may be attached to a polymer or may occur as a separate, non-covalently linked ligand. During curation of a newly deposited structure by wwPDB annotation staff, each molecule is cross-referenced to the PDB Chemical Component Dictionary (CCD). If the molecule is new to the PDB, a dictionary description is created for it. The information about all small molecule components found in the PDB is distributed via the ftp archive as an external reference file. Small molecule annotation in the PDB also includes information about ligand-binding sites and about covalent and other linkages between ligands and macromolecules. During the remediation of the peptide-like antibiotics and inhibitors present in the PDB archive in 2011, it became clear that additional annotation was required for consistent representation of these molecules, which are quite often composed of several sequential subcomponents including modified amino acids and other chemical groups. The connectivity information of the modified amino acids is necessary for correct representation of these biologically interesting molecules. The combined information is made available via a new resource called the Biologically Interesting molecules Reference Dictionary, which is complementary to the CCD and is now routinely used for annotation of peptide-like antibiotics and inhibitors. PMID:25425036
Small molecule annotation for the Protein Data Bank.
Sen, Sanchayita; Young, Jasmine; Berrisford, John M; Chen, Minyu; Conroy, Matthew J; Dutta, Shuchismita; Di Costanzo, Luigi; Gao, Guanghua; Ghosh, Sutapa; Hudson, Brian P; Igarashi, Reiko; Kengaku, Yumiko; Liang, Yuhe; Peisach, Ezra; Persikova, Irina; Mukhopadhyay, Abhik; Narayanan, Buvaneswari Coimbatore; Sahni, Gaurav; Sato, Junko; Sekharan, Monica; Shao, Chenghua; Tan, Lihua; Zhuravleva, Marina A
2014-01-01
The Protein Data Bank (PDB) is the single global repository for three-dimensional structures of biological macromolecules and their complexes, and its more than 100,000 structures contain more than 20,000 distinct ligands or small molecules bound to proteins and nucleic acids. Information about these small molecules and their interactions with proteins and nucleic acids is crucial for our understanding of biochemical processes and vital for structure-based drug design. Small molecules present in a deposited structure may be attached to a polymer or may occur as a separate, non-covalently linked ligand. During curation of a newly deposited structure by wwPDB annotation staff, each molecule is cross-referenced to the PDB Chemical Component Dictionary (CCD). If the molecule is new to the PDB, a dictionary description is created for it. The information about all small molecule components found in the PDB is distributed via the ftp archive as an external reference file. Small molecule annotation in the PDB also includes information about ligand-binding sites and about covalent and other linkages between ligands and macromolecules. During the remediation of the peptide-like antibiotics and inhibitors present in the PDB archive in 2011, it became clear that additional annotation was required for consistent representation of these molecules, which are quite often composed of several sequential subcomponents including modified amino acids and other chemical groups. The connectivity information of the modified amino acids is necessary for correct representation of these biologically interesting molecules. The combined information is made available via a new resource called the Biologically Interesting molecules Reference Dictionary, which is complementary to the CCD and is now routinely used for annotation of peptide-like antibiotics and inhibitors. © The Author(s) 2014. Published by Oxford University Press.
Cell Context Dependent p53 Genome-Wide Binding Patterns and Enrichment at Repeats
Botcheva, Krassimira; McCorkle, Sean R.
2014-11-21
The p53 ability to elicit stress specific and cell type specific responses is well recognized, but how that specificity is established remains to be defined. Whether upon activation p53 binds to its genomic targets in a cell type and stress type dependent manner is still an open question. Here we show that the p53 binding to the human genome is selective and cell context-dependent. We mapped the genomic binding sites for the endogenous wild type p53 protein in the human cancer cell line HCT116 and compared them to those we previously determined in the normal cell line IMR90. We reportmore » distinct p53 genome-wide binding landscapes in two different cell lines, analyzed under the same treatment and experimental conditions, using the same ChIP-seq approach. This is evidence for cell context dependent p53 genomic binding. The observed differences affect the p53 binding sites distribution with respect to major genomic and epigenomic elements (promoter regions, CpG islands and repeats). We correlated the high-confidence p53 ChIP-seq peaks positions with the annotated human repeats (UCSC Human Genome Browser) and observed both common and cell line specific trends. In HCT116, the p53 binding was specifically enriched at LINE repeats, compared to IMR90 cells. The p53 genome-wide binding patterns in HCT116 and IMR90 likely reflect the different epigenetic landscapes in these two cell lines, resulting from cancer-associated changes (accumulated in HCT116) superimposed on tissue specific differences (HCT116 has epithelial, while IMR90 has mesenchymal origin). In conclusion, our data support the model for p53 binding to the human genome in a highly selective manner, mobilizing distinct sets of genes, contributing to distinct pathways.« less
SplicingTypesAnno: annotating and quantifying alternative splicing events for RNA-Seq data.
Sun, Xiaoyong; Zuo, Fenghua; Ru, Yuanbin; Guo, Jiqiang; Yan, Xiaoyan; Sablok, Gaurav
2015-04-01
Alternative splicing plays a key role in the regulation of the central dogma. Four major types of alternative splicing have been classified as intron retention, exon skipping, alternative 5 splice sites or alternative donor sites, and alternative 3 splice sites or alternative acceptor sites. A few algorithms have been developed to detect splice junctions from RNA-Seq reads. However, there are few tools targeting at the major alternative splicing types at the exon/intron level. This type of analysis may reveal subtle, yet important events of alternative splicing, and thus help gain deeper understanding of the mechanism of alternative splicing. This paper describes a user-friendly R package, extracting, annotating and analyzing alternative splicing types for sequence alignment files from RNA-Seq. SplicingTypesAnno can: (1) provide annotation for major alternative splicing at exon/intron level. By comparing the annotation from GTF/GFF file, it identifies the novel alternative splicing sites; (2) offer a convenient two-level analysis: genome-scale annotation for users with high performance computing environment, and gene-scale annotation for users with personal computers; (3) generate a user-friendly web report and additional BED files for IGV visualization. SplicingTypesAnno is a user-friendly R package for extracting, annotating and analyzing alternative splicing types at exon/intron level for sequence alignment files from RNA-Seq. It is publically available at https://sourceforge.net/projects/splicingtypes/files/ or http://genome.sdau.edu.cn/research/software/SplicingTypesAnno.html. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Banderas, Alvaro; Guiliani, Nicolas
2013-08-16
The biomining bacterium Acidithiobacillus ferrooxidans oxidizes sulfide ores and promotes metal solubilization. The efficiency of this process depends on the attachment of cells to surfaces, a process regulated by quorum sensing (QS) cell-to-cell signalling in many Gram-negative bacteria. At. ferrooxidans has a functional QS system and the presence of AHLs enhances its attachment to pyrite. However, direct targets of the QS transcription factor AfeR remain unknown. In this study, a bioinformatic approach was used to infer possible AfeR direct targets based on the particular palindromic features of the AfeR binding site. A set of Hidden Markov Models designed to maintain palindromic regions and vary non-palindromic regions was used to screen for putative binding sites. By annotating the context of each predicted binding site (PBS), we classified them according to their positional coherence relative to other putative genomic structures such as start codons, RNA polymerase promoter elements and intergenic regions. We further used the Multiple EM for Motif Elicitation algorithm (MEME) to further filter out low homology PBSs. In summary, 75 target-genes were identified, 34 of which have a higher confidence level. Among the identified genes, we found afeR itself, zwf, genes encoding glycosyltransferase activities, metallo-beta lactamases, and active transport-related proteins. Glycosyltransferases and Zwf (Glucose 6-phosphate-1-dehydrogenase) might be directly involved in polysaccharide biosynthesis and attachment to minerals by At. ferrooxidans cells during the bioleaching process.
Banderas, Alvaro; Guiliani, Nicolas
2013-01-01
The biomining bacterium Acidithiobacillus ferrooxidans oxidizes sulfide ores and promotes metal solubilization. The efficiency of this process depends on the attachment of cells to surfaces, a process regulated by quorum sensing (QS) cell-to-cell signalling in many Gram-negative bacteria. At. ferrooxidans has a functional QS system and the presence of AHLs enhances its attachment to pyrite. However, direct targets of the QS transcription factor AfeR remain unknown. In this study, a bioinformatic approach was used to infer possible AfeR direct targets based on the particular palindromic features of the AfeR binding site. A set of Hidden Markov Models designed to maintain palindromic regions and vary non-palindromic regions was used to screen for putative binding sites. By annotating the context of each predicted binding site (PBS), we classified them according to their positional coherence relative to other putative genomic structures such as start codons, RNA polymerase promoter elements and intergenic regions. We further used the Multiple EM for Motif Elicitation algorithm (MEME) to further filter out low homology PBSs. In summary, 75 target-genes were identified, 34 of which have a higher confidence level. Among the identified genes, we found afeR itself, zwf, genes encoding glycosyltransferase activities, metallo-beta lactamases, and active transport-related proteins. Glycosyltransferases and Zwf (Glucose 6-phosphate-1-dehydrogenase) might be directly involved in polysaccharide biosynthesis and attachment to minerals by At. ferrooxidans cells during the bioleaching process. PMID:23959118
Genome-wide transcription start site profiling in biofilm-grown Burkholderia cenocepacia J2315.
Sass, Andrea M; Van Acker, Heleen; Förstner, Konrad U; Van Nieuwerburgh, Filip; Deforce, Dieter; Vogel, Jörg; Coenye, Tom
2015-10-13
Burkholderia cenocepacia is a soil-dwelling Gram-negative Betaproteobacterium with an important role as opportunistic pathogen in humans. Infections with B. cenocepacia are very difficult to treat due to their high intrinsic resistance to most antibiotics. Biofilm formation further adds to their antibiotic resistance. B. cenocepacia harbours a large, multi-replicon genome with a high GC-content, the reference genome of strain J2315 includes 7374 annotated genes. This study aims to annotate transcription start sites and identify novel transcripts on a whole genome scale. RNA extracted from B. cenocepacia J2315 biofilms was analysed by differential RNA-sequencing and the resulting dataset compared to data derived from conventional, global RNA-sequencing. Transcription start sites were annotated and further analysed according to their position relative to annotated genes. Four thousand ten transcription start sites were mapped over the whole B. cenocepacia genome and the primary transcription start site of 2089 genes expressed in B. cenocepacia biofilms were defined. For 64 genes a start codon alternative to the annotated one was proposed. Substantial antisense transcription for 105 genes and two novel protein coding sequences were identified. The distribution of internal transcription start sites can be used to identify genomic islands in B. cenocepacia. A potassium pump strongly induced only under biofilm conditions was found and 15 non-coding small RNAs highly expressed in biofilms were discovered. Mapping transcription start sites across the B. cenocepacia genome added relevant information to the J2315 annotation. Genes and novel regulatory RNAs putatively involved in B. cenocepacia biofilm formation were identified. These findings will help in understanding regulation of B. cenocepacia biofilm formation.
Mathelier, Anthony; Fornes, Oriol; Arenillas, David J; Chen, Chih-Yu; Denay, Grégoire; Lee, Jessica; Shi, Wenqiang; Shyr, Casper; Tan, Ge; Worsley-Hunt, Rebecca; Zhang, Allen W; Parcy, François; Lenhard, Boris; Sandelin, Albin; Wasserman, Wyeth W
2016-01-04
JASPAR (http://jaspar.genereg.net) is an open-access database storing curated, non-redundant transcription factor (TF) binding profiles representing transcription factor binding preferences as position frequency matrices for multiple species in six taxonomic groups. For this 2016 release, we expanded the JASPAR CORE collection with 494 new TF binding profiles (315 in vertebrates, 11 in nematodes, 3 in insects, 1 in fungi and 164 in plants) and updated 59 profiles (58 in vertebrates and 1 in fungi). The introduced profiles represent an 83% expansion and 10% update when compared to the previous release. We updated the structural annotation of the TF DNA binding domains (DBDs) following a published hierarchical structural classification. In addition, we introduced 130 transcription factor flexible models trained on ChIP-seq data for vertebrates, which capture dinucleotide dependencies within TF binding sites. This new JASPAR release is accompanied by a new web tool to infer JASPAR TF binding profiles recognized by a given TF protein sequence. Moreover, we provide the users with a Ruby module complementing the JASPAR API to ease programmatic access and use of the JASPAR collection of profiles. Finally, we provide the JASPAR2016 R/Bioconductor data package with the data of this release. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Chow, Chi-Nga; Zheng, Han-Qin; Wu, Nai-Yun; Chien, Chia-Hung; Huang, Hsien-Da; Lee, Tzong-Yi; Chiang-Hsieh, Yi-Fan; Hou, Ping-Fu; Yang, Tien-Yi; Chang, Wen-Chi
2016-01-04
Transcription factors (TFs) are sequence-specific DNA-binding proteins acting as critical regulators of gene expression. The Plant Promoter Analysis Navigator (PlantPAN; http://PlantPAN2.itps.ncku.edu.tw) provides an informative resource for detecting transcription factor binding sites (TFBSs), corresponding TFs, and other important regulatory elements (CpG islands and tandem repeats) in a promoter or a set of plant promoters. Additionally, TFBSs, CpG islands, and tandem repeats in the conserve regions between similar gene promoters are also identified. The current PlantPAN release (version 2.0) contains 16 960 TFs and 1143 TF binding site matrices among 76 plant species. In addition to updating of the annotation information, adding experimentally verified TF matrices, and making improvements in the visualization of transcriptional regulatory networks, several new features and functions are incorporated. These features include: (i) comprehensive curation of TF information (response conditions, target genes, and sequence logos of binding motifs, etc.), (ii) co-expression profiles of TFs and their target genes under various conditions, (iii) protein-protein interactions among TFs and their co-factors, (iv) TF-target networks, and (v) downstream promoter elements. Furthermore, a dynamic transcriptional regulatory network under various conditions is provided in PlantPAN 2.0. The PlantPAN 2.0 is a systematic platform for plant promoter analysis and reconstructing transcriptional regulatory networks. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
oPOSSUM: identification of over-represented transcription factor binding sites in co-expressed genes
Ho Sui, Shannan J.; Mortimer, James R.; Arenillas, David J.; Brumm, Jochen; Walsh, Christopher J.; Kennedy, Brian P.; Wasserman, Wyeth W.
2005-01-01
Targeted transcript profiling studies can identify sets of co-expressed genes; however, identification of the underlying functional mechanism(s) is a significant challenge. Established methods for the analysis of gene annotations, particularly those based on the Gene Ontology, can identify functional linkages between genes. Similar methods for the identification of over-represented transcription factor binding sites (TFBSs) have been successful in yeast, but extension to human genomics has largely proved ineffective. Creation of a system for the efficient identification of common regulatory mechanisms in a subset of co-expressed human genes promises to break a roadblock in functional genomics research. We have developed an integrated system that searches for evidence of co-regulation by one or more transcription factors (TFs). oPOSSUM combines a pre-computed database of conserved TFBSs in human and mouse promoters with statistical methods for identification of sites over-represented in a set of co-expressed genes. The algorithm successfully identified mediating TFs in control sets of tissue-specific genes and in sets of co-expressed genes from three transcript profiling studies. Simulation studies indicate that oPOSSUM produces few false positives using empirically defined thresholds and can tolerate up to 50% noise in a set of co-expressed genes. PMID:15933209
Roche, Daniel B; Buenavista, Maria T; Tetchner, Stuart J; McGuffin, Liam J
2011-07-01
The IntFOLD server is a novel independent server that integrates several cutting edge methods for the prediction of structure and function from sequence. Our guiding principles behind the server development were as follows: (i) to provide a simple unified resource that makes our prediction software accessible to all and (ii) to produce integrated output for predictions that can be easily interpreted. The output for predictions is presented as a simple table that summarizes all results graphically via plots and annotated 3D models. The raw machine readable data files for each set of predictions are also provided for developers, which comply with the Critical Assessment of Methods for Protein Structure Prediction (CASP) data standards. The server comprises an integrated suite of five novel methods: nFOLD4, for tertiary structure prediction; ModFOLD 3.0, for model quality assessment; DISOclust 2.0, for disorder prediction; DomFOLD 2.0 for domain prediction; and FunFOLD 1.0, for ligand binding site prediction. Predictions from the IntFOLD server were found to be competitive in several categories in the recent CASP9 experiment. The IntFOLD server is available at the following web site: http://www.reading.ac.uk/bioinf/IntFOLD/.
Bioinformatic investigation of the role of ubiquitins in cucumber flower morphogenesis
NASA Astrophysics Data System (ADS)
Pawełkowicz, Magdalena; Osipowski, Paweł; Wojcieszek, Michał; Kowalczuk, Cezary; PlÄ der, Wojciech; Przybecki, Zbigniew
2016-09-01
Three cDNA clones were used to screen cucumber genome in order to find genes and proteins. Functional annotation reveals that they are correlated with ubiquitination pathways. Various bioinformatics tools were used to screen and check protein sequences features such as: the presence of specific domains, transmembrane regions, cleavage site and cellular placement. The computational analysis for promotor region shows many binding sites for transcription factors, which could regulate the expression of genes. In order to check gene expression levels in developing flower buds of monoecious (B10) and gynoecious (2gg) cucumber lines, the real - time PCR technique was applied. The expression was checked for the whole buds and only for the 3rd and 4th whorls of bud when generative organ are form which were obtained by Laser Capture Microdissection (LCM) technique.
Identification of Small RNAs in Desulfovibrio vulgaris Hildenborough
DOE Office of Scientific and Technical Information (OSTI.GOV)
Burns, Andrew; Joachimiak, Marcin; Deutschbauer, Adam
2010-05-17
Desulfovibrio vulgaris is an anaerobic sulfate-reducing bacterium capable of facilitating the removal of toxic metals such as uranium from contaminated sites via reduction. As such, it is essential to understand the intricate regulatory cascades involved in how D. vulgaris and its relatives respond to stressors in such sites. One approach is the identification and analysis of small non-coding RNAs (sRNAs); molecules ranging in size from 20-200 nucleotides that predominantly affect gene regulation by binding to complementary mRNA in an anti-sense fashion and therefore provide an immediate regulatory response. To identify sRNAs in D. vulgaris, a bacterium that does not possessmore » an annotated hfq gene, RNA was pooled from stationary and exponential phases, nitrate exposure, and biofilm conditions. The subsequent RNA was size fractionated, modified, and converted to cDNA for high throughput transcriptomic deep sequencing. A computational approach to identify sRNAs via the alignment of seven separate Desulfovibrio genomes was also performed. From the deep sequencing analysis, 2,296 reads between 20 and 250 nt were identified with expression above genome background. Analysis of those reads limited the number of candidates to ~;;87 intergenic, while ~;;140 appeared to be antisense to annotated open reading frames (ORFs). Further BLAST analysis of the intergenic candidates and other Desulfovibrio genomes indicated that eight candidates were likely portions of ORFs not previously annotated in the D. vulgaris genome. Comparison of the intergenic and antisense data sets to the bioinformatical predicted candidates, resulted in ~;;54 common candidates. Current approaches using Northern analysis and qRT-PCR are being used toverify expression of the candidates and to further develop the role these sRNAs play in D. vulgaris regulation.« less
Burt, Andrew J; William, H Manilal; Perry, Gregory; Khanal, Raja; Pauls, K Peter; Kelly, James D; Navabi, Alireza
2015-01-01
Anthracnose, caused by Colletotrichum lindemuthianum, is an important fungal disease of common bean (Phaseolus vulgaris). Alleles at the Co-4 locus confer resistance to a number of races of C. lindemuthianum. A population of 94 F4:5 recombinant inbred lines of a cross between resistant black bean genotype B09197 and susceptible navy bean cultivar Nautica was used to identify markers associated with resistance in bean chromosome 8 (Pv08) where Co-4 is localized. Three SCAR markers with known linkage to Co-4 and a panel of single nucleotide markers were used for genotyping. A refined physical region on Pv08 with significant association with anthracnose resistance identified by markers was used in BLAST searches with the genomic sequence of common bean accession G19833. Thirty two unique annotated candidate genes were identified that spanned a physical region of 936.46 kb. A majority of the annotated genes identified had functional similarity to leucine rich repeats/receptor like kinase domains. Three annotated genes had similarity to 1, 3-β-glucanase domains. There were sequence similarities between some of the annotated genes found in the study and the genes associated with phosphoinositide-specific phosphilipases C associated with Co-x and the COK-4 loci found in previous studies. It is possible that the Co-4 locus is structured as a group of genes with functional domains dominated by protein tyrosine kinase along with leucine rich repeats/nucleotide binding site, phosphilipases C as well as β-glucanases.
Burt, Andrew J.; William, H. Manilal; Perry, Gregory; Khanal, Raja; Pauls, K. Peter; Kelly, James D.; Navabi, Alireza
2015-01-01
Anthracnose, caused by Colletotrichum lindemuthianum, is an important fungal disease of common bean (Phaseolus vulgaris). Alleles at the Co–4 locus confer resistance to a number of races of C. lindemuthianum. A population of 94 F4:5 recombinant inbred lines of a cross between resistant black bean genotype B09197 and susceptible navy bean cultivar Nautica was used to identify markers associated with resistance in bean chromosome 8 (Pv08) where Co–4 is localized. Three SCAR markers with known linkage to Co–4 and a panel of single nucleotide markers were used for genotyping. A refined physical region on Pv08 with significant association with anthracnose resistance identified by markers was used in BLAST searches with the genomic sequence of common bean accession G19833. Thirty two unique annotated candidate genes were identified that spanned a physical region of 936.46 kb. A majority of the annotated genes identified had functional similarity to leucine rich repeats/receptor like kinase domains. Three annotated genes had similarity to 1, 3-β-glucanase domains. There were sequence similarities between some of the annotated genes found in the study and the genes associated with phosphoinositide-specific phosphilipases C associated with Co-x and the COK–4 loci found in previous studies. It is possible that the Co–4 locus is structured as a group of genes with functional domains dominated by protein tyrosine kinase along with leucine rich repeats/nucleotide binding site, phosphilipases C as well as β-glucanases. PMID:26431031
Localization of TFIIB binding regions using serial analysis of chromatin occupancy
Yochum, Gregory S; Rajaraman, Veena; Cleland, Ryan; McWeeney, Shannon
2007-01-01
Background: RNA Polymerase II (RNAP II) is recruited to core promoters by the pre-initiation complex (PIC) of general transcription factors. Within the PIC, transcription factor for RNA polymerase IIB (TFIIB) determines the start site of transcription. TFIIB binding has not been localized, genome-wide, in metazoans. Serial analysis of chromatin occupancy (SACO) is an unbiased methodology used to empirically identify transcription factor binding regions. In this report, we use TFIIB and SACO to localize TFIIB binding regions across the rat genome. Results: A sample of the TFIIB SACO library was sequenced and 12,968 TFIIB genomic signature tags (GSTs) were assigned to the rat genome. GSTs are 20–22 base pair fragments that are derived from TFIIB bound chromatin. TFIIB localized to both non-protein coding and protein-coding loci. For 21% of the 1783 protein-coding genes in this sample of the SACO library, TFIIB binding mapped near the characterized 5' promoter that is upstream of the transcription start site (TSS). However, internal TFIIB binding positions were identified in 57% of the 1783 protein-coding genes. Internal positions are defined as those within an inclusive region greater than 2.5 kb downstream from the 5' TSS and 2.5 kb upstream from the transcription stop. We demonstrate that both TFIIB and TFIID (an additional component of PICs) bound to internal regions using chromatin immunoprecipitation (ChIP). The 5' cap of transcripts associated with internal TFIIB binding positions were identified using a cap-trapping assay. The 5' TSSs for internal transcripts were confirmed by primer extension. Additionally, an analysis of the functional annotation of mouse 3 (FANTOM3) databases indicates that internally initiated transcripts identified by TFIIB SACO in rat are conserved in mouse. Conclusion: Our findings that TFIIB binding is not restricted to the 5' upstream region indicates that the propensity for PIC to contribute to transcript diversity is far greater than previously appreciated. PMID:17997859
A stochastic context free grammar based framework for analysis of protein sequences
Dyrka, Witold; Nebel, Jean-Christophe
2009-01-01
Background In the last decade, there have been many applications of formal language theory in bioinformatics such as RNA structure prediction and detection of patterns in DNA. However, in the field of proteomics, the size of the protein alphabet and the complexity of relationship between amino acids have mainly limited the application of formal language theory to the production of grammars whose expressive power is not higher than stochastic regular grammars. However, these grammars, like other state of the art methods, cannot cover any higher-order dependencies such as nested and crossing relationships that are common in proteins. In order to overcome some of these limitations, we propose a Stochastic Context Free Grammar based framework for the analysis of protein sequences where grammars are induced using a genetic algorithm. Results This framework was implemented in a system aiming at the production of binding site descriptors. These descriptors not only allow detection of protein regions that are involved in these sites, but also provide insight in their structure. Grammars were induced using quantitative properties of amino acids to deal with the size of the protein alphabet. Moreover, we imposed some structural constraints on grammars to reduce the extent of the rule search space. Finally, grammars based on different properties were combined to convey as much information as possible. Evaluation was performed on sites of various sizes and complexity described either by PROSITE patterns, domain profiles or a set of patterns. Results show the produced binding site descriptors are human-readable and, hence, highlight biologically meaningful features. Moreover, they achieve good accuracy in both annotation and detection. In addition, findings suggest that, unlike current state-of-the-art methods, our system may be particularly suited to deal with patterns shared by non-homologous proteins. Conclusion A new Stochastic Context Free Grammar based framework has been introduced allowing the production of binding site descriptors for analysis of protein sequences. Experiments have shown that not only is this new approach valid, but produces human-readable descriptors for binding sites which have been beyond the capability of current machine learning techniques. PMID:19814800
Velagapudi, Sai Pradeep; Disney, Matthew D
2013-10-15
RNA is an extremely important target for the development of chemical probes of function or small molecule therapeutics. Aminoglycosides are the most well studied class of small molecules to target RNA. However, the RNA motifs outside of the bacterial rRNA A-site that are likely to be bound by these compounds in biological systems is largely unknown. If such information were known, it could allow for aminoglycosides to be exploited to target other RNAs and, in addition, could provide invaluable insights into potential bystander targets of these clinically used drugs. We utilized two-dimensional combinatorial screening (2DCS), a library-versus-library screening approach, to select the motifs displayed in a 3×3 nucleotide internal loop library and in a 6-nucleotide hairpin library that bind with high affinity and selectivity to six aminoglycoside derivatives. The selected RNA motifs were then analyzed using structure-activity relationships through sequencing (StARTS), a statistical approach that defines the privileged RNA motif space that binds a small molecule. StARTS allowed for the facile annotation of the selected RNA motif-aminoglycoside interactions in terms of affinity and selectivity. The interactions selected by 2DCS generally have nanomolar affinities, which is higher affinity than the binding of aminoglycosides to a mimic of their therapeutic target, the bacterial rRNA A-site. Copyright © 2013 Elsevier Ltd. All rights reserved.
Velagapudi, Sai Pradeep; Disney, Matthew D.
2013-01-01
RNA is an extremely important target for the development of chemical probes of function or small molecule therapeutics. Aminoglycosides are the most well studied class of small molecules to target RNA. However, the RNA motifs outside of the bacterial rRNA A-site that are likely to be bound by these compounds in biological systems is largely unknown. If such information were known, it could allow for aminoglycosides to be exploited to target other RNAs and, in addition, could provide invaluable insights into potential bystander targets of these clinically used drugs. We utilized two-dimensional combinatorial screening (2DCS), a library-versus-library screening approach, to select the motifs displayed in a 3 × 3 nucleotide internal loop library and in a 6-nucleotide hairpin library that bind with high affinity and selectivity to six aminoglycoside derivatives. The selected RNA motifs were then analyzed using structure–activity relationships through sequencing (StARTS), a statistical approach that defines the privileged RNA motif space that binds a small molecule. StARTS allowed for the facile annotation of the selected RNA motif–aminoglycoside interactions in terms of affinity and selectivity. The interactions selected by 2DCS generally have nanomolar affinities, which is higher affinity than the binding of aminoglycosides to a mimic of their therapeutic target, the bacterial rRNA A-site. PMID:23719281
Transcription Factor Binding Site Enrichment Analysis in Co-Expression Modules in Celiac Disease
Romero-Garmendia, Irati; Jauregi-Miguel, Amaia; Plaza-Izurieta, Leticia; Cros, Marie-Pierre; Legarda, Maria; Irastorza, Iñaki; Herceg, Zdenko; Fernandez-Jimenez, Nora
2018-01-01
The aim of this study was to construct celiac co-expression patterns at a whole genome level and to identify transcription factors (TFs) that could drive the gliadin-related changes in coordination of gene expression observed in celiac disease (CD). Differential co-expression modules were identified in the acute and chronic responses to gliadin using expression data from a previous microarray study in duodenal biopsies. Transcription factor binding site (TFBS) and Gene Ontology (GO) annotation enrichment analyses were performed in differentially co-expressed genes (DCGs) and selection of candidate regulators was performed. Expression of candidates was measured in clinical samples and the activation of the TFs was further characterized in C2BBe1 cells upon gliadin challenge. Enrichment analyses of the DCGs identified 10 TFs and five were selected for further investigation. Expression changes related to active CD were detected in four TFs, as well as in several of their in silico predicted targets. The activation of TFs was further characterized in C2BBe1 cells upon gliadin challenge, and an increase in nuclear translocation of CAMP Responsive Element Binding Protein 1 (CREB1) and IFN regulatory factor-1 (IRF1) in response to gliadin was observed. Using transcriptome-wide co-expression analyses we are able to propose novel genes involved in CD pathogenesis that respond upon gliadin stimulation, also in non-celiac models. PMID:29748492
Transcription Factor Binding Site Enrichment Analysis in Co-Expression Modules in Celiac Disease.
Romero-Garmendia, Irati; Garcia-Etxebarria, Koldo; Hernandez-Vargas, Hector; Santin, Izortze; Jauregi-Miguel, Amaia; Plaza-Izurieta, Leticia; Cros, Marie-Pierre; Legarda, Maria; Irastorza, Iñaki; Herceg, Zdenko; Fernandez-Jimenez, Nora; Bilbao, Jose Ramon
2018-05-10
The aim of this study was to construct celiac co-expression patterns at a whole genome level and to identify transcription factors (TFs) that could drive the gliadin-related changes in coordination of gene expression observed in celiac disease (CD). Differential co-expression modules were identified in the acute and chronic responses to gliadin using expression data from a previous microarray study in duodenal biopsies. Transcription factor binding site (TFBS) and Gene Ontology (GO) annotation enrichment analyses were performed in differentially co-expressed genes (DCGs) and selection of candidate regulators was performed. Expression of candidates was measured in clinical samples and the activation of the TFs was further characterized in C2BBe1 cells upon gliadin challenge. Enrichment analyses of the DCGs identified 10 TFs and five were selected for further investigation. Expression changes related to active CD were detected in four TFs, as well as in several of their in silico predicted targets. The activation of TFs was further characterized in C2BBe1 cells upon gliadin challenge, and an increase in nuclear translocation of CAMP Responsive Element Binding Protein 1 (CREB1) and IFN regulatory factor-1 (IRF1) in response to gliadin was observed. Using transcriptome-wide co-expression analyses we are able to propose novel genes involved in CD pathogenesis that respond upon gliadin stimulation, also in non-celiac models.
Wan, Emily S.; Qiu, Weiliang; Morrow, Jarrett; Beaty, Terri H.; Hetmanski, Jacqueline; Make, Barry J.; Lomas, David A.; Silverman, Edwin K.; DeMeo, Dawn L.
2015-01-01
Klinefelter syndrome (KS) (47 XXY) is a common sex-chromosome aneuploidy with an estimated prevalence of 1 in every 660 male births. Investigations into the associations between DNA methylation and the highly variable clinical manifestations of KS have largely focused on the supernumerary X chromosome; systematic investigations of the epigenome have been limited. We obtained genome-wide DNA methylation data from peripheral blood using the Illumina HumanMethylation450K platform in 5 KS (47 XXY), 102 male (46 XY), and 113 female (46 XX) control subjects participating in the chronic obstructive pulmonary disease (COPD) Gene Study. Empirical Bayes-mediated models were used to test for differential methylation by KS status. CpG sites with a false-discovery rate <0.05 from the first-generation HumanMethylation27K platform were further examined in an independent replication cohort of 2 KS subjects, 590 male, and 495 female controls drawn from the International COPD Genetics Network (ICGN). Differential methylation at sites throughout the genome were identified, including 86 CpG sites that were differentially methylated in KS subjects relative to both male and female controls. CpG sites annotated to the HEN1 methyltransferase homolog 1 (HENMT1), calcyclin-binding protein (CACYBP), and GTPase-activating protein (SH3 domain)-binding protein 1 (G3BP1) genes were among the “KS-specific” loci that were replicated in ICGN. We therefore conclude that site-specific differential methylation exists throughout the genome in KS. The functional impact and clinical relevance of these differentially methylated loci should be explored in future studies. PMID:25988574
Defining Transcriptional Regulatory Mechanisms for Primary let-7 miRNAs
Gaeta, Xavier; Le, Luat; Lin, Ying; Xie, Yuan; Lowry, William E.
2017-01-01
The let-7 family of miRNAs have been shown to control developmental timing in organisms from C. elegans to humans; their function in several essential cell processes throughout development is also well conserved. Numerous studies have defined several steps of post-transcriptional regulation of let-7 production; from pri-miRNA through pre-miRNA, to the mature miRNA that targets endogenous mRNAs for degradation or translational inhibition. Less-well defined are modes of transcriptional regulation of the pri-miRNAs for let-7. let-7 pri-miRNAs are expressed in polycistronic fashion, in long transcripts newly annotated based on chromatin-associated RNA-sequencing. Upon differentiation, we found that some let-7 pri-miRNAs are regulated at the transcriptional level, while others appear to be constitutively transcribed. Using the Epigenetic Roadmap database, we further annotated regulatory elements of each polycistron identified putative promoters and enhancers. Probing these regulatory elements for transcription factor binding sites identified factors that regulate transcription of let-7 in both promoter and enhancer regions, and identified novel regulatory mechanisms for this important class of miRNAs. PMID:28052101
Osypov, Alexander A; Krutinin, Gleb G; Krutinina, Eugenia A; Kamzolova, Svetlana G
2012-04-01
Electrostatic properties of genome DNA are important to its interactions with different proteins, in particular, related to transcription. DEPPDB - DNA Electrostatic Potential (and other Physical) Properties Database - provides information on the electrostatic and other physical properties of genome DNA combined with its sequence and annotation of biological and structural properties of genomes and their elements. Genomes are organized on taxonomical basis, supporting comparative and evolutionary studies. Currently, DEPPDB contains all completely sequenced bacterial, viral, mitochondrial, and plastids genomes according to the NCBI RefSeq, and some model eukaryotic genomes. Data for promoters, regulation sites, binding proteins, etc., are incorporated from established DBs and literature. The database is complemented by analytical tools. User sequences calculations are available. Case studies discovered electrostatics complementing DNA bending in E.coli plasmid BNT2 promoter functioning, possibly affecting host-environment metabolic switch. Transcription factors binding sites gravitate to high potential regions, confirming the electrostatics universal importance in protein-DNA interactions beyond the classical promoter-RNA polymerase recognition and regulation. Other genome elements, such as terminators, also show electrostatic peculiarities. Most intriguing are gene starts, exhibiting taxonomic correlations. The necessity of the genome electrostatic properties studies is discussed.
Regulation of neural macroRNAs by the transcriptional repressor REST
Johnson, Rory; Teh, Christina Hui-Leng; Jia, Hui; Vanisri, Ravi Raj; Pandey, Tridansh; Lu, Zhong-Hao; Buckley, Noel J.; Stanton, Lawrence W.; Lipovich, Leonard
2009-01-01
The essential transcriptional repressor REST (repressor element 1-silencing transcription factor) plays central roles in development and human disease by regulating a large cohort of neural genes. These have conventionally fallen into the class of known, protein-coding genes; recently, however, several noncoding microRNA genes were identified as REST targets. Given the widespread transcription of messenger RNA-like, noncoding RNAs (“macroRNAs”), some of which are functional and implicated in disease in mammalian genomes, we sought to determine whether this class of noncoding RNAs can also be regulated by REST. By applying a new, unbiased target gene annotation pipeline to computationally discovered REST binding sites, we find that 23% of mammalian REST genomic binding sites are within 10 kb of a macroRNA gene. These putative target genes were overlooked by previous studies. Focusing on a set of 18 candidate macroRNA targets from mouse, we experimentally demonstrate that two are regulated by REST in neural stem cells. Flanking protein-coding genes are, at most, weakly repressed, suggesting specific targeting of the macroRNAs by REST. Similar to the majority of known REST target genes, both of these macroRNAs are induced during nervous system development and have neurally restricted expression profiles in adult mouse. We observe a similar phenomenon in human: the DiGeorge syndrome-associated noncoding RNA, DGCR5, is repressed by REST through a proximal upstream binding site. Therefore neural macroRNAs represent an additional component of the REST regulatory network. These macroRNAs are new candidates for understanding the role of REST in neuronal development, neurodegeneration, and cancer. PMID:19050060
Regulation of neural macroRNAs by the transcriptional repressor REST.
Johnson, Rory; Teh, Christina Hui-Leng; Jia, Hui; Vanisri, Ravi Raj; Pandey, Tridansh; Lu, Zhong-Hao; Buckley, Noel J; Stanton, Lawrence W; Lipovich, Leonard
2009-01-01
The essential transcriptional repressor REST (repressor element 1-silencing transcription factor) plays central roles in development and human disease by regulating a large cohort of neural genes. These have conventionally fallen into the class of known, protein-coding genes; recently, however, several noncoding microRNA genes were identified as REST targets. Given the widespread transcription of messenger RNA-like, noncoding RNAs ("macroRNAs"), some of which are functional and implicated in disease in mammalian genomes, we sought to determine whether this class of noncoding RNAs can also be regulated by REST. By applying a new, unbiased target gene annotation pipeline to computationally discovered REST binding sites, we find that 23% of mammalian REST genomic binding sites are within 10 kb of a macroRNA gene. These putative target genes were overlooked by previous studies. Focusing on a set of 18 candidate macroRNA targets from mouse, we experimentally demonstrate that two are regulated by REST in neural stem cells. Flanking protein-coding genes are, at most, weakly repressed, suggesting specific targeting of the macroRNAs by REST. Similar to the majority of known REST target genes, both of these macroRNAs are induced during nervous system development and have neurally restricted expression profiles in adult mouse. We observe a similar phenomenon in human: the DiGeorge syndrome-associated noncoding RNA, DGCR5, is repressed by REST through a proximal upstream binding site. Therefore neural macroRNAs represent an additional component of the REST regulatory network. These macroRNAs are new candidates for understanding the role of REST in neuronal development, neurodegeneration, and cancer.
Prostate Cancer Biospecimen Cohort Study
2016-10-01
goal of the study is development of a Prostate Cancer Biorepository Network (PCBN) resource site with high quality and well-annotated urine , blood...with no coordinating center and each site will be responsible for maintaining/storing their own data/ samples . 15. SUBJECT TERMS Prostate cancer...Biorepository Network (PCBN) resource site with high quality and well-annotated urine , blood, and tissue specimens as part of a multi-institutional Department of
Yang, Jian-Hua; Li, Jun-Hao; Jiang, Shan; Zhou, Hui; Qu, Liang-Hu
2013-01-01
Long non-coding RNAs (lncRNAs) and microRNAs (miRNAs) represent two classes of important non-coding RNAs in eukaryotes. Although these non-coding RNAs have been implicated in organismal development and in various human diseases, surprisingly little is known about their transcriptional regulation. Recent advances in chromatin immunoprecipitation with next-generation DNA sequencing (ChIP-Seq) have provided methods of detecting transcription factor binding sites (TFBSs) with unprecedented sensitivity. In this study, we describe ChIPBase (http://deepbase.sysu.edu.cn/chipbase/), a novel database that we have developed to facilitate the comprehensive annotation and discovery of transcription factor binding maps and transcriptional regulatory relationships of lncRNAs and miRNAs from ChIP-Seq data. The current release of ChIPBase includes high-throughput sequencing data that were generated by 543 ChIP-Seq experiments in diverse tissues and cell lines from six organisms. By analysing millions of TFBSs, we identified tens of thousands of TF-lncRNA and TF-miRNA regulatory relationships. Furthermore, two web-based servers were developed to annotate and discover transcriptional regulatory relationships of lncRNAs and miRNAs from ChIP-Seq data. In addition, we developed two genome browsers, deepView and genomeView, to provide integrated views of multidimensional data. Moreover, our web implementation supports diverse query types and the exploration of TFs, lncRNAs, miRNAs, gene ontologies and pathways.
Nebula--a web-server for advanced ChIP-seq data analysis.
Boeva, Valentina; Lermine, Alban; Barette, Camille; Guillouf, Christel; Barillot, Emmanuel
2012-10-01
ChIP-seq consists of chromatin immunoprecipitation and deep sequencing of the extracted DNA fragments. It is the technique of choice for accurate characterization of the binding sites of transcription factors and other DNA-associated proteins. We present a web service, Nebula, which allows inexperienced users to perform a complete bioinformatics analysis of ChIP-seq data. Nebula was designed for both bioinformaticians and biologists. It is based on the Galaxy open source framework. Galaxy already includes a large number of functionalities for mapping reads and peak calling. We added the following to Galaxy: (i) peak calling with FindPeaks and a module for immunoprecipitation quality control, (ii) de novo motif discovery with ChIPMunk, (iii) calculation of the density and the cumulative distribution of peak locations relative to gene transcription start sites, (iv) annotation of peaks with genomic features and (v) annotation of genes with peak information. Nebula generates the graphs and the enrichment statistics at each step of the process. During Steps 3-5, Nebula optionally repeats the analysis on a control dataset and compares these results with those from the main dataset. Nebula can also incorporate gene expression (or gene modulation) data during these steps. In summary, Nebula is an innovative web service that provides an advanced ChIP-seq analysis pipeline providing ready-to-publish results. Nebula is available at http://nebula.curie.fr/ Supplementary data are available at Bioinformatics online.
Model annotation for synthetic biology: automating model to nucleotide sequence conversion
Misirli, Goksel; Hallinan, Jennifer S.; Yu, Tommy; Lawson, James R.; Wimalaratne, Sarala M.; Cooling, Michael T.; Wipat, Anil
2011-01-01
Motivation: The need for the automated computational design of genetic circuits is becoming increasingly apparent with the advent of ever more complex and ambitious synthetic biology projects. Currently, most circuits are designed through the assembly of models of individual parts such as promoters, ribosome binding sites and coding sequences. These low level models are combined to produce a dynamic model of a larger device that exhibits a desired behaviour. The larger model then acts as a blueprint for physical implementation at the DNA level. However, the conversion of models of complex genetic circuits into DNA sequences is a non-trivial undertaking due to the complexity of mapping the model parts to their physical manifestation. Automating this process is further hampered by the lack of computationally tractable information in most models. Results: We describe a method for automatically generating DNA sequences from dynamic models implemented in CellML and Systems Biology Markup Language (SBML). We also identify the metadata needed to annotate models to facilitate automated conversion, and propose and demonstrate a method for the markup of these models using RDF. Our algorithm has been implemented in a software tool called MoSeC. Availability: The software is available from the authors' web site http://research.ncl.ac.uk/synthetic_biology/downloads.html. Contact: anil.wipat@ncl.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online. PMID:21296753
Cheng, Feng; Pan, Ying; Lu, Yi-Min; Zhu, Lei; Chen, Shuzheng
2017-01-01
RNA-binding proteins (RBPs) and miRNAs are capable of controlling processes in normal development and cancer. Both of them could determine RNA transcripts fate from synthesis to decay. One such RBP, Dead end (Dnd1), is essential for regulating germ-cell viability and suppresses the germ-cell tumors development, yet how it exerts its functions in breast cancer has remained unresolved. The level of Dnd1 was detected in 21 cancerous tissues paired with neighboring normal tissues by qRT-PCR. We further annotated TCGA (The Cancer Genome Atlas) mRNA expression profiles and found that the expression of Dnd1 and Bim is positively correlated ( p = 0.04). Patients with higher Dnd1 expression level had longer overall survival ( p = 0.0014) by KM Plotter tool. Dnd1 knockdown in MCF-7 cells decreased Bim expression levels and inhibited apoptosis. While knockdown of Dnd1 promoted the decay of Bim mRNA 3'UTR, the stability of Bim-5'UTR was not affected. In addition, mutation of miR-221-binding site in Bim-3'UTR canceled the effect of Dnd1 on Bim mRNA. Knockdown of Dnd1 in MCF-7 cells confirmed that Dnd1 antagonized miR-221-inhibitory effects on Bim expression. Overall, our findings indicate that Dnd1 facilitates apoptosis by increasing the expression of Bim via its competitive combining with miR-221 in Bim-3'UTR. The new function of Dnd1 may contribute to a vital role in breast cancer development.
Wang, Min; Hancock, Timothy P; Chamberlain, Amanda J; Vander Jagt, Christy J; Pryce, Jennie E; Cocks, Benjamin G; Goddard, Mike E; Hayes, Benjamin J
2018-05-24
Topological association domains (TADs) are chromosomal domains characterised by frequent internal DNA-DNA interactions. The transcription factor CTCF binds to conserved DNA sequence patterns called CTCF binding motifs to either prohibit or facilitate chromosomal interactions. TADs and CTCF binding motifs control gene expression, but they are not yet well defined in the bovine genome. In this paper, we sought to improve the annotation of bovine TADs and CTCF binding motifs, and assess whether the new annotation can reduce the search space for cis-regulatory variants. We used genomic synteny to map TADs and CTCF binding motifs from humans, mice, dogs and macaques to the bovine genome. We found that our mapped TADs exhibited the same hallmark properties of those sourced from experimental data, such as housekeeping genes, transfer RNA genes, CTCF binding motifs, short interspersed elements, H3K4me3 and H3K27ac. We showed that runs of genes with the same pattern of allele-specific expression (ASE) (either favouring paternal or maternal allele) were often located in the same TAD or between the same conserved CTCF binding motifs. Analyses of variance showed that when averaged across all bovine tissues tested, TADs explained 14% of ASE variation (standard deviation, SD: 0.056), while CTCF explained 27% (SD: 0.078). Furthermore, we showed that the quantitative trait loci (QTLs) associated with gene expression variation (eQTLs) or ASE variation (aseQTLs), which were identified from mRNA transcripts from 141 lactating cows' white blood and milk cells, were highly enriched at putative bovine CTCF binding motifs. The linearly-furthermost, and most-significant aseQTL and eQTL for each genic target were located within the same TAD as the gene more often than expected (Chi-Squared test P-value < 0.001). Our results suggest that genomic synteny can be used to functionally annotate conserved transcriptional components, and provides a tool to reduce the search space for causative regulatory variants in the bovine genome.
Accelerating the Pace of Protein Functional Annotation With Intel Xeon Phi Coprocessors.
Feinstein, Wei P; Moreno, Juana; Jarrell, Mark; Brylinski, Michal
2015-06-01
Intel Xeon Phi is a new addition to the family of powerful parallel accelerators. The range of its potential applications in computationally driven research is broad; however, at present, the repository of scientific codes is still relatively limited. In this study, we describe the development and benchmarking of a parallel version of eFindSite, a structural bioinformatics algorithm for the prediction of ligand-binding sites in proteins. Implemented for the Intel Xeon Phi platform, the parallelization of the structure alignment portion of eFindSite using pragma-based OpenMP brings about the desired performance improvements, which scale well with the number of computing cores. Compared to a serial version, the parallel code runs 11.8 and 10.1 times faster on the CPU and the coprocessor, respectively; when both resources are utilized simultaneously, the speedup is 17.6. For example, ligand-binding predictions for 501 benchmarking proteins are completed in 2.1 hours on a single Stampede node equipped with the Intel Xeon Phi card compared to 3.1 hours without the accelerator and 36.8 hours required by a serial version. In addition to the satisfactory parallel performance, porting existing scientific codes to the Intel Xeon Phi architecture is relatively straightforward with a short development time due to the support of common parallel programming models by the coprocessor. The parallel version of eFindSite is freely available to the academic community at www.brylinski.org/efindsite.
High Precision Prediction of Functional Sites in Protein Structures
Buturovic, Ljubomir; Wong, Mike; Tang, Grace W.; Altman, Russ B.; Petkovic, Dragutin
2014-01-01
We address the problem of assigning biological function to solved protein structures. Computational tools play a critical role in identifying potential active sites and informing screening decisions for further lab analysis. A critical parameter in the practical application of computational methods is the precision, or positive predictive value. Precision measures the level of confidence the user should have in a particular computed functional assignment. Low precision annotations lead to futile laboratory investigations and waste scarce research resources. In this paper we describe an advanced version of the protein function annotation system FEATURE, which achieved 99% precision and average recall of 95% across 20 representative functional sites. The system uses a Support Vector Machine classifier operating on the microenvironment of physicochemical features around an amino acid. We also compared performance of our method with state-of-the-art sequence-level annotator Pfam in terms of precision, recall and localization. To our knowledge, no other functional site annotator has been rigorously evaluated against these key criteria. The software and predictive models are incorporated into the WebFEATURE service at http://feature.stanford.edu/wf4.0-beta. PMID:24632601
Feld, Christine; Sahu, Peeyush; Frech, Miriam; Finkernagel, Florian; Nist, Andrea; Stiewe, Thorsten; Bauer, Uta-Maria; Neubauer, Andreas
2018-01-01
Abstract SKI is a transcriptional co-regulator and overexpressed in various human tumors, for example in acute myeloid leukemia (AML). SKI contributes to the origin and maintenance of the leukemic phenotype. Here, we use ChIP-seq and RNA-seq analysis to identify the epigenetic alterations induced by SKI overexpression in AML cells. We show that approximately two thirds of differentially expressed genes are up-regulated upon SKI deletion, of which >40% harbor SKI binding sites in their proximity, primarily in enhancer regions. Gene ontology analysis reveals that many of the differentially expressed genes are annotated to hematopoietic cell differentiation and inflammatory response, corroborating our finding that SKI contributes to a myeloid differentiation block in HL60 cells. We find that SKI peaks are enriched for RUNX1 consensus motifs, particularly in up-regulated SKI targets upon SKI deletion. RUNX1 ChIP-seq displays that nearly 70% of RUNX1 binding sites overlap with SKI peaks, mainly at enhancer regions. SKI and RUNX1 occupy the same genomic sites and cooperate in gene silencing. Our work demonstrates for the first time the predominant co-repressive function of SKI in AML cells on a genome-wide scale and uncovers the transcription factor RUNX1 as an important mediator of SKI-dependent transcriptional repression. PMID:29471413
SiteBinder: an improved approach for comparing multiple protein structural motifs.
Sehnal, David; Vařeková, Radka Svobodová; Huber, Heinrich J; Geidl, Stanislav; Ionescu, Crina-Maria; Wimmerová, Michaela; Koča, Jaroslav
2012-02-27
There is a paramount need to develop new techniques and tools that will extract as much information as possible from the ever growing repository of protein 3D structures. We report here on the development of a software tool for the multiple superimposition of large sets of protein structural motifs. Our superimposition methodology performs a systematic search for the atom pairing that provides the best fit. During this search, the RMSD values for all chemically relevant pairings are calculated by quaternion algebra. The number of evaluated pairings is markedly decreased by using PDB annotations for atoms. This approach guarantees that the best fit will be found and can be applied even when sequence similarity is low or does not exist at all. We have implemented this methodology in the Web application SiteBinder, which is able to process up to thousands of protein structural motifs in a very short time, and which provides an intuitive and user-friendly interface. Our benchmarking analysis has shown the robustness, efficiency, and versatility of our methodology and its implementation by the successful superimposition of 1000 experimentally determined structures for each of 32 eukaryotic linear motifs. We also demonstrate the applicability of SiteBinder using three case studies. We first compared the structures of 61 PA-IIL sugar binding sites containing nine different sugars, and we found that the sugar binding sites of PA-IIL and its mutants have a conserved structure despite their binding different sugars. We then superimposed over 300 zinc finger central motifs and revealed that the molecular structure in the vicinity of the Zn atom is highly conserved. Finally, we superimposed 12 BH3 domains from pro-apoptotic proteins. Our findings come to support the hypothesis that there is a structural basis for the functional segregation of BH3-only proteins into activators and enablers.
DIBS: a repository of disordered binding sites mediating interactions with ordered proteins.
Schad, Eva; Fichó, Erzsébet; Pancsa, Rita; Simon, István; Dosztányi, Zsuzsanna; Mészáros, Bálint
2018-02-01
Intrinsically Disordered Proteins (IDPs) mediate crucial protein-protein interactions, most notably in signaling and regulation. As their importance is increasingly recognized, the detailed analyses of specific IDP interactions opened up new opportunities for therapeutic targeting. Yet, large scale information about IDP-mediated interactions in structural and functional details are lacking, hindering the understanding of the mechanisms underlying this distinct binding mode. Here, we present DIBS, the first comprehensive, curated collection of complexes between IDPs and ordered proteins. DIBS not only describes by far the highest number of cases, it also provides the dissociation constants of their interactions, as well as the description of potential post-translational modifications modulating the binding strength and linear motifs involved in the binding. Together with the wide range of structural and functional annotations, DIBS will provide the cornerstone for structural and functional studies of IDP complexes. DIBS is freely accessible at http://dibs.enzim.ttk.mta.hu/. The DIBS application is hosted by Apache web server and was implemented in PHP. To enrich querying features and to enhance backend performance a MySQL database was also created. dosztanyi@caesar.elte.hu or bmeszaros@caesar.elte.hu. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press.
GermOnline 4.0 is a genomics gateway for germline development, meiosis and the mitotic cell cycle.
Lardenois, Aurélie; Gattiker, Alexandre; Collin, Olivier; Chalmel, Frédéric; Primig, Michael
2010-01-01
GermOnline 4.0 is a cross-species database portal focusing on high-throughput expression data relevant for germline development, the meiotic cell cycle and mitosis in healthy versus malignant cells. It is thus a source of information for life scientists as well as clinicians who are interested in gene expression and regulatory networks. The GermOnline gateway provides unlimited access to information produced with high-density oligonucleotide microarrays (3'-UTR GeneChips), genome-wide protein-DNA binding assays and protein-protein interaction studies in the context of Ensembl genome annotation. Samples used to produce high-throughput expression data and to carry out genome-wide in vivo DNA binding assays are annotated via the MIAME-compliant Multiomics Information Management and Annotation System (MIMAS 3.0). Furthermore, the Saccharomyces Genomics Viewer (SGV) was developed and integrated into the gateway. SGV is a visualization tool that outputs genome annotation and DNA-strand specific expression data produced with high-density oligonucleotide tiling microarrays (Sc_tlg GeneChips) which cover the complete budding yeast genome on both DNA strands. It facilitates the interpretation of expression levels and transcript structures determined for various cell types cultured under different growth and differentiation conditions. Database URL: www.germonline.org/
GermOnline 4.0 is a genomics gateway for germline development, meiosis and the mitotic cell cycle
Lardenois, Aurélie; Gattiker, Alexandre; Collin, Olivier; Chalmel, Frédéric; Primig, Michael
2010-01-01
GermOnline 4.0 is a cross-species database portal focusing on high-throughput expression data relevant for germline development, the meiotic cell cycle and mitosis in healthy versus malignant cells. It is thus a source of information for life scientists as well as clinicians who are interested in gene expression and regulatory networks. The GermOnline gateway provides unlimited access to information produced with high-density oligonucleotide microarrays (3′-UTR GeneChips), genome-wide protein–DNA binding assays and protein–protein interaction studies in the context of Ensembl genome annotation. Samples used to produce high-throughput expression data and to carry out genome-wide in vivo DNA binding assays are annotated via the MIAME-compliant Multiomics Information Management and Annotation System (MIMAS 3.0). Furthermore, the Saccharomyces Genomics Viewer (SGV) was developed and integrated into the gateway. SGV is a visualization tool that outputs genome annotation and DNA-strand specific expression data produced with high-density oligonucleotide tiling microarrays (Sc_tlg GeneChips) which cover the complete budding yeast genome on both DNA strands. It facilitates the interpretation of expression levels and transcript structures determined for various cell types cultured under different growth and differentiation conditions. Database URL: www.germonline.org/ PMID:21149299
Suplatov, Dmitry; Sharapova, Yana; Timonina, Daria; Kopylov, Kirill; Švedas, Vytas
2018-04-01
The visualCMAT web-server was designed to assist experimental research in the fields of protein/enzyme biochemistry, protein engineering, and drug discovery by providing an intuitive and easy-to-use interface to the analysis of correlated mutations/co-evolving residues. Sequence and structural information describing homologous proteins are used to predict correlated substitutions by the Mutual information-based CMAT approach, classify them into spatially close co-evolving pairs, which either form a direct physical contact or interact with the same ligand (e.g. a substrate or a crystallographic water molecule), and long-range correlations, annotate and rank binding sites on the protein surface by the presence of statistically significant co-evolving positions. The results of the visualCMAT are organized for a convenient visual analysis and can be downloaded to a local computer as a content-rich all-in-one PyMol session file with multiple layers of annotation corresponding to bioinformatic, statistical and structural analyses of the predicted co-evolution, or further studied online using the built-in interactive analysis tools. The online interactivity is implemented in HTML5 and therefore neither plugins nor Java are required. The visualCMAT web-server is integrated with the Mustguseal web-server capable of constructing large structure-guided sequence alignments of protein families and superfamilies using all available information about their structures and sequences in public databases. The visualCMAT web-server can be used to understand the relationship between structure and function in proteins, implemented at selecting hotspots and compensatory mutations for rational design and directed evolution experiments to produce novel enzymes with improved properties, and employed at studying the mechanism of selective ligand's binding and allosteric communication between topologically independent sites in protein structures. The web-server is freely available at https://biokinet.belozersky.msu.ru/visualcmat and there are no login requirements.
The Thiamin Pyrophosphate-Motif
NASA Technical Reports Server (NTRS)
Dominiak, Paulina M.; Ciszak, Ewa M.
2003-01-01
Using databases the authors have identified a common thiamin pyrophosphate (TPP)-motif in the family of functionally diverse TPP-dependent enzymes. This common motif consists of multimeric organization of subunits, two catalytic centers, common amino acid sequence, and specific contacts to provide a flip-flop, or alternate site, mechanism of action. Each catalytic center [PP:PYR] is formed at the interface of the PP-domain binding the magnesium ion, pyrophosphate and aminopyrimidine ring of TPP, and the PYR-domain binding the aminopyrimidine ring of that cofactor. A pair of these catalytic centers constitutes the catalytic core [PP:PYR]* within these enzymes. Analysis of the structural elements of this catalytic core reveals novel definition of the common amino acid sequences, which are GX@&(G)@XXGQ, and GDGX25-30 within the PP- domain, and the E&(G)@XXG@ within the PYR-domain, where Q, corresponds to a hydrophobic amino acid. This TPP-motif provides a novel tool for annotation of TPP-dependent enzymes useful in advancing functional proteomics.
Garamszegi, Sara; Franzosa, Eric A.; Xia, Yu
2013-01-01
A central challenge in host-pathogen systems biology is the elucidation of general, systems-level principles that distinguish host-pathogen interactions from within-host interactions. Current analyses of host-pathogen and within-host protein-protein interaction networks are largely limited by their resolution, treating proteins as nodes and interactions as edges. Here, we construct a domain-resolved map of human-virus and within-human protein-protein interaction networks by annotating protein interactions with high-coverage, high-accuracy, domain-centric interaction mechanisms: (1) domain-domain interactions, in which a domain in one protein binds to a domain in a second protein, and (2) domain-motif interactions, in which a domain in one protein binds to a short, linear peptide motif in a second protein. Analysis of these domain-resolved networks reveals, for the first time, significant mechanistic differences between virus-human and within-human interactions at the resolution of single domains. While human proteins tend to compete with each other for domain binding sites by means of sequence similarity, viral proteins tend to compete with human proteins for domain binding sites in the absence of sequence similarity. Independent of their previously established preference for targeting human protein hubs, viral proteins also preferentially target human proteins containing linear motif-binding domains. Compared to human proteins, viral proteins participate in more domain-motif interactions, target more unique linear motif-binding domains per residue, and contain more unique linear motifs per residue. Together, these results suggest that viruses surmount genome size constraints by convergently evolving multiple short linear motifs in order to effectively mimic, hijack, and manipulate complex host processes for their survival. Our domain-resolved analyses reveal unique signatures of pleiotropy, economy, and convergent evolution in viral-host interactions that are otherwise hidden in the traditional binary network, highlighting the power and necessity of high-resolution approaches in host-pathogen systems biology. PMID:24339775
Garamszegi, Sara; Franzosa, Eric A; Xia, Yu
2013-01-01
A central challenge in host-pathogen systems biology is the elucidation of general, systems-level principles that distinguish host-pathogen interactions from within-host interactions. Current analyses of host-pathogen and within-host protein-protein interaction networks are largely limited by their resolution, treating proteins as nodes and interactions as edges. Here, we construct a domain-resolved map of human-virus and within-human protein-protein interaction networks by annotating protein interactions with high-coverage, high-accuracy, domain-centric interaction mechanisms: (1) domain-domain interactions, in which a domain in one protein binds to a domain in a second protein, and (2) domain-motif interactions, in which a domain in one protein binds to a short, linear peptide motif in a second protein. Analysis of these domain-resolved networks reveals, for the first time, significant mechanistic differences between virus-human and within-human interactions at the resolution of single domains. While human proteins tend to compete with each other for domain binding sites by means of sequence similarity, viral proteins tend to compete with human proteins for domain binding sites in the absence of sequence similarity. Independent of their previously established preference for targeting human protein hubs, viral proteins also preferentially target human proteins containing linear motif-binding domains. Compared to human proteins, viral proteins participate in more domain-motif interactions, target more unique linear motif-binding domains per residue, and contain more unique linear motifs per residue. Together, these results suggest that viruses surmount genome size constraints by convergently evolving multiple short linear motifs in order to effectively mimic, hijack, and manipulate complex host processes for their survival. Our domain-resolved analyses reveal unique signatures of pleiotropy, economy, and convergent evolution in viral-host interactions that are otherwise hidden in the traditional binary network, highlighting the power and necessity of high-resolution approaches in host-pathogen systems biology.
Li, Minghui; Goncearenco, Alexander; Panchenko, Anna R
2017-01-01
In this review we describe a protocol to annotate the effects of missense mutations on proteins, their functions, stability, and binding. For this purpose we present a collection of the most comprehensive databases which store different types of sequencing data on missense mutations, we discuss their relationships, possible intersections, and unique features. Next, we suggest an annotation workflow using the state-of-the art methods and highlight their usability, advantages, and limitations for different cases. Finally, we address a particularly difficult problem of deciphering the molecular mechanisms of mutations on proteins and protein complexes to understand the origins and mechanisms of diseases.
Suprafenacine, an Indazole-Hydrazide Agent, Targets Cancer Cells Through Microtubule Destabilization
Choi, Bo-Hwa; Chattopadhaya, Souvik; Thanh, Le Nguyen; Feng, Lin; Nguyen, Quoc Toan; Lim, Chuan Bian; Harikishore, Amaravadhi; Nanga, Ravi Prakash Reddy; Bharatham, Nagakumar; Zhao, Yan; Liu, Xuewei; Yoon, Ho Sup
2014-01-01
Microtubules are a highly validated target in cancer therapy. However, the clinical development of tubulin binding agents (TBA) has been hampered by toxicity and chemoresistance issues and has necessitated the search for new TBAs. Here, we report the identification of a novel cell permeable, tubulin-destabilizing molecule - 4,5,6,7-tetrahydro-1H-indazole-3-carboxylic acid [1p-tolyl-meth-(E)-ylidene]-hydrazide (termed as Suprafenacine, SRF). SRF, identified by in silico screening of annotated chemical libraries, was shown to bind microtubules at the colchicine-binding site and inhibit polymerization. This led to G2/M cell cycle arrest and cell death via a mitochondria-mediated apoptotic pathway. Cell death was preceded by loss of mitochondrial membrane potential, JNK - mediated phosphorylation of Bcl-2 and Bad, and activation of caspase-3. Intriguingly, SRF was found to selectively inhibit cancer cell proliferation and was effective against drug-resistant cancer cells by virtue of its ability to bypass the multidrug resistance transporter P-glycoprotein. Taken together, our results suggest that SRF has potential as a chemotherapeutic agent for cancer treatment and provides an alternate scaffold for the development of improved anti-cancer agents. PMID:25354194
ERIC Educational Resources Information Center
Lee, John K.; Calandra, Brendan
2004-01-01
Two versions of a Web site on the United States Constitution were used by students in separate high school history classes to solve problems that emerged from four constitutional scenarios. One site contained embedded conceptual scaffolding devices in the form of textual annotations; the other did not. The results of our study demonstrated the…
Computer systems for annotation of single molecule fragments
Schwartz, David Charles; Severin, Jessica
2016-07-19
There are provided computer systems for visualizing and annotating single molecule images. Annotation systems in accordance with this disclosure allow a user to mark and annotate single molecules of interest and their restriction enzyme cut sites thereby determining the restriction fragments of single nucleic acid molecules. The markings and annotations may be automatically generated by the system in certain embodiments and they may be overlaid translucently onto the single molecule images. An image caching system may be implemented in the computer annotation systems to reduce image processing time. The annotation systems include one or more connectors connecting to one or more databases capable of storing single molecule data as well as other biomedical data. Such diverse array of data can be retrieved and used to validate the markings and annotations. The annotation systems may be implemented and deployed over a computer network. They may be ergonomically optimized to facilitate user interactions.
Irla, Marta; Neshat, Armin; Brautaset, Trygve; Rückert, Christian; Kalinowski, Jörn; Wendisch, Volker F
2015-02-14
Bacillus methanolicus MGA3 is a thermophilic, facultative ribulose monophosphate (RuMP) cycle methylotroph. Together with its ability to produce high yields of amino acids, the relevance of this microorganism as a promising candidate for biotechnological applications is evident. The B. methanolicus MGA3 genome consists of a 3,337,035 nucleotides (nt) circular chromosome, the 19,174 nt plasmid pBM19 and the 68,999 nt plasmid pBM69. 3,218 protein-coding regions were annotated on the chromosome, 22 on pBM19 and 82 on pBM69. In the present study, the RNA-seq approach was used to comprehensively investigate the transcriptome of B. methanolicus MGA3 in order to improve the genome annotation, identify novel transcripts, analyze conserved sequence motifs involved in gene expression and reveal operon structures. For this aim, two different cDNA library preparation methods were applied: one which allows characterization of the whole transcriptome and another which includes enrichment of primary transcript 5'-ends. Analysis of the primary transcriptome data enabled the detection of 2,167 putative transcription start sites (TSSs) which were categorized into 1,642 TSSs located in the upstream region (5'-UTR) of known protein-coding genes and 525 TSSs of novel antisense, intragenic, or intergenic transcripts. Firstly, 14 wrongly annotated translation start sites (TLSs) were corrected based on primary transcriptome data. Further investigation of the identified 5'-UTRs resulted in the detailed characterization of their length distribution and the detection of 75 hitherto unknown cis-regulatory RNA elements. Moreover, the exact TSSs positions were utilized to define conserved sequence motifs for translation start sites, ribosome binding sites and promoters in B. methanolicus MGA3. Based on the whole transcriptome data set, novel transcripts, operon structures and mRNA abundances were determined. The analysis of the operon structures revealed that almost half of the genes are transcribed monocistronically (940), whereas 1,164 genes are organized in 381 operons. Several of the genes related to methylotrophy had highly abundant transcripts. The extensive insights into the transcriptional landscape of B. methanolicus MGA3, gained in this study, represent a valuable foundation for further comparative quantitative transcriptome analyses and possibly also for the development of molecular biology tools which at present are very limited for this organism.
GENCODE: the reference human genome annotation for The ENCODE Project.
Harrow, Jennifer; Frankish, Adam; Gonzalez, Jose M; Tapanari, Electra; Diekhans, Mark; Kokocinski, Felix; Aken, Bronwen L; Barrell, Daniel; Zadissa, Amonida; Searle, Stephen; Barnes, If; Bignell, Alexandra; Boychenko, Veronika; Hunt, Toby; Kay, Mike; Mukherjee, Gaurab; Rajan, Jeena; Despacio-Reyes, Gloria; Saunders, Gary; Steward, Charles; Harte, Rachel; Lin, Michael; Howald, Cédric; Tanzer, Andrea; Derrien, Thomas; Chrast, Jacqueline; Walters, Nathalie; Balasubramanian, Suganthi; Pei, Baikang; Tress, Michael; Rodriguez, Jose Manuel; Ezkurdia, Iakes; van Baren, Jeltje; Brent, Michael; Haussler, David; Kellis, Manolis; Valencia, Alfonso; Reymond, Alexandre; Gerstein, Mark; Guigó, Roderic; Hubbard, Tim J
2012-09-01
The GENCODE Consortium aims to identify all gene features in the human genome using a combination of computational analysis, manual annotation, and experimental validation. Since the first public release of this annotation data set, few new protein-coding loci have been added, yet the number of alternative splicing transcripts annotated has steadily increased. The GENCODE 7 release contains 20,687 protein-coding and 9640 long noncoding RNA loci and has 33,977 coding transcripts not represented in UCSC genes and RefSeq. It also has the most comprehensive annotation of long noncoding RNA (lncRNA) loci publicly available with the predominant transcript form consisting of two exons. We have examined the completeness of the transcript annotation and found that 35% of transcriptional start sites are supported by CAGE clusters and 62% of protein-coding genes have annotated polyA sites. Over one-third of GENCODE protein-coding genes are supported by peptide hits derived from mass spectrometry spectra submitted to Peptide Atlas. New models derived from the Illumina Body Map 2.0 RNA-seq data identify 3689 new loci not currently in GENCODE, of which 3127 consist of two exon models indicating that they are possibly unannotated long noncoding loci. GENCODE 7 is publicly available from gencodegenes.org and via the Ensembl and UCSC Genome Browsers.
The standard operating procedure of the DOE-JGI Microbial Genome Annotation Pipeline (MGAP v.4).
Huntemann, Marcel; Ivanova, Natalia N; Mavromatis, Konstantinos; Tripp, H James; Paez-Espino, David; Palaniappan, Krishnaveni; Szeto, Ernest; Pillay, Manoj; Chen, I-Min A; Pati, Amrita; Nielsen, Torben; Markowitz, Victor M; Kyrpides, Nikos C
2015-01-01
The DOE-JGI Microbial Genome Annotation Pipeline performs structural and functional annotation of microbial genomes that are further included into the Integrated Microbial Genome comparative analysis system. MGAP is applied to assembled nucleotide sequence datasets that are provided via the IMG submission site. Dataset submission for annotation first requires project and associated metadata description in GOLD. The MGAP sequence data processing consists of feature prediction including identification of protein-coding genes, non-coding RNAs and regulatory RNA features, as well as CRISPR elements. Structural annotation is followed by assignment of protein product names and functions.
The RNA-binding protein repertoire of embryonic stem cells.
Kwon, S Chul; Yi, Hyerim; Eichelbaum, Katrin; Föhr, Sophia; Fischer, Bernd; You, Kwon Tae; Castello, Alfredo; Krijgsveld, Jeroen; Hentze, Matthias W; Kim, V Narry
2013-09-01
RNA-binding proteins (RBPs) have essential roles in RNA-mediated gene regulation, and yet annotation of RBPs is limited mainly to those with known RNA-binding domains. To systematically identify the RBPs of embryonic stem cells (ESCs), we here employ interactome capture, which combines UV cross-linking of RBP to RNA in living cells, oligo(dT) capture and MS. From mouse ESCs (mESCs), we have defined 555 proteins constituting the mESC mRNA interactome, including 283 proteins not previously annotated as RBPs. Of these, 68 new RBP candidates are highly expressed in ESCs compared to differentiated cells, implicating a role in stem-cell physiology. Two well-known E3 ubiquitin ligases, Trim25 (also called Efp) and Trim71 (also called Lin41), are validated as RBPs, revealing a potential link between RNA biology and protein-modification pathways. Our study confirms and expands the atlas of RBPs, providing a useful resource for the study of the RNA-RBP network in stem cells.
ERIC Educational Resources Information Center
Web Feet, 2001
2001-01-01
This annotated subject guide to Web sites for grades K-8 focuses on biography, dinosaurs, fairy tales and folk tales, history, math, science, and calendar connections for December observances. Specific grade levels are indicated for each annotation. (LRW)
Oliveira, Alberto; Bleicher, Lucas; Schrago, Carlos G; Silva Junior, Floriano Paes
2018-05-01
Phospholipases A2 (PLA 2 s) comprise a superfamily of glycerophospholipids hydrolyzing enzymes present in many organisms in nature, whose catalytic activity was majorly unveiled by analysis of snake venoms. The latter have pharmaceutical and biotechnological interests and can be divided into different functional sub-classes. Our goal was to identify important residues and their relation to the functional and class-specific characteristics in the PLA 2 s family with special emphasis on snake venom PLA 2 s (svPLA 2 s). We identified such residues by conservation analysis and decomposition of residue coevolution networks (DRCN), annotated the results based on the available literature on PLA 2 s, structural analysis and molecular dynamics simulations, and related the results to the phylogenetic distribution of these proteins. A filtered alignment of PLA 2 s revealed 14 highly conserved positions and 3 sets of coevolved residues, which were annotated according to their structural or functional role. These residues are mostly involved in ligand binding and catalysis, calcium-binding, the formation of disulfide bridges and a hydrophobic cluster close to the binding site. An independent validation of the inference of structure-function relationships from our co-evolution analysis on the svPLA2s family was obtained by the analysis of the pattern of selection acting on the Viperidae and Elapidae lineages. Additionally, a molecular dynamics simulation on the Lys49 PLA 2 from Agkistrodon contortrix laticinctus was carried out to further investigate the correlation of the Lys49-Glu69 pair. Our results suggest this configuration can result in a novel conformation where the binding cavity collapses due to the approximation of two loops caused by a strong salt bridge between Glu69 and Arg34. Finally, phylogenetic analysis indicated a correlation between the presence of residues in the coevolved sets found in this analysis and the clade localization. The results provide a guide for important positions in the family of PLA 2 s, and potential new objects of investigation. Copyright © 2018 Elsevier Ltd. All rights reserved.
The standard operating procedure of the DOE-JGI Microbial Genome Annotation Pipeline (MGAP v.4)
Huntemann, Marcel; Ivanova, Natalia N.; Mavromatis, Konstantinos; ...
2015-10-26
The DOE-JGI Microbial Genome Annotation Pipeline performs structural and functional annotation of microbial genomes that are further included into the Integrated Microbial Genome comparative analysis system. MGAP is applied to assembled nucleotide sequence datasets that are provided via the IMG submission site. Dataset submission for annotation first requires project and associated metadata description in GOLD. The MGAP sequence data processing consists of feature prediction including identification of protein-coding genes, non-coding RNAs and regulatory RNA features, as well as CRISPR elements. In conclusion, structural annotation is followed by assignment of protein product names and functions.
The standard operating procedure of the DOE-JGI Microbial Genome Annotation Pipeline (MGAP v.4)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Huntemann, Marcel; Ivanova, Natalia N.; Mavromatis, Konstantinos
The DOE-JGI Microbial Genome Annotation Pipeline performs structural and functional annotation of microbial genomes that are further included into the Integrated Microbial Genome comparative analysis system. MGAP is applied to assembled nucleotide sequence datasets that are provided via the IMG submission site. Dataset submission for annotation first requires project and associated metadata description in GOLD. The MGAP sequence data processing consists of feature prediction including identification of protein-coding genes, non-coding RNAs and regulatory RNA features, as well as CRISPR elements. In conclusion, structural annotation is followed by assignment of protein product names and functions.
Mitrasinovic, Petar M
2006-03-01
RNA structure can be viewed as both a construct composed of various structural motifs and a flexible polymer that is substantially influenced by its environment. In this light, the present paper represents an attempt to reconcile the two standpoints. By using the 3D structures both of four (16S and 23S) portions of unbound 50S, H50S, and T30S ribosomal subunits and of 38 large ribonucleoligand complexes as the starting point, the behavior, which is induced by ligand binding, of 73 hairpin triloops with closing g-c and c-g base pairs was investigated using root-mean-square deviation (RMSD) approach and pseudotorsional (eta,theta) convention at the nucleotide-by-nucleotide level. Triloops were annotated in accordance with a recent proposal of geometric nomenclature. A simple measure for the determination of the strain of a triloop is introduced. It is believed that a possible classification of the interior triloops, based on the 2D eta-theta unique path, will aid to conceive their local behavior upon ligand binding. All rRNA residues in contact with ligands as well as regions of considerable conformational changes upon complex formation were identified. The analysis offers the answer to: how proximal to and how far from the actual ligand-binding sites the structural changes occur?
DOE Office of Scientific and Technical Information (OSTI.GOV)
Card, G L; Peterson, N A; Smith, C A
2005-02-15
Mycobacterium tuberculosis, the cause of TB, is a devastating human pathogen. The emergence of multi-drug resistance in recent years has prompted a search for new drug targets and for a better understanding of mechanisms of resistance. Here we focus on the gene product of an open reading frame from M. tuberculosis, Rv1347c, which is annotated as a putative aminoglycoside N-acetyltransferase. The Rv1347c protein does not show this activity, however, and we show from its crystal structure, coupled with functional and bioinformatic data, that its most likely role is in the biosynthesis of mycobactin, the M. tuberculosis siderophore. The crystal structuremore » of Rv1347c was determined by MAD phasing from selenomethionine-substituted protein and refined at 2.2 {angstrom} resolution (R = 0.227, R{sub free} = 0.257). The protein is monomeric, with a fold that places it in the GCN5-related N-acetyltransferase (GNAT) family of acyltransferases. Features of the structure are an acylCoA binding site that is shared with other GNAT family members, and an adjacent hydrophobic channel leading to the surface that could accommodate long-chain acyl groups. Modeling the postulated substrate, the N{sup {var_epsilon}}-hydroxylysine side chain of mycobactin, into the acceptor substrate binding groove identifies two residues at the active site, His130 and Asp168, that have putative roles in substrate binding and catalysis.« less
2012-01-01
Background To discover a compound inhibiting multiple proteins (i.e. polypharmacological targets) is a new paradigm for the complex diseases (e.g. cancers and diabetes). In general, the polypharmacological proteins often share similar local binding environments and motifs. As the exponential growth of the number of protein structures, to find the similar structural binding motifs (pharma-motifs) is an emergency task for drug discovery (e.g. side effects and new uses for old drugs) and protein functions. Results We have developed a Space-Related Pharmamotifs (called SRPmotif) method to recognize the binding motifs by searching against protein structure database. SRPmotif is able to recognize conserved binding environments containing spatially discontinuous pharma-motifs which are often short conserved peptides with specific physico-chemical properties for protein functions. Among 356 pharma-motifs, 56.5% interacting residues are highly conserved. Experimental results indicate that 81.1% and 92.7% polypharmacological targets of each protein-ligand complex are annotated with same biological process (BP) and molecular function (MF) terms, respectively, based on Gene Ontology (GO). Our experimental results show that the identified pharma-motifs often consist of key residues in functional (active) sites and play the key roles for protein functions. The SRPmotif is available at http://gemdock.life.nctu.edu.tw/SRP/. Conclusions SRPmotif is able to identify similar pharma-interfaces and pharma-motifs sharing similar binding environments for polypharmacological targets by rapidly searching against the protein structure database. Pharma-motifs describe the conservations of binding environments for drug discovery and protein functions. Additionally, these pharma-motifs provide the clues for discovering new sequence-based motifs to predict protein functions from protein sequence databases. We believe that SRPmotif is useful for elucidating protein functions and drug discovery. PMID:23281852
Chiu, Yi-Yuan; Lin, Chun-Yu; Lin, Chih-Ta; Hsu, Kai-Cheng; Chang, Li-Zen; Yang, Jinn-Moon
2012-01-01
To discover a compound inhibiting multiple proteins (i.e. polypharmacological targets) is a new paradigm for the complex diseases (e.g. cancers and diabetes). In general, the polypharmacological proteins often share similar local binding environments and motifs. As the exponential growth of the number of protein structures, to find the similar structural binding motifs (pharma-motifs) is an emergency task for drug discovery (e.g. side effects and new uses for old drugs) and protein functions. We have developed a Space-Related Pharmamotifs (called SRPmotif) method to recognize the binding motifs by searching against protein structure database. SRPmotif is able to recognize conserved binding environments containing spatially discontinuous pharma-motifs which are often short conserved peptides with specific physico-chemical properties for protein functions. Among 356 pharma-motifs, 56.5% interacting residues are highly conserved. Experimental results indicate that 81.1% and 92.7% polypharmacological targets of each protein-ligand complex are annotated with same biological process (BP) and molecular function (MF) terms, respectively, based on Gene Ontology (GO). Our experimental results show that the identified pharma-motifs often consist of key residues in functional (active) sites and play the key roles for protein functions. The SRPmotif is available at http://gemdock.life.nctu.edu.tw/SRP/. SRPmotif is able to identify similar pharma-interfaces and pharma-motifs sharing similar binding environments for polypharmacological targets by rapidly searching against the protein structure database. Pharma-motifs describe the conservations of binding environments for drug discovery and protein functions. Additionally, these pharma-motifs provide the clues for discovering new sequence-based motifs to predict protein functions from protein sequence databases. We believe that SRPmotif is useful for elucidating protein functions and drug discovery.
Generative Research on Second Language Acquisition.
ERIC Educational Resources Information Center
Eubank, Lynn
1995-01-01
Reviews recent trends in generative research on second language acquisition, focusing on the role of universal grammar, parameter resetting, and anaphoric binding. An annotated bibliography discusses five important works in the field. (61 references) (MDM)
A Noncoding, Regulatory Mutation Implicates HCFC1 in Nonsyndromic Intellectual Disability
Huang, Lingli; Jolly, Lachlan A.; Willis-Owen, Saffron; Gardner, Alison; Kumar, Raman; Douglas, Evelyn; Shoubridge, Cheryl; Wieczorek, Dagmar; Tzschach, Andreas; Cohen, Monika; Hackett, Anna; Field, Michael; Froyen, Guy; Hu, Hao; Haas, Stefan A.; Ropers, Hans-Hilger; Kalscheuer, Vera M.; Corbett, Mark A.; Gecz, Jozef
2012-01-01
The discovery of mutations causing human disease has so far been biased toward protein-coding regions. Having excluded all annotated coding regions, we performed targeted massively parallel resequencing of the nonrepetitive genomic linkage interval at Xq28 of family MRX3. We identified in the binding site of transcription factor YY1 a regulatory mutation that leads to overexpression of the chromatin-associated transcriptional regulator HCFC1. When tested on embryonic murine neural stem cells and embryonic hippocampal neurons, HCFC1 overexpression led to a significant increase of the production of astrocytes and a considerable reduction in neurite growth. Two other nonsynonymous, potentially deleterious changes have been identified by X-exome sequencing in individuals with intellectual disability, implicating HCFC1 in normal brain function. PMID:23000143
Mahajan, Anubha; Locke, Adam; Rayner, N William; Robertson, Neil; Scott, Robert A; Prokopenko, Inga; Scott, Laura J; Green, Todd; Sparso, Thomas; Thuillier, Dorothee; Yengo, Loic; Grallert, Harald; Wahl, Simone; Frånberg, Mattias; Strawbridge, Rona J; Kestler, Hans; Chheda, Himanshu; Eisele, Lewin; Gustafsson, Stefan; Steinthorsdottir, Valgerdur; Thorleifsson, Gudmar; Qi, Lu; Karssen, Lennart C; van Leeuwen, Elisabeth M; Willems, Sara M; Li, Man; Chen, Han; Fuchsberger, Christian; Kwan, Phoenix; Ma, Clement; Linderman, Michael; Lu, Yingchang; Thomsen, Soren K; Rundle, Jana K; Beer, Nicola L; van de Bunt, Martijn; Chalisey, Anil; Kang, Hyun Min; Voight, Benjamin F; Abecasis, Goncalo R; Almgren, Peter; Baldassarre, Damiano; Balkau, Beverley; Benediktsson, Rafn; Blüher, Matthias; Boeing, Heiner; Bonnycastle, Lori L; Borringer, Erwin P; Burtt, Noël P; Carey, Jason; Charpentier, Guillaume; Chines, Peter S; Cornelis, Marilyn C; Couper, David J; Crenshaw, Andrew T; van Dam, Rob M; Doney, Alex SF; Dorkhan, Mozhgan; Edkins, Sarah; Eriksson, Johan G; Esko, Tonu; Eury, Elodie; Fadista, João; Flannick, Jason; Fontanillas, Pierre; Fox, Caroline; Franks, Paul W; Gertow, Karl; Gieger, Christian; Gigante, Bruna; Gottesman, Omri; Grant, George B; Grarup, Niels; Groves, Christopher J; Hassinen, Maija; Have, Christian T; Herder, Christian; Holmen, Oddgeir L; Hreidarsson, Astradur B; Humphries, Steve E; Hunter, David J; Jackson, Anne U; Jonsson, Anna; Jørgensen, Marit E; Jørgensen, Torben; Kerrison, Nicola D; Kinnunen, Leena; Klopp, Norman; Kong, Augustine; Kovacs, Peter; Kraft, Peter; Kravic, Jasmina; Langford, Cordelia; Leander, Karin; Liang, Liming; Lichtner, Peter; Lindgren, Cecilia M; Lindholm, Eero; Linneberg, Allan; Liu, Ching-Ti; Lobbens, Stéphane; Luan, Jian’an; Lyssenko, Valeriya; Männistö, Satu; McLeod, Olga; Meyer, Julia; Mihailov, Evelin; Mirza, Ghazala; Mühleisen, Thomas W; Müller-Nurasyid, Martina; Navarro, Carmen; Nöthen, Markus M; Oskolkov, Nikolay N; Owen, Katharine R; Palli, Domenico; Pechlivanis, Sonali; Perry, John RB; Platou, Carl GP; Roden, Michael; Ruderfer, Douglas; Rybin, Denis; van der Schouw, Yvonne T; Sennblad, Bengt; Sigurðsson, Gunnar; Stančáková, Alena; Steinbach, Gerald; Storm, Petter; Strauch, Konstantin; Stringham, Heather M; Sun, Qi; Thorand, Barbara; Tikkanen, Emmi; Tonjes, Anke; Trakalo, Joseph; Tremoli, Elena; Tuomi, Tiinamaija; Wennauer, Roman; Wood, Andrew R; Zeggini, Eleftheria; Dunham, Ian; Birney, Ewan; Pasquali, Lorenzo; Ferrer, Jorge; Loos, Ruth JF; Dupuis, Josée; Florez, Jose C; Boerwinkle, Eric; Pankow, James S; van Duijn, Cornelia; Sijbrands, Eric; Meigs, James B; Hu, Frank B; Thorsteinsdottir, Unnur; Stefansson, Kari; Lakka, Timo A; Rauramaa, Rainer; Stumvoll, Michael; Pedersen, Nancy L; Lind, Lars; Keinanen-Kiukaanniemi, Sirkka M; Korpi-Hyövälti, Eeva; Saaristo, Timo E; Saltevo, Juha; Kuusisto, Johanna; Laakso, Markku; Metspalu, Andres; Erbel, Raimund; Jöckel, Karl-Heinz; Moebus, Susanne; Ripatti, Samuli; Salomaa, Veikko; Ingelsson, Erik; Boehm, Bernhard O; Bergman, Richard N; Collins, Francis S; Mohlke, Karen L; Koistinen, Heikki; Tuomilehto, Jaakko; Hveem, Kristian; Njølstad, Inger; Deloukas, Panagiotis; Donnelly, Peter J; Frayling, Timothy M; Hattersley, Andrew T; de Faire, Ulf; Hamsten, Anders; Illig, Thomas; Peters, Annette; Cauchi, Stephane; Sladek, Rob; Froguel, Philippe; Hansen, Torben; Pedersen, Oluf; Morris, Andrew D; Palmer, Collin NA; Kathiresan, Sekar; Melander, Olle; Nilsson, Peter M; Groop, Leif C; Barroso, Inês; Langenberg, Claudia; Wareham, Nicholas J; O’Callaghan, Christopher A; Gloyn, Anna L; Altshuler, David; Boehnke, Michael; Teslovich, Tanya M; McCarthy, Mark I; Morris, Andrew P
2015-01-01
We performed fine-mapping of 39 established type 2 diabetes (T2D) loci in 27,206 cases and 57,574 controls of European ancestry. We identified 49 distinct association signals at these loci, including five mapping in/near KCNQ1. “Credible sets” of variants most likely to drive each distinct signal mapped predominantly to non-coding sequence, implying that T2D association is mediated through gene regulation. Credible set variants were enriched for overlap with FOXA2 chromatin immunoprecipitation binding sites in human islet and liver cells, including at MTNR1B, where fine-mapping implicated rs10830963 as driving T2D association. We confirmed that this T2D-risk allele increases FOXA2-bound enhancer activity in islet- and liver-derived cells. We observed allele-specific differences in NEUROD1 binding in islet-derived cells, consistent with evidence that the T2D-risk allele increases islet MTNR1B expression. Our study demonstrates how integration of genetic and genomic information can define molecular mechanisms through which variants underlying association signals exert their effects on disease. PMID:26551672
Leuthaeuser, Janelle B; Knutson, Stacy T; Kumar, Kiran; Babbitt, Patricia C; Fetrow, Jacquelyn S
2015-09-01
The development of accurate protein function annotation methods has emerged as a major unsolved biological problem. Protein similarity networks, one approach to function annotation via annotation transfer, group proteins into similarity-based clusters. An underlying assumption is that the edge metric used to identify such clusters correlates with functional information. In this contribution, this assumption is evaluated by observing topologies in similarity networks using three different edge metrics: sequence (BLAST), structure (TM-Align), and active site similarity (active site profiling, implemented in DASP). Network topologies for four well-studied protein superfamilies (enolase, peroxiredoxin (Prx), glutathione transferase (GST), and crotonase) were compared with curated functional hierarchies and structure. As expected, network topology differs, depending on edge metric; comparison of topologies provides valuable information on structure/function relationships. Subnetworks based on active site similarity correlate with known functional hierarchies at a single edge threshold more often than sequence- or structure-based networks. Sequence- and structure-based networks are useful for identifying sequence and domain similarities and differences; therefore, it is important to consider the clustering goal before deciding appropriate edge metric. Further, conserved active site residues identified in enolase and GST active site subnetworks correspond with published functionally important residues. Extension of this analysis yields predictions of functionally determinant residues for GST subgroups. These results support the hypothesis that active site similarity-based networks reveal clusters that share functional details and lay the foundation for capturing functionally relevant hierarchies using an approach that is both automatable and can deliver greater precision in function annotation than current similarity-based methods. © 2015 The Authors Protein Science published by Wiley Periodicals, Inc. on behalf of The Protein Society.
Leuthaeuser, Janelle B; Knutson, Stacy T; Kumar, Kiran; Babbitt, Patricia C; Fetrow, Jacquelyn S
2015-01-01
The development of accurate protein function annotation methods has emerged as a major unsolved biological problem. Protein similarity networks, one approach to function annotation via annotation transfer, group proteins into similarity-based clusters. An underlying assumption is that the edge metric used to identify such clusters correlates with functional information. In this contribution, this assumption is evaluated by observing topologies in similarity networks using three different edge metrics: sequence (BLAST), structure (TM-Align), and active site similarity (active site profiling, implemented in DASP). Network topologies for four well-studied protein superfamilies (enolase, peroxiredoxin (Prx), glutathione transferase (GST), and crotonase) were compared with curated functional hierarchies and structure. As expected, network topology differs, depending on edge metric; comparison of topologies provides valuable information on structure/function relationships. Subnetworks based on active site similarity correlate with known functional hierarchies at a single edge threshold more often than sequence- or structure-based networks. Sequence- and structure-based networks are useful for identifying sequence and domain similarities and differences; therefore, it is important to consider the clustering goal before deciding appropriate edge metric. Further, conserved active site residues identified in enolase and GST active site subnetworks correspond with published functionally important residues. Extension of this analysis yields predictions of functionally determinant residues for GST subgroups. These results support the hypothesis that active site similarity-based networks reveal clusters that share functional details and lay the foundation for capturing functionally relevant hierarchies using an approach that is both automatable and can deliver greater precision in function annotation than current similarity-based methods. PMID:26073648
Brozovic, Matija; Dantec, Christelle; Dardaillon, Justine; Dauga, Delphine; Faure, Emmanuel; Gineste, Mathieu; Louis, Alexandra; Naville, Magali; Nitta, Kazuhiro R; Piette, Jacques; Reeves, Wendy; Scornavacca, Céline; Simion, Paul; Vincentelli, Renaud; Bellec, Maelle; Aicha, Sameh Ben; Fagotto, Marie; Guéroult-Bellone, Marion; Haeussler, Maximilian; Jacox, Edwin; Lowe, Elijah K; Mendez, Mickael; Roberge, Alexis; Stolfi, Alberto; Yokomori, Rui; Cambillau, Christian; Christiaen, Lionel; Delsuc, Frédéric; Douzery, Emmanuel; Dumollard, Rémi; Kusakabe, Takehiro; Nakai, Kenta; Nishida, Hiroki; Satou, Yutaka; Swalla, Billie; Veeman, Michael; Volff, Jean-Nicolas
2018-01-01
Abstract ANISEED (www.aniseed.cnrs.fr) is the main model organism database for tunicates, the sister-group of vertebrates. This release gives access to annotated genomes, gene expression patterns, and anatomical descriptions for nine ascidian species. It provides increased integration with external molecular and taxonomy databases, better support for epigenomics datasets, in particular RNA-seq, ChIP-seq and SELEX-seq, and features novel interactive interfaces for existing and novel datatypes. In particular, the cross-species navigation and comparison is enhanced through a novel taxonomy section describing each represented species and through the implementation of interactive phylogenetic gene trees for 60% of tunicate genes. The gene expression section displays the results of RNA-seq experiments for the three major model species of solitary ascidians. Gene expression is controlled by the binding of transcription factors to cis-regulatory sequences. A high-resolution description of the DNA-binding specificity for 131 Ciona robusta (formerly C. intestinalis type A) transcription factors by SELEX-seq is provided and used to map candidate binding sites across the Ciona robusta and Phallusia mammillata genomes. Finally, use of a WashU Epigenome browser enhances genome navigation, while a Genomicus server was set up to explore microsynteny relationships within tunicates and with vertebrates, Amphioxus, echinoderms and hemichordates. PMID:29149270
Protein Information Resource: a community resource for expert annotation of protein data
Barker, Winona C.; Garavelli, John S.; Hou, Zhenglin; Huang, Hongzhan; Ledley, Robert S.; McGarvey, Peter B.; Mewes, Hans-Werner; Orcutt, Bruce C.; Pfeiffer, Friedhelm; Tsugita, Akira; Vinayaka, C. R.; Xiao, Chunlin; Yeh, Lai-Su L.; Wu, Cathy
2001-01-01
The Protein Information Resource, in collaboration with the Munich Information Center for Protein Sequences (MIPS) and the Japan International Protein Information Database (JIPID), produces the most comprehensive and expertly annotated protein sequence database in the public domain, the PIR-International Protein Sequence Database. To provide timely and high quality annotation and promote database interoperability, the PIR-International employs rule-based and classification-driven procedures based on controlled vocabulary and standard nomenclature and includes status tags to distinguish experimentally determined from predicted protein features. The database contains about 200 000 non-redundant protein sequences, which are classified into families and superfamilies and their domains and motifs identified. Entries are extensively cross-referenced to other sequence, classification, genome, structure and activity databases. The PIR web site features search engines that use sequence similarity and database annotation to facilitate the analysis and functional identification of proteins. The PIR-International databases and search tools are accessible on the PIR web site at http://pir.georgetown.edu/ and at the MIPS web site at http://www.mips.biochem.mpg.de. The PIR-International Protein Sequence Database and other files are also available by FTP. PMID:11125041
Stein, Matthias; Pilli, Manohar; Bernauer, Sabine; Habermann, Bianca H.; Zerial, Marino; Wade, Rebecca C.
2012-01-01
Background Rab GTPases constitute the largest subfamily of the Ras protein superfamily. Rab proteins regulate organelle biogenesis and transport, and display distinct binding preferences for effector and activator proteins, many of which have not been elucidated yet. The underlying molecular recognition motifs, binding partner preferences and selectivities are not well understood. Methodology/Principal Findings Comparative analysis of the amino acid sequences and the three-dimensional electrostatic and hydrophobic molecular interaction fields of 62 human Rab proteins revealed a wide range of binding properties with large differences between some Rab proteins. This analysis assists the functional annotation of Rab proteins 12, 14, 26, 37 and 41 and provided an explanation for the shared function of Rab3 and 27. Rab7a and 7b have very different electrostatic potentials, indicating that they may bind to different effector proteins and thus, exert different functions. The subfamily V Rab GTPases which are associated with endosome differ subtly in the interaction properties of their switch regions, and this may explain exchange factor specificity and exchange kinetics. Conclusions/Significance We have analysed conservation of sequence and of molecular interaction fields to cluster and annotate the human Rab proteins. The analysis of three dimensional molecular interaction fields provides detailed insight that is not available from a sequence-based approach alone. Based on our results, we predict novel functions for some Rab proteins and provide insights into their divergent functions and the determinants of their binding partner selectivity. PMID:22523562
PhyloGibbs-MP: Module Prediction and Discriminative Motif-Finding by Gibbs Sampling
Siddharthan, Rahul
2008-01-01
PhyloGibbs, our recent Gibbs-sampling motif-finder, takes phylogeny into account in detecting binding sites for transcription factors in DNA and assigns posterior probabilities to its predictions obtained by sampling the entire configuration space. Here, in an extension called PhyloGibbs-MP, we widen the scope of the program, addressing two major problems in computational regulatory genomics. First, PhyloGibbs-MP can localise predictions to small, undetermined regions of a large input sequence, thus effectively predicting cis-regulatory modules (CRMs) ab initio while simultaneously predicting binding sites in those modules—tasks that are usually done by two separate programs. PhyloGibbs-MP's performance at such ab initio CRM prediction is comparable with or superior to dedicated module-prediction software that use prior knowledge of previously characterised transcription factors. Second, PhyloGibbs-MP can predict motifs that differentiate between two (or more) different groups of regulatory regions, that is, motifs that occur preferentially in one group over the others. While other “discriminative motif-finders” have been published in the literature, PhyloGibbs-MP's implementation has some unique features and flexibility. Benchmarks on synthetic and actual genomic data show that this algorithm is successful at enhancing predictions of differentiating sites and suppressing predictions of common sites and compares with or outperforms other discriminative motif-finders on actual genomic data. Additional enhancements include significant performance and speed improvements, the ability to use “informative priors” on known transcription factors, and the ability to output annotations in a format that can be visualised with the Generic Genome Browser. In stand-alone motif-finding, PhyloGibbs-MP remains competitive, outperforming PhyloGibbs-1.0 and other programs on benchmark data. PMID:18769735
Spliceman2: a computational web server that predicts defects in pre-mRNA splicing.
Cygan, Kamil Jan; Sanford, Clayton Hendrick; Fairbrother, William Guy
2017-09-15
Most pre-mRNA transcripts in eukaryotic cells must undergo splicing to remove introns and join exons, and splicing elements present a large mutational target for disease-causing mutations. Splicing elements are strongly position dependent with respect to the transcript annotations. In 2012, we presented Spliceman, an online tool that used positional dependence to predict how likely distant mutations around annotated splice sites were to disrupt splicing. Here, we present an improved version of the previous tool that will be more useful for predicting the likelihood of splicing mutations. We have added industry-standard input options (i.e. Spliceman now accepts variant call format files), which allow much larger inputs than previously available. The tool also can visualize the locations-within exons and introns-of sequence variants to be analyzed and the predicted effects on splicing of the pre-mRNA transcript. In addition, Spliceman2 integrates with RNAcompete motif libraries to provide a prediction of which trans -acting factors binding sites are disrupted/created and links out to the UCSC genome browser. In summary, the new features in Spliceman2 will allow scientists and physicians to better understand the effects of single nucleotide variations on splicing. Freely available on the web at http://fairbrother.biomed.brown.edu/spliceman2 . Website implemented in PHP framework-Laravel 5, PostgreSQL, Apache, and Perl, with all major browsers supported. william_fairbrother@brown.edu. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
Zimmerman, Carl-Ulrich R; Rosengarten, Renate; Spergser, Joachim
2013-01-01
Phase variation of two loci (‘mba locus’ and ‘UU172 phase-variable element’) in Ureaplasma parvum serovar 3 has been suggested as result of site-specific DNA inversion occurring at short inverted repeats. Three potential tyrosine recombinases (RipX, XerC, and CodV encoded by the genes UU145, UU222, and UU529) have been annotated in the genome of U. parvum serovar 3, which could be mediators in the proposed recombination event. We document that only orthologs of the gene xerC are present in all strains that show phase variation in the two loci. We demonstrate in vitro binding of recombinant maltose-binding protein fusions of XerC to the inverted repeats of the phase-variable loci, of RipX to a direct repeat that flanks a 20-kbp region, which has been proposed as putative pathogenicity island, and of CodV to a putative dif site. Co-transformation of the model organism Mycoplasma pneumoniae M129 with both the ‘mba locus’ and the recombinase gene xerC behind an active promoter region resulted in DNA inversion in the ‘mba locus’. Results suggest that XerC of U. parvum serovar 3 is a mediator in the proposed DNA inversion event of the two phase-variable loci. PMID:23305333
2011-01-01
Background The endometrium is a dynamic tissue whose changes are driven by the ovarian steroidal hormones. Its main function is to provide an adequate substrate for embryo implantation. Using microarray technology, several reports have provided the gene expression patterns of human endometrial tissue during the window of implantation. However it is required that biological connections be made across these genomic datasets to take full advantage of them. The objective of this work was to perform a research synthesis of available gene expression profiles related to acquisition of endometrial receptivity for embryo implantation, in order to gain insights into its molecular basis and regulation. Methods Gene expression datasets were intersected to determine a consensus endometrial receptivity transcript list (CERTL). For this cluster of genes we determined their functional annotations using available web-based databases. In addition, promoter sequences were analyzed to identify putative transcription factor binding sites using bioinformatics tools and determined over-represented features. Results We found 40 up- and 21 down-regulated transcripts in the CERTL. Those more consistently increased were C4BPA, SPP1, APOD, CD55, CFD, CLDN4, DKK1, ID4, IL15 and MAP3K5 whereas the more consistently decreased were OLFM1, CCNB1, CRABP2, EDN3, FGFR1, MSX1 and MSX2. Functional annotation of CERTL showed it was enriched with transcripts related to the immune response, complement activation and cell cycle regulation. Promoter sequence analysis of genes revealed that DNA binding sites for E47, E2F1 and SREBP1 transcription factors were the most consistently over-represented and in both up- and down-regulated genes during the window of implantation. Conclusions Our research synthesis allowed organizing and mining high throughput data to explore endometrial receptivity and focus future research efforts on specific genes and pathways. The discovery of possible new transcription factors orchestrating the CERTL opens new alternatives for understanding gene expression regulation in uterine function. PMID:21272326
ASD: a comprehensive database of allosteric proteins and modulators
Huang, Zhimin; Zhu, Liang; Cao, Yan; Wu, Geng; Liu, Xinyi; Chen, Yingyi; Wang, Qi; Shi, Ting; Zhao, Yaxue; Wang, Yuefei; Li, Weihua; Li, Yixue; Chen, Haifeng; Chen, Guoqiang; Zhang, Jian
2011-01-01
Allostery is the most direct, rapid and efficient way of regulating protein function, ranging from the control of metabolic mechanisms to signal-transduction pathways. However, an enormous amount of unsystematic allostery information has deterred scientists who could benefit from this field. Here, we present the AlloSteric Database (ASD), the first online database that provides a central resource for the display, search and analysis of structure, function and related annotation for allosteric molecules. Currently, ASD contains 336 allosteric proteins from 101 species and 8095 modulators in three categories (activators, inhibitors and regulators). Proteins are annotated with a detailed description of allostery, biological process and related diseases, and modulators with binding affinity, physicochemical properties and therapeutic area. Integrating the information of allosteric proteins in ASD should allow for the identification of specific allosteric sites of a given subtype among proteins of the same family that can potentially serve as ideal targets for experimental validation. In addition, modulators curated in ASD can be used to investigate potent allosteric targets for the query compound, and also help chemists to implement structure modifications for novel allosteric drug design. Therefore, ASD could be a platform and a starting point for biologists and medicinal chemists for furthering allosteric research. ASD is freely available at http://mdl.shsmu.edu.cn/ASD/. PMID:21051350
SAbDab: the structural antibody database
Dunbar, James; Krawczyk, Konrad; Leem, Jinwoo; Baker, Terry; Fuchs, Angelika; Georges, Guy; Shi, Jiye; Deane, Charlotte M.
2014-01-01
Structural antibody database (SAbDab; http://opig.stats.ox.ac.uk/webapps/sabdab) is an online resource containing all the publicly available antibody structures annotated and presented in a consistent fashion. The data are annotated with several properties including experimental information, gene details, correct heavy and light chain pairings, antigen details and, where available, antibody–antigen binding affinity. The user can select structures, according to these attributes as well as structural properties such as complementarity determining region loop conformation and variable domain orientation. Individual structures, datasets and the complete database can be downloaded. PMID:24214988
Identifying potential maternal genes of Bombyx mori using digital gene expression profiling
Xu, Pingzhen
2018-01-01
Maternal genes present in mature oocytes play a crucial role in the early development of silkworm. Although maternal genes have been widely studied in many other species, there has been limited research in Bombyx mori. High-throughput next generation sequencing provides a practical method for gene discovery on a genome-wide level. Herein, a transcriptome study was used to identify maternal-related genes from silkworm eggs. Unfertilized eggs from five different stages of early development were used to detect the changing situation of gene expression. The expressed genes showed different patterns over time. Seventy-six maternal genes were annotated according to homology analysis with Drosophila melanogaster. More than half of the differentially expressed maternal genes fell into four expression patterns, while the expression patterns showed a downward trend over time. The functional annotation of these material genes was mainly related to transcription factor activity, growth factor activity, nucleic acid binding, RNA binding, ATP binding, and ion binding. Additionally, twenty-two gene clusters including maternal genes were identified from 18 scaffolds. Altogether, we plotted a profile for the maternal genes of Bombyx mori using a digital gene expression profiling method. This will provide the basis for maternal-specific signature research and improve the understanding of the early development of silkworm. PMID:29462160
Osato, Naoki
2018-01-19
Transcriptional target genes show functional enrichment of genes. However, how many and how significantly transcriptional target genes include functional enrichments are still unclear. To address these issues, I predicted human transcriptional target genes using open chromatin regions, ChIP-seq data and DNA binding sequences of transcription factors in databases, and examined functional enrichment and gene expression level of putative transcriptional target genes. Gene Ontology annotations showed four times larger numbers of functional enrichments in putative transcriptional target genes than gene expression information alone, independent of transcriptional target genes. To compare the number of functional enrichments of putative transcriptional target genes between cells or search conditions, I normalized the number of functional enrichment by calculating its ratios in the total number of transcriptional target genes. With this analysis, native putative transcriptional target genes showed the largest normalized number of functional enrichments, compared with target genes including 5-60% of randomly selected genes. The normalized number of functional enrichments was changed according to the criteria of enhancer-promoter interactions such as distance from transcriptional start sites and orientation of CTCF-binding sites. Forward-reverse orientation of CTCF-binding sites showed significantly higher normalized number of functional enrichments than the other orientations. Journal papers showed that the top five frequent functional enrichments were related to the cellular functions in the three cell types. The median expression level of transcriptional target genes changed according to the criteria of enhancer-promoter assignments (i.e. interactions) and was correlated with the changes of the normalized number of functional enrichments of transcriptional target genes. Human putative transcriptional target genes showed significant functional enrichments. Functional enrichments were related to the cellular functions. The normalized number of functional enrichments of human putative transcriptional target genes changed according to the criteria of enhancer-promoter assignments and correlated with the median expression level of the target genes. These analyses and characters of human putative transcriptional target genes would be useful to examine the criteria of enhancer-promoter assignments and to predict the novel mechanisms and factors such as DNA binding proteins and DNA sequences of enhancer-promoter interactions.
iDNA-Prot: Identification of DNA Binding Proteins Using Random Forest with Grey Model
Lin, Wei-Zhong; Fang, Jian-An; Xiao, Xuan; Chou, Kuo-Chen
2011-01-01
DNA-binding proteins play crucial roles in various cellular processes. Developing high throughput tools for rapidly and effectively identifying DNA-binding proteins is one of the major challenges in the field of genome annotation. Although many efforts have been made in this regard, further effort is needed to enhance the prediction power. By incorporating the features into the general form of pseudo amino acid composition that were extracted from protein sequences via the “grey model” and by adopting the random forest operation engine, we proposed a new predictor, called iDNA-Prot, for identifying uncharacterized proteins as DNA-binding proteins or non-DNA binding proteins based on their amino acid sequences information alone. The overall success rate by iDNA-Prot was 83.96% that was obtained via jackknife tests on a newly constructed stringent benchmark dataset in which none of the proteins included has pairwise sequence identity to any other in a same subset. In addition to achieving high success rate, the computational time for iDNA-Prot is remarkably shorter in comparison with the relevant existing predictors. Hence it is anticipated that iDNA-Prot may become a useful high throughput tool for large-scale analysis of DNA-binding proteins. As a user-friendly web-server, iDNA-Prot is freely accessible to the public at the web-site on http://icpr.jci.edu.cn/bioinfo/iDNA-Prot or http://www.jci-bioinfo.cn/iDNA-Prot. Moreover, for the convenience of the vast majority of experimental scientists, a step-by-step guide is provided on how to use the web-server to get the desired results. PMID:21935457
Oellrich, Anika; Collier, Nigel; Smedley, Damian; Groza, Tudor
2015-01-01
Electronic health records and scientific articles possess differing linguistic characteristics that may impact the performance of natural language processing tools developed for one or the other. In this paper, we investigate the performance of four extant concept recognition tools: the clinical Text Analysis and Knowledge Extraction System (cTAKES), the National Center for Biomedical Ontology (NCBO) Annotator, the Biomedical Concept Annotation System (BeCAS) and MetaMap. Each of the four concept recognition systems is applied to four different corpora: the i2b2 corpus of clinical documents, a PubMed corpus of Medline abstracts, a clinical trails corpus and the ShARe/CLEF corpus. In addition, we assess the individual system performances with respect to one gold standard annotation set, available for the ShARe/CLEF corpus. Furthermore, we built a silver standard annotation set from the individual systems' output and assess the quality as well as the contribution of individual systems to the quality of the silver standard. Our results demonstrate that mainly the NCBO annotator and cTAKES contribute to the silver standard corpora (F1-measures in the range of 21% to 74%) and their quality (best F1-measure of 33%), independent from the type of text investigated. While BeCAS and MetaMap can contribute to the precision of silver standard annotations (precision of up to 42%), the F1-measure drops when combined with NCBO Annotator and cTAKES due to a low recall. In conclusion, the performances of individual systems need to be improved independently from the text types, and the leveraging strategies to best take advantage of individual systems' annotations need to be revised. The textual content of the PubMed corpus, accession numbers for the clinical trials corpus, and assigned annotations of the four concept recognition systems as well as the generated silver standard annotation sets are available from http://purl.org/phenotype/resources. The textual content of the ShARe/CLEF (https://sites.google.com/site/shareclefehealth/data) and i2b2 (https://i2b2.org/NLP/DataSets/) corpora needs to be requested with the individual corpus providers.
American Revolution, Fitness, Presidents, U.S., Rocks & Minerals, Spelling, Vocabulary.
ERIC Educational Resources Information Center
Web Feet, 2002
2002-01-01
This annotated subject guide to Web sites for grades K-8 focuses on the American Revolution, fitness, U.S. Presidents, rocks and minerals, spelling, vocabulary, and calendar connections for Women's History Month and other March observations. Specific grade levels are indicated for each annotation. (LRW)
Identification of novel alleles of the rice blast resistance gene Pi54
NASA Astrophysics Data System (ADS)
Vasudevan, Kumar; Gruissem, Wilhelm; Bhullar, Navreet K.
2015-10-01
Rice blast is one of the most devastating rice diseases and continuous resistance breeding is required to control the disease. The rice blast resistance gene Pi54 initially identified in an Indian cultivar confers broad-spectrum resistance in India. We explored the allelic diversity of the Pi54 gene among 885 Indian rice genotypes that were found resistant in our screening against field mixture of naturally existing M. oryzae strains as well as against five unique strains. These genotypes are also annotated as rice blast resistant in the International Rice Genebank database. Sequence-based allele mining was used to amplify and clone the Pi54 allelic variants. Nine new alleles of Pi54 were identified based on the nucleotide sequence comparison to the Pi54 reference sequence as well as to already known Pi54 alleles. DNA sequence analysis of the newly identified Pi54 alleles revealed several single polymorphic sites, three double deletions and an eight base pair deletion. A SNP-rich region was found between a tyrosine kinase phosphorylation site and the nucleotide binding site (NBS) domain. Together, the newly identified Pi54 alleles expand the allelic series and are candidates for rice blast resistance breeding programs.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Willis, Mark A.; Zhuang, Zhihao; Song, Feng
2008-04-02
The crystal structure of HI0827 from Haemophilus influenzae Rd KW20, initially annotated 'hypothetical protein' in sequence databases, exhibits an acyl-coenzyme A (acyl-CoA) thioesterase 'hot dog' fold with a trimer of dimers oligomeric association, a novel assembly for this enzyme family. In studies described in the preceding paper [Zhuang, Z., Song, F., Zhao, H., Li, L., Cao, J., Eisenstein, E., Herzberg, O., and Dunaway-Mariano, D. (2008) Biochemistry 47, 2789-2796], HI0827 is shown to be an acyl-CoA thioesterase that acts on a wide range of acyl-CoA compounds. Two substrate binding sites are located across the dimer interface. The binding sites are occupiedmore » by two CoA molecules, one with full occupancy and the second only partially occupied. The CoA molecules, acquired from HI0827-expressing Escherichia coli cells, remained tightly bound to the enzyme through the protein purification steps. The difference in CoA occupancies indicates a different substrate affinity for each of the binding sites, which in turn implies that the enzyme might be subject to allosteric regulation. Mutagenesis studies have shown that the replacement of the putative catalytic carboxylate Asp44 with an alanine residue abolishes activity. The impact of this mutation is seen in the crystal structure of D44A HI0827. Whereas the overall fold and assembly of the mutant protein are the same as those of the wild-type enzyme, the CoA ligands are absent. The dimer interface is perturbed, and the channel that accommodates the thioester acyl chain is more open and wider than that observed in the wild-type enzyme. A model of intact substrate bound to wild-type HI0827 provides a structural rationale for the broad substrate range.« less
Deep sequencing of cardiac microRNA-mRNA interactomes in clinical and experimental cardiomyopathy
Matkovich, Scot J.; Dorn, Gerald W.
2018-01-01
Summary MicroRNAs are a family of short (~21 nucleotide) noncoding RNAs that serve key roles in cellular growth and differentiation and the response of the heart to stress stimuli. As the sequence-specific recognition element of RNA-induced silencing complexes (RISCs), microRNAs bind mRNAs and prevent their translation via mechanisms that may include transcript degradation and/or prevention of ribosome binding. Short microRNA sequences and the ability of microRNAs to bind to mRNA sites having only partial/imperfect sequence complementarity complicates purely computational analyses of microRNA-mRNA interactomes. Furthermore, computational microRNA target prediction programs typically ignore biological context, and therefore the principal determinants of microRNA-mRNA binding: the presence and quantity of each. To address these deficiencies we describe an empirical method, developed via studies of stressed and failing hearts, to determine disease-induced changes in microRNAs, mRNAs, and the mRNAs targeted to the RISC, without cross-linking mRNAs to RISC proteins. Deep sequencing methods are used to determine RNA abundances, delivering unbiased, quantitative RNA data limited only by their annotation in the genome of interest. We describe the laboratory bench steps required to perform these experiments, experimental design strategies to achieve an appropriate number of sequencing reads per biological replicate, and computer-based processing tools and procedures to convert large raw sequencing data files into gene expression measures useful for differential expression analyses. PMID:25836573
Deep sequencing of cardiac microRNA-mRNA interactomes in clinical and experimental cardiomyopathy.
Matkovich, Scot J; Dorn, Gerald W
2015-01-01
MicroRNAs are a family of short (~21 nucleotide) noncoding RNAs that serve key roles in cellular growth and differentiation and the response of the heart to stress stimuli. As the sequence-specific recognition element of RNA-induced silencing complexes (RISCs), microRNAs bind mRNAs and prevent their translation via mechanisms that may include transcript degradation and/or prevention of ribosome binding. Short microRNA sequences and the ability of microRNAs to bind to mRNA sites having only partial/imperfect sequence complementarity complicate purely computational analyses of microRNA-mRNA interactomes. Furthermore, computational microRNA target prediction programs typically ignore biological context, and therefore the principal determinants of microRNA-mRNA binding: the presence and quantity of each. To address these deficiencies we describe an empirical method, developed via studies of stressed and failing hearts, to determine disease-induced changes in microRNAs, mRNAs, and the mRNAs targeted to the RISC, without cross-linking mRNAs to RISC proteins. Deep sequencing methods are used to determine RNA abundances, delivering unbiased, quantitative RNA data limited only by their annotation in the genome of interest. We describe the laboratory bench steps required to perform these experiments, experimental design strategies to achieve an appropriate number of sequencing reads per biological replicate, and computer-based processing tools and procedures to convert large raw sequencing data files into gene expression measures useful for differential expression analyses.
Mapping Polymerization and Allostery of Hemoglobin S Using Point Mutations
Weinkam, Patrick; Sali, Andrej
2014-01-01
Hemoglobin is a complex system that undergoes conformational changes in response to oxygen, allosteric effectors, mutations, and environmental changes. Here, we study allostery and polymerization of hemoglobin and its variants by application of two previously described methods: (i) AllosMod for simulating allostery dynamics given two allosterically related input structures and (ii) a machine-learning method for dynamics- and structure-based prediction of the mutation impact on allostery (Weinkam et al. J. Mol. Biol. 2013), now applicable to systems with multiple coupled binding sites such as hemoglobin. First, we predict the relative stabilities of substates and microstates of hemoglobin, which are determined primarily by entropy within our model. Next, we predict the impact of 866 annotated mutations on hemoglobin’s oxygen binding equilibrium. We then discuss a subset of 30 mutations that occur in the presence of the sickle cell mutation and whose effects on polymerization have been measured. Seven of these HbS mutations occur in three predicted druggable binding pockets that might be exploited to directly inhibit polymerization; one of these binding pockets is not apparent in the crystal structure but only in structures generated by AllosMod. For the 30 mutations, we predict that mutation-induced conformational changes within a single tetramer tend not to significantly impact polymerization; instead, these mutations more likely impact polymerization by directly perturbing a polymerization interface. Finally, our analysis of allostery allows us to hypothesize why hemoglobin evolved to have multiple subunits and a persistent low frequency sickle cell mutation. PMID:23957820
APAtrap: identification and quantification of alternative polyadenylation sites from RNA-seq data.
Ye, Congting; Long, Yuqi; Ji, Guoli; Li, Qingshun Quinn; Wu, Xiaohui
2018-06-01
Alternative polyadenylation (APA) has been increasingly recognized as a crucial mechanism that contributes to transcriptome diversity and gene expression regulation. As RNA-seq has become a routine protocol for transcriptome analysis, it is of great interest to leverage such unprecedented collection of RNA-seq data by new computational methods to extract and quantify APA dynamics in these transcriptomes. However, research progress in this area has been relatively limited. Conventional methods rely on either transcript assembly to determine transcript 3' ends or annotated poly(A) sites. Moreover, they can neither identify more than two poly(A) sites in a gene nor detect dynamic APA site usage considering more than two poly(A) sites. We developed an approach called APAtrap based on the mean squared error model to identify and quantify APA sites from RNA-seq data. APAtrap is capable of identifying novel 3' UTRs and 3' UTR extensions, which contributes to locating potential poly(A) sites in previously overlooked regions and improving genome annotations. APAtrap also aims to tally all potential poly(A) sites and detect genes with differential APA site usages between conditions. Extensive comparisons of APAtrap with two other latest methods, ChangePoint and DaPars, using various RNA-seq datasets from simulation studies, human and Arabidopsis demonstrate the efficacy and flexibility of APAtrap for any organisms with an annotated genome. Freely available for download at https://apatrap.sourceforge.io. liqq@xmu.edu.cn or xhuister@xmu.edu.cn. Supplementary data are available at Bioinformatics online.
Recognition of Protein-coding Genes Based on Z-curve Algorithms
-Biao Guo, Feng; Lin, Yan; -Ling Chen, Ling
2014-01-01
Recognition of protein-coding genes, a classical bioinformatics issue, is an absolutely needed step for annotating newly sequenced genomes. The Z-curve algorithm, as one of the most effective methods on this issue, has been successfully applied in annotating or re-annotating many genomes, including those of bacteria, archaea and viruses. Two Z-curve based ab initio gene-finding programs have been developed: ZCURVE (for bacteria and archaea) and ZCURVE_V (for viruses and phages). ZCURVE_C (for 57 bacteria) and Zfisher (for any bacterium) are web servers for re-annotation of bacterial and archaeal genomes. The above four tools can be used for genome annotation or re-annotation, either independently or combined with the other gene-finding programs. In addition to recognizing protein-coding genes and exons, Z-curve algorithms are also effective in recognizing promoters and translation start sites. Here, we summarize the applications of Z-curve algorithms in gene finding and genome annotation. PMID:24822027
Bovine Genome Database: supporting community annotation and analysis of the Bos taurus genome
2010-01-01
Background A goal of the Bovine Genome Database (BGD; http://BovineGenome.org) has been to support the Bovine Genome Sequencing and Analysis Consortium (BGSAC) in the annotation and analysis of the bovine genome. We were faced with several challenges, including the need to maintain consistent quality despite diversity in annotation expertise in the research community, the need to maintain consistent data formats, and the need to minimize the potential duplication of annotation effort. With new sequencing technologies allowing many more eukaryotic genomes to be sequenced, the demand for collaborative annotation is likely to increase. Here we present our approach, challenges and solutions facilitating a large distributed annotation project. Results and Discussion BGD has provided annotation tools that supported 147 members of the BGSAC in contributing 3,871 gene models over a fifteen-week period, and these annotations have been integrated into the bovine Official Gene Set. Our approach has been to provide an annotation system, which includes a BLAST site, multiple genome browsers, an annotation portal, and the Apollo Annotation Editor configured to connect directly to our Chado database. In addition to implementing and integrating components of the annotation system, we have performed computational analyses to create gene evidence tracks and a consensus gene set, which can be viewed on individual gene pages at BGD. Conclusions We have provided annotation tools that alleviate challenges associated with distributed annotation. Our system provides a consistent set of data to all annotators and eliminates the need for annotators to format data. Involving the bovine research community in genome annotation has allowed us to leverage expertise in various areas of bovine biology to provide biological insight into the genome sequence. PMID:21092105
Protein Structure and Function Prediction Using I-TASSER
Yang, Jianyi; Zhang, Yang
2016-01-01
I-TASSER is a hierarchical protocol for automated protein structure prediction and structure-based function annotation. Starting from the amino acid sequence of target proteins, I-TASSER first generates full-length atomic structural models from multiple threading alignments and iterative structural assembly simulations followed by atomic-level structure refinement. The biological functions of the protein, including ligand-binding sites, enzyme commission number, and gene ontology terms, are then inferred from known protein function databases based on sequence and structure profile comparisons. I-TASSER is freely available as both an on-line server and a stand-alone package. This unit describes how to use the I-TASSER protocol to generate structure and function prediction and how to interpret the prediction results, as well as alternative approaches for further improving the I-TASSER modeling quality for distant-homologous and multi-domain protein targets. PMID:26678386
Annual Editions: Early Childhood Education 06/07
ERIC Educational Resources Information Center
Paciorek, Karen Menke, Ed.
2006-01-01
This 27th edition of "Annual Editions: Early Childhood Education" provides convenient, inexpensive access to current articles selected from the best of the public press. Organizational features include: an annotated listing of selected World Wide Web sites; an annotated table of contents; a topic guide; a general introduction; brief overviews for…
Health Communication and Literacy: An Annotated Bibliography.
ERIC Educational Resources Information Center
Beveridge, Jennifer
This annotated bibliography lists publications and World Wide Web sites dealing with health communication and literacy. The 51 publications, which were all published between 1982 and 1998, contain information about and/or for use in the following areas: assessment, assessment tools, elderly adults, empowerment, maternal and child health, patient…
Careers; Current Events; Government, U.S.; Insects; Science Experiments; Terrorism, War on.
ERIC Educational Resources Information Center
Web Feet, 2001
2001-01-01
This annotated subject guide to Web sites for grades K-8 focuses on careers, current events, government, insects, science experiments, the war on terrorism, and calendar connections for Martin Luther King Day and other January observances. Specific grade levels are indicated for each annotation. (LRW)
LS-SNP: large-scale annotation of coding non-synonymous SNPs based on multiple information sources.
Karchin, Rachel; Diekhans, Mark; Kelly, Libusha; Thomas, Daryl J; Pieper, Ursula; Eswar, Narayanan; Haussler, David; Sali, Andrej
2005-06-15
The NCBI dbSNP database lists over 9 million single nucleotide polymorphisms (SNPs) in the human genome, but currently contains limited annotation information. SNPs that result in amino acid residue changes (nsSNPs) are of critical importance in variation between individuals, including disease and drug sensitivity. We have developed LS-SNP, a genomic scale software pipeline to annotate nsSNPs. LS-SNP comprehensively maps nsSNPs onto protein sequences, functional pathways and comparative protein structure models, and predicts positions where nsSNPs destabilize proteins, interfere with the formation of domain-domain interfaces, have an effect on protein-ligand binding or severely impact human health. It currently annotates 28,043 validated SNPs that produce amino acid residue substitutions in human proteins from the SwissProt/TrEMBL database. Annotations can be viewed via a web interface either in the context of a genomic region or by selecting sets of SNPs, genes, proteins or pathways. These results are useful for identifying candidate functional SNPs within a gene, haplotype or pathway and in probing molecular mechanisms responsible for functional impacts of nsSNPs. http://www.salilab.org/LS-SNP CONTACT: rachelk@salilab.org http://salilab.org/LS-SNP/supp-info.pdf.
SeqHound: biological sequence and structure database as a platform for bioinformatics research
2002-01-01
Background SeqHound has been developed as an integrated biological sequence, taxonomy, annotation and 3-D structure database system. It provides a high-performance server platform for bioinformatics research in a locally-hosted environment. Results SeqHound is based on the National Center for Biotechnology Information data model and programming tools. It offers daily updated contents of all Entrez sequence databases in addition to 3-D structural data and information about sequence redundancies, sequence neighbours, taxonomy, complete genomes, functional annotation including Gene Ontology terms and literature links to PubMed. SeqHound is accessible via a web server through a Perl, C or C++ remote API or an optimized local API. It provides functionality necessary to retrieve specialized subsets of sequences, structures and structural domains. Sequences may be retrieved in FASTA, GenBank, ASN.1 and XML formats. Structures are available in ASN.1, XML and PDB formats. Emphasis has been placed on complete genomes, taxonomy, domain and functional annotation as well as 3-D structural functionality in the API, while fielded text indexing functionality remains under development. SeqHound also offers a streamlined WWW interface for simple web-user queries. Conclusions The system has proven useful in several published bioinformatics projects such as the BIND database and offers a cost-effective infrastructure for research. SeqHound will continue to develop and be provided as a service of the Blueprint Initiative at the Samuel Lunenfeld Research Institute. The source code and examples are available under the terms of the GNU public license at the Sourceforge site http://sourceforge.net/projects/slritools/ in the SLRI Toolkit. PMID:12401134
The standard operating procedure of the DOE-JGI Metagenome Annotation Pipeline (MAP v.4)
Huntemann, Marcel; Ivanova, Natalia N.; Mavromatis, Konstantinos; ...
2016-02-24
The DOE-JGI Metagenome Annotation Pipeline (MAP v.4) performs structural and functional annotation for metagenomic sequences that are submitted to the Integrated Microbial Genomes with Microbiomes (IMG/M) system for comparative analysis. The pipeline runs on nucleotide sequences provide d via the IMG submission site. Users must first define their analysis projects in GOLD and then submit the associated sequence datasets consisting of scaffolds/contigs with optional coverage information and/or unassembled reads in fasta and fastq file formats. The MAP processing consists of feature prediction including identification of protein-coding genes, non-coding RNAs and regulatory RNAs, as well as CRISPR elements. Structural annotation ismore » followed by functional annotation including assignment of protein product names and connection to various protein family databases.« less
The standard operating procedure of the DOE-JGI Metagenome Annotation Pipeline (MAP v.4)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Huntemann, Marcel; Ivanova, Natalia N.; Mavromatis, Konstantinos
The DOE-JGI Metagenome Annotation Pipeline (MAP v.4) performs structural and functional annotation for metagenomic sequences that are submitted to the Integrated Microbial Genomes with Microbiomes (IMG/M) system for comparative analysis. The pipeline runs on nucleotide sequences provide d via the IMG submission site. Users must first define their analysis projects in GOLD and then submit the associated sequence datasets consisting of scaffolds/contigs with optional coverage information and/or unassembled reads in fasta and fastq file formats. The MAP processing consists of feature prediction including identification of protein-coding genes, non-coding RNAs and regulatory RNAs, as well as CRISPR elements. Structural annotation ismore » followed by functional annotation including assignment of protein product names and connection to various protein family databases.« less
Schwientek, Patrick; Neshat, Armin; Kalinowski, Jörn; Klein, Andreas; Rückert, Christian; Schneiker-Bekel, Susanne; Wendler, Sergej; Stoye, Jens; Pühler, Alfred
2014-11-20
Actinoplanes sp. SE50/110 is the producer of the alpha-glucosidase inhibitor acarbose, which is an economically relevant and potent drug in the treatment of type-2 diabetes mellitus. In this study, we present the detection of transcription start sites on this genome by sequencing enriched 5'-ends of primary transcripts. Altogether, 1427 putative transcription start sites were initially identified. With help of the annotated genome sequence, 661 transcription start sites were found to belong to the leader region of protein-coding genes with the surprising result that roughly 20% of these genes rank among the class of leaderless transcripts. Next, conserved promoter motifs were identified for protein-coding genes with and without leader sequences. The mapped transcription start sites were finally used to improve the annotation of the Actinoplanes sp. SE50/110 genome sequence. Concerning protein-coding genes, 41 translation start sites were corrected and 9 novel protein-coding genes could be identified. In addition to this, 122 previously undetermined non-coding RNA (ncRNA) genes of Actinoplanes sp. SE50/110 were defined. Focusing on antisense transcription start sites located within coding genes or their leader sequences, it was discovered that 96 of those ncRNA genes belong to the class of antisense RNA (asRNA) genes. The remaining 26 ncRNA genes were found outside of known protein-coding genes. Four chosen examples of prominent ncRNA genes, namely the transfer messenger RNA gene ssrA, the ribonuclease P class A RNA gene rnpB, the cobalamin riboswitch RNA gene cobRS, and the selenocysteine-specific tRNA gene selC, are presented in more detail. This study demonstrates that sequencing of enriched 5'-ends of primary transcripts and the identification of transcription start sites are valuable tools for advanced genome annotation of Actinoplanes sp. SE50/110 and most probably also for other bacteria. Copyright © 2014 Elsevier B.V. All rights reserved.
Functional Evolution of PLP-dependent Enzymes based on Active-Site Structural Similarities
Catazaro, Jonathan; Caprez, Adam; Guru, Ashu; Swanson, David; Powers, Robert
2014-01-01
Families of distantly related proteins typically have very low sequence identity, which hinders evolutionary analysis and functional annotation. Slowly evolving features of proteins, such as an active site, are therefore valuable for annotating putative and distantly related proteins. To date, a complete evolutionary analysis of the functional relationship of an entire enzyme family based on active-site structural similarities has not yet been undertaken. Pyridoxal-5’-phosphate (PLP) dependent enzymes are primordial enzymes that diversified in the last universal ancestor. Using the Comparison of Protein Active Site Structures (CPASS) software and database, we show that the active site structures of PLP-dependent enzymes can be used to infer evolutionary relationships based on functional similarity. The enzymes successfully clustered together based on substrate specificity, function, and three-dimensional fold. This study demonstrates the value of using active site structures for functional evolutionary analysis and the effectiveness of CPASS. PMID:24920327
Functional evolution of PLP-dependent enzymes based on active-site structural similarities.
Catazaro, Jonathan; Caprez, Adam; Guru, Ashu; Swanson, David; Powers, Robert
2014-10-01
Families of distantly related proteins typically have very low sequence identity, which hinders evolutionary analysis and functional annotation. Slowly evolving features of proteins, such as an active site, are therefore valuable for annotating putative and distantly related proteins. To date, a complete evolutionary analysis of the functional relationship of an entire enzyme family based on active-site structural similarities has not yet been undertaken. Pyridoxal-5'-phosphate (PLP) dependent enzymes are primordial enzymes that diversified in the last universal ancestor. Using the comparison of protein active site structures (CPASS) software and database, we show that the active site structures of PLP-dependent enzymes can be used to infer evolutionary relationships based on functional similarity. The enzymes successfully clustered together based on substrate specificity, function, and three-dimensional-fold. This study demonstrates the value of using active site structures for functional evolutionary analysis and the effectiveness of CPASS. © 2014 Wiley Periodicals, Inc.
Classic Religious Books for Children: An Annotated Bibliography.
ERIC Educational Resources Information Center
Campbell, Carol, Comp.
This annotated bibliography of religious books for children contains approximately 450 books, one-fifth of which are Judaic. The books' current availability has been verified using Web sites such as those of individual publishers, the Library of Congress, Amazon.com, or Barnes&Noble.com. New subject headings have been added, such as Kwanza,…
The Great War: Online Resources.
ERIC Educational Resources Information Center
Duncanson, Bruce
2002-01-01
Presents an annotated bibliography of Web sites about World War I. Includes: (1) general Web sites; (2) Web sites with information during the war; (3) Web sites with information about post-World War I; (4) Web sites that provide photos, sound files of speeches, and propaganda posters; and (5) Web sites with lesson plans. (CMK)
2013-01-01
Background The binding of transcription factors to DNA plays an essential role in the regulation of gene expression. Numerous experiments elucidated binding sequences which subsequently have been used to derive statistical models for predicting potential transcription factor binding sites (TFBS). The rapidly increasing number of genome sequence data requires sophisticated computational approaches to manage and query experimental and predicted TFBS data in the context of other epigenetic factors and across different organisms. Results We have developed D-Light, a novel client-server software package to store and query large amounts of TFBS data for any number of genomes. Users can add small-scale data to the server database and query them in a large scale, genome-wide promoter context. The client is implemented in Java and provides simple graphical user interfaces and data visualization. Here we also performed a statistical analysis showing what a user can expect for certain parameter settings and we illustrate the usage of D-Light with the help of a microarray data set. Conclusions D-Light is an easy to use software tool to integrate, store and query annotation data for promoters. A public D-Light server, the client and server software for local installation and the source code under GNU GPL license are available at http://biwww.che.sbg.ac.at/dlight. PMID:23617301
Brozovic, Matija; Dantec, Christelle; Dardaillon, Justine; Dauga, Delphine; Faure, Emmanuel; Gineste, Mathieu; Louis, Alexandra; Naville, Magali; Nitta, Kazuhiro R; Piette, Jacques; Reeves, Wendy; Scornavacca, Céline; Simion, Paul; Vincentelli, Renaud; Bellec, Maelle; Aicha, Sameh Ben; Fagotto, Marie; Guéroult-Bellone, Marion; Haeussler, Maximilian; Jacox, Edwin; Lowe, Elijah K; Mendez, Mickael; Roberge, Alexis; Stolfi, Alberto; Yokomori, Rui; Brown, C Titus; Cambillau, Christian; Christiaen, Lionel; Delsuc, Frédéric; Douzery, Emmanuel; Dumollard, Rémi; Kusakabe, Takehiro; Nakai, Kenta; Nishida, Hiroki; Satou, Yutaka; Swalla, Billie; Veeman, Michael; Volff, Jean-Nicolas; Lemaire, Patrick
2018-01-04
ANISEED (www.aniseed.cnrs.fr) is the main model organism database for tunicates, the sister-group of vertebrates. This release gives access to annotated genomes, gene expression patterns, and anatomical descriptions for nine ascidian species. It provides increased integration with external molecular and taxonomy databases, better support for epigenomics datasets, in particular RNA-seq, ChIP-seq and SELEX-seq, and features novel interactive interfaces for existing and novel datatypes. In particular, the cross-species navigation and comparison is enhanced through a novel taxonomy section describing each represented species and through the implementation of interactive phylogenetic gene trees for 60% of tunicate genes. The gene expression section displays the results of RNA-seq experiments for the three major model species of solitary ascidians. Gene expression is controlled by the binding of transcription factors to cis-regulatory sequences. A high-resolution description of the DNA-binding specificity for 131 Ciona robusta (formerly C. intestinalis type A) transcription factors by SELEX-seq is provided and used to map candidate binding sites across the Ciona robusta and Phallusia mammillata genomes. Finally, use of a WashU Epigenome browser enhances genome navigation, while a Genomicus server was set up to explore microsynteny relationships within tunicates and with vertebrates, Amphioxus, echinoderms and hemichordates. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Markunas, Christina A; Johnson, Eric O; Hancock, Dana B
2017-07-01
Genome-wide association study (GWAS)-identified variants are enriched for functional elements. However, we have limited knowledge of how functional enrichment may differ by disease/trait and tissue type. We tested a broad set of eight functional elements for enrichment among GWAS-identified SNPs (p < 5×10 -8 ) from the NHGRI-EBI Catalog across seven disease/trait categories: cancer, cardiovascular disease, diabetes, autoimmune disease, psychiatric disease, neurological disease, and anthropometric traits. SNPs were annotated using HaploReg for the eight functional elements across any tissue: DNase sites, expression quantitative trait loci (eQTL), sequence conservation, enhancers, promoters, missense variants, sequence motifs, and protein binding sites. In addition, tissue-specific annotations were considered for brain vs. blood. Disease/trait SNPs were compared to a control set of 4809 SNPs matched to the GWAS SNPs (N = 1639) on allele frequency, gene density, distance to nearest gene, and linkage disequilibrium at ~3:1 ratio. Enrichment analyses were conducted using logistic regression, with Bonferroni correction. Overall, a significant enrichment was observed for all functional elements, except sequence motifs. Missense SNPs showed the strongest magnitude of enrichment. eQTLs were the only functional element significantly enriched across all diseases/traits. Magnitudes of enrichment were generally similar across diseases/traits, where enrichment was statistically significant. Blood vs. brain tissue effects on enrichment were dependent on disease/trait and functional element (e.g., cardiovascular disease: eQTLs P TissueDifference = 1.28 × 10 -6 vs. enhancers P TissueDifference = 0.94). Identifying disease/trait-relevant functional elements and tissue types could provide new insight into the underlying biology, by guiding a priori GWAS analyses (e.g., brain enhancer elements for psychiatric disease) or facilitating post hoc interpretation.
Updated regulation curation model at the Saccharomyces Genome Database
Engel, Stacia R; Skrzypek, Marek S; Hellerstedt, Sage T; Wong, Edith D; Nash, Robert S; Weng, Shuai; Binkley, Gail; Sheppard, Travis K; Karra, Kalpana; Cherry, J Michael
2018-01-01
Abstract The Saccharomyces Genome Database (SGD) provides comprehensive, integrated biological information for the budding yeast Saccharomyces cerevisiae, along with search and analysis tools to explore these data, enabling the discovery of functional relationships between sequence and gene products in fungi and higher organisms. We have recently expanded our data model for regulation curation to address regulation at the protein level in addition to transcription, and are presenting the expanded data on the ‘Regulation’ pages at SGD. These pages include a summary describing the context under which the regulator acts, manually curated and high-throughput annotations showing the regulatory relationships for that gene and a graphical visualization of its regulatory network and connected networks. For genes whose products regulate other genes or proteins, the Regulation page includes Gene Ontology enrichment analysis of the biological processes in which those targets participate. For DNA-binding transcription factors, we also provide other information relevant to their regulatory function, such as DNA binding site motifs and protein domains. As with other data types at SGD, all regulatory relationships and accompanying data are available through YeastMine, SGD’s data warehouse based on InterMine. Database URL: http://www.yeastgenome.org PMID:29688362
Gaulton, Kyle J; Ferreira, Teresa; Lee, Yeji; Raimondo, Anne; Mägi, Reedik; Reschen, Michael E; Mahajan, Anubha; Locke, Adam; Rayner, N William; Robertson, Neil; Scott, Robert A; Prokopenko, Inga; Scott, Laura J; Green, Todd; Sparso, Thomas; Thuillier, Dorothee; Yengo, Loic; Grallert, Harald; Wahl, Simone; Frånberg, Mattias; Strawbridge, Rona J; Kestler, Hans; Chheda, Himanshu; Eisele, Lewin; Gustafsson, Stefan; Steinthorsdottir, Valgerdur; Thorleifsson, Gudmar; Qi, Lu; Karssen, Lennart C; van Leeuwen, Elisabeth M; Willems, Sara M; Li, Man; Chen, Han; Fuchsberger, Christian; Kwan, Phoenix; Ma, Clement; Linderman, Michael; Lu, Yingchang; Thomsen, Soren K; Rundle, Jana K; Beer, Nicola L; van de Bunt, Martijn; Chalisey, Anil; Kang, Hyun Min; Voight, Benjamin F; Abecasis, Gonçalo R; Almgren, Peter; Baldassarre, Damiano; Balkau, Beverley; Benediktsson, Rafn; Blüher, Matthias; Boeing, Heiner; Bonnycastle, Lori L; Bottinger, Erwin P; Burtt, Noël P; Carey, Jason; Charpentier, Guillaume; Chines, Peter S; Cornelis, Marilyn C; Couper, David J; Crenshaw, Andrew T; van Dam, Rob M; Doney, Alex S F; Dorkhan, Mozhgan; Edkins, Sarah; Eriksson, Johan G; Esko, Tonu; Eury, Elodie; Fadista, João; Flannick, Jason; Fontanillas, Pierre; Fox, Caroline; Franks, Paul W; Gertow, Karl; Gieger, Christian; Gigante, Bruna; Gottesman, Omri; Grant, George B; Grarup, Niels; Groves, Christopher J; Hassinen, Maija; Have, Christian T; Herder, Christian; Holmen, Oddgeir L; Hreidarsson, Astradur B; Humphries, Steve E; Hunter, David J; Jackson, Anne U; Jonsson, Anna; Jørgensen, Marit E; Jørgensen, Torben; Kao, Wen-Hong L; Kerrison, Nicola D; Kinnunen, Leena; Klopp, Norman; Kong, Augustine; Kovacs, Peter; Kraft, Peter; Kravic, Jasmina; Langford, Cordelia; Leander, Karin; Liang, Liming; Lichtner, Peter; Lindgren, Cecilia M; Lindholm, Eero; Linneberg, Allan; Liu, Ching-Ti; Lobbens, Stéphane; Luan, Jian'an; Lyssenko, Valeriya; Männistö, Satu; McLeod, Olga; Meyer, Julia; Mihailov, Evelin; Mirza, Ghazala; Mühleisen, Thomas W; Müller-Nurasyid, Martina; Navarro, Carmen; Nöthen, Markus M; Oskolkov, Nikolay N; Owen, Katharine R; Palli, Domenico; Pechlivanis, Sonali; Peltonen, Leena; Perry, John R B; Platou, Carl G P; Roden, Michael; Ruderfer, Douglas; Rybin, Denis; van der Schouw, Yvonne T; Sennblad, Bengt; Sigurðsson, Gunnar; Stančáková, Alena; Steinbach, Gerald; Storm, Petter; Strauch, Konstantin; Stringham, Heather M; Sun, Qi; Thorand, Barbara; Tikkanen, Emmi; Tonjes, Anke; Trakalo, Joseph; Tremoli, Elena; Tuomi, Tiinamaija; Wennauer, Roman; Wiltshire, Steven; Wood, Andrew R; Zeggini, Eleftheria; Dunham, Ian; Birney, Ewan; Pasquali, Lorenzo; Ferrer, Jorge; Loos, Ruth J F; Dupuis, Josée; Florez, Jose C; Boerwinkle, Eric; Pankow, James S; van Duijn, Cornelia; Sijbrands, Eric; Meigs, James B; Hu, Frank B; Thorsteinsdottir, Unnur; Stefansson, Kari; Lakka, Timo A; Rauramaa, Rainer; Stumvoll, Michael; Pedersen, Nancy L; Lind, Lars; Keinanen-Kiukaanniemi, Sirkka M; Korpi-Hyövälti, Eeva; Saaristo, Timo E; Saltevo, Juha; Kuusisto, Johanna; Laakso, Markku; Metspalu, Andres; Erbel, Raimund; Jöcke, Karl-Heinz; Moebus, Susanne; Ripatti, Samuli; Salomaa, Veikko; Ingelsson, Erik; Boehm, Bernhard O; Bergman, Richard N; Collins, Francis S; Mohlke, Karen L; Koistinen, Heikki; Tuomilehto, Jaakko; Hveem, Kristian; Njølstad, Inger; Deloukas, Panagiotis; Donnelly, Peter J; Frayling, Timothy M; Hattersley, Andrew T; de Faire, Ulf; Hamsten, Anders; Illig, Thomas; Peters, Annette; Cauchi, Stephane; Sladek, Rob; Froguel, Philippe; Hansen, Torben; Pedersen, Oluf; Morris, Andrew D; Palmer, Collin N A; Kathiresan, Sekar; Melander, Olle; Nilsson, Peter M; Groop, Leif C; Barroso, Inês; Langenberg, Claudia; Wareham, Nicholas J; O'Callaghan, Christopher A; Gloyn, Anna L; Altshuler, David; Boehnke, Michael; Teslovich, Tanya M; McCarthy, Mark I; Morris, Andrew P
2015-12-01
We performed fine mapping of 39 established type 2 diabetes (T2D) loci in 27,206 cases and 57,574 controls of European ancestry. We identified 49 distinct association signals at these loci, including five mapping in or near KCNQ1. 'Credible sets' of the variants most likely to drive each distinct signal mapped predominantly to noncoding sequence, implying that association with T2D is mediated through gene regulation. Credible set variants were enriched for overlap with FOXA2 chromatin immunoprecipitation binding sites in human islet and liver cells, including at MTNR1B, where fine mapping implicated rs10830963 as driving T2D association. We confirmed that the T2D risk allele for this SNP increases FOXA2-bound enhancer activity in islet- and liver-derived cells. We observed allele-specific differences in NEUROD1 binding in islet-derived cells, consistent with evidence that the T2D risk allele increases islet MTNR1B expression. Our study demonstrates how integration of genetic and genomic information can define molecular mechanisms through which variants underlying association signals exert their effects on disease.
The identification and functional annotation of RNA structures conserved in vertebrates
Seemann, Stefan E.; Mirza, Aashiq H.; Hansen, Claus; Bang-Berthelsen, Claus H.; Garde, Christian; Christensen-Dalsgaard, Mikkel; Torarinsson, Elfar; Yao, Zizhen; Workman, Christopher T.; Pociot, Flemming; Nielsen, Henrik; Tommerup, Niels; Ruzzo, Walter L.; Gorodkin, Jan
2017-01-01
Structured elements of RNA molecules are essential in, e.g., RNA stabilization, localization, and protein interaction, and their conservation across species suggests a common functional role. We computationally screened vertebrate genomes for conserved RNA structures (CRSs), leveraging structure-based, rather than sequence-based, alignments. After careful correction for sequence identity and GC content, we predict ∼516,000 human genomic regions containing CRSs. We find that a substantial fraction of human–mouse CRS regions (1) colocalize consistently with binding sites of the same RNA binding proteins (RBPs) or (2) are transcribed in corresponding tissues. Additionally, a CaptureSeq experiment revealed expression of many of our CRS regions in human fetal brain, including 662 novel ones. For selected human and mouse candidate pairs, qRT-PCR and in vitro RNA structure probing supported both shared expression and shared structure despite low abundance and low sequence identity. About 30,000 CRS regions are located near coding or long noncoding RNA genes or within enhancers. Structured (CRS overlapping) enhancer RNAs and extended 3′ ends have significantly increased expression levels over their nonstructured counterparts. Our findings of transcribed uncharacterized regulatory regions that contain CRSs support their RNA-mediated functionality. PMID:28487280
Barakat, Mohamed; Ortet, Philippe; Whitworth, David E
2013-04-20
Regulatory proteins (RPs) such as transcription factors (TFs) and two-component system (TCS) proteins control how prokaryotic cells respond to changes in their external and/or internal state. Identification and annotation of TFs and TCSs is non-trivial, and between-genome comparisons are often confounded by different standards in annotation. There is a need for user-friendly, fast and convenient tools to allow researchers to overcome the inherent variability in annotation between genome sequences. We have developed the web-server P2RP (Predicted Prokaryotic Regulatory Proteins), which enables users to identify and annotate TFs and TCS proteins within their sequences of interest. Users can input amino acid or genomic DNA sequences, and predicted proteins therein are scanned for the possession of DNA-binding domains and/or TCS domains. RPs identified in this manner are categorised into families, unambiguously annotated, and a detailed description of their features generated, using an integrated software pipeline. P2RP results can then be outputted in user-specified formats. Biologists have an increasing need for fast and intuitively usable tools, which is why P2RP has been developed as an interactive system. As well as assisting experimental biologists to interrogate novel sequence data, it is hoped that P2RP will be built into genome annotation pipelines and re-annotation processes, to increase the consistency of RP annotation in public genomic sequences. P2RP is the first publicly available tool for predicting and analysing RP proteins in users' sequences. The server is freely available and can be accessed along with documentation at http://www.p2rp.org.
Many human accelerated regions are developmental enhancers
Capra, John A.; Erwin, Genevieve D.; McKinsey, Gabriel; Rubenstein, John L. R.; Pollard, Katherine S.
2013-01-01
The genetic changes underlying the dramatic differences in form and function between humans and other primates are largely unknown, although it is clear that gene regulatory changes play an important role. To identify regulatory sequences with potentially human-specific functions, we and others used comparative genomics to find non-coding regions conserved across mammals that have acquired many sequence changes in humans since divergence from chimpanzees. These regions are good candidates for performing human-specific regulatory functions. Here, we analysed the DNA sequence, evolutionary history, histone modifications, chromatin state and transcription factor (TF) binding sites of a combined set of 2649 non-coding human accelerated regions (ncHARs) and predicted that at least 30% of them function as developmental enhancers. We prioritized the predicted ncHAR enhancers using analysis of TF binding site gain and loss, along with the functional annotations and expression patterns of nearby genes. We then tested both the human and chimpanzee sequence for 29 ncHARs in transgenic mice, and found 24 novel developmental enhancers active in both species, 17 of which had very consistent patterns of activity in specific embryonic tissues. Of these ncHAR enhancers, five drove expression patterns suggestive of different activity for the human and chimpanzee sequence at embryonic day 11.5. The changes to human non-coding DNA in these ncHAR enhancers may modify the complex patterns of gene expression necessary for proper development in a human-specific manner and are thus promising candidates for understanding the genetic basis of human-specific biology. PMID:24218637
Lumsden, Amanda L; Ma, Yuefang; Ashander, Liam M; Stempel, Andrew J; Keating, Damien J; Smith, Justine R; Appukuttan, Binoy
2018-05-09
Regulation of intercellular adhesion molecule (ICAM)-1 in retinal endothelial cells is a promising druggable target for retinal vascular diseases. The ICAM-1-related (ICR) long non-coding RNA stabilizes ICAM-1 transcript, increasing protein expression. However, studies of ICR involvement in disease have been limited as the promoter is uncharacterized. To address this issue, we undertook a comprehensive in silico analysis of the human ICR gene promoter region. We used genomic evolutionary rate profiling to identify a 115 base pair (bp) sequence within 500 bp upstream of the transcription start site of the annotated human ICR gene that was conserved across 25 eutherian genomes. A second constrained sequence upstream of the orthologous mouse gene (68 bp; conserved across 27 Eutherian genomes including human) was also discovered. Searching these elements identified 33 matrices predictive of binding sites for transcription factors known to be responsive to a broad range of pathological stimuli, including hypoxia, and metabolic and inflammatory proteins. Five phenotype-associated single nucleotide polymorphisms (SNPs) in the immediate vicinity of these elements included four SNPs (i.e. rs2569693, rs281439, rs281440 and rs11575074) predicted to impact binding motifs of transcription factors, and thus the expression of ICR and ICAM-1 genes, with potential to influence disease susceptibility. We verified that human retinal endothelial cells expressed ICR, and observed induction of expression by tumor necrosis factor-α.
Active site and laminarin binding in glycoside hydrolase family 55
Bianchetti, Christopher M.; Takasuka, Taichi E.; Deutsch, Sam; ...
2015-03-09
The Carbohydrate Active Enzyme (CAZy) database indicates that glycoside hydrolase family 55 (GH55) contains both endo- and exo-β-1,3-glucanases. The founding structure in the GH55 is PcLam55A from the white rot fungus Phanerochaete chrysosporium. Here, we present high resolution crystal structures of bacterial SacteLam55A from the highly cellulolytic Streptomyces sp. SirexAA-E with bound substrates and product. These structures, along with mutagenesis and kinetic studies, implicate Glu-502 as the catalytic acid (as proposed earlier for Glu-663 in PcLam55A) and a proton relay network of four residues in activating water as the nucleophile. Further, a set of conserved aromatic residues that define themore » active site apparently enforce an exo-glucanase reactivity as demonstrated by exhaustive hydrolysis reactions with purified laminarioligosaccharides. Two additional aromatic residues that line the substrate-binding channel show substrate-dependent conformational flexibility that may promote processive reactivity of the bound oligosaccharide in the bacterial enzymes. Gene synthesis carried out on ~30% of the GH55 family gave 34 active enzymes (19% functional coverage of the nonredundant members of GH55). These active enzymes reacted with only laminarin from a panel of 10 different soluble and insoluble polysaccharides and displayed a broad range of specific activities and optima for pH and temperature. Furthermore, application of this experimental method provides a new, systematic way to annotate glycoside hydrolase phylogenetic space for functional properties.« less
CRAVAT is an easy to use web-based tool for analysis of cancer variants (missense, nonsense, in-frame indel, frameshift indel, splice site). CRAVAT provides scores and a variety of annotations that assist in identification of important variants. Results are provided in an interactive, highly graphical webpage and include annotated 3D structure visualization. CRAVAT is also available for local or cloud-based installation as a Docker container. MuPIT provides 3D visualization of mutation clusters and functional annotation and is now integrated with CRAVAT.
Vadigepalli, Rajanikanth; Chakravarthula, Praveen; Zak, Daniel E; Schwaber, James S; Gonye, Gregory E
2003-01-01
We have developed a bioinformatics tool named PAINT that automates the promoter analysis of a given set of genes for the presence of transcription factor binding sites. Based on coincidence of regulatory sites, this tool produces an interaction matrix that represents a candidate transcriptional regulatory network. This tool currently consists of (1) a database of promoter sequences of known or predicted genes in the Ensembl annotated mouse genome database, (2) various modules that can retrieve and process the promoter sequences for binding sites of known transcription factors, and (3) modules for visualization and analysis of the resulting set of candidate network connections. This information provides a substantially pruned list of genes and transcription factors that can be examined in detail in further experimental studies on gene regulation. Also, the candidate network can be incorporated into network identification methods in the form of constraints on feasible structures in order to render the algorithms tractable for large-scale systems. The tool can also produce output in various formats suitable for use in external visualization and analysis software. In this manuscript, PAINT is demonstrated in two case studies involving analysis of differentially regulated genes chosen from two microarray data sets. The first set is from a neuroblastoma N1E-115 cell differentiation experiment, and the second set is from neuroblastoma N1E-115 cells at different time intervals following exposure to neuropeptide angiotensin II. PAINT is available for use as an agent in BioSPICE simulation and analysis framework (www.biospice.org), and can also be accessed via a WWW interface at www.dbi.tju.edu/dbi/tools/paint/.
De novo-based transcriptome profiling of male-sterile and fertile watermelon lines
Seo, Minseok; Jang, Yoon Jeong; Sim, Tae Yong; Cho, Seoae; Han, Sang-Wook
2017-01-01
The whole-genome sequence of watermelon (Citrullus lanatus (Thunb.) Matsum. & Nakai), a valuable horticultural crop worldwide, was released in 2013. Here, we compared a de novo-based approach (DBA) to a reference-based approach (RBA) using RNA-seq data, to aid in efforts to improve the annotation of the watermelon reference genome and to obtain biological insight into male-sterility in watermelon. We applied these techniques to available data from two watermelon lines: the male-sterile line DAH3615-MS and the male-fertile line DAH3615. Using DBA, we newly annotated 855 watermelon transcripts, and found gene functional clusters predicted to be related to stimulus responses, nucleic acid binding, transmembrane transport, homeostasis, and Golgi/vesicles. Among the DBA-annotated transcripts, 138 de novo-exclusive differentially-expressed genes (DEDEGs) related to male sterility were detected. Out of 33 randomly selected newly annotated transcripts and DEDEGs, 32 were validated by RT-qPCR. This study demonstrates the usefulness and reliability of the de novo transcriptome assembly in watermelon, and provides new insights for researchers exploring transcriptional blueprints with regard to the male sterility. PMID:29095876
ERIC Educational Resources Information Center
Milevski, Robert J.
1995-01-01
This book repair manual developed for the Illinois Cooperative Conservation Program includes book structure and book problems, book repair procedures for 4 specific problems, a description of adhesive bindings, a glossary, an annotated list of 11 additional readings, book repair supplies and suppliers, and specifications for book repair kits. (LRW)
Vařeková, Radka Svobodová; Jaiswal, Deepti; Sehnal, David; Ionescu, Crina-Maria; Geidl, Stanislav; Pravda, Lukáš; Horský, Vladimír; Wimmerová, Michaela; Koča, Jaroslav
2014-07-01
Structure validation has become a major issue in the structural biology community, and an essential step is checking the ligand structure. This paper introduces MotiveValidator, a web-based application for the validation of ligands and residues in PDB or PDBx/mmCIF format files provided by the user. Specifically, MotiveValidator is able to evaluate in a straightforward manner whether the ligand or residue being studied has a correct annotation (3-letter code), i.e. if it has the same topology and stereochemistry as the model ligand or residue with this annotation. If not, MotiveValidator explicitly describes the differences. MotiveValidator offers a user-friendly, interactive and platform-independent environment for validating structures obtained by any type of experiment. The results of the validation are presented in both tabular and graphical form, facilitating their interpretation. MotiveValidator can process thousands of ligands or residues in a single validation run that takes no more than a few minutes. MotiveValidator can be used for testing single structures, or the analysis of large sets of ligands or fragments prepared for binding site analysis, docking or virtual screening. MotiveValidator is freely available via the Internet at http://ncbr.muni.cz/MotiveValidator. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Du, Yushen; Wu, Nicholas C.; Jiang, Lin; Zhang, Tianhao; Gong, Danyang; Shu, Sara; Wu, Ting-Ting
2016-01-01
ABSTRACT Identification and annotation of functional residues are fundamental questions in protein sequence analysis. Sequence and structure conservation provides valuable information to tackle these questions. It is, however, limited by the incomplete sampling of sequence space in natural evolution. Moreover, proteins often have multiple functions, with overlapping sequences that present challenges to accurate annotation of the exact functions of individual residues by conservation-based methods. Using the influenza A virus PB1 protein as an example, we developed a method to systematically identify and annotate functional residues. We used saturation mutagenesis and high-throughput sequencing to measure the replication capacity of single nucleotide mutations across the entire PB1 protein. After predicting protein stability upon mutations, we identified functional PB1 residues that are essential for viral replication. To further annotate the functional residues important to the canonical or noncanonical functions of viral RNA-dependent RNA polymerase (vRdRp), we performed a homologous-structure analysis with 16 different vRdRp structures. We achieved high sensitivity in annotating the known canonical polymerase functional residues. Moreover, we identified a cluster of noncanonical functional residues located in the loop region of the PB1 β-ribbon. We further demonstrated that these residues were important for PB1 protein nuclear import through the interaction with Ran-binding protein 5. In summary, we developed a systematic and sensitive method to identify and annotate functional residues that are not restrained by sequence conservation. Importantly, this method is generally applicable to other proteins about which homologous-structure information is available. PMID:27803181
Lozano, Roberto; Ponce, Olga; Ramirez, Manuel; Mostajo, Nelly; Orjeda, Gisella
2012-01-01
The majority of disease resistance (R) genes identified to date in plants encode a nucleotide-binding site (NBS) and leucine-rich repeat (LRR) domain containing protein. Additional domains such as coiled-coil (CC) and TOLL/interleukin-1 receptor (TIR) domains can also be present. In the recently sequenced Solanum tuberosum group phureja genome we used HMM models and manual curation to annotate 435 NBS-encoding R gene homologs and 142 NBS-derived genes that lack the NBS domain. Highly similar homologs for most previously documented Solanaceae R genes were identified. A surprising ∼41% (179) of the 435 NBS-encoding genes are pseudogenes primarily caused by premature stop codons or frameshift mutations. Alignment of 81.80% of the 577 homologs to S. tuberosum group phureja pseudomolecules revealed non-random distribution of the R-genes; 362 of 470 genes were found in high density clusters on 11 chromosomes. PMID:22493716
Identification of genetic elements in metabolism by high-throughput mouse phenotyping.
Rozman, Jan; Rathkolb, Birgit; Oestereicher, Manuela A; Schütt, Christine; Ravindranath, Aakash Chavan; Leuchtenberger, Stefanie; Sharma, Sapna; Kistler, Martin; Willershäuser, Monja; Brommage, Robert; Meehan, Terrence F; Mason, Jeremy; Haselimashhadi, Hamed; Hough, Tertius; Mallon, Ann-Marie; Wells, Sara; Santos, Luis; Lelliott, Christopher J; White, Jacqueline K; Sorg, Tania; Champy, Marie-France; Bower, Lynette R; Reynolds, Corey L; Flenniken, Ann M; Murray, Stephen A; Nutter, Lauryl M J; Svenson, Karen L; West, David; Tocchini-Valentini, Glauco P; Beaudet, Arthur L; Bosch, Fatima; Braun, Robert B; Dobbie, Michael S; Gao, Xiang; Herault, Yann; Moshiri, Ala; Moore, Bret A; Kent Lloyd, K C; McKerlie, Colin; Masuya, Hiroshi; Tanaka, Nobuhiko; Flicek, Paul; Parkinson, Helen E; Sedlacek, Radislav; Seong, Je Kyung; Wang, Chi-Kuang Leo; Moore, Mark; Brown, Steve D; Tschöp, Matthias H; Wurst, Wolfgang; Klingenspor, Martin; Wolf, Eckhard; Beckers, Johannes; Machicao, Fausto; Peter, Andreas; Staiger, Harald; Häring, Hans-Ulrich; Grallert, Harald; Campillos, Monica; Maier, Holger; Fuchs, Helmut; Gailus-Durner, Valerie; Werner, Thomas; Hrabe de Angelis, Martin
2018-01-18
Metabolic diseases are a worldwide problem but the underlying genetic factors and their relevance to metabolic disease remain incompletely understood. Genome-wide research is needed to characterize so-far unannotated mammalian metabolic genes. Here, we generate and analyze metabolic phenotypic data of 2016 knockout mouse strains under the aegis of the International Mouse Phenotyping Consortium (IMPC) and find 974 gene knockouts with strong metabolic phenotypes. 429 of those had no previous link to metabolism and 51 genes remain functionally completely unannotated. We compared human orthologues of these uncharacterized genes in five GWAS consortia and indeed 23 candidate genes are associated with metabolic disease. We further identify common regulatory elements in promoters of candidate genes. As each regulatory element is composed of several transcription factor binding sites, our data reveal an extensive metabolic phenotype-associated network of co-regulated genes. Our systematic mouse phenotype analysis thus paves the way for full functional annotation of the genome.
Epigenetics, chromatin and genome organization: recent advances from the ENCODE project.
Siggens, L; Ekwall, K
2014-09-01
The organization of the genome into functional units, such as enhancers and active or repressed promoters, is associated with distinct patterns of DNA and histone modifications. The Encyclopedia of DNA Elements (ENCODE) project has advanced our understanding of the principles of genome, epigenome and chromatin organization, identifying hundreds of thousands of potential regulatory regions and transcription factor binding sites. Part of the ENCODE consortium, GENCODE, has annotated the human genome with novel transcripts including new noncoding RNAs and pseudogenes, highlighting transcriptional complexity. Many disease variants identified in genome-wide association studies are located within putative enhancer regions defined by the ENCODE project. Understanding the principles of chromatin and epigenome organization will help to identify new disease mechanisms, biomarkers and drug targets, particularly as ongoing epigenome mapping projects generate data for primary human cell types that play important roles in disease. © 2014 The Association for the Publication of the Journal of Internal Medicine.
Ensembl 2002: accommodating comparative genomics.
Clamp, M; Andrews, D; Barker, D; Bevan, P; Cameron, G; Chen, Y; Clark, L; Cox, T; Cuff, J; Curwen, V; Down, T; Durbin, R; Eyras, E; Gilbert, J; Hammond, M; Hubbard, T; Kasprzyk, A; Keefe, D; Lehvaslaiho, H; Iyer, V; Melsopp, C; Mongin, E; Pettett, R; Potter, S; Rust, A; Schmidt, E; Searle, S; Slater, G; Smith, J; Spooner, W; Stabenau, A; Stalker, J; Stupka, E; Ureta-Vidal, A; Vastrik, I; Birney, E
2003-01-01
The Ensembl (http://www.ensembl.org/) database project provides a bioinformatics framework to organise biology around the sequences of large genomes. It is a comprehensive source of stable automatic annotation of human, mouse and other genome sequences, available as either an interactive web site or as flat files. Ensembl also integrates manually annotated gene structures from external sources where available. As well as being one of the leading sources of genome annotation, Ensembl is an open source software engineering project to develop a portable system able to handle very large genomes and associated requirements. These range from sequence analysis to data storage and visualisation and installations exist around the world in both companies and at academic sites. With both human and mouse genome sequences available and more vertebrate sequences to follow, many of the recent developments in Ensembl have focusing on developing automatic comparative genome analysis and visualisation.
Narancic, Tanja; Scollica, Elisa; Cagney, Gerard; O'Connor, Kevin E
2018-04-01
Polyhydroxybutyrate (PHB), a biodegradable polymer accumulated by bacteria is deposited intracellularly in the form of inclusion bodies often called granules. The granules are supramolecular complexes harbouring a varied number of proteins on their surface, which have specific but incompletely characterised functions. By comparison with other organisms that produce biodegradable polymers, only two phasins have been described to date for Rhodosprillum rubrum, raising the possibility that more await discovery. Using a comparative proteomics strategy to compare the granules of wild-type R. rubrum with a PHB-negative mutant housing artificial PHB granules, we identified four potential PHB granules' associated proteins. These were: Q2RSI4, an uncharacterised protein; Q2RWU9, annotated as an extracellular solute-binding protein; Q2RQL4, annotated as basic membrane lipoprotein; and Q2RQ51, annotated as glucose-6-phosphate isomerase. In silico analysis revealed that Q2RSI4 harbours a Phasin_2 family domain and shares low identity with a single-strand DNA-binding protein from Sphaerochaeta coccoides. Fluorescence microscopy found that three proteins Q2RSI4, Q2EWU9 and Q2RQL4 co-localised with PHB granules. This work adds three potential new granule associated proteins to the repertoire of factors involved in bacterial storage granule formation, and confirms that proteomics screens are an effective strategy for discovery of novel granule associated proteins.
Vetting, Matthew W.; Al-Obaidi, Nawar; Zhao, Suwen; ...
2014-12-25
The rate at which genome sequencing data is accruing demands enhanced methods for functional annotation and metabolism discovery. Solute binding proteins (SBPs) facilitate the transport of the first reactant in a metabolic pathway, thereby constraining the regions of chemical space and the chemistries that must be considered for pathway reconstruction. Here in this paper, we describe high-throughput protein production and differential scanning fluorimetry platforms, which enabled the screening of 158 SBPs against a 189 component library specifically tailored for this class of proteins. Like all screening efforts, this approach is limited by the practical constraints imposed by construction of themore » library, i.e., we can study only those metabolites that are known to exist and which can be made in sufficient quantities for experimentation. To move beyond these inherent limitations, we illustrate the promise of crystallographic- and mass spectrometric-based approaches for the unbiased use of entire metabolomes as screening libraries. Together, our approaches identified 40 new SBP ligands, generated experiment-based annotations for 2084 SBPs in 71 isofunctional clusters, and defined numerous metabolic pathways, including novel catabolic pathways for the utilization of ethanolamine as sole nitrogen source and the use of D-Ala-D-Ala as sole carbon source. These efforts begin to define an integrated strategy for realizing the full value of amassing genome sequence data.« less
Creating reference gene annotation for the mouse C57BL6/J genome assembly.
Mudge, Jonathan M; Harrow, Jennifer
2015-10-01
Annotation on the reference genome of the C57BL6/J mouse has been an ongoing project ever since the draft genome was first published. Initially, the principle focus was on the identification of all protein-coding genes, although today the importance of describing long non-coding RNAs, small RNAs, and pseudogenes is recognized. Here, we describe the progress of the GENCODE mouse annotation project, which combines manual annotation from the HAVANA group with Ensembl computational annotation, alongside experimental and in silico validation pipelines from other members of the consortium. We discuss the more recent incorporation of next-generation sequencing datasets into this workflow, including the usage of mass-spectrometry data to potentially identify novel protein-coding genes. Finally, we will outline how the C57BL6/J genebuild can be used to gain insights into the variant sites that distinguish different mouse strains and species.
Functional analysis of iodotyrosine deiodinase from drosophila melanogaster
Phatarphekar, Abhishek
2016-01-01
Abstract The flavoprotein iodotyrosine deiodinase (IYD) was first discovered in mammals through its ability to salvage iodide from mono‐ and diiodotyrosine, the by‐products of thyroid hormone synthesis. Genomic information indicates that invertebrates contain homologous enzymes although their iodide requirements are unknown. The catalytic domain of IYD from Drosophila melanogaster has now been cloned, expressed and characterized to determine the scope of its potential catalytic function as a model for organisms that are not associated with thyroid hormone production. Little discrimination between iodo‐, bromo‐, and chlorotyrosine was detected. Their affinity for IYD ranges from 0.46 to 0.62 μM (K d) and their efficiency of dehalogenation ranges from 2.4 – 9 x 103 M−1 s−1 (k cat/K m). These values fall within the variations described for IYDs from other organisms for which a physiological function has been confirmed. The relative contribution of three active site residues that coordinate to the amino acid substrates was subsequently determined by mutagenesis of IYD from Drosophila to refine future annotations of genomic and meta‐genomic data for dehalogenation of halotyrosines. Substitution of the active site glutamate to glutamine was most detrimental to catalysis. Alternative substitution of an active site lysine to glutamine affected substrate affinity to the greatest extent but only moderately affected catalytic turnover. Substitution of phenylalanine for an active site tyrosine was least perturbing for binding and catalysis. PMID:27643701
Jeyapalan, Jennie N; Doctor, Gabriel T; Jones, Tania A; Alberman, Samuel N; Tep, Alexander; Haria, Chirag M; Schwalbe, Edward C; Morley, Isabel C F; Hill, Alfred A; LeCain, Magdalena; Ottaviani, Diego; Clifford, Steven C; Qaddoumi, Ibrahim; Tatevossian, Ruth G; Ellison, David W; Sheer, Denise
2016-05-27
Low-grade gliomas (LGGs) account for about a third of all brain tumours in children. We conducted a detailed study of DNA methylation and gene expression to improve our understanding of the biology of pilocytic and diffuse astrocytomas. Pilocytic astrocytomas were found to have a distinctive signature at 315 CpG sites, of which 312 were hypomethylated and 3 were hypermethylated. Genomic analysis revealed that 182 of these sites are within annotated enhancers. The signature was not present in diffuse astrocytomas, or in published profiles of other brain tumours and normal brain tissue. The AP-1 transcription factor was predicted to bind within 200 bp of a subset of the 315 differentially methylated CpG sites; the AP-1 factors, FOS and FOSL1 were found to be up-regulated in pilocytic astrocytomas. We also analysed splice variants of the AP-1 target gene, CCND1, which encodes cell cycle regulator cyclin D1. CCND1a was found to be highly expressed in both pilocytic and diffuse astrocytomas, but diffuse astrocytomas have far higher expression of the oncogenic variant, CCND1b. These findings highlight novel genetic and epigenetic differences between pilocytic and diffuse astrocytoma, in addition to well-described alterations involving BRAF, MYB and FGFR1.
GONUTS: the Gene Ontology Normal Usage Tracking System
Renfro, Daniel P.; McIntosh, Brenley K.; Venkatraman, Anand; Siegele, Deborah A.; Hu, James C.
2012-01-01
The Gene Ontology Normal Usage Tracking System (GONUTS) is a community-based browser and usage guide for Gene Ontology (GO) terms and a community system for general GO annotation of proteins. GONUTS uses wiki technology to allow registered users to share and edit notes on the use of each term in GO, and to contribute annotations for specific genes of interest. By providing a site for generation of third-party documentation at the granularity of individual terms, GONUTS complements the official documentation of the Gene Ontology Consortium. To provide examples for community users, GONUTS displays the complete GO annotations from seven model organisms: Saccharomyces cerevisiae, Dictyostelium discoideum, Caenorhabditis elegans, Drosophila melanogaster, Danio rerio, Mus musculus and Arabidopsis thaliana. To support community annotation, GONUTS allows automated creation of gene pages for gene products in UniProt. GONUTS will improve the consistency of annotation efforts across genome projects, and should be useful in training new annotators and consumers in the production of GO annotations and the use of GO terms. GONUTS can be accessed at http://gowiki.tamu.edu. The source code for generating the content of GONUTS is available upon request. PMID:22110029
ERIC Educational Resources Information Center
Web Feet K-8, 2001
2001-01-01
This annotated subject guide to Web sites and additional resources focuses on biomes. Specifies age levels for resources that include Web sites, CD-ROMs and software, videos, books, audios, and magazines; includes professional resources; and presents a relevant class activity. (LRW)
2017-01-01
Peptide binding to MHC class I molecules is the single most selective step in antigen presentation and the strongest single correlate to peptide cellular immunogenicity. The cost of experimentally characterizing the rules of peptide presentation for a given MHC-I molecule is extensive, and predictors of peptide–MHC interactions constitute an attractive alternative. Recently, an increasing amount of MHC presented peptides identified by mass spectrometry (MS ligands) has been published. Handling and interpretation of MS ligand data is, in general, challenging due to the polyspecificity nature of the data. We here outline a general pipeline for dealing with this challenge and accurately annotate ligands to the relevant MHC-I molecule they were eluted from by use of GibbsClustering and binding motif information inferred from in silico models. We illustrate the approach here in the context of MHC-I molecules (BoLA) of cattle. Next, we demonstrate how such annotated BoLA MS ligand data can readily be integrated with in vitro binding affinity data in a prediction model with very high and unprecedented performance for identification of BoLA-I restricted T-cell epitopes. The prediction model is freely available at http://www.cbs.dtu.dk/services/NetMHCpan/NetBoLApan. The approach has here been applied to the BoLA-I system, but the pipeline is readily applicable to MHC systems in other species. PMID:29115832
Nielsen, Morten; Connelley, Tim; Ternette, Nicola
2018-01-05
Peptide binding to MHC class I molecules is the single most selective step in antigen presentation and the strongest single correlate to peptide cellular immunogenicity. The cost of experimentally characterizing the rules of peptide presentation for a given MHC-I molecule is extensive, and predictors of peptide-MHC interactions constitute an attractive alternative. Recently, an increasing amount of MHC presented peptides identified by mass spectrometry (MS ligands) has been published. Handling and interpretation of MS ligand data is, in general, challenging due to the polyspecificity nature of the data. We here outline a general pipeline for dealing with this challenge and accurately annotate ligands to the relevant MHC-I molecule they were eluted from by use of GibbsClustering and binding motif information inferred from in silico models. We illustrate the approach here in the context of MHC-I molecules (BoLA) of cattle. Next, we demonstrate how such annotated BoLA MS ligand data can readily be integrated with in vitro binding affinity data in a prediction model with very high and unprecedented performance for identification of BoLA-I restricted T-cell epitopes. The prediction model is freely available at http://www.cbs.dtu.dk/services/NetMHCpan/NetBoLApan . The approach has here been applied to the BoLA-I system, but the pipeline is readily applicable to MHC systems in other species.
Using the Web To Explore the Great Depression.
ERIC Educational Resources Information Center
Chamberlin, Paul
2001-01-01
Presents an annotated list of Web sites that focus on the Great Depression. Includes the American Experience, American Memory, the National Archives and Records Administration, and the New Deal Network Web sites. Offers additional sites covering topics such as the Jersey homesteads and labor history. (CMK)
Gamma-aminobutyric acid-modulated benzodiazepine binding sites in bacteria
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lummis, S.C.R.; Johnston, G.A.R.; Nicoletti, G.
1991-01-01
Benzodiazepine binding sites, which were once considered to exist only in higher vertebrates, are here demonstrated in the bacteria E. coli. The bacterial ({sup 3}H)diazepam binding sites are modulated by GABA; the modulation is dose dependent and is reduced at high concentrations. The most potent competitors of E.Coli ({sup 3}H)diazepam binding are those that are active in displacing ({sup 3}H)benzodiazepines from vertebrate peripheral benzodiazepine binding sites. These vertebrate sites are not modulated by GABA, in contrast to vertebrate neuronal benzodiazepine binding sites. The E.coli benzodiazepine binding sites therefore differ from both classes of vertebrate benzodiazepine binding sites; however the ligandmore » spectrum and GABA-modulatory properties of the E.coli sites are similar to those found in insects. This intermediate type of receptor in lower species suggests a precursor for at least one class of vertebrate benzodiazepine binding sites may have existed.« less
NASA Astrophysics Data System (ADS)
Lengyel, Iván M.; Morelli, Luis G.
2017-04-01
Cells may control fluctuations in protein levels by means of negative autoregulation, where transcription factors bind DNA sites to repress their own production. Theoretical studies have assumed a single binding site for the repressor, while in most species it is found that multiple binding sites are arranged in clusters. We study a stochastic description of negative autoregulation with multiple binding sites for the repressor. We find that increasing the number of binding sites induces regular bursting of gene products. By tuning the threshold for repression, we show that multiple binding sites can also suppress fluctuations. Our results highlight possible roles for the presence of multiple binding sites of negative autoregulators.
ERIC Educational Resources Information Center
Web Feet K-8, 2001
2001-01-01
This annotated subject guide to Web sites and additional resources focuses on mythology. Specific age levels are given for resources that include Web sites, CD-ROMs and software, videos, books, audios, and magazines; offers professional resources; and presents a relevant class activity. (LRW)
ERIC Educational Resources Information Center
Web Feet K-8, 2001
2001-01-01
This annotated subject guide to Web sites and additional resources focuses on space and astronomy. Specifies age levels for resources that include Web sites, CD-ROMS and software, videos, books, audios, and magazines; offers professional resources; and presents a relevant class activity. (LRW)
NegGOA: negative GO annotations selection using ontology structure.
Fu, Guangyuan; Wang, Jun; Yang, Bo; Yu, Guoxian
2016-10-01
Predicting the biological functions of proteins is one of the key challenges in the post-genomic era. Computational models have demonstrated the utility of applying machine learning methods to predict protein function. Most prediction methods explicitly require a set of negative examples-proteins that are known not carrying out a particular function. However, Gene Ontology (GO) almost always only provides the knowledge that proteins carry out a particular function, and functional annotations of proteins are incomplete. GO structurally organizes more than tens of thousands GO terms and a protein is annotated with several (or dozens) of these terms. For these reasons, the negative examples of a protein can greatly help distinguishing true positive examples of the protein from such a large candidate GO space. In this paper, we present a novel approach (called NegGOA) to select negative examples. Specifically, NegGOA takes advantage of the ontology structure, available annotations and potentiality of additional annotations of a protein to choose negative examples of the protein. We compare NegGOA with other negative examples selection algorithms and find that NegGOA produces much fewer false negatives than them. We incorporate the selected negative examples into an efficient function prediction model to predict the functions of proteins in Yeast, Human, Mouse and Fly. NegGOA also demonstrates improved accuracy than these comparing algorithms across various evaluation metrics. In addition, NegGOA is less suffered from incomplete annotations of proteins than these comparing methods. The Matlab and R codes are available at https://sites.google.com/site/guoxian85/neggoa gxyu@swu.edu.cn Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Genome-wide mapping and analysis of active promoters in mouse embryonic stem cells and adult organs
Barrera, Leah O.; Li, Zirong; Smith, Andrew D.; Arden, Karen C.; Cavenee, Webster K.; Zhang, Michael Q.; Green, Roland D.; Ren, Bing
2008-01-01
By integrating genome-wide maps of RNA polymerase II (Polr2a) binding with gene expression data and H3ac and H3K4me3 profiles, we characterized promoters with enriched activity in mouse embryonic stem cells (mES) as well as adult brain, heart, kidney, and liver. We identified ∼24,000 promoters across these samples, including 16,976 annotated mRNA 5′ ends and 5153 additional sites validating cap-analysis of gene expression (CAGE) 5′ end data. We showed that promoters with CpG islands are typically non-tissue specific, with the majority associated with Polr2a and the active chromatin modifications in nearly all the tissues examined. By contrast, the promoters without CpG islands are generally associated with Polr2a and the active chromatin marks in a tissue-dependent way. We defined 4396 tissue-specific promoters by adapting a quantitative index of tissue-specificity based on Polr2a occupancy. While there is a general correspondence between Polr2a occupancy and active chromatin modifications at the tissue-specific promoters, a subset of them appear to be persistently marked by active chromatin modifications in the absence of detectable Polr2a binding, highlighting the complexity of the functional relationship between chromatin modification and gene expression. Our results provide a resource for exploring promoter Polr2a binding and epigenetic states across pluripotent and differentiated cell types in mammals. PMID:18042645
ERIC Educational Resources Information Center
Huang, T. K.
2018-01-01
The study makes use of the photo-hosting site, namely Flickr, for students to upload screenshots to demonstrate computer software problems and troubleshooting software. By creating non-text stickers and text-based annotations above the screenshots, students are able to help one another to diagnose and solve problems with greater certainty. In…
Shakespeare Goes Online: Web Resources for Teaching Shakespeare.
ERIC Educational Resources Information Center
Schuetz, Carol L.
This annotated bibliography contains five sections and 62 items. The first section lists general resources including six Web site addresses; the second section, on Shakespeare's works, contains five Web site addresses; the third section, on Shakespeare and the Globe Theatre, provides five Web site addresses; the fourth section presents classroom…
DOE Office of Scientific and Technical Information (OSTI.GOV)
Beatley, J C
1965-04-01
A checklist of vascular plants of the Nevada Test Site is presented for use in studies of plant ecology. Data on the occurrence and distribution of plant species are included. Collections were made from both undisturbed and disturbed sites.
Lebow, Mahria
2014-04-01
The Arctic Health web site is a portal to Arctic-specific, health related content. The site provides expertly organized and annotated resources pertinent to northern peoples and places, including health information, research publications and environmental information. This site also features the Arctic Health Publications Database, which indexes an array of Arctic-related resources.
Cultivating Critical Mindsets in the Digital Information Age: Teaching Meaningful Web Evaluation
ERIC Educational Resources Information Center
Johnson, Angela Kwasnik
2017-01-01
This dissertation examines the use of dialogic discussion to improve young adolescents' ability to critically evaluate web sites. An intervention unit comprised three iterations of an instructional cycle in which students independently annotated web sites about controversial issues and discussed the reliability of those sites in dialogic…
An Annotated Bibliography for the Development and Operation of Historic Sites.
ERIC Educational Resources Information Center
American Association of Museums, Washington, DC.
Over 340 books, articles, manuals, newsletters, and other publications concerning the development and operation of historic sites are listed. Most cited materials were published since 1972 and are arranged under four major categories: site development and planning, documentation and preservation of structures and objects, interpretation of…
Countries: General, Electricity, Geography, Health, Literature: Children's, Plants.
ERIC Educational Resources Information Center
Web Feet, 2002
2002-01-01
Presents an annotated list of Web site educational resources kindergarten through eighth grade. The Web sites this month cover the following subjects: countries (general); electricity; geography; health; children's literature; and plants. Includes a list of "Calendar Connections" to Web site sources of information on Earth Day in April…
Wolf, Timo; Schneiker-Bekel, Susanne; Neshat, Armin; Ortseifen, Vera; Wibberg, Daniel; Zemke, Till; Pühler, Alfred; Kalinowski, Jörn
2017-06-10
Actinoplanes sp. SE50/110 is the natural producer of acarbose, which is used in the treatment of diabetes mellitus type II. However, until now the transcriptional organization and regulation of the acarbose biosynthesis are only understood rudimentarily. The genome sequence of Actinoplanes sp. SE50/110 was known before, but was resequenced in this study to remove assembly artifacts and incorrect base callings. The annotation of the genome was refined in a multi-step approach, including modern bioinformatic pipelines, transcriptome and proteome data. A whole transcriptome RNA-seq library as well as an RNA-seq library enriched for primary 5'-ends were used for the detection of transcription start sites, to correct tRNA predictions, to identify novel transcripts like small RNAs and to improve the annotation through the correction of falsely annotated translation start sites. The transcriptome data sets were also applied to identify 31 cis-regulatory RNA structures, such as riboswitches or RNA thermometers as well as three leaderless transcribed short peptides found in putative attenuators upstream of genes for amino acid biosynthesis. The transcriptional organization of the acarbose biosynthetic gene cluster was elucidated in detail and fourteen novel biosynthetic gene clusters were suggested. The accurate genome sequence and precise annotation of the Actinoplanes sp. SE50/110 genome will be the foundation for future genetic engineering and systems biology studies. Copyright © 2017 Elsevier B.V. All rights reserved.
Annotating Cancer Variants and Anti-Cancer Therapeutics in Reactome
Milacic, Marija; Haw, Robin; Rothfels, Karen; Wu, Guanming; Croft, David; Hermjakob, Henning; D’Eustachio, Peter; Stein, Lincoln
2012-01-01
Reactome describes biological pathways as chemical reactions that closely mirror the actual physical interactions that occur in the cell. Recent extensions of our data model accommodate the annotation of cancer and other disease processes. First, we have extended our class of protein modifications to accommodate annotation of changes in amino acid sequence and the formation of fusion proteins to describe the proteins involved in disease processes. Second, we have added a disease attribute to reaction, pathway, and physical entity classes that uses disease ontology terms. To support the graphical representation of “cancer” pathways, we have adapted our Pathway Browser to display disease variants and events in a way that allows comparison with the wild type pathway, and shows connections between perturbations in cancer and other biological pathways. The curation of pathways associated with cancer, coupled with our efforts to create other disease-specific pathways, will interoperate with our existing pathway and network analysis tools. Using the Epidermal Growth Factor Receptor (EGFR) signaling pathway as an example, we show how Reactome annotates and presents the altered biological behavior of EGFR variants due to their altered kinase and ligand-binding properties, and the mode of action and specificity of anti-cancer therapeutics. PMID:24213504
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sloan, J.W.
1984-01-01
These studies show that nicotine binds to the rat brain P/sub 2/ preparation by saturable and reversible processes. Multiple binding sites were revealed by the configuration of saturation, kinetic and Scatchard plots. A least squares best fit of Scatchard data using nonlinear curve fitting programs confirmed the presence of a very high affinity site, an up-regulatory site, a high affinity site and one or two low affinity sites. Stereospecificity was demonstrated for the up-regulatory site where (+)-nicotine was more effective and for the high affinity site where (-)-nicotine had a higher affinity. Drugs which selectively up-regulate nicotine binding site(s) havemore » been identified. Further, separate very high and high affinity sites were identified for (-)- and (+)-(/sup 3/H)nicotine, based on evidence that the site density for the (-)-isomer is 10 times greater than that for the (+)-isomer at these sites. Enhanced nicotine binding has been shown to be a statistically significant phenomenon which appears to be a consequence of drugs binding to specific site(s) which up-regulate binding at other site(s). Although Scatchard and Hill plots indicate positive cooperatively, up-regulation more adequately describes the function of these site(s). A separate up-regulatory site is suggested by the following: (1) Drugs vary markedly in their ability to up-regulate binding. (2) Both the affinity and the degree of up-regulation can be altered by structural changes in ligands. (3) Drugs with specificity for up-regulation have been identified. (4) Some drugs enhance binding in a dose-related manner. (5) Competition studies employing cold (-)- and (+)-nicotine against (-)- and (+)-(/sup 3/H)nicotine show that the isomers bind to separate sites which up-regulate binding at the (-)- and (+)-nicotine high affinity sites and in this regard (+)-nicotine is more specific and efficacious than (-)-nicotine.« less
Bate, Paul; Warwicker, Jim
2004-07-02
Calculations of charge interactions complement analysis of a characterised active site, rationalising pH-dependence of activity and transition state stabilisation. Prediction of active site location through large DeltapK(a)s or electrostatic strain is relevant for structural genomics. We report a study of ionisable groups in a set of 20 enzymes, finding that false positives obscure predictive potential. In a larger set of 156 enzymes, peaks in solvent-space electrostatic properties are calculated. Both electric field and potential match well to active site location. The best correlation is found with electrostatic potential calculated from uniform charge density over enzyme volume, rather than from assignment of a standard atom-specific charge set. Studying a shell around each molecule, for 77% of enzymes the potential peak is within that 5% of the shell closest to the active site centre, and 86% within 10%. Active site identification by largest cleft, also with projection onto a shell, gives 58% of enzymes for which the centre of the largest cleft lies within 5% of the active site, and 70% within 10%. Dielectric boundary conditions emphasise clefts in the uniform charge density method, which is suited to recognition of binding pockets embedded within larger clefts. The variation of peak potential with distance from active site, and comparison between enzyme and non-enzyme sets, gives an optimal threshold distinguishing enzyme from non-enzyme. We find that 87% of the enzyme set exceeds the threshold as compared to 29% of the non-enzyme set. Enzyme/non-enzyme homologues, "structural genomics" annotated proteins and catalytic/non-catalytic RNAs are studied in this context.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bernard, Steffen M.; Akey, David L.; Tripathi, Ashootosh
Sugar moieties in natural products are frequently modified by O-methylation. In the biosynthesis of the macrolide antibiotic mycinamicin, methylation of a 6'-deoxyallose substituent occurs in a stepwise manner first at the 2'- and then the 3'-hydroxyl groups to produce the mycinose moiety in the final product. The timing and placement of the O-methylations impact final stage C-H functionalization reactions mediated by the P450 monooxygenase MycG. The structural basis of pathway ordering and substrate specificity is unknown. A series of crystal structures of MycF, the 3'-O-methyltransferase, including the free enzyme and complexes with S-adenosyl homocysteine (SAH), substrate, product, and unnatural substrates,more » show that SAM binding induces substantial ordering that creates the binding site for the natural substrate, and a bound metal ion positions the substrate for catalysis. A single amino acid substitution relaxed the 2'-methoxy specificity but retained regiospecificity. The engineered variant produced a new mycinamicin analog, demonstrating the utility of structural information to facilitate bioengineering approaches for the chemoenzymatic synthesis of complex small molecules containing modified sugars. Using the MycF substrate complex and the modeled substrate complex of a 4'-specific homolog, active site residues were identified that correlate with the 3'- or 4'- specificity of MycF family members and define the protein and substrate features that direct the regiochemistry of methyltransfer. Lastly, this classification scheme will be useful in the annotation of new secondary metabolite pathways that utilize this family of enzymes.« less
NASA Astrophysics Data System (ADS)
Chenge, Jude; Kavanagh, Madeline E.; Driscoll, Max D.; McLean, Kirsty J.; Young, Douglas B.; Cortes, Teresa; Matak-Vinkovic, Dijana; Levy, Colin W.; Rigby, Stephen E. J.; Leys, David; Abell, Chris; Munro, Andrew W.
2016-05-01
Mycobacterium tuberculosis (Mtb) causes the disease tuberculosis (TB). The virulent Mtb H37Rv strain encodes 20 cytochrome P450 (CYP) enzymes, many of which are implicated in Mtb survival and pathogenicity in the human host. Bioinformatics analysis revealed that CYP144A1 is retained exclusively within the Mycobacterium genus, particularly in species causing human and animal disease. Transcriptomic annotation revealed two possible CYP144A1 start codons, leading to expression of (i) a “full-length” 434 amino acid version (CYP144A1-FLV) and (ii) a “truncated” 404 amino acid version (CYP144A1-TRV). Computational analysis predicted that the extended N-terminal region of CYP144A1-FLV is largely unstructured. CYP144A1 FLV and TRV forms were purified in heme-bound states. Mass spectrometry confirmed production of intact, His6-tagged forms of CYP144A1-FLV and -TRV, with EPR demonstrating cysteine thiolate coordination of heme iron in both cases. Hydrodynamic analysis indicated that both CYP144A1 forms are monomeric. CYP144A1-TRV was crystallized and the first structure of a CYP144 family P450 protein determined. CYP144A1-TRV has an open structure primed for substrate binding, with a large active site cavity. Our data provide the first evidence that Mtb produces two different forms of CYP144A1 from alternative transcripts, with CYP144A1-TRV generated from a leaderless transcript lacking a 5‧-untranslated region and Shine-Dalgarno ribosome binding site.
Bernard, Steffen M.; Akey, David L.; Tripathi, Ashootosh; ...
2015-02-18
Sugar moieties in natural products are frequently modified by O-methylation. In the biosynthesis of the macrolide antibiotic mycinamicin, methylation of a 6'-deoxyallose substituent occurs in a stepwise manner first at the 2'- and then the 3'-hydroxyl groups to produce the mycinose moiety in the final product. The timing and placement of the O-methylations impact final stage C-H functionalization reactions mediated by the P450 monooxygenase MycG. The structural basis of pathway ordering and substrate specificity is unknown. A series of crystal structures of MycF, the 3'-O-methyltransferase, including the free enzyme and complexes with S-adenosyl homocysteine (SAH), substrate, product, and unnatural substrates,more » show that SAM binding induces substantial ordering that creates the binding site for the natural substrate, and a bound metal ion positions the substrate for catalysis. A single amino acid substitution relaxed the 2'-methoxy specificity but retained regiospecificity. The engineered variant produced a new mycinamicin analog, demonstrating the utility of structural information to facilitate bioengineering approaches for the chemoenzymatic synthesis of complex small molecules containing modified sugars. Using the MycF substrate complex and the modeled substrate complex of a 4'-specific homolog, active site residues were identified that correlate with the 3'- or 4'- specificity of MycF family members and define the protein and substrate features that direct the regiochemistry of methyltransfer. Lastly, this classification scheme will be useful in the annotation of new secondary metabolite pathways that utilize this family of enzymes.« less
Loughran, Gary; Jungreis, Irwin; Tzani, Ioanna; Power, Michael; Dmitriev, Ruslan I.; Ivanov, Ivaylo P.; Kellis, Manolis; Atkins, John F.
2018-01-01
Although stop codon readthrough is used extensively by viruses to expand their gene expression, verified instances of mammalian readthrough have only recently been uncovered by systems biology and comparative genomics approaches. Previously, our analysis of conserved protein coding signatures that extend beyond annotated stop codons predicted stop codon readthrough of several mammalian genes, all of which have been validated experimentally. Four mRNAs display highly efficient stop codon readthrough, and these mRNAs have a UGA stop codon immediately followed by CUAG (UGA_CUAG) that is conserved throughout vertebrates. Extending on the identification of this readthrough motif, we here investigated stop codon readthrough, using tissue culture reporter assays, for all previously untested human genes containing UGA_CUAG. The readthrough efficiency of the annotated stop codon for the sequence encoding vitamin D receptor (VDR) was 6.7%. It was the highest of those tested but all showed notable levels of readthrough. The VDR is a member of the nuclear receptor superfamily of ligand-inducible transcription factors, and it binds its major ligand, calcitriol, via its C-terminal ligand-binding domain. Readthrough of the annotated VDR mRNA results in a 67 amino acid–long C-terminal extension that generates a VDR proteoform named VDRx. VDRx may form homodimers and heterodimers with VDR but, compared with VDR, VDRx displayed a reduced transcriptional response to calcitriol even in the presence of its partner retinoid X receptor. PMID:29386352
Defining functional DNA elements in the human genome
Kellis, Manolis; Wold, Barbara; Snyder, Michael P.; Bernstein, Bradley E.; Kundaje, Anshul; Marinov, Georgi K.; Ward, Lucas D.; Birney, Ewan; Crawford, Gregory E.; Dekker, Job; Dunham, Ian; Elnitski, Laura L.; Farnham, Peggy J.; Feingold, Elise A.; Gerstein, Mark; Giddings, Morgan C.; Gilbert, David M.; Gingeras, Thomas R.; Green, Eric D.; Guigo, Roderic; Hubbard, Tim; Kent, Jim; Lieb, Jason D.; Myers, Richard M.; Pazin, Michael J.; Ren, Bing; Stamatoyannopoulos, John A.; Weng, Zhiping; White, Kevin P.; Hardison, Ross C.
2014-01-01
With the completion of the human genome sequence, attention turned to identifying and annotating its functional DNA elements. As a complement to genetic and comparative genomics approaches, the Encyclopedia of DNA Elements Project was launched to contribute maps of RNA transcripts, transcriptional regulator binding sites, and chromatin states in many cell types. The resulting genome-wide data reveal sites of biochemical activity with high positional resolution and cell type specificity that facilitate studies of gene regulation and interpretation of noncoding variants associated with human disease. However, the biochemically active regions cover a much larger fraction of the genome than do evolutionarily conserved regions, raising the question of whether nonconserved but biochemically active regions are truly functional. Here, we review the strengths and limitations of biochemical, evolutionary, and genetic approaches for defining functional DNA segments, potential sources for the observed differences in estimated genomic coverage, and the biological implications of these discrepancies. We also analyze the relationship between signal intensity, genomic coverage, and evolutionary conservation. Our results reinforce the principle that each approach provides complementary information and that we need to use combinations of all three to elucidate genome function in human biology and disease. PMID:24753594
2009-01-01
Background Bacillus anthracis, the etiologic agent of anthrax, has recently been used as an agent of bioterrorism. The innate immune system initially appears to contain the pathogen at the site of entry. Because the human alveolar macrophage (HAM) plays a key role in lung innate immune responses, studying the HAM response to B. anthracis is important in understanding the pathogenesis of the pulmonary form of this disease. Methods In this paper, the transcriptional profile of B. anthracis spore-treated HAM was compared with that of mock-infected cells, and differentially expressed genes were identified by Affymetrix microarray analysis. A portion of the results were verified by Luminex protein analysis. Results The majority of genes modulated by spores were upregulated, and a lesser number were downregulated. The differentially expressed genes were subjected to Ingenuity Pathway analysis, the Database for Annotation, Visualization and Integrated Discovery (DAVID) analysis, the Promoter Analysis and Interaction Network Toolset (PAINT) and Oncomine analysis. Among the upregulated genes, we identified a group of chemokine ligand, apoptosis, and, interestingly, keratin filament genes. Central hubs regulating the activated genes were TNF-α, NF-κB and their ligands/receptors. In addition to TNF-α, a broad range of cytokines was induced, and this was confirmed at the level of translation by Luminex multiplex protein analysis. PAINT analysis revealed that many of the genes affected by spores contain the binding site for c-Rel, a member of the NF-κB family of transcription factors. Other transcription regulatory elements contained in many of the upregulated genes were c-Myb, CP2, Barbie Box, E2F and CRE-BP1. However, many of the genes are poorly annotated, indicating that they represent novel functions. Four of the genes most highly regulated by spores have only previously been associated with head and neck and lung carcinomas. Conclusion The results demonstrate not only that TNF-α and NF-κb are key components of the innate immune response to the pathogen, but also that a large part of the mechanisms by which the alveolar macrophage responds to B. anthracis are still unknown as many of the genes involved are poorly annotated. PMID:19744333
ERIC Educational Resources Information Center
Schaber, Robin L.
2002-01-01
Provides an annotated bibliography of Web sites that focus on using film to teach history. Includes Web sites in five areas: (1) film and education; (2) history of cinema; (3) film and history resources; (4) film and women; and (5) film organizations. (CMK)
Environmental Information Sources on the Net.
ERIC Educational Resources Information Center
Raeder, Aggi
1997-01-01
Discusses environmental information needs of business professionals and provides an annotated list of Web sites serving as information sources. Highlights include "meta sites", government, health, law, engineering, education, organizations, and environmental news, as well as selected environmental "hot topics." (AEF)
Non-Coding RNA Analysis Using the Rfam Database.
Kalvari, Ioanna; Nawrocki, Eric P; Argasinska, Joanna; Quinones-Olvera, Natalia; Finn, Robert D; Bateman, Alex; Petrov, Anton I
2018-06-01
Rfam is a database of non-coding RNA families in which each family is represented by a multiple sequence alignment, a consensus secondary structure, and a covariance model. Using a combination of manual and literature-based curation and a custom software pipeline, Rfam converts descriptions of RNA families found in the scientific literature into computational models that can be used to annotate RNAs belonging to those families in any DNA or RNA sequence. Valuable research outputs that are often locked up in figures and supplementary information files are encapsulated in Rfam entries and made accessible through the Rfam Web site. The data produced by Rfam have a broad application, from genome annotation to providing training sets for algorithm development. This article gives an overview of how to search and navigate the Rfam Web site, and how to annotate sequences with RNA families. The Rfam database is freely available at http://rfam.org. © 2018 by John Wiley & Sons, Inc. Copyright © 2018 John Wiley & Sons, Inc.
Clifford, Jacob; Adami, Christoph
2015-09-02
Transcription factor binding to the surface of DNA regulatory regions is one of the primary causes of regulating gene expression levels. A probabilistic approach to model protein-DNA interactions at the sequence level is through position weight matrices (PWMs) that estimate the joint probability of a DNA binding site sequence by assuming positional independence within the DNA sequence. Here we construct conditional PWMs that depend on the motif signatures in the flanking DNA sequence, by conditioning known binding site loci on the presence or absence of additional binding sites in the flanking sequence of each site's locus. Pooling known sites with similar flanking sequence patterns allows for the estimation of the conditional distribution function over the binding site sequences. We apply our model to the Dorsal transcription factor binding sites active in patterning the Dorsal-Ventral axis of Drosophila development. We find that those binding sites that cooperate with nearby Twist sites on average contain about 0.5 bits of information about the presence of Twist transcription factor binding sites in the flanking sequence. We also find that Dorsal binding site detectors conditioned on flanking sequence information make better predictions about what is a Dorsal site relative to background DNA than detection without information about flanking sequence features.
Johnson, Matthew E; Deliard, Sandra; Zhu, Fengchang; Xia, Qianghua; Wells, Andrew D; Hankenson, Kurt D; Grant, Struan F A
2014-04-01
Genome-wide association studies (GWAS) have demonstrated that genetic variation at the MADS box transcription enhancer factor 2, polypeptide C (MEF2C) locus is robustly associated with bone mineral density, primarily at the femoral neck. MEF2C is a transcription factor known to operate via the Wnt signaling pathway. Our hypothesis was that MEF2C regulates the expression of a set of molecular pathways critical to skeletal function. Drawing on our laboratory and bioinformatic experience with ChIP-seq, we analyzed ChIP-seq data for MEF2C available via the ENCODE project to gain insight in to its global genomic binding pattern. We aligned the ChIP-seq data generated for GM12878 (an established lymphoblastoid cell line) and, using the analysis package HOMER, a total of 17,611 binding sites corresponding to 8,118 known genes were observed. We then performed a pathway analysis of the gene list using Ingenuity. At 5 kb, the gene list yielded 'EIF2 Signaling' as the most significant annotation, with a P value of 5.01 × 10(-26). Moving further out, this category remained the top pathway at 50 and 100 kb, then dropped to just second place at 500 kb and beyond by 'Molecular Mechanisms of Cancer'. In addition, at 50 kb and beyond 'RANK Signaling in Osteoclasts' was a consistent feature and resonates with the main general finding from GWAS of bone density. We also observed that MEF2C binding sites were significantly enriched primarily near inflammation associated genes identified from GWAS; indeed, a similar enrichment for inflammation genes has been reported previously using a similar approach for the vitamin D receptor, an established key regulator of bone turnover. Our analyses point to known connective tissue and skeletal processes but also provide novel insights in to networks involved in skeletal regulation. The fact that a specific GWAS category is enriched points to a possible role of inflammation through which it impacts bone mineral density.
The identification and functional annotation of RNA structures conserved in vertebrates.
Seemann, Stefan E; Mirza, Aashiq H; Hansen, Claus; Bang-Berthelsen, Claus H; Garde, Christian; Christensen-Dalsgaard, Mikkel; Torarinsson, Elfar; Yao, Zizhen; Workman, Christopher T; Pociot, Flemming; Nielsen, Henrik; Tommerup, Niels; Ruzzo, Walter L; Gorodkin, Jan
2017-08-01
Structured elements of RNA molecules are essential in, e.g., RNA stabilization, localization, and protein interaction, and their conservation across species suggests a common functional role. We computationally screened vertebrate genomes for conserved RNA structures (CRSs), leveraging structure-based, rather than sequence-based, alignments. After careful correction for sequence identity and GC content, we predict ∼516,000 human genomic regions containing CRSs. We find that a substantial fraction of human-mouse CRS regions (1) colocalize consistently with binding sites of the same RNA binding proteins (RBPs) or (2) are transcribed in corresponding tissues. Additionally, a CaptureSeq experiment revealed expression of many of our CRS regions in human fetal brain, including 662 novel ones. For selected human and mouse candidate pairs, qRT-PCR and in vitro RNA structure probing supported both shared expression and shared structure despite low abundance and low sequence identity. About 30,000 CRS regions are located near coding or long noncoding RNA genes or within enhancers. Structured (CRS overlapping) enhancer RNAs and extended 3' ends have significantly increased expression levels over their nonstructured counterparts. Our findings of transcribed uncharacterized regulatory regions that contain CRSs support their RNA-mediated functionality. © 2017 Seemann et al.; Published by Cold Spring Harbor Laboratory Press.
Deconvoluting AMP-activated protein kinase (AMPK) adenine nucleotide binding and sensing
Gu, Xin; Yan, Yan; Novick, Scott J.; Kovach, Amanda; Goswami, Devrishi; Ke, Jiyuan; Tan, M. H. Eileen; Wang, Lili; Li, Xiaodan; de Waal, Parker W.; Webb, Martin R.; Griffin, Patrick R.; Xu, H. Eric
2017-01-01
AMP-activated protein kinase (AMPK) is a central cellular energy sensor that adapts metabolism and growth to the energy state of the cell. AMPK senses the ratio of adenine nucleotides (adenylate energy charge) by competitive binding of AMP, ADP, and ATP to three sites (CBS1, CBS3, and CBS4) in its γ-subunit. Because these three binding sites are functionally interconnected, it remains unclear how nucleotides bind to individual sites, which nucleotides occupy each site under physiological conditions, and how binding to one site affects binding to the other sites. Here, we comprehensively analyze nucleotide binding to wild-type and mutant AMPK protein complexes by quantitative competition assays and by hydrogen-deuterium exchange MS. We also demonstrate that NADPH, in addition to the known AMPK ligand NADH, directly and competitively binds AMPK at the AMP-sensing CBS3 site. Our findings reveal how AMP binding to one site affects the conformation and adenine nucleotide binding at the other two sites and establish CBS3, and not CBS1, as the high affinity exchangeable AMP/ADP/ATP-binding site. We further show that AMP binding at CBS4 increases AMP binding at CBS3 by 2 orders of magnitude and reverses the AMP/ATP preference of CBS3. Together, these results illustrate how the three CBS sites collaborate to enable highly sensitive detection of cellular energy states to maintain the tight ATP homeostastis required for cellular metabolism. PMID:28615457
Standardized description of scientific evidence using the Evidence Ontology (ECO)
Chibucos, Marcus C.; Mungall, Christopher J.; Balakrishnan, Rama; Christie, Karen R.; Huntley, Rachael P.; White, Owen; Blake, Judith A.; Lewis, Suzanna E.; Giglio, Michelle
2014-01-01
The Evidence Ontology (ECO) is a structured, controlled vocabulary for capturing evidence in biological research. ECO includes diverse terms for categorizing evidence that supports annotation assertions including experimental types, computational methods, author statements and curator inferences. Using ECO, annotation assertions can be distinguished according to the evidence they are based on such as those made by curators versus those automatically computed or those made via high-throughput data review versus single test experiments. Originally created for capturing evidence associated with Gene Ontology annotations, ECO is now used in other capacities by many additional annotation resources including UniProt, Mouse Genome Informatics, Saccharomyces Genome Database, PomBase, the Protein Information Resource and others. Information on the development and use of ECO can be found at http://evidenceontology.org. The ontology is freely available under Creative Commons license (CC BY-SA 3.0), and can be downloaded in both Open Biological Ontologies and Web Ontology Language formats at http://code.google.com/p/evidenceontology. Also at this site is a tracker for user submission of term requests and questions. ECO remains under active development in response to user-requested terms and in collaborations with other ontologies and database resources. Database URL: Evidence Ontology Web site: http://evidenceontology.org PMID:25052702
Naqvi, Ahmad Abu Turab; Shahbaaz, Mohd; Ahmad, Faizan; Hassan, Md Imtaiyaz
2015-01-01
Syphilis is a globally occurring venereal disease, and its infection is propagated through sexual contact. The causative agent of syphilis, Treponema pallidum ssp. pallidum, a Gram-negative sphirochaete, is an obligate human parasite. Genome of T. pallidum ssp. pallidum SS14 strain (RefSeq NC_010741.1) encodes 1,027 proteins, of which 444 proteins are known as hypothetical proteins (HPs), i.e., proteins of unknown functions. Here, we performed functional annotation of HPs of T. pallidum ssp. pallidum using various database, domain architecture predictors, protein function annotators and clustering tools. We have analyzed the sequences of 444 HPs of T. pallidum ssp. pallidum and subsequently predicted the function of 207 HPs with a high level of confidence. However, functions of 237 HPs are predicted with less accuracy. We found various enzymes, transporters, binding proteins in the annotated group of HPs that may be possible molecular targets, facilitating for the survival of pathogen. Our comprehensive analysis helps to understand the mechanism of pathogenesis to provide many novel potential therapeutic interventions.
An Electrostatic Funnel in the GABA-Binding Pathway
Lightstone, Felice C.
2016-01-01
The γ-aminobutyric acid type A receptor (GABAA-R) is a major inhibitory neuroreceptor that is activated by the binding of GABA. The structure of the GABAA-R is well characterized, and many of the binding site residues have been identified. However, most of these residues are obscured behind the C-loop that acts as a cover to the binding site. Thus, the mechanism by which the GABA molecule recognizes the binding site, and the pathway it takes to enter the binding site are both unclear. Through the completion and detailed analysis of 100 short, unbiased, independent molecular dynamics simulations, we have investigated this phenomenon of GABA entering the binding site. In each system, GABA was placed quasi-randomly near the binding site of a GABAA-R homology model, and atomistic simulations were carried out to observe the behavior of the GABA molecules. GABA fully entered the binding site in 19 of the 100 simulations. The pathway taken by these molecules was consistent and non-random; the GABA molecules approach the binding site from below, before passing up behind the C-loop and into the binding site. This binding pathway is driven by long-range electrostatic interactions, whereby the electrostatic field acts as a ‘funnel’ that sweeps the GABA molecules towards the binding site, at which point more specific atomic interactions take over. These findings define a nuanced mechanism whereby the GABAA-R uses the general zwitterionic features of the GABA molecule to identify a potential ligand some 2 nm away from the binding site. PMID:27119953
Career and Employment Resources on the Internet.
ERIC Educational Resources Information Center
Fenske, Rachel F.
1997-01-01
Presents an annotated list of career and employment Web sites to assist librarians and job seekers with locating information on all aspects of career and job searching. Provides general indexes and sites specializing in career fairs, resume services, relocation, and newsgroups. (AEF)
Le, Vu H.; Buscaglia, Robert; Chaires, Jonathan B.; Lewis, Edwin A.
2013-01-01
Isothermal Titration Calorimetry, ITC, is a powerful technique that can be used to estimate a complete set of thermodynamic parameters (e.g. Keq (or ΔG), ΔH, ΔS, and n) for a ligand binding interaction described by a thermodynamic model. Thermodynamic models are constructed by combination of equilibrium constant, mass balance, and charge balance equations for the system under study. Commercial ITC instruments are supplied with software that includes a number of simple interaction models, for example one binding site, two binding sites, sequential sites, and n-independent binding sites. More complex models for example, three or more binding sites, one site with multiple binding mechanisms, linked equilibria, or equilibria involving macromolecular conformational selection through ligand binding need to be developed on a case by case basis by the ITC user. In this paper we provide an algorithm (and a link to our MATLAB program) for the non-linear regression analysis of a multiple binding site model with up to four overlapping binding equilibria. Error analysis demonstrates that fitting ITC data for multiple parameters (e.g. up to nine parameters in the three binding site model) yields thermodynamic parameters with acceptable accuracy. PMID:23262283
A tool for calculating binding-site residues on proteins from PDB structures.
Hu, Jing; Yan, Changhui
2009-08-03
In the research on protein functional sites, researchers often need to identify binding-site residues on a protein. A commonly used strategy is to find a complex structure from the Protein Data Bank (PDB) that consists of the protein of interest and its interacting partner(s) and calculate binding-site residues based on the complex structure. However, since a protein may participate in multiple interactions, the binding-site residues calculated based on one complex structure usually do not reveal all binding sites on a protein. Thus, this requires researchers to find all PDB complexes that contain the protein of interest and combine the binding-site information gleaned from them. This process is very time-consuming. Especially, combing binding-site information obtained from different PDB structures requires tedious work to align protein sequences. The process becomes overwhelmingly difficult when researchers have a large set of proteins to analyze, which is usually the case in practice. In this study, we have developed a tool for calculating binding-site residues on proteins, TCBRP http://yanbioinformatics.cs.usu.edu:8080/ppbindingsubmit. For an input protein, TCBRP can quickly find all binding-site residues on the protein by automatically combining the information obtained from all PDB structures that consist of the protein of interest. Additionally, TCBRP presents the binding-site residues in different categories according to the interaction type. TCBRP also allows researchers to set the definition of binding-site residues. The developed tool is very useful for the research on protein binding site analysis and prediction.
The Ensembl genome database project.
Hubbard, T; Barker, D; Birney, E; Cameron, G; Chen, Y; Clark, L; Cox, T; Cuff, J; Curwen, V; Down, T; Durbin, R; Eyras, E; Gilbert, J; Hammond, M; Huminiecki, L; Kasprzyk, A; Lehvaslaiho, H; Lijnzaad, P; Melsopp, C; Mongin, E; Pettett, R; Pocock, M; Potter, S; Rust, A; Schmidt, E; Searle, S; Slater, G; Smith, J; Spooner, W; Stabenau, A; Stalker, J; Stupka, E; Ureta-Vidal, A; Vastrik, I; Clamp, M
2002-01-01
The Ensembl (http://www.ensembl.org/) database project provides a bioinformatics framework to organise biology around the sequences of large genomes. It is a comprehensive source of stable automatic annotation of the human genome sequence, with confirmed gene predictions that have been integrated with external data sources, and is available as either an interactive web site or as flat files. It is also an open source software engineering project to develop a portable system able to handle very large genomes and associated requirements from sequence analysis to data storage and visualisation. The Ensembl site is one of the leading sources of human genome sequence annotation and provided much of the analysis for publication by the international human genome project of the draft genome. The Ensembl system is being installed around the world in both companies and academic sites on machines ranging from supercomputers to laptops.
Next Generation Models for Storage and Representation of Microbial Biological Annotation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Quest, Daniel J; Land, Miriam L; Brettin, Thomas S
2010-01-01
Background Traditional genome annotation systems were developed in a very different computing era, one where the World Wide Web was just emerging. Consequently, these systems are built as centralized black boxes focused on generating high quality annotation submissions to GenBank/EMBL supported by expert manual curation. The exponential growth of sequence data drives a growing need for increasingly higher quality and automatically generated annotation. Typical annotation pipelines utilize traditional database technologies, clustered computing resources, Perl, C, and UNIX file systems to process raw sequence data, identify genes, and predict and categorize gene function. These technologies tightly couple the annotation software systemmore » to hardware and third party software (e.g. relational database systems and schemas). This makes annotation systems hard to reproduce, inflexible to modification over time, difficult to assess, difficult to partition across multiple geographic sites, and difficult to understand for those who are not domain experts. These systems are not readily open to scrutiny and therefore not scientifically tractable. The advent of Semantic Web standards such as Resource Description Framework (RDF) and OWL Web Ontology Language (OWL) enables us to construct systems that address these challenges in a new comprehensive way. Results Here, we develop a framework for linking traditional data to OWL-based ontologies in genome annotation. We show how data standards can decouple hardware and third party software tools from annotation pipelines, thereby making annotation pipelines easier to reproduce and assess. An illustrative example shows how TURTLE (Terse RDF Triple Language) can be used as a human readable, but also semantically-aware, equivalent to GenBank/EMBL files. Conclusions The power of this approach lies in its ability to assemble annotation data from multiple databases across multiple locations into a representation that is understandable to researchers. In this way, all researchers, experimental and computational, will more easily understand the informatics processes constructing genome annotation and ultimately be able to help improve the systems that produce them.« less
Ryden, T A; de Mars, M; Beemon, K
1993-01-01
Several C/EBP binding sites within the Rous sarcoma virus (RSV) long terminal repeat (LTR) and gag enhancers were mutated, and the effect of these mutations on viral gene expression was assessed. Minimal site-specific mutations in each of three adjacent C/EBP binding sites in the LTR reduced steady-state viral RNA levels. Double mutation of the two 5' proximal LTR binding sites resulted in production of 30% of wild-type levels of virus. DNase I footprinting analysis of mutant DNAs indicated that the mutations blocked C/EBP binding at the affected sites. Additional C/EBP binding sites were identified upstream of the 3' LTR and within the 5' end of the LTRs. Point mutations in the RSV gag intragenic enhancer region, which blocked binding of C/EBP at two of three adjacent C/EBP sites, also reduced virus production significantly. Nuclear extracts prepared from both chicken embryo fibroblasts (CEFs) and chicken muscle contained proteins binding to the same RSV DNA sites as did C/EBP, and mutations that prevented C/EBP binding also blocked binding of these chicken proteins. It appears that CEFs and chicken muscle contain distinct proteins binding to these RSV DNA sites; the CEF binding protein was heat stable, as is C/EBP, while the chicken muscle protein was heat sensitive. Images PMID:8386280
The Binding Sites of miR-619-5p in the mRNAs of Human and Orthologous Genes.
Atambayeva, Shara; Niyazova, Raigul; Ivashchenko, Anatoliy; Pyrkova, Anna; Pinsky, Ilya; Akimniyazova, Aigul; Labeit, Siegfried
2017-06-01
Normally, one miRNA interacts with the mRNA of one gene. However, there are miRNAs that can bind to many mRNAs, and one mRNA can be the target of many miRNAs. This significantly complicates the study of the properties of miRNAs and their diagnostic and medical applications. The search of 2,750 human microRNAs (miRNAs) binding sites in 12,175 mRNAs of human genes using the MirTarget program has been completed. For the binding sites of the miR-619-5p the hybridization free energy of the bonds was equal to 100% of the maximum potential free energy. The mRNAs of 201 human genes have complete complementary binding sites of miR-619-5p in the 3'UTR (214 sites), CDS (3 sites), and 5'UTR (4 sites). The mRNAs of CATAD1, ICA1L, GK5, POLH, and PRR11 genes have six miR-619-5p binding sites, and the mRNAs of OPA3 and CYP20A1 genes have eight and ten binding sites, respectively. All of these miR-619-5p binding sites are located in the 3'UTRs. The miR-619-5p binding site in the 5'UTR of mRNA of human USP29 gene is found in the mRNAs of orthologous genes of primates. Binding sites of miR-619-5p in the coding regions of mRNAs of C8H8orf44, C8orf44, and ISY1 genes encode the WLMPVIP oligopeptide, which is present in the orthologous proteins. Binding sites of miR-619-5p in the mRNAs of transcription factor genes ZNF429 and ZNF429 encode the AHACNP oligopeptide in another reading frame. Binding sites of miR-619-5p in the 3'UTRs of all human target genes are also present in the 3'UTRs of orthologous genes of mammals. The completely complementary binding sites for miR-619-5p are conservative in the orthologous mammalian genes. The majority of miR-619-5p binding sites are located in the 3'UTRs but some genes have miRNA binding sites in the 5'UTRs of mRNAs. Several genes have binding sites for miRNAs in the CDSs that are read in different open reading frames. Identical nucleotide sequences of binding sites encode different amino acids in different proteins. The binding sites of miR-619-5p in 3'UTRs, 5'UTRs and CDSs are conservative in the orthologous mammalian genes.
Song, Junfang; Duc, Céline; Storey, Kate G.; McLean, W. H. Irwin; Brown, Sara J.; Simpson, Gordon G.; Barton, Geoffrey J.
2014-01-01
The reference annotations made for a genome sequence provide the framework for all subsequent analyses of the genome. Correct and complete annotation in addition to the underlying genomic sequence is particularly important when interpreting the results of RNA-seq experiments where short sequence reads are mapped against the genome and assigned to genes according to the annotation. Inconsistencies in annotations between the reference and the experimental system can lead to incorrect interpretation of the effect on RNA expression of an experimental treatment or mutation in the system under study. Until recently, the genome-wide annotation of 3′ untranslated regions received less attention than coding regions and the delineation of intron/exon boundaries. In this paper, data produced for samples in Human, Chicken and A. thaliana by the novel single-molecule, strand-specific, Direct RNA Sequencing technology from Helicos Biosciences which locates 3′ polyadenylation sites to within +/− 2 nt, were combined with archival EST and RNA-Seq data. Nine examples are illustrated where this combination of data allowed: (1) gene and 3′ UTR re-annotation (including extension of one 3′ UTR by 5.9 kb); (2) disentangling of gene expression in complex regions; (3) clearer interpretation of small RNA expression and (4) identification of novel genes. While the specific examples displayed here may become obsolete as genome sequences and their annotations are refined, the principles laid out in this paper will be of general use both to those annotating genomes and those seeking to interpret existing publically available annotations in the context of their own experimental data. PMID:24722185
NASA Technical Reports Server (NTRS)
Winchester, S. K.; Selvamurugan, N.; D'Alonzo, R. C.; Partridge, N. C.
2000-01-01
Collagenase-3 mRNA is initially detectable when osteoblasts cease proliferation, increasing during differentiation and mineralization. We showed that this developmental expression is due to an increase in collagenase-3 gene transcription. Mutation of either the activator protein-1 or the runt domain binding site decreased collagenase-3 promoter activity, demonstrating that these sites are responsible for collagenase-3 gene transcription. The activator protein-1 and runt domain binding sites bind members of the activator protein-1 and core-binding factor family of transcription factors, respectively. We identified core-binding factor a1 binding to the runt domain binding site and JunD in addition to a Fos-related antigen binding to the activator protein-1 site. Overexpression of both c-Fos and c-Jun in osteoblasts or core-binding factor a1 increased collagenase-3 promoter activity. Furthermore, overexpression of c-Fos, c-Jun, and core-binding factor a1 synergistically increased collagenase-3 promoter activity. Mutation of either the activator protein-1 or the runt domain binding site resulted in the inability of c-Fos and c-Jun or core-binding factor a1 to increase collagenase-3 promoter activity, suggesting that there is cooperative interaction between the sites and the proteins. Overexpression of Fra-2 and JunD repressed core-binding factor a1-induced collagenase-3 promoter activity. Our results suggest that members of the activator protein-1 and core-binding factor families, binding to the activator protein-1 and runt domain binding sites are responsible for the developmental regulation of collagenase-3 gene expression in osteoblasts.
New insight into the binding modes of TNP-AMP to human liver fructose-1,6-bisphosphatase
NASA Astrophysics Data System (ADS)
Han, Xinya; Huang, Yunyuan; Zhang, Rui; Xiao, San; Zhu, Shuaihuan; Qin, Nian; Hong, Zongqin; Wei, Lin; Feng, Jiangtao; Ren, Yanliang; Feng, Lingling; Wan, Jian
2016-08-01
Human liver fructose-1,6-bisphosphatase (FBPase) contains two binding sites, a substrate fructose-1,6-bisphosphate (FBP) active site and an adenosine monophosphate (AMP) allosteric site. The FBP active site works by stabilizing the FBPase, and the allosteric site impairs the activity of FBPase through its binding of a nonsubstrate molecule. The fluorescent AMP analogue, 2‧,3‧-O-(2,4,6-trinitrophenyl)adenosine 5‧-monophosphate (TNP-AMP) has been used as a fluorescent probe as it is able to competitively inhibit AMP binding to the AMP allosteric site and, therefore, could be used for exploring the binding modes of inhibitors targeted on the allosteric site. In this study, we have re-examined the binding modes of TNP-AMP to FBPase. However, our present enzyme kinetic assays show that AMP and FBP both can reduce the fluorescence from the bound TNP-AMP through competition for FBPase, suggesting that TNP-AMP binds not only to the AMP allosteric site but also to the FBP active site. Mutagenesis assays of K274L (located in the FBP active site) show that the residue K274 is very important for TNP-AMP to bind to the active site of FBPase. The results further prove that TNP-AMP is able to bind individually to the both sites. Our present study provides a new insight into the binding mechanism of TNP-AMP to the FBPase. The TNP-AMP fluorescent probe can be used to exam the binding site of an inhibitor (the active site or the allosteric site) using FBPase saturated by AMP and FBP, respectively, or the K247L mutant FBPase.
New insight into the binding modes of TNP-AMP to human liver fructose-1,6-bisphosphatase.
Han, Xinya; Huang, Yunyuan; Zhang, Rui; Xiao, San; Zhu, Shuaihuan; Qin, Nian; Hong, Zongqin; Wei, Lin; Feng, Jiangtao; Ren, Yanliang; Feng, Lingling; Wan, Jian
2016-08-05
Human liver fructose-1,6-bisphosphatase (FBPase) contains two binding sites, a substrate fructose-1,6-bisphosphate (FBP) active site and an adenosine monophosphate (AMP) allosteric site. The FBP active site works by stabilizing the FBPase, and the allosteric site impairs the activity of FBPase through its binding of a nonsubstrate molecule. The fluorescent AMP analogue, 2',3'-O-(2,4,6-trinitrophenyl)adenosine 5'-monophosphate (TNP-AMP) has been used as a fluorescent probe as it is able to competitively inhibit AMP binding to the AMP allosteric site and, therefore, could be used for exploring the binding modes of inhibitors targeted on the allosteric site. In this study, we have re-examined the binding modes of TNP-AMP to FBPase. However, our present enzyme kinetic assays show that AMP and FBP both can reduce the fluorescence from the bound TNP-AMP through competition for FBPase, suggesting that TNP-AMP binds not only to the AMP allosteric site but also to the FBP active site. Mutagenesis assays of K274L (located in the FBP active site) show that the residue K274 is very important for TNP-AMP to bind to the active site of FBPase. The results further prove that TNP-AMP is able to bind individually to the both sites. Our present study provides a new insight into the binding mechanism of TNP-AMP to the FBPase. The TNP-AMP fluorescent probe can be used to exam the binding site of an inhibitor (the active site or the allosteric site) using FBPase saturated by AMP and FBP, respectively, or the K247L mutant FBPase. Copyright © 2016 Elsevier B.V. All rights reserved.
Mechanism of Metal Ion Activation of the Diphtheria Toxin Repressor DtxR
NASA Astrophysics Data System (ADS)
D'Aquino, J. Alejandro; Ringe, Dagmar
2006-08-01
The diphtheria toxin repressor, DtxR, is a metal ion-activated transcriptional regulator that has been linked to the virulence of Corynebacterium diphtheriae. Structure determination has shown that there are two metal ion binding sites per repressor monomer, and site-directed mutagenesis has demonstrated that binding site 2 (primary) is essential for recognition of the target DNA repressor, leaving the role of binding site 1 (ancillary) unclear (1 - 3). Calorimetric techniques have demonstrated that while binding site 1 (ancillary) has high affinity for metal ion with a binding constant of 2 × 10-7, binding site 2 (primary) is a low affinity binding site with a binding constant of 6.3 × 10-4. These two binding sites act independently and their contribution can be easily dissected by traditional mutational analysis. Our results clearly demonstrate that binding site 1 (ancillary) is the first one to be occupied during metal ion activation, playing a critical role in stabilization of the repressor. In addition, structural data obtained for the mutants Ni-DtxR(H79A,C102D), reported here and the previously reported DtxR(H79A) (4) has allowed us to propose a mechanism of metal ion activation for DtxR.
Allosteric binding sites in Rab11 for potential drug candidates
2018-01-01
Rab11 is an important protein subfamily in the RabGTPase family. These proteins physiologically function as key regulators of intracellular membrane trafficking processes. Pathologically, Rab11 proteins are implicated in many diseases including cancers, neurodegenerative diseases and type 2 diabetes. Although they are medically important, no previous study has found Rab11 allosteric binding sites where potential drug candidates can bind to. In this study, by employing multiple clustering approaches integrating principal component analysis, independent component analysis and locally linear embedding, we performed structural analyses of Rab11 and identified eight representative structures. Using these representatives to perform binding site mapping and virtual screening, we identified two novel binding sites in Rab11 and small molecules that can preferentially bind to different conformations of these sites with high affinities. After identifying the binding sites and the residue interaction networks in the representatives, we computationally showed that these binding sites may allosterically regulate Rab11, as these sites communicate with switch 2 region that binds to GTP/GDP. These two allosteric binding sites in Rab11 are also similar to two allosteric pockets in Ras that we discovered previously. PMID:29874286
Gene Ontology annotation of the rice blast fungus, Magnaporthe oryzae
Meng, Shaowu; Brown, Douglas E; Ebbole, Daniel J; Torto-Alalibo, Trudy; Oh, Yeon Yee; Deng, Jixin; Mitchell, Thomas K; Dean, Ralph A
2009-01-01
Background Magnaporthe oryzae, the causal agent of blast disease of rice, is the most destructive disease of rice worldwide. The genome of this fungal pathogen has been sequenced and an automated annotation has recently been updated to Version 6 . However, a comprehensive manual curation remains to be performed. Gene Ontology (GO) annotation is a valuable means of assigning functional information using standardized vocabulary. We report an overview of the GO annotation for Version 5 of M. oryzae genome assembly. Methods A similarity-based (i.e., computational) GO annotation with manual review was conducted, which was then integrated with a literature-based GO annotation with computational assistance. For similarity-based GO annotation a stringent reciprocal best hits method was used to identify similarity between predicted proteins of M. oryzae and GO proteins from multiple organisms with published associations to GO terms. Significant alignment pairs were manually reviewed. Functional assignments were further cross-validated with manually reviewed data, conserved domains, or data determined by wet lab experiments. Additionally, biological appropriateness of the functional assignments was manually checked. Results In total, 6,286 proteins received GO term assignment via the homology-based annotation, including 2,870 hypothetical proteins. Literature-based experimental evidence, such as microarray, MPSS, T-DNA insertion mutation, or gene knockout mutation, resulted in 2,810 proteins being annotated with GO terms. Of these, 1,673 proteins were annotated with new terms developed for Plant-Associated Microbe Gene Ontology (PAMGO). In addition, 67 experiment-determined secreted proteins were annotated with PAMGO terms. Integration of the two data sets resulted in 7,412 proteins (57%) being annotated with 1,957 distinct and specific GO terms. Unannotated proteins were assigned to the 3 root terms. The Version 5 GO annotation is publically queryable via the GO site . Additionally, the genome of M. oryzae is constantly being refined and updated as new information is incorporated. For the latest GO annotation of Version 6 genome, please visit our website . The preliminary GO annotation of Version 6 genome is placed at a local MySql database that is publically queryable via a user-friendly interface Adhoc Query System. Conclusion Our analysis provides comprehensive and robust GO annotations of the M. oryzae genome assemblies that will be solid foundations for further functional interrogation of M. oryzae. PMID:19278556
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rosier, A.M.; Vandesande, F.; Orban, G.A.
1991-03-08
The distribution of galanin (GAL) binding sites in the visual cortex of cat and monkey was determined by autoradiographic visualization of ({sup 125}I)-GAL binding to tissue sections. Binding conditions were optimized and, as a result, the binding was saturable and specific. In cat visual cortex, GAL binding sites were concentrated in layers I, IVc, V, and VI. Areas 17, 18, and 19 exhibited a similar distribution pattern. In monkey primary visual cortex, the highest density of GAL binding sites was observed in layers II/III, lower IVc, and upper V. Layers IVA and VI contained moderate numbers of GAL binding sites,more » while layer I and the remaining parts of layer IV displayed the lowest density. In monkey secondary visual cortex, GAL binding sites were mainly concentrated in layers V-VI. Layer IV exhibited a moderate density, while the supragranular layers contained the lowest proportion of GAL binding sites. In both cat and monkey, we found little difference between regions subserving central and those subserving peripheral vision. Similarities in the distribution of GAL and acetylcholine binding sites are discussed.« less
Global View of Mars Topography
NASA Technical Reports Server (NTRS)
2007-01-01
[figure removed for brevity, see original site] Annotated Version This global map of Mars is based on topographical information collected by the Mars Orbiter Laser Altimeter instrument on NASA's Mars Global Surveyor orbiter. Illumination is from the upper right. The image width is approximately 18,000 kilometers (11,185 miles). Candor Chasma forms part of the large Martian canyon system named Valles Marineris. The location of Southwest Candor Chasma is indicated in the annotated version.Enhancement of the Shared Graphics Workspace.
1987-12-31
participants to share videodisc images and computer graphics displayed in color and text and facsimile information displayed in black on amber. They...could annotate the information in up to five * colors and print the annotated version at both sites, using a standard fax machine. The SGWS also used a fax...system to display a document, whether text or photo, the camera scans the document, digitizes the data, and sends it via direct memory access (DMA) to
Hansen, M R; Simorre, J P; Hanson, P; Mokler, V; Bellon, L; Beigelman, L; Pardi, A
1999-01-01
A novel metal-binding site has been identified in the hammerhead ribozyme by 31P NMR. The metal-binding site is associated with the A13 phosphate in the catalytic core of the hammerhead ribozyme and is distinct from any previously identified metal-binding sites. 31P NMR spectroscopy was used to measure the metal-binding affinity for this site and leads to an apparent dissociation constant of 250-570 microM at 25 degrees C for binding of a single Mg2+ ion. The NMR data also show evidence of a structural change at this site upon metal binding and these results are compared with previous data on metal-induced structural changes in the core of the hammerhead ribozyme. These NMR data were combined with the X-ray structure of the hammerhead ribozyme (Pley HW, Flaherty KM, McKay DB. 1994. Nature 372:68-74) to model RNA ligands involved in binding the metal at this A13 site. In this model, the A13 metal-binding site is structurally similar to the previously identified A(g) metal-binding site and illustrates the symmetrical nature of the tandem G x A base pairs in domain 2 of the hammerhead ribozyme. These results demonstrate that 31P NMR represents an important method for both identification and characterization of metal-binding sites in nucleic acids. PMID:10445883
Ge, Yushu; van der Kamp, Marc; Malaisree, Maturos; Liu, Dan; Liu, Yi; Mulholland, Adrian J
2017-11-01
Cdc25 phosphatase B, a potential target for cancer therapy, is inhibited by a series of quinones. The binding site and mode of quinone inhibitors to Cdc25B remains unclear, whereas this information is important for structure-based drug design. We investigated the potential binding site of NSC663284 [DA3003-1 or 6-chloro-7-(2-morpholin-4-yl-ethylamino)-quinoline-5, 8-dione] through docking and molecular dynamics simulations. Of the two main binding sites suggested by docking, the molecular dynamics simulations only support one site for stable binding of the inhibitor. Binding sites in and near the Cdc25B catalytic site that have been suggested previously do not lead to stable binding in 50 ns molecular dynamics (MD) simulations. In contrast, a shallow pocket between the C-terminal helix and the catalytic site provides a favourable binding site that shows high stability. Two similar binding modes featuring protein-inhibitor interactions involving Tyr428, Arg482, Thr547 and Ser549 are identified by clustering analysis of all stable MD trajectories. The relatively flexible C-terminal region of Cdc25B contributes to inhibitor binding. The binding mode of NSC663284, identified through MD simulation, likely prevents the binding of protein substrates to Cdc25B. The present results provide useful information for the design of quinone inhibitors and their mechanism of inhibition.
NASA Astrophysics Data System (ADS)
Ge, Yushu; van der Kamp, Marc; Malaisree, Maturos; Liu, Dan; Liu, Yi; Mulholland, Adrian J.
2017-11-01
Cdc25 phosphatase B, a potential target for cancer therapy, is inhibited by a series of quinones. The binding site and mode of quinone inhibitors to Cdc25B remains unclear, whereas this information is important for structure-based drug design. We investigated the potential binding site of NSC663284 [DA3003-1 or 6-chloro-7-(2-morpholin-4-yl-ethylamino)-quinoline-5, 8-dione] through docking and molecular dynamics simulations. Of the two main binding sites suggested by docking, the molecular dynamics simulations only support one site for stable binding of the inhibitor. Binding sites in and near the Cdc25B catalytic site that have been suggested previously do not lead to stable binding in 50 ns molecular dynamics (MD) simulations. In contrast, a shallow pocket between the C-terminal helix and the catalytic site provides a favourable binding site that shows high stability. Two similar binding modes featuring protein-inhibitor interactions involving Tyr428, Arg482, Thr547 and Ser549 are identified by clustering analysis of all stable MD trajectories. The relatively flexible C-terminal region of Cdc25B contributes to inhibitor binding. The binding mode of NSC663284, identified through MD simulation, likely prevents the binding of protein substrates to Cdc25B. The present results provide useful information for the design of quinone inhibitors and their mechanism of inhibition.
Proteopedia: 3D Visualization and Annotation of Transcription Factor-DNA Readout Modes
ERIC Educational Resources Information Center
Dantas Machado, Ana Carolina; Saleebyan, Skyler B.; Holmes, Bailey T.; Karelina, Maria; Tam, Julia; Kim, Sharon Y.; Kim, Keziah H.; Dror, Iris; Hodis, Eran; Martz, Eric; Compeau, Patricia A.; Rohs, Remo
2012-01-01
3D visualization assists in identifying diverse mechanisms of protein-DNA recognition that can be observed for transcription factors and other DNA binding proteins. We used Proteopedia to illustrate transcription factor-DNA readout modes with a focus on DNA shape, which can be a function of either nucleotide sequence (Hox proteins) or base pairing…
New Virtual Field Trips. Revised Edition.
ERIC Educational Resources Information Center
Cooper, Gail; Cooper, Garry
This book is an annotated guidebook, arranged by subject matter, of World Wide Web sites for K-12 students. The following chapters are included: (1) Virtual Time Machine (i.e., sites that cover topics in world history); (2) Tour the World (i.e., sites that include information about countries); (3) Outer Space; (4) The Great Outdoors; (5) Aquatic…
Randak, Christoph O.; Dong, Qian; Ver Heul, Amanda R.; Elcock, Adrian H.; Welsh, Michael J.
2013-01-01
Cystic fibrosis transmembrane conductance regulator (CFTR) is an anion channel in the ATP-binding cassette (ABC) transporter protein family. In the presence of ATP and physiologically relevant concentrations of AMP, CFTR exhibits adenylate kinase activity (ATP + AMP ⇆ 2 ADP). Previous studies suggested that the interaction of nucleotide triphosphate with CFTR at ATP-binding site 2 is required for this activity. Two other ABC proteins, Rad50 and a structural maintenance of chromosome protein, also have adenylate kinase activity. All three ABC adenylate kinases bind and hydrolyze ATP in the absence of other nucleotides. However, little is known about how an ABC adenylate kinase interacts with ATP and AMP when both are present. Based on data from non-ABC adenylate kinases, we hypothesized that ATP and AMP mutually influence their interaction with CFTR at separate binding sites. We further hypothesized that only one of the two CFTR ATP-binding sites is involved in the adenylate kinase reaction. We found that 8-azidoadenosine 5′-triphosphate (8-N3-ATP) and 8-azidoadenosine 5′-monophosphate (8-N3-AMP) photolabeled separate sites in CFTR. Labeling of the AMP-binding site with 8-N3-AMP required the presence of ATP. Conversely, AMP enhanced photolabeling with 8-N3-ATP at ATP-binding site 2. The adenylate kinase active center probe P1,P5-di(adenosine-5′) pentaphosphate interacted simultaneously with an AMP-binding site and ATP-binding site 2. These results show that ATP and AMP interact with separate binding sites but mutually influence their interaction with the ABC adenylate kinase CFTR. They further indicate that the active center of the adenylate kinase comprises ATP-binding site 2. PMID:23921386
[Three-dimensional genome organization: a lesson from the Polycomb-Group proteins].
Bantignies, Frédéric
2013-01-01
As more and more genomes are being explored and annotated, important features of three-dimensional (3D) genome organization are just being uncovered. In the light of what we know about Polycomb group (PcG) proteins, we will present the latest findings on this topic. The PcG proteins are well-conserved chromatin factors that repress transcription of numerous target genes. They bind the genome at specific sites, forming chromatin domains of associated histone modifications as well as higher-order chromatin structures. These 3D chromatin structures involve the interactions between PcG-bound regulatory regions at short- and long-range distances, and may significantly contribute to PcG function. Recent high throughput "Chromosome Conformation Capture" (3C) analyses have revealed many other higher order structures along the chromatin fiber, partitioning the genomes into well demarcated topological domains. This revealed an unprecedented link between linear epigenetic domains and chromosome architecture, which might be intimately connected to genome function. © Société de Biologie, 2013.
Krastel, Philipp; Roggo, Silvio; Schirle, Markus; Ross, Nathan T; Perruccio, Francesca; Aspesi, Peter; Aust, Thomas; Buntin, Kathrin; Estoppey, David; Liechty, Brigitta; Mapa, Felipa; Memmert, Klaus; Miller, Howard; Pan, Xuewen; Riedl, Ralph; Thibaut, Christian; Thomas, Jason; Wagner, Trixie; Weber, Eric; Xie, Xiaobing; Schmitt, Esther K; Hoepfner, Dominic
2015-08-24
Cultivation of myxobacteria of the Nannocystis genus led to the isolation and structure elucidation of a class of novel cyclic lactone inhibitors of elongation factor 1. Whole genome sequence analysis and annotation enabled identification of the putative biosynthetic cluster and synthesis process. In biological assays the compounds displayed anti-fungal and cytotoxic activity. Combined genetic and proteomic approaches identified the eukaryotic translation elongation factor 1α (EF-1α) as the primary target for this compound class. Nannocystin A (1) displayed differential activity across various cancer cell lines and EEF1A1 expression levels appear to be the main differentiating factor. Biochemical and genetic evidence support an overlapping binding site of 1 with the anti-cancer compound didemnin B on EF-1α. This myxobacterial chemotype thus offers an interesting starting point for further investigations of the potential of therapeutics targeting elongation factor 1. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Araya, Carlos L.; Cenik, Can; Reuter, Jason A.; Kiss, Gert; Pande, Vijay S.; Snyder, Michael P.; Greenleaf, William J.
2015-01-01
Cancer sequencing studies have primarily identified cancer-driver genes by the accumulation of protein-altering mutations. An improved method would be annotation-independent, sensitive to unknown distributions of functions within proteins, and inclusive of non-coding drivers. We employed density-based clustering methods in 21 tumor types to detect variably-sized significantly mutated regions (SMRs). SMRs reveal recurrent alterations across a spectrum of coding and non-coding elements, including transcription factor binding sites and untranslated regions mutated in up to ∼15% of specific tumor types. SMRs reveal spatial clustering of mutations at molecular domains and interfaces, often with associated changes in signaling. Mutation frequencies in SMRs demonstrate that distinct protein regions are differentially mutated among tumor types, as exemplified by a linker region of PIK3CA in which biophysical simulations suggest mutations affect regulatory interactions. The functional diversity of SMRs underscores both the varied mechanisms of oncogenic misregulation and the advantage of functionally-agnostic driver identification. PMID:26691984
Geographic Resources on the Web: Bringing the World to Your Classroom.
ERIC Educational Resources Information Center
Green, Tim
2001-01-01
Presents an annotated bibliography of Web sites that can be useful for geography classroom teachers and of interest to students. Includes Web sites for the United States Geological Survey, the Central Intelligence Agency, University of Wisconsin-Stevens Point, and GlobeXplorer. (CMK)
Sharma, Virag; Hiller, Michael
2017-08-21
Genome alignments provide a powerful basis to transfer gene annotations from a well-annotated reference genome to many other aligned genomes. The completeness of these annotations crucially depends on the sensitivity of the underlying genome alignment. Here, we investigated the impact of the genome alignment parameters and found that parameters with a higher sensitivity allow the detection of thousands of novel alignments between orthologous exons that have been missed before. In particular, comparisons between species separated by an evolutionary distance of >0.75 substitutions per neutral site, like human and other non-placental vertebrates, benefit from increased sensitivity. To systematically test if increased sensitivity improves comparative gene annotations, we built a multiple alignment of 144 vertebrate genomes and used this alignment to map human genes to the other 143 vertebrates with CESAR. We found that higher alignment sensitivity substantially improves the completeness of comparative gene annotations by adding on average 2382 and 7440 novel exons and 117 and 317 novel genes for mammalian and non-mammalian species, respectively. Our results suggest a more sensitive alignment strategy that should generally be used for genome alignments between distantly-related species. Our 144-vertebrate genome alignment and the comparative gene annotations (https://bds.mpi-cbg.de/hillerlab/144VertebrateAlignment_CESAR/) are a valuable resource for comparative genomics. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Shen, Hong-Bin; Chou, Kuo-Chen
2007-04-20
Proteins may simultaneously exist at, or move between, two or more different subcellular locations. Proteins with multiple locations or dynamic feature of this kind are particularly interesting because they may have some very special biological functions intriguing to investigators in both basic research and drug discovery. For instance, among the 6408 human protein entries that have experimentally observed subcellular location annotations in the Swiss-Prot database (version 50.7, released 19-Sept-2006), 973 ( approximately 15%) have multiple location sites. The number of total human protein entries (except those annotated with "fragment" or those with less than 50 amino acids) in the same database is 14,370, meaning a gap of (14,370-6408)=7962 entries for which no knowledge is available about their subcellular locations. Although one can use the computational approach to predict the desired information for the gap, so far all the existing methods for predicting human protein subcellular localization are limited in the case of single location site only. To overcome such a barrier, a new ensemble classifier, named Hum-mPLoc, was developed that can be used to deal with the case of multiple location sites as well. Hum-mPLoc is freely accessible to the public as a web server at http://202.120.37.186/bioinf/hum-multi. Meanwhile, for the convenience of people working in the relevant areas, Hum-mPLoc has been used to identify all human protein entries in the Swiss-Prot database that do not have subcellular location annotations or are annotated as being uncertain. The large-scale results thus obtained have been deposited in a downloadable file prepared with Microsoft Excel and named "Tab_Hum-mPLoc.xls". This file is available at the same website and will be updated twice a year to include new entries of human proteins and reflect the continuous development of Hum-mPLoc.
Chromatin-Specific Regulation of Mammalian rDNA Transcription by Clustered TTF-I Binding Sites
Diermeier, Sarah D.; Németh, Attila; Rehli, Michael; Grummt, Ingrid; Längst, Gernot
2013-01-01
Enhancers and promoters often contain multiple binding sites for the same transcription factor, suggesting that homotypic clustering of binding sites may serve a role in transcription regulation. Here we show that clustering of binding sites for the transcription termination factor TTF-I downstream of the pre-rRNA coding region specifies transcription termination, increases the efficiency of transcription initiation and affects the three-dimensional structure of rRNA genes. On chromatin templates, but not on free rDNA, clustered binding sites promote cooperative binding of TTF-I, loading TTF-I to the downstream terminators before it binds to the rDNA promoter. Interaction of TTF-I with target sites upstream and downstream of the rDNA transcription unit connects these distal DNA elements by forming a chromatin loop between the rDNA promoter and the terminators. The results imply that clustered binding sites increase the binding affinity of transcription factors in chromatin, thus influencing the timing and strength of DNA-dependent processes. PMID:24068958
PlantRNA, a database for tRNAs of photosynthetic eukaryotes.
Cognat, Valérie; Pawlak, Gaël; Duchêne, Anne-Marie; Daujat, Magali; Gigant, Anaïs; Salinas, Thalia; Michaud, Morgane; Gutmann, Bernard; Giegé, Philippe; Gobert, Anthony; Maréchal-Drouard, Laurence
2013-01-01
PlantRNA database (http://plantrna.ibmp.cnrs.fr/) compiles transfer RNA (tRNA) gene sequences retrieved from fully annotated plant nuclear, plastidial and mitochondrial genomes. The set of annotated tRNA gene sequences has been manually curated for maximum quality and confidence. The novelty of this database resides in the inclusion of biological information relevant to the function of all the tRNAs entered in the library. This includes 5'- and 3'-flanking sequences, A and B box sequences, region of transcription initiation and poly(T) transcription termination stretches, tRNA intron sequences, aminoacyl-tRNA synthetases and enzymes responsible for tRNA maturation and modification. Finally, data on mitochondrial import of nuclear-encoded tRNAs as well as the bibliome for the respective tRNAs and tRNA-binding proteins are also included. The current annotation concerns complete genomes from 11 organisms: five flowering plants (Arabidopsis thaliana, Oryza sativa, Populus trichocarpa, Medicago truncatula and Brachypodium distachyon), a moss (Physcomitrella patens), two green algae (Chlamydomonas reinhardtii and Ostreococcus tauri), one glaucophyte (Cyanophora paradoxa), one brown alga (Ectocarpus siliculosus) and a pennate diatom (Phaeodactylum tricornutum). The database will be regularly updated and implemented with new plant genome annotations so as to provide extensive information on tRNA biology to the research community.
Automated Gene Ontology annotation for anonymous sequence data.
Hennig, Steffen; Groth, Detlef; Lehrach, Hans
2003-07-01
Gene Ontology (GO) is the most widely accepted attempt to construct a unified and structured vocabulary for the description of genes and their products in any organism. Annotation by GO terms is performed in most of the current genome projects, which besides generality has the advantage of being very convenient for computer based classification methods. However, direct use of GO in small sequencing projects is not easy, especially for species not commonly represented in public databases. We present a software package (GOblet), which performs annotation based on GO terms for anonymous cDNA or protein sequences. It uses the species independent GO structure and vocabulary together with a series of protein databases collected from various sites, to perform a detailed GO annotation by sequence similarity searches. The sensitivity and the reference protein sets can be selected by the user. GOblet runs automatically and is available as a public service on our web server. The paper also addresses the reliability of automated GO annotations by using a reference set of more than 6000 human proteins. The GOblet server is accessible at http://goblet.molgen.mpg.de.
Comparative Omics-Driven Genome Annotation Refinement: Application across Yersiniae
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rutledge, Alexandra C.; Jones, Marcus B.; Chauhan, Sadhana
2012-03-27
Genome sequencing continues to be a rapidly evolving technology, yet most downstream aspects of genome annotation pipelines remain relatively stable or are even being abandoned. To date, the perceived value of manual curation for genome annotations is not offset by the real cost and time associated with the process. In order to balance the large number of sequences generated, the annotation process is now performed almost exclusively in an automated fashion for most genome sequencing projects. One possible way to reduce errors inherent to automated computational annotations is to apply data from 'omics' measurements (i.e. transcriptional and proteomic) to themore » un-annotated genome with a proteogenomic-based approach. This approach does require additional experimental and bioinformatics methods to include omics technologies; however, the approach is readily automatable and can benefit from rapid developments occurring in those research domains as well. The annotation process can be improved by experimental validation of transcription and translation and aid in the discovery of annotation errors. Here the concept of annotation refinement has been extended to include a comparative assessment of genomes across closely related species, as is becoming common in sequencing efforts. Transcriptomic and proteomic data derived from three highly similar pathogenic Yersiniae (Y. pestis CO92, Y. pestis pestoides F, and Y. pseudotuberculosis PB1/+) was used to demonstrate a comprehensive comparative omic-based annotation methodology. Peptide and oligo measurements experimentally validated the expression of nearly 40% of each strain's predicted proteome and revealed the identification of 28 novel and 68 previously incorrect protein-coding sequences (e.g., observed frameshifts, extended start sites, and translated pseudogenes) within the three current Yersinia genome annotations. Gene loss is presumed to play a major role in Y. pestis acquiring its niche as a virulent pathogen, thus the discovery of many translated pseudogenes underscores a need for functional analyses to investigate hypotheses related to divergence. Refinements included the discovery of a seemingly essential ribosomal protein, several virulence-associated factors, and a transcriptional regulator, among other proteins, most of which are annotated as hypothetical, that were missed during annotation.« less
A ternary metal binding site in the C2 domain of phosphoinositide-specific phospholipase C-delta1.
Essen, L O; Perisic, O; Lynch, D E; Katan, M; Williams, R L
1997-03-11
We have determined the crystal structures of complexes of phosphoinositide-specific phospholipase C-delta1 from rat with calcium, barium, and lanthanum at 2.5-2.6 A resolution. Binding of these metal ions is observed in the active site of the catalytic TIM barrel and in the calcium binding region (CBR) of the C2 domain. The C2 domain of PLC-delta1 is a circularly permuted topological variant (P-variant) of the synaptotagmin I C2A domain (S-variant). On the basis of sequence analysis, we propose that both the S-variant and P-variant topologies are present among other C2 domains. Multiple adjacent binding sites in the C2 domain were observed for calcium and the other metal/enzyme complexes. The maximum number of binding sites observed was for the calcium analogue lanthanum. This complex shows an array-like binding of three lanthanum ions (sites I-III) in a crevice on one end of the C2 beta-sandwich. Residues involved in metal binding are contained in three loops, CBR1, CBR2, and CBR3. Sites I and II are maintained in the calcium and barium complexes, whereas sites II and III coincide with a binary calcium binding site in the C2A domain of synaptotagmin I. Several conformers for CBR1 are observed. The conformation of CBR1 does not appear to be strictly dependent on metal binding; however, metal binding may stabilize certain conformers. No significant structural changes are observed for CBR2 or CBR3. The surface of this ternary binding site provides a cluster of freely accessible liganding positions for putative phospholipid ligands of the C2 domain. It may be that the ternary metal binding site is also a feature of calcium-dependent phospholipid binding in solution. A ternary metal binding site might be a conserved feature among C2 domains that contain the critical calcium ligands in their CBR's. The high cooperativity of calcium-mediated lipid binding by C2 domains described previously is explained by this novel type of calcium binding site.
Srivastava, Gaurava; Tripathi, Shubhandra; Kumar, Akhil; Sharma, Ashok
2017-07-01
Multi drug resistant tuberculosis is a major threat for mankind. Resistance against Isoniazid (INH), targeting MtKatG protein, is one of the most commonly occurring resistances in MDR TB strains. S315T-MtKatG mutation is widely reported for INH resistance. Despite having knowledge about the mechanism of INH, exact binding site of INH to MtKatG is still uncertain and proposed to have three presumable binding sites (site-1, site-2, and site-3). In the current study docking, molecular dynamics simulation, binding free energy estimation, principal component analysis and free energy landscape analysis were performed to get molecular level details of INH binding site on MtKatG, and to probe the effect of S315T mutation on INH binding. Molecular docking and MD analysis suggested site-1 as active binding site of INH, where the effects of S315T mutation were observed on both access tunnel as well as molecular interaction between INH and its neighboring residues. MMPBSA also supported site-1 as potential binding site with lowest binding energy of -44.201 kJ/mol. Moreover, PCA and FEL revealed that S315T mutation not only reduces the dimension of heme access tunnel but also showed that extra methyl group at 315 position altered heme cavity, enforcing heme group distantly from INH, and thus preventing INH activation. The present study not only investigated the active binding site of INH but also provides a new insight about the conformational changes in the binding site of S315T-MtKatG. Copyright © 2017 Elsevier Ltd. All rights reserved.
Mechanism of Metal Ion Activation of the Diphtheria Toxin Repressor DtxR
DOE Office of Scientific and Technical Information (OSTI.GOV)
D'Aquino,J.; Tetenbaum-Novatt, J.; White, A.
2005-01-01
The diphtheria toxin repressor (DtxR) is a metal ion-activated transcriptional regulator that has been linked to the virulence of Corynebacterium diphtheriae. Structure determination has shown that there are two metal ion binding sites per repressor monomer, and site-directed mutagenesis has demonstrated that binding site 2 (primary) is essential for recognition of the target DNA repressor, leaving the role of binding site 1 (ancillary) unclear. Calorimetric techniques have demonstrated that although binding site 1 (ancillary) has high affinity for metal ion with a binding constant of 2 x 10{sup -7}, binding site 2 (primary) is a low-affinity binding site with amore » binding constant of 6.3 x 10{sup -4}. These two binding sites act in an independent fashion, and their contribution can be easily dissected by traditional mutational analysis. Our results clearly demonstrate that binding site 1 (ancillary) is the first one to be occupied during metal ion activation, playing a critical role in stabilization of the repressor. In addition, structural data obtained for the mutants Ni-DtxR(H79A, C102D), reported here, and the previously reported DtxR(H79A) have allowed us to propose a mechanism of metal activation for DtxR.« less
Identification of a Second Substrate-binding Site in Solute-Sodium Symporters*
Li, Zheng; Lee, Ashley S. E.; Bracher, Susanne; Jung, Heinrich; Paz, Aviv; Kumar, Jay P.; Abramson, Jeff; Quick, Matthias; Shi, Lei
2015-01-01
The structure of the sodium/galactose transporter (vSGLT), a solute-sodium symporter (SSS) from Vibrio parahaemolyticus, shares a common structural fold with LeuT of the neurotransmitter-sodium symporter family. Structural alignments between LeuT and vSGLT reveal that the crystallographically identified galactose-binding site in vSGLT is located in a more extracellular location relative to the central substrate-binding site (S1) in LeuT. Our computational analyses suggest the existence of an additional galactose-binding site in vSGLT that aligns to the S1 site of LeuT. Radiolabeled galactose saturation binding experiments indicate that, like LeuT, vSGLT can simultaneously bind two substrate molecules under equilibrium conditions. Mutating key residues in the individual substrate-binding sites reduced the molar substrate-to-protein binding stoichiometry to ∼1. In addition, the related and more experimentally tractable SSS member PutP (the Na+/proline transporter) also exhibits a binding stoichiometry of 2. Targeting residues in the proposed sites with mutations results in the reduction of the binding stoichiometry and is accompanied by severely impaired translocation of proline. Our data suggest that substrate transport by SSS members requires both substrate-binding sites, thereby implying that SSSs and neurotransmitter-sodium symporters share common mechanistic elements in substrate transport. PMID:25398883
Nelson, Christopher S; Fuller, Chris K; Fordyce, Polly M; Greninger, Alexander L; Li, Hao; DeRisi, Joseph L
2013-07-01
The transcription factor forkhead box P2 (FOXP2) is believed to be important in the evolution of human speech. A mutation in its DNA-binding domain causes severe speech impairment. Humans have acquired two coding changes relative to the conserved mammalian sequence. Despite intense interest in FOXP2, it has remained an open question whether the human protein's DNA-binding specificity and chromatin localization are conserved. Previous in vitro and ChIP-chip studies have provided conflicting consensus sequences for the FOXP2-binding site. Using MITOMI 2.0 microfluidic affinity assays, we describe the binding site of FOXP2 and its affinity profile in base-specific detail for all substitutions of the strongest binding site. We find that human and chimp FOXP2 have similar binding sites that are distinct from previously suggested consensus binding sites. Additionally, through analysis of FOXP2 ChIP-seq data from cultured neurons, we find strong overrepresentation of a motif that matches our in vitro results and identifies a set of genes with FOXP2 binding sites. The FOXP2-binding sites tend to be conserved, yet we identified 38 instances of evolutionarily novel sites in humans. Combined, these data present a comprehensive portrait of FOXP2's-binding properties and imply that although its sequence specificity has been conserved, some of its genomic binding sites are newly evolved.
Nelson, Christopher S.; Fuller, Chris K.; Fordyce, Polly M.; Greninger, Alexander L.; Li, Hao; DeRisi, Joseph L.
2013-01-01
The transcription factor forkhead box P2 (FOXP2) is believed to be important in the evolution of human speech. A mutation in its DNA-binding domain causes severe speech impairment. Humans have acquired two coding changes relative to the conserved mammalian sequence. Despite intense interest in FOXP2, it has remained an open question whether the human protein’s DNA-binding specificity and chromatin localization are conserved. Previous in vitro and ChIP-chip studies have provided conflicting consensus sequences for the FOXP2-binding site. Using MITOMI 2.0 microfluidic affinity assays, we describe the binding site of FOXP2 and its affinity profile in base-specific detail for all substitutions of the strongest binding site. We find that human and chimp FOXP2 have similar binding sites that are distinct from previously suggested consensus binding sites. Additionally, through analysis of FOXP2 ChIP-seq data from cultured neurons, we find strong overrepresentation of a motif that matches our in vitro results and identifies a set of genes with FOXP2 binding sites. The FOXP2-binding sites tend to be conserved, yet we identified 38 instances of evolutionarily novel sites in humans. Combined, these data present a comprehensive portrait of FOXP2’s-binding properties and imply that although its sequence specificity has been conserved, some of its genomic binding sites are newly evolved. PMID:23625967
Evolution of Metal(Loid) Binding Sites in Transcriptional Regulators
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ordonez, E.; Thiyagarajan, S.; Cook, J.D.
2009-05-22
Expression of the genes for resistance to heavy metals and metalloids is transcriptionally regulated by the toxic ions themselves. Members of the ArsR/SmtB family of small metalloregulatory proteins respond to transition metals, heavy metals, and metalloids, including As(III), Sb(III), Cd(II), Pb(II), Zn(II), Co(II), and Ni(II). These homodimeric repressors bind to DNA in the absence of inducing metal(loid) ion and dissociate from the DNA when inducer is bound. The regulatory sites are often three- or four-coordinate metal binding sites composed of cysteine thiolates. Surprisingly, in two different As(III)-responsive regulators, the metalloid binding sites were in different locations in the repressor, andmore » the Cd(II) binding sites were in two different locations in two Cd(II)-responsive regulators. We hypothesize that ArsR/SmtB repressors have a common backbone structure, that of a winged helix DNA-binding protein, but have considerable plasticity in the location of inducer binding sites. Here we show that an As(III)-responsive member of the family, CgArsR1 from Corynebacterium glutamicum, binds As(III) to a cysteine triad composed of Cys{sup 15}, Cys{sup 16}, and Cys{sup 55}. This binding site is clearly unrelated to the binding sites of other characterized ArsR/SmtB family members. This is consistent with our hypothesis that metal(loid) binding sites in DNA binding proteins evolve convergently in response to persistent environmental pressures.« less
Conservation of Fold and Topology of Functional Elements in Thiamin Pyrophosphate Enzymes
NASA Technical Reports Server (NTRS)
Dominiak, P.; Ciszak, E. M.
2005-01-01
Thiamin pyrophosphate (TPP)-dependent enzymes are a highly divergent family of proteins binding both TPP and metal ions. They perform decarboxylation-hydroxyaldehydes. Prior -ketoacids and of a common - (O=)C-C(OH)- fragment of to knowledge of three-dimensional structures of these enzmes, the GDGY25-30NN sequence was used to identify these enzymes. Subsequently, a number of structural studies on those enzymes revealed multi-subunit organization and the features of the two duplicate cofactor binding sites. Analyzing the structures of 44 structurally known enzymes, we found that the common structure of these enzymes is reduced to 180-220 amino acid long fragments of two PP and two PYR domains that form the [PP:PYR]2 binding center of two cofactor molecules. The structures of PP and PYR are arranged in a similar fold-sheet with triplets of helices on both sides.Dconsisting of a six-stranded Residues surrounding the cofactors are not strictly conserved, but they provide the same interatomic contacts required for the catalytic functions that these enzymes perform while maintaining interactive structural integrity. These structural and functional amino acids are topological counterparts located in the same positions of the conserved fold of sets of PP and PYR domains. Additional parallels include short fragments of sequences that link these amino acids to the fold and function. This report on the structural commonalities amongst TPP dependent enzymes is thought to contribute new approaches to annotation that may assist in advancing the functional proteomics of TPP dependent enzymes, and trace their complexity within evolutionary context.
NASA Astrophysics Data System (ADS)
Pang, ChunLi; Cao, TianGuang; Li, JunWei; Jia, MengWen; Zhang, SuHua; Ren, ShuXi; An, HaiLong; Zhan, Yong
2013-08-01
The family of calcium-binding proteins (CaBPs) consists of dozens of members and contributes to all aspects of the cell's function, from homeostasis to learning and memory. However, the Ca2+-binding mechanism is still unclear for most of CaBPs. To identify the Ca2+-binding sites of CaBPs, this study presented a computational approach which combined the fragment homology modeling with molecular dynamics simulation. For validation, we performed a two-step strategy as follows: first, the approach is used to identify the Ca2+-binding sites of CaBPs, which have the EF-hand Ca2+-binding site and the detailed binding mechanism. To accomplish this, eighteen crystal structures of CaBPs with 49 Ca2+-binding sites are selected to be analyzed including calmodulin. The computational method identified 43 from 49 Ca2+-binding sites. Second, we performed the approach to large-conductance Ca2+-activated K+ (BK) channels which don't have clear Ca2+-binding mechanism. The simulated results are consistent with the experimental data. The computational approach may shed some light on the identification of Ca2+-binding sites in CaBPs.
Worley, K C; Wiese, B A; Smith, R F
1995-09-01
BEAUTY (BLAST enhanced alignment utility) is an enhanced version of the NCBI's BLAST data base search tool that facilitates identification of the functions of matched sequences. We have created new data bases of conserved regions and functional domains for protein sequences in NCBI's Entrez data base, and BEAUTY allows this information to be incorporated directly into BLAST search results. A Conserved Regions Data Base, containing the locations of conserved regions within Entrez protein sequences, was constructed by (1) clustering the entire data base into families, (2) aligning each family using our PIMA multiple sequence alignment program, and (3) scanning the multiple alignments to locate the conserved regions within each aligned sequence. A separate Annotated Domains Data Base was constructed by extracting the locations of all annotated domains and sites from sequences represented in the Entrez, PROSITE, BLOCKS, and PRINTS data bases. BEAUTY performs a BLAST search of those Entrez sequences with conserved regions and/or annotated domains. BEAUTY then uses the information from the Conserved Regions and Annotated Domains data bases to generate, for each matched sequence, a schematic display that allows one to directly compare the relative locations of (1) the conserved regions, (2) annotated domains and sites, and (3) the locally aligned regions matched in the BLAST search. In addition, BEAUTY search results include World-Wide Web hypertext links to a number of external data bases that provide a variety of additional types of information on the function of matched sequences. This convenient integration of protein families, conserved regions, annotated domains, alignment displays, and World-Wide Web resources greatly enhances the biological informativeness of sequence similarity searches. BEAUTY searches can be performed remotely on our system using the "BCM Search Launcher" World-Wide Web pages (URL is < http:/ /gc.bcm.tmc.edu:8088/ search-launcher/launcher.html > ).
Kamalakaran, Sitharthan; Radhakrishnan, Senthil K; Beck, William T
2005-06-03
We developed a pipeline to identify novel genes regulated by the steroid hormone-dependent transcription factor, estrogen receptor, through a systematic analysis of upstream regions of all human and mouse genes. We built a data base of putative promoter regions for 23,077 human and 19,984 mouse transcripts from National Center for Biotechnology Information annotation and 8793 human and 6785 mouse promoters from the Data Base of Transcriptional Start Sites. We used this data base of putative promoters to identify potential targets of estrogen receptor by identifying estrogen response elements (EREs) in their promoters. Our program correctly identified EREs in genes known to be regulated by estrogen in addition to several new genes whose putative promoters contained EREs. We validated six genes (KIAA1243, NRIP1, MADH9, NME3, TPD52L, and ABCG2) to be estrogen-responsive in MCF7 cells using reverse transcription PCR. To allow for extensibility of our program in identifying targets of other transcription factors, we have built a Web interface to access our data base and programs. Our Web-based program for Promoter Analysis of Genome, PAGen@UIC, allows a user to identify putative target genes for vertebrate transcription factors through the analysis of their upstream sequences. The interface allows the user to search the human and mouse promoter data bases for potential target genes containing one or more listed transcription factor binding sites (TFBSs) in their upstream elements, using either regular expression-based consensus or position weight matrices. The data base can also be searched for promoters harboring user-defined TFBSs given as a consensus or a position weight matrix. Furthermore, the user can retrieve putative promoter sequences for any given gene together with identified TFBSs located on its promoter. Orthologous promoters are also analyzed to determine conserved elements.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rothman, R.B.; Jacobson, A.E.; Rice, K.C.
1987-11-01
Previous studies demonstrated that pretreatment of brain membranes with the irreversible mu antagonist, beta-funaltrexamine (beta-FNA), partially eliminated mu binding sites (25,35), consistent with the existence of two mu binding sites distinguished by beta-FNA. This paper tests the hypothesis that the FNA-sensitive and FNA-insensitive mu binding sites have different anatomical distributions in rat brain. Prior to autoradiographic visualization of mu binding sites, (/sup 3/H)oxymorphone, (/sup 3/H)D-ala2-MePhe4, Gly-ol5-enkephalin (DAGO), and (/sup 125/I)D-ala2-Me-Phe4-met(o)-ol)enkephalin (FK33824) were shown to selectively label mu binding sites using slide mounted sections of molded minced rat brain. As found using membranes, beta-FNA eliminated only a portion of mu bindingmore » sites. Autoradiographic visualization of mu binding sites using the mu-selective ligand (/sup 125/I)FK33824 in control and FNA-treated sections of rat brain demonstrated that the proportion of mu binding sites sensitive to beta-FNA varied across regions of the brain, particularly the dorsal thalamus, ventrobasal complex and the hypothalamus, providing anatomical data supporting the existence of two classes of mu binding sites in rat brain.« less
Widespread evidence of cooperative DNA binding by transcription factors in Drosophila development
Kazemian, Majid; Pham, Hannah; Wolfe, Scot A.; Brodsky, Michael H.; Sinha, Saurabh
2013-01-01
Regulation of eukaryotic gene transcription is often combinatorial in nature, with multiple transcription factors (TFs) regulating common target genes, often through direct or indirect mutual interactions. Many individual examples of cooperative binding by directly interacting TFs have been identified, but it remains unclear how pervasive this mechanism is during animal development. Cooperative TF binding should be manifest in genomic sequences as biased arrangements of TF-binding sites. Here, we explore the extent and diversity of such arrangements related to gene regulation during Drosophila embryogenesis. We used the DNA-binding specificities of 322 TFs along with chromatin accessibility information to identify enriched spacing and orientation patterns of TF-binding site pairs. We developed a new statistical approach for this task, specifically designed to accurately assess inter-site spacing biases while accounting for the phenomenon of homotypic site clustering commonly observed in developmental regulatory regions. We observed a large number of short-range distance preferences between TF-binding site pairs, including examples where the preference depends on the relative orientation of the binding sites. To test whether these binding site patterns reflect physical interactions between the corresponding TFs, we analyzed 27 TF pairs whose binding sites exhibited short distance preferences. In vitro protein–protein binding experiments revealed that >65% of these TF pairs can directly interact with each other. For five pairs, we further demonstrate that they bind cooperatively to DNA if both sites are present with the preferred spacing. This study demonstrates how DNA-binding motifs can be used to produce a comprehensive map of sequence signatures for different mechanisms of combinatorial TF action. PMID:23847101
Widespread evidence of cooperative DNA binding by transcription factors in Drosophila development.
Kazemian, Majid; Pham, Hannah; Wolfe, Scot A; Brodsky, Michael H; Sinha, Saurabh
2013-09-01
Regulation of eukaryotic gene transcription is often combinatorial in nature, with multiple transcription factors (TFs) regulating common target genes, often through direct or indirect mutual interactions. Many individual examples of cooperative binding by directly interacting TFs have been identified, but it remains unclear how pervasive this mechanism is during animal development. Cooperative TF binding should be manifest in genomic sequences as biased arrangements of TF-binding sites. Here, we explore the extent and diversity of such arrangements related to gene regulation during Drosophila embryogenesis. We used the DNA-binding specificities of 322 TFs along with chromatin accessibility information to identify enriched spacing and orientation patterns of TF-binding site pairs. We developed a new statistical approach for this task, specifically designed to accurately assess inter-site spacing biases while accounting for the phenomenon of homotypic site clustering commonly observed in developmental regulatory regions. We observed a large number of short-range distance preferences between TF-binding site pairs, including examples where the preference depends on the relative orientation of the binding sites. To test whether these binding site patterns reflect physical interactions between the corresponding TFs, we analyzed 27 TF pairs whose binding sites exhibited short distance preferences. In vitro protein-protein binding experiments revealed that >65% of these TF pairs can directly interact with each other. For five pairs, we further demonstrate that they bind cooperatively to DNA if both sites are present with the preferred spacing. This study demonstrates how DNA-binding motifs can be used to produce a comprehensive map of sequence signatures for different mechanisms of combinatorial TF action.
Cooperative activation of cardiac transcription through myocardin bridging of paired MEF2 sites
DOE Office of Scientific and Technical Information (OSTI.GOV)
Anderson, Courtney M.; Hu, Jianxin; Thomas, Reuben
2017-03-28
Enhancers frequently contain multiple binding sites for the same transcription factor. These homotypic binding sites often exhibit synergy, whereby the transcriptional output from two or more binding sites is greater than the sum of the contributions of the individual binding sites alone. Although this phenomenon is frequently observed, the mechanistic basis for homotypic binding site synergy is poorly understood. Here in this paper, we identify a bona fide cardiac-specific Prkaa2 enhancer that is synergistically activated by homotypic MEF2 binding sites. We show that two MEF2 sites in the enhancer function cooperatively due to bridging of the MEF2C-bound sites by themore » SAP domain-containing co-activator protein myocardin, and we show that paired sites buffer the enhancer from integration site-dependent effects on transcription in vivo. Paired MEF2 sites are prevalent in cardiac enhancers, suggesting that this might be a common mechanism underlying synergy in the control of cardiac gene expression in vivo.« less
Abou-Zied, Osama K
2015-01-01
Human serum albumin (HSA) is one of the major carrier proteins in the body and constitutes approximately half of the protein found in blood plasma. It plays an important role in lipid metabolism, and its ability to reversibly bind a large variety of pharmaceutical compounds makes it a crucial determinant of drug pharmacokinetics and pharmacodynamics. This review deals with one of the protein's major binding sites "Sudlow I" which includes a binding pocket for the drug warfarin (WAR). The binding nature of this important site can be characterized by measuring the spectroscopic changes when a ligand is bound. Using several drugs, including WAR, and other drug-like molecules as ligands, the results emphasize the nature of Sudlow I as a flexible binding site, capable of binding a variety of ligands by adapting its binding pockets. The high affinity of the WAR pocket for binding versatile molecular structures stems from the flexibility of the amino acids forming the pocket. The binding site is shown to have an ionization ability which is important to consider when using drugs that are known to bind in Sudlow I. Several studies point to the important role of water molecules trapped inside the binding site in molecular recognition and ligand binding. Water inside the protein's cavity is crucial in maintaining the balance between the hydrophobic and hydrophilic nature of the binding site. Upon the unfolding and refolding of HSA, more water molecules are trapped inside the binding site which cause some swelling that prevents a full recovery from the denatured state. Better understanding of the mechanism of binding in macromolecules such as HSA and other proteins can be achieved by combining experimental and theoretical studies which produce significant synergies in studying complex biochemical phenomena.
Nuclear binding of progesterone in hen oviduct. Binding to multiple sites in vitro.
Pikler, G M; Webster, R A; Spelsberg, T C
1976-01-01
Steroid hormones, including progesterone, are known to bind with high affinity (Kd approximately 1x10(-10)M) to receptor proteins once they enter target cells. This complex (the progesterone-receptor) then undergoes a temperature-and/or salt-dependent activation which allows it to migrate to the cell nucleus and to bind to the deoxyribonucleoproteins. The present studies demonstrate that binding the hormone-receptor complex in vitro to isolated nuclei from the oviducts of laying hens required the same conditions as do other studies of bbinding in vitro reported previously, e.g. the hormone must be complexed to intact and activated receptor. The assay of the nuclear binding by using multiple concentrations of progesterone receptor reveals the presence of more than one class of binding site in the oviduct nuclei. The affinity of each of these classes of binding sites range from Kd approximately 1x10(-9)-1x10(-8)M. Assays using free steroid (not complexed with receptor) show no binding to these sites. The binding to each of the classes of sites, displays a differential stability to increasing ionic concentrations, suggesting primarily an ionic-type interaction for all classes. Only the highest-affinity class of binding site is capable of binding progesterone receptor under physioligical-saline conditions. This class represent 6000-10000 sites per cell nucleus and resembles the sites detected in vivo (Spelsberg, 1976, Biochem. J. 156, 391-398) which cause maximal transcriptional response when saturated with the progesterone receptor. The multiple binding sites for the progesterone receptor either are not present or are found in limited numbers in the nuclei of non-target organs. Differences in extent of binding to the nuclear material between a target tissue (oviduct) and other tissues (spleen or erythrocyte) are markedly dependent on the ionic conditions, and are probably due to binding to different classes of sites in the nuclei. PMID:182147
Alignment-Annotator web server: rendering and annotating sequence alignments.
Gille, Christoph; Fähling, Michael; Weyand, Birgit; Wieland, Thomas; Gille, Andreas
2014-07-01
Alignment-Annotator is a novel web service designed to generate interactive views of annotated nucleotide and amino acid sequence alignments (i) de novo and (ii) embedded in other software. All computations are performed at server side. Interactivity is implemented in HTML5, a language native to web browsers. The alignment is initially displayed using default settings and can be modified with the graphical user interfaces. For example, individual sequences can be reordered or deleted using drag and drop, amino acid color code schemes can be applied and annotations can be added. Annotations can be made manually or imported (BioDAS servers, the UniProt, the Catalytic Site Atlas and the PDB). Some edits take immediate effect while others require server interaction and may take a few seconds to execute. The final alignment document can be downloaded as a zip-archive containing the HTML files. Because of the use of HTML the resulting interactive alignment can be viewed on any platform including Windows, Mac OS X, Linux, Android and iOS in any standard web browser. Importantly, no plugins nor Java are required and therefore Alignment-Anotator represents the first interactive browser-based alignment visualization. http://www.bioinformatics.org/strap/aa/ and http://strap.charite.de/aa/. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Alignment-Annotator web server: rendering and annotating sequence alignments
Gille, Christoph; Fähling, Michael; Weyand, Birgit; Wieland, Thomas; Gille, Andreas
2014-01-01
Alignment-Annotator is a novel web service designed to generate interactive views of annotated nucleotide and amino acid sequence alignments (i) de novo and (ii) embedded in other software. All computations are performed at server side. Interactivity is implemented in HTML5, a language native to web browsers. The alignment is initially displayed using default settings and can be modified with the graphical user interfaces. For example, individual sequences can be reordered or deleted using drag and drop, amino acid color code schemes can be applied and annotations can be added. Annotations can be made manually or imported (BioDAS servers, the UniProt, the Catalytic Site Atlas and the PDB). Some edits take immediate effect while others require server interaction and may take a few seconds to execute. The final alignment document can be downloaded as a zip-archive containing the HTML files. Because of the use of HTML the resulting interactive alignment can be viewed on any platform including Windows, Mac OS X, Linux, Android and iOS in any standard web browser. Importantly, no plugins nor Java are required and therefore Alignment-Anotator represents the first interactive browser-based alignment visualization. Availability: http://www.bioinformatics.org/strap/aa/ and http://strap.charite.de/aa/. PMID:24813445
Lee, Donald W; Khavrutskii, Ilja V; Wallqvist, Anders; Bavari, Sina; Cooper, Christopher L; Chaudhury, Sidhartha
2016-01-01
The somatic diversity of antigen-recognizing B-cell receptors (BCRs) arises from Variable (V), Diversity (D), and Joining (J) (VDJ) recombination and somatic hypermutation (SHM) during B-cell development and affinity maturation. The VDJ junction of the BCR heavy chain forms the highly variable complementarity determining region 3 (CDR3), which plays a critical role in antigen specificity and binding affinity. Tracking the selection and mutation of the CDR3 can be useful in characterizing humoral responses to infection and vaccination. Although tens to hundreds of thousands of unique BCR genes within an expressed B-cell repertoire can now be resolved with high-throughput sequencing, tracking SHMs is still challenging because existing annotation methods are often limited by poor annotation coverage, inconsistent SHM identification across the VDJ junction, or lack of B-cell lineage data. Here, we present B-cell repertoire inductive lineage and immunosequence annotator (BRILIA), an algorithm that leverages repertoire-wide sequencing data to globally improve the VDJ annotation coverage, lineage tree assembly, and SHM identification. On benchmark tests against simulated human and mouse BCR repertoires, BRILIA correctly annotated germline and clonally expanded sequences with 94 and 70% accuracy, respectively, and it has a 90% SHM-positive prediction rate in the CDR3 of heavily mutated sequences; these are substantial improvements over existing methods. We used BRILIA to process BCR sequences obtained from splenic germinal center B cells extracted from C57BL/6 mice. BRILIA returned robust B-cell lineage trees and yielded SHM patterns that are consistent across the VDJ junction and agree with known biological mechanisms of SHM. By contrast, existing BCR annotation tools, which do not account for repertoire-wide clonal relationships, systematically underestimated both the size of clonally related B-cell clusters and yielded inconsistent SHM frequencies. We demonstrate BRILIA's utility in B-cell repertoire studies related to VDJ gene usage, mechanisms for adenosine mutations, and SHM hot spot motifs. Furthermore, we show that the complete gene usage annotation and SHM identification across the entire CDR3 are essential for studying the B-cell affinity maturation process through immunosequencing methods.
Gauss, George H.; Reott, Michael A.; Rocha, Edson R.; Young, Mark J.; Douglas, Trevor
2012-01-01
A factor contributing to the pathogenicity of Bacteroides fragilis, the most common anaerobic species isolated from clinical infections, is the bacterium's extreme aerotolerance, which allows survival in oxygenated tissues prior to anaerobic abscess formation. We investigated the role of the bacterioferritin-related (bfr) gene in the B. fragilis oxidative stress response. The bfr mRNA levels are increased in stationary phase or in response to O2 or iron. In addition, bfr null mutants exhibit reduced aerotolerance, and the bfr gene product protects DNA from hydroxyl radical cleavage in vitro. Crystallographic studies revealed a protein with a dodecameric structure and greater similarity to an archaeal DNA protection in starved cells (DPS)-like protein than to the 24-subunit bacterioferritins. Similarity to the DPS-like (DPSL) protein extends to the subunit and includes a pair of conserved cysteine residues juxtaposed to a buried dimetal binding site within the four-helix bundle. Compared to archaeal DPSLs, however, this bacterial DPSL protein contains several unique features, including a significantly different conformation in the C-terminal tail that alters the number and location of pores leading to the central cavity and a conserved metal binding site on the interior surface of the dodecamer. Combined, these characteristics confirm this new class of miniferritin in the bacterial domain, delineate the similarities and differences between bacterial DPSL proteins and their archaeal homologs, allow corrected annotations for B. fragilis bfr and other dpsl genes within the bacterial domain, and suggest an evolutionary link within the ferritin superfamily that connects dodecameric DPS to the (bacterio)ferritin 24-mer. PMID:22020642
Dozmorov, Mikhail G
2015-01-01
Although age-associated gene expression and methylation changes have been reported throughout the literature, the unifying epigenomic principles of aging remain poorly understood. Recent explosion in availability and resolution of functional/regulatory genome annotation data (epigenomic data), such as that provided by the ENCODE and Roadmap Epigenomics projects, provides an opportunity for the identification of epigenomic mechanisms potentially altered by age-associated differentially methylated regions (aDMRs) and regulatory signatures in the promoters of age-associated genes (aGENs). In this study we found that aDMRs and aGENs identified in multiple independent studies share a common Polycomb Repressive Complex 2 signature marked by EZH2, SUZ12, CTCF binding sites, repressive H3K27me3, and activating H3K4me1 histone modification marks, and a “poised promoter” chromatin state. This signature is depleted in RNA Polymerase II-associated transcription factor binding sites, activating H3K79me2, H3K36me3, H3K27ac marks, and an “active promoter” chromatin state. The PRC2 signature was shown to be generally stable across cell types. When considering the directionality of methylation changes, we found the PRC2 signature to be associated with aDMRs hypermethylated with age, while hypomethylated aDMRs were associated with enhancers. In contrast, aGENs were associated with the PRC2 signature independently of the directionality of gene expression changes. In this study we demonstrate that the PRC2 signature is the common epigenomic context of genomic regions associated with hypermethylation and gene expression changes in aging. PMID:25880792
Jia, Dong; Cai, Lun; He, Housheng; Skogerbø, Geir; Li, Tiantian; Aftab, Muhammad Nauman; Chen, Runsheng
2007-01-01
Background The 2,2,7-trimethylguanosine (TMG) cap structure is an important functional characteristic of ncRNAs with critical cellular roles, such as some snRNAs. Here we used immunoprecipitation with both K121 and R1131 anti-TMG antibodies to systematically identify the TMG cap structures for all presently characterized ncRNAs in C. elegans. Results The two anti-TMG antibodies precipitated a similar group of the C. elegans ncRNAs. All snRNAs known to have a TMG cap structure were found in the precipitate, indicating that our identification system was efficient. Other ncRNA families related to splicing, such as SL RNAs and Sm Y RNAs, were also found in the precipitate, as were 7 C/D box snoRNAs. Further analysis showed that the SL RNAs and the Sm Y RNAs shared a very similar Sm binding site element (AAU4–5GGA), which sequence composition differed somewhat from those of other U snRNAs. There were also 16 ncRNAs without an Sm binding site element in the precipitate, suggesting that for these ncRNAs, TMG formation may occur independently of Sm proteins. Conclusion Our results showed that most ncRNAs predicted to be transcribed by RNA polymerase II had a TMG cap, while those predicted to be transcribed by RNA plymerase III or located in introns did not have a TMG cap structure. Compared to ncRNAs without a TMG cap, TMG-capped ncRNAs tended to have higher expression levels. Five functionally non-annotated ncRNAs also have a TMG cap structure, which might be helpful for identifying the cellular roles of these ncRNAs. PMID:17903271
Jia, Dong; Cai, Lun; He, Housheng; Skogerbø, Geir; Li, Tiantian; Aftab, Muhammad Nauman; Chen, Runsheng
2007-09-29
The 2,2,7-trimethylguanosine (TMG) cap structure is an important functional characteristic of ncRNAs with critical cellular roles, such as some snRNAs. Here we used immunoprecipitation with both K121 and R1131 anti-TMG antibodies to systematically identify the TMG cap structures for all presently characterized ncRNAs in C. elegans. The two anti-TMG antibodies precipitated a similar group of the C. elegans ncRNAs. All snRNAs known to have a TMG cap structure were found in the precipitate, indicating that our identification system was efficient. Other ncRNA families related to splicing, such as SL RNAs and Sm Y RNAs, were also found in the precipitate, as were 7 C/D box snoRNAs. Further analysis showed that the SL RNAs and the Sm Y RNAs shared a very similar Sm binding site element (AAU4-5GGA), which sequence composition differed somewhat from those of other U snRNAs. There were also 16 ncRNAs without an Sm binding site element in the precipitate, suggesting that for these ncRNAs, TMG formation may occur independently of Sm proteins. Our results showed that most ncRNAs predicted to be transcribed by RNA polymerase II had a TMG cap, while those predicted to be transcribed by RNA plymerase III or located in introns did not have a TMG cap structure. Compared to ncRNAs without a TMG cap, TMG-capped ncRNAs tended to have higher expression levels. Five functionally non-annotated ncRNAs also have a TMG cap structure, which might be helpful for identifying the cellular roles of these ncRNAs.
Liu, Yichuan; Li, Yun; March, Michael E; Nguyen, Kenny; Kenny, Nguyen; Xu, Kexiang; Wang, Fengxiang; Guo, Yiran; Keating, Brendan; Glessner, Joseph; Li, Jiankang; Ganley, Theodore J; Zhang, Jianguo; Deardorff, Matthew A; Xu, Xun; Hakonarson, Hakon
2015-11-11
Absence of the anterior (ACL) or posterior cruciate ligament (PCL) are rare congenital malformations that result in knee joint instability, with a prevalence of 1.7 per 100,000 live births and can be associated with other lower-limb abnormalities such as ACL agnesia and absence of the menisci of the knee. While a few cases of absence of ACL/PCL are reported in the literature, a number of large familial case series of related conditions such as ACL agnesia suggest a potential underlying monogenic etiology. We performed whole exome sequencing of a family with two individuals affected by ACL/PCL. We identified copy number variation (CNV) deletion impacting the exon sequences of CEP57L1, present in the affected mother and her affected daughter based on the exome sequencing data. The deletion was validated using quantitative PCR (qPCR), and the gene was confirmed to be expressed in ACL ligament tissue. Interestingly, we detected reduced expression of CEP57L1 in Epstein-Barr virus (EBV) cells from the two patients in comparison with healthy controls. Evaluation of 3D protein structure showed that the helix-binding sites of the protein remain intact with the deletion, but other functional binding sites related to microtubule attachment are missing. The specificity of the CNV deletion was confirmed by showing that it was absent in ~700 exome sequencing samples as well as in the database of genomic variations (DGV), a database containing large numbers of annotated CNVs from previous scientific reports. We identified a novel CNV deletion that was inherited through an autosomal dominant transmission from an affected mother to her affected daughter, both of whom suffered from the absence of the anterior and posterior cruciate ligaments of the knees.
Zumaraga, Mark Pretzel; Medina, Paul Julius; Recto, Juan Miguel; Abrahan, Lauro; Azurin, Edelyn; Tanchoco, Celeste C; Jimeno, Cecilia A; Palmes-Saloma, Cynthia
2017-03-01
This study aimed to discover genetic variants in the entire 101 kB vitamin D receptor (VDR) gene for vitamin D deficiency in a group of postmenopausal Filipino women using targeted next generation sequencing (TNGS) approach in a case-control study design. A total of 50 women with and without osteoporotic fracture seen at the Philippine Orthopedic Center were included. Blood samples were collected for determination of serum vitamin D, calcium, phosphorus, glucose, blood urea nitrogen, creatinine, aspartate aminotransferase, alanine aminotransferase and as primary source for targeted VDR gene sequencing using the Ion Torrent Personal Genome Machine. The variant calling was based on the GATK best practice workflow and annotated using Annovar tool. A total of 1496 unique variants in the whole 101-kb VDR gene were identified. Novel sequence variations not registered in the dbSNP database were found among cases and controls at a rate of 23.1% and 16.6% of total discovered variants, respectively. One disease-associated enhancer showed statistically significant association to low serum 25-hydroxy vitamin D levels (Pearson chi-square P-value=0.009). The transcription factor binding site prediction program PROMO predicted the disruption of three transcription factor binding sites in this enhancer region. These findings show the power of TNGS in identifying sequence variations in a very large gene and the surprising results obtained in this study greatly expand the catalog of known VDR sequence variants that may represent an important clue in the emergence of vitamin D deficiency. Such information will also provide the additional guidance necessary toward a personalized nutritional advice to reach sufficient vitamin D status. Copyright © 2016 Elsevier Inc. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ansong, Charles; Tolic, Nikola; Purvine, Samuel O.
Complete and accurate genome annotation is crucial for comprehensive and systematic studies of biological systems. For example systems biology-oriented genome scale modeling efforts greatly benefit from accurate annotation of protein-coding genes to develop proper functioning models. However, determining protein-coding genes for most new genomes is almost completely performed by inference, using computational predictions with significant documented error rates (> 15%). Furthermore, gene prediction programs provide no information on biologically important post-translational processing events critical for protein function. With the ability to directly measure peptides arising from expressed proteins, mass spectrometry-based proteomics approaches can be used to augment and verify codingmore » regions of a genomic sequence and importantly detect post-translational processing events. In this study we utilized “shotgun” proteomics to guide accurate primary genome annotation of the bacterial pathogen Salmonella Typhimurium 14028 to facilitate a systems-level understanding of Salmonella biology. The data provides protein-level experimental confirmation for 44% of predicted protein-coding genes, suggests revisions to 48 genes assigned incorrect translational start sites, and uncovers 13 non-annotated genes missed by gene prediction programs. We also present a comprehensive analysis of post-translational processing events in Salmonella, revealing a wide range of complex chemical modifications (70 distinct modifications) and confirming more than 130 signal peptide and N-terminal methionine cleavage events in Salmonella. This study highlights several ways in which proteomics data applied during the primary stages of annotation can improve the quality of genome annotations, especially with regards to the annotation of mature protein products.« less
Nicotinic Cholinergic Receptor Binding Sites in the Brain: Regulation in vivo
NASA Astrophysics Data System (ADS)
Schwartz, Rochelle D.; Kellar, Kenneth J.
1983-04-01
Tritiated acetylcholine was used to measure binding sites with characteristics of nicotinic cholinergic receptors in rat brain. Regulation of the binding sites in vivo was examined by administering two drugs that stimulate nicotinic receptors directly or indirectly. After 10 days of exposure to the cholinesterase inhibitor diisopropyl fluorophosphate, binding of tritiated acetylcholine in the cerebral cortex was decreased. However, after repeated administration of nicotine for 10 days, binding of tritiated acetylcholine in the cortex was increased. Saturation analysis of tritiated acetylcholine binding in the cortices of rats treated with diisopropyl fluorophosphate or nicotine indicated that the number of binding sites decreased and increased, respectively, while the affinity of the sites was unaltered.
Substance P binding sites in the nucleus tractus solitarius of the cat
DOE Office of Scientific and Technical Information (OSTI.GOV)
Maley, B.E.; Sasek, C.A.; Seybold, V.S.
1988-11-01
Substance P binding sites in the nucleus tractus solitarius were visualized with receptor autoradiography using Bolton-Hunter (/sup 125/I)substance P. Substance P binding sites were found to have distinct patterns within the cat nucleus tractus solitarius. The majority of substance P binding sites were present in the medial, intermediate and the peripheral rim of the parvocellular subdivisions. Lower amounts of substance P binding sites were present in the commissural, ventrolateral, interstitial and dorsolateral subdivisions. No substance P binding sites were present in the central region of the parvocellular subdivision or the solitary tract. The localization of substance P binding sites inmore » the nucleus tractus solitarius is very similar to the patterns of substance P immunoreactive fibers previously described for this region. Results of this study add further support for a functional role of substance P in synaptic circuits of the nucleus tractus solitarius.« less
Bae, Ji-Eun; Hwang, Kwang Yeon; Nam, Ki Hyun
2018-06-16
Glucose isomerase (GI) catalyzes the reversible enzymatic isomerization of d-glucose and d-xylose to d-fructose and d-xylulose, respectively. This is one of the most important enzymes in the production of high-fructose corn syrup (HFCS) and biofuel. We recently determined the crystal structure of GI from S. rubiginosus (SruGI) complexed with a xylitol inhibitor in one metal binding mode. Although we assessed inhibitor binding at the M1 site, the metal binding at the M2 site and the substrate recognition mechanism for SruGI remains the unclear. Here, we report the crystal structure of the two metal binding modes of SruGI and its complex with glucose. This study provides a snapshot of metal binding at the SruGI M2 site in the presence of Mn 2+ , but not in the presence of Mg 2+ . Metal binding at the M2 site elicits a configuration change at the M1 site. Glucose molecule can only bind to the M1 site in presence of Mn 2+ at the M2 site. Glucose and Mn 2+ at the M2 site were bridged by water molecules using a hydrogen bonding network. The metal binding geometry of the M2 site indicates a distorted octahedral coordination with an angle of 55-110°, whereas the M1 site has a relatively stable octahedral coordination with an angle of 85-95°. We suggest a two-step sequential process for SruGI substrate recognition, in Mn 2+ binding mode, at the M2 site. Our results provide a better understanding of the molecular role of the M2 site in GI substrate recognition. Copyright © 2018. Published by Elsevier Inc.
Pintor, J.; Torres, M.; Castro, E.; Miras-Portugal, M. T.
1991-01-01
1. Diadenosine tetraphosphate (Ap4A) a dinucleotide, which is stored in secretory granules, presents two types of high affinity binding sites in chromaffin cells. A Kd value of 8 +/- 0.65 x 10(-11) M and Bmax value of 5420 +/- 450 sites per cell were obtained for the high affinity binding site. A Kd value of 5.6 +/- 0.53 x 10(-9) M and a Bmax value close to 70,000 sites per cell were obtained for the second binding site with high affinity. 2. The diadenosine polyphosphates, Ap3A, Ap4A, Ap5A and Ap6A, displaced [3H]-Ap4A from the two binding sites, the Ki values being 1.0 nM, 0.013 nM, 0.013 nM and 0.013 nM for the very high affinity binding site and 0.5 microM, 0.13 microM, 0.062 microM and 0.75 microM for the second binding site. 3. The ATP analogues displaced [3H]-Ap4A with the potency order of the P2y receptors, adenosine 5'-O-(2 thiodiphosphate) (ADP-beta-S) greater than 5'-adenylyl imidodiphosphate (AMP-PNP) greater than alpha, beta-methylene ATP (alpha, beta-MeATP), in both binding sites. The Ki values were respectively 0.075 nM, 0.2 nM and 0.75 nM for the very high affinity binding site and 0.125 microM, 0.5 microM and 0.9 microM for the second binding site. PMID:1912985
PlantTFDB 4.0: toward a central hub for transcription factors and regulatory interactions in plants.
Jin, Jinpu; Tian, Feng; Yang, De-Chang; Meng, Yu-Qi; Kong, Lei; Luo, Jingchu; Gao, Ge
2017-01-04
With the goal of providing a comprehensive, high-quality resource for both plant transcription factors (TFs) and their regulatory interactions with target genes, we upgraded plant TF database PlantTFDB to version 4.0 (http://planttfdb.cbi.pku.edu.cn/). In the new version, we identified 320 370 TFs from 165 species, presenting a more comprehensive genomic TF repertoires of green plants. Besides updating the pre-existing abundant functional and evolutionary annotation for identified TFs, we generated three new types of annotation which provide more directly clues to investigate functional mechanisms underlying: (i) a set of high-quality, non-redundant TF binding motifs derived from experiments; (ii) multiple types of regulatory elements identified from high-throughput sequencing data; (iii) regulatory interactions curated from literature and inferred by combining TF binding motifs and regulatory elements. In addition, we upgraded previous TF prediction server, and set up four novel tools for regulation prediction and functional enrichment analyses. Finally, we set up a novel companion portal PlantRegMap (http://plantregmap.cbi.pku.edu.cn) for users to access the regulation resource and analysis tools conveniently. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Valdramidou, Dimitra; Humphries, Martin J; Mould, A Paul
2008-11-21
Integrin-ligand interactions are regulated in a complex manner by divalent cations, and previous studies have identified ligand-competent, stimulatory, and inhibitory cation-binding sites. In collagen-binding integrins, such as alpha2beta1, ligand recognition takes place exclusively at the alpha subunit I domain. However, activation of the alphaI domain depends on its interaction with a structurally similar domain in the beta subunit known as the I-like or betaI domain. The top face of the betaI domain contains three cation-binding sites: the metal-ion dependent adhesion site (MIDAS), the ADMIDAS (adjacent to MIDAS), and LIMBS (ligand-associated metal-binding site). The role of these sites in controlling ligand binding to the alphaI domain has yet to be elucidated. Mutation of the MIDAS or LIMBS completely blocked collagen binding to alpha2beta1; in contrast mutation of the ADMIDAS reduced ligand recognition but this effect could be overcome by the activating monoclonal antibody TS2/16. Hence, the MIDAS and LIMBS appear to be essential for the interaction between alphaI and betaI, whereas occupancy of the ADMIDAS has an allosteric effect on the conformation of betaI. An activating mutation in the alpha2 I domain partially restored ligand binding to the MIDAS and LIMBS mutants. Analysis of the effects of Ca(2+), Mg(2+), and Mn(2+) on ligand binding to these mutants showed that the MIDAS is a ligand-competent site through which Mn(2+) stimulates ligand binding, whereas the LIMBS is a stimulatory Ca(2+)-binding site, occupancy of which increases the affinity of Mg(2+) for the MIDAS.
Wright, J F; Pernollet, M; Reboul, A; Aude, C; Colomb, M G
1992-05-05
Tetanus toxin was shown to contain a metal-binding site for zinc and copper. Equilibrium dialysis binding experiments using 65Zn indicated an association constant of 9-15 microM, with one zinc-binding site/toxin molecule. The zinc-binding site was localized to the toxin light chain as determined by binding of 65Zn to the light chain but not to the heavy chain after separation by sodium dodecyl sulfate-polyacrylamide gel electrophoresis and transfer to Immobilon membranes. Copper was an efficient inhibitor of 65Zn binding to tetanus toxin and caused two peptide bond cleavages in the toxin light chain in the presence of ascorbate. These metal-catalyzed oxidative cleavages were inhibited by the presence of zinc. Partial characterization of metal-catalyzed oxidative modifications of a peptide based on a putative metal-binding site (HELIH) in the toxin light chain was used to map the metal-binding site in the protein.
ERIC Educational Resources Information Center
Galica, Carol
1997-01-01
Provides an annotated bibliography of selected NASA Web sites for K-12 math and science teachers: the NASA Lewis Research Center Learning Technologies K-12 Home Page, Spacelink, NASA Quest, Basic Aircraft Design Page, International Space Station, NASA Shuttle Web Site, LIFTOFF to Space Education, Telescopes in Education, and Space Educator's…
Terrorism: Online Resources for Helping Students Understand and Cope.
ERIC Educational Resources Information Center
Green, Tim; Ramirez, Fred
2002-01-01
Presents an annotated bibliography of Web sites that focus on the issue of terrorism. Aims to assist teachers in educating their students and helping them cope with terrorism since the September 11, 2001 attack on the United States. Offers sites on other terrorist attacks on the U.S. (CMK)
Goonesekere, Nalin C W; Shipely, Krysten; O'Connor, Kevin
2010-06-01
The Pfam database is an important tool in genome annotation, since it provides a collection of curated protein families. However, a subset of these families, known as domains of unknown function (DUFs), remains poorly characterized. We have related sequences from DUF404, DUF407, DUF482, DUF608, DUF810, DUF853, DUF976 and DUF1111 to homologs in PDB, within the midnight zone (9-20%) of sequence identity. These relationships were extended to provide functional annotation by sequence analysis and model building. Also described are examples of residue plasticity within enzyme active sites, and change of function within homologous sequences of a DUF. Copyright 2010 Elsevier Ltd. All rights reserved.
Hestand, Matthew S; van Galen, Michiel; Villerius, Michel P; van Ommen, Gert-Jan B; den Dunnen, Johan T; 't Hoen, Peter AC
2008-01-01
Background The identification of transcription factor binding sites is difficult since they are only a small number of nucleotides in size, resulting in large numbers of false positives and false negatives in current approaches. Computational methods to reduce false positives are to look for over-representation of transcription factor binding sites in a set of similarly regulated promoters or to look for conservation in orthologous promoter alignments. Results We have developed a novel tool, "CORE_TF" (Conserved and Over-REpresented Transcription Factor binding sites) that identifies common transcription factor binding sites in promoters of co-regulated genes. To improve upon existing binding site predictions, the tool searches for position weight matrices from the TRANSFACR database that are over-represented in an experimental set compared to a random set of promoters and identifies cross-species conservation of the predicted transcription factor binding sites. The algorithm has been evaluated with expression and chromatin-immunoprecipitation on microarray data. We also implement and demonstrate the importance of matching the random set of promoters to the experimental promoters by GC content, which is a unique feature of our tool. Conclusion The program CORE_TF is accessible in a user friendly web interface at . It provides a table of over-represented transcription factor binding sites in the users input genes' promoters and a graphical view of evolutionary conserved transcription factor binding sites. In our test data sets it successfully predicts target transcription factors and their binding sites. PMID:19036135
Kawasaki, Kazuyoshi; Ogawa, Seturou
2003-01-01
NMDA receptor contributes to cause neuronal death in anoxic condition. It is not known how a part of NMDA receptors, NMDA-binding site and/or glycine-binding site, influence neuronal damage in rats' hippocampus in vitro. Rats' hippocampus, labeled with norepinephrine (3H-NE), was incubated in artificial cerebrospinal fluid (aCSF) and we measured 3H-NE in superfusion solution and remaining tissue. Glucose was eliminated from aCSF and 95% N2 + 5% CO2 produced the anoxic state. The amount of 3H-NE release increased in anoxia with NMDA (NMDA-binding site agonist), while there was no influence on NMDA receptor in non-anoxic state even after D-serine (glycine-binding site agonist) has been administered. The 3H-NE was released more when D-serine (100 mu mM) and NMDA (100 mu mM) were administered together than when only D-serine (10 mu mM, 100 mu mM, 1000 mu mM) in anoxia or NMDA (10 mu mM, 100 mu mM, 1000 mu mM) in anoxia was administered. Glycine-binding site agonist alone does not act significantly but ion channels in NMDA receptor open more and become more effective when both glycine-binding site agonist and NMDA-binding site agonist exist, suggesting that there are interactions between NMDA-binding site and glycine-binding site in NMDA-receptor during anoxia.
Rapid comparison of protein binding site surfaces with Property Encoded Shape Distributions (PESD)
Das, Sourav; Kokardekar, Arshad
2009-01-01
Patterns in shape and property distributions on the surface of binding sites are often conserved across functional proteins without significant conservation of the underlying amino-acid residues. To explore similarities of these sites from the viewpoint of a ligand, a sequence and fold-independent method was created to rapidly and accurately compare binding sites of proteins represented by property-mapped triangulated Gauss-Connolly surfaces. Within this paradigm, signatures for each binding site surface are produced by calculating their property-encoded shape distributions (PESD), a measure of the probability that a particular property will be at a specific distance to another on the molecular surface. Similarity between the signatures can then be treated as a measure of similarity between binding sites. As postulated, the PESD method rapidly detected high levels of similarity in binding site surface characteristics even in cases where there was very low similarity at the sequence level. In a screening experiment involving each member of the PDBBind 2005 dataset as a query against the rest of the set, PESD was able to retrieve a binding site with identical E.C. (Enzyme Commission) numbers as the top match in 79.5% of cases. The ability of the method in detecting similarity in binding sites with low sequence conservations were compared with state-of-the-art binding site comparison methods. PMID:19919089
Naqvi, Ahmad Abu Turab; Shahbaaz, Mohd; Ahmad, Faizan; Hassan, Md. Imtaiyaz
2015-01-01
Syphilis is a globally occurring venereal disease, and its infection is propagated through sexual contact. The causative agent of syphilis, Treponema pallidum ssp. pallidum, a Gram-negative sphirochaete, is an obligate human parasite. Genome of T. pallidum ssp. pallidum SS14 strain (RefSeq NC_010741.1) encodes 1,027 proteins, of which 444 proteins are known as hypothetical proteins (HPs), i.e., proteins of unknown functions. Here, we performed functional annotation of HPs of T. pallidum ssp. pallidum using various database, domain architecture predictors, protein function annotators and clustering tools. We have analyzed the sequences of 444 HPs of T. pallidum ssp. pallidum and subsequently predicted the function of 207 HPs with a high level of confidence. However, functions of 237 HPs are predicted with less accuracy. We found various enzymes, transporters, binding proteins in the annotated group of HPs that may be possible molecular targets, facilitating for the survival of pathogen. Our comprehensive analysis helps to understand the mechanism of pathogenesis to provide many novel potential therapeutic interventions. PMID:25894582
A new family of β-helix proteins with similarities to the polysaccharide lyases
Close, Devin W.; D'Angelo, Sara; Bradbury, Andrew R. M.
2014-09-27
Microorganisms that degrade biomass produce diverse assortments of carbohydrate-active enzymes and binding modules. Despite tremendous advances in the genomic sequencing of these organisms, many genes do not have an ascribed function owing to low sequence identity to genes that have been annotated. Consequently, biochemical and structural characterization of genes with unknown function is required to complement the rapidly growing pool of genomic sequencing data. A protein with previously unknown function (Cthe_2159) was recently isolated in a genome-wide screen using phage display to identify cellulose-binding protein domains from the biomass-degrading bacterium Clostridium thermocellum. Here, the crystal structure of Cthe_2159 is presentedmore » and it is shown that it is a unique right-handed parallel β-helix protein. Despite very low sequence identity to known β-helix or carbohydrate-active proteins, Cthe_2159 displays structural features that are very similar to those of polysaccharide lyase (PL) families 1, 3, 6 and 9. Cthe_2159 is conserved across bacteria and some archaea and is a member of the domain of unknown function family DUF4353. This suggests that Cthe_2159 is the first representative of a previously unknown family of cellulose and/or acid-sugar binding β-helix proteins that share structural similarities with PLs. More importantly, these results demonstrate how functional annotation by biochemical and structural analysis remains a critical tool in the characterization of new gene products.« less
A new family of β-helix proteins with similarities to the polysaccharide lyases
DOE Office of Scientific and Technical Information (OSTI.GOV)
Close, Devin W.; D'Angelo, Sara; Bradbury, Andrew R. M.
Microorganisms that degrade biomass produce diverse assortments of carbohydrate-active enzymes and binding modules. Despite tremendous advances in the genomic sequencing of these organisms, many genes do not have an ascribed function owing to low sequence identity to genes that have been annotated. Consequently, biochemical and structural characterization of genes with unknown function is required to complement the rapidly growing pool of genomic sequencing data. A protein with previously unknown function (Cthe_2159) was recently isolated in a genome-wide screen using phage display to identify cellulose-binding protein domains from the biomass-degrading bacterium Clostridium thermocellum. Here, the crystal structure of Cthe_2159 is presentedmore » and it is shown that it is a unique right-handed parallel β-helix protein. Despite very low sequence identity to known β-helix or carbohydrate-active proteins, Cthe_2159 displays structural features that are very similar to those of polysaccharide lyase (PL) families 1, 3, 6 and 9. Cthe_2159 is conserved across bacteria and some archaea and is a member of the domain of unknown function family DUF4353. This suggests that Cthe_2159 is the first representative of a previously unknown family of cellulose and/or acid-sugar binding β-helix proteins that share structural similarities with PLs. More importantly, these results demonstrate how functional annotation by biochemical and structural analysis remains a critical tool in the characterization of new gene products.« less
Platelet binding sites for factor VIII in relation to fibrin and phosphatidylserine
Novakovic, Valerie A.; Shi, Jialan; Rasmussen, Jan; Pipe, Steven W.
2015-01-01
Thrombin-stimulated platelets expose very little phosphatidylserine (PS) but express binding sites for factor VIII (fVIII), casting doubt on the role of exposed PS as the determinant of binding sites. We previously reported that fVIII binding sites are increased three- to sixfold when soluble fibrin (SF) binds the αIIbβ3 integrin. This study focuses on the hypothesis that platelet-bound SF is the major source of fVIII binding sites. Less than 10% of fVIII was displaced from thrombin-stimulated platelets by lactadherin, a PS-binding protein, and an fVIII mutant defective in PS-dependent binding retained platelet affinity. Therefore, PS is not the determinant of most binding sites. FVIII bound immobilized SF and paralleled platelet binding in affinity, dependence on separation from von Willebrand factor, and mediation by the C2 domain. SF also enhanced activity of fVIII in the factor Xase complex by two- to fourfold. Monoclonal antibody (mAb) ESH8, against the fVIII C2 domain, inhibited binding of fVIII to SF and platelets but not to PS-containing vesicles. Similarly, mAb ESH4 against the C2 domain, inhibited >90% of platelet-dependent fVIII activity vs 35% of vesicle-supported activity. These results imply that platelet-bound SF is a component of functional fVIII binding sites. PMID:26162408
DOE Office of Scientific and Technical Information (OSTI.GOV)
Poat, J.A.; Cripps, H.E.; Iversen, L.L.
1988-05-01
Forskolin labelled with (/sup 3/H) bound to high- and low-affinity sites in the rat brain. The high-affinity site was discretely located, with highest densities in the striatum, nucleus accumbens, olfactory tubercule, substantia nigra, hippocampus, and the molecular layers of the cerebellum. This site did not correlate well with the distribution of adenylate cyclase. The high-affinity striatal binding site may be associated with a stimulatory guanine nucleotide-binding protein. Thus, the number of sites was increased by the addition of Mg/sup 2 +/ and guanylyl imidodiphosphate. Cholera toxin stereotaxically injected into rat striatum increased the number of binding sites, and no furthermore » increase was noted following the subsequent addition of guanyl nucleotide. High-affinity forskolin binding sites in non-dopamine-rich brain areas (hippocampus and cerebullum) were modulated in a qualitatively different manner by guanyl nucleotides. In these areas the number of binding sites was significantly reduced by the addition of guanyl nucleotide. These results suggest that forskolin may have a potential role in identifying different functional/structural guanine nucleotide-binding proteins.« less
HEMD: an integrated tool of human epigenetic enzymes and chemical modulators for therapeutics.
Huang, Zhimin; Jiang, Haiming; Liu, Xinyi; Chen, Yingyi; Wong, Jiemin; Wang, Qi; Huang, Wenkang; Shi, Ting; Zhang, Jian
2012-01-01
Epigenetic mechanisms mainly include DNA methylation, post-translational modifications of histones, chromatin remodeling and non-coding RNAs. All of these processes are mediated and controlled by enzymes. Abnormalities of the enzymes are involved in a variety of complex human diseases. Recently, potent natural or synthetic chemicals are utilized to establish the quantitative contributions of epigenetic regulation through the enzymes and provide novel insight for developing new therapeutics. However, the development of more specific and effective epigenetic therapeutics requires a more complete understanding of the chemical epigenomic landscape. Here, we present a human epigenetic enzyme and modulator database (HEMD), the database which provides a central resource for the display, search, and analysis of the structure, function, and related annotation for human epigenetic enzymes and chemical modulators focused on epigenetic therapeutics. Currently, HEMD contains 269 epigenetic enzymes and 4377 modulators in three categories (activators, inhibitors, and regulators). Enzymes are annotated with detailed description of epigenetic mechanisms, catalytic processes, and related diseases, and chemical modulators with binding sites, pharmacological effect, and therapeutic uses. Integrating the information of epigenetic enzymes in HEMD should allow for the prediction of conserved features for proteins and could potentially classify them as ideal targets for experimental validation. In addition, modulators curated in HEMD can be used to investigate potent epigenetic targets for the query compound and also help chemists to implement structural modifications for the design of novel epigenetic drugs. HEMD could be a platform and a starting point for biologists and medicinal chemists for furthering research on epigenetic therapeutics. HEMD is freely available at http://mdl.shsmu.edu.cn/HEMD/.
DOE Office of Scientific and Technical Information (OSTI.GOV)
McCann, D.J.; Su, T.P.
1991-05-01
The zwitterionic detergent 3-((3-cholamidopropyl)dimethylamino)-1-propanesulfonate (CHAPS) produced optimal solubilization of (+)-({sup 3}H)SKF-10,047 binding sites from rat liver membranes at a concentration of 0.2%, well below the critical micellular concentration of the detergent. The pharmacological selectivity of the liver (+)-({sup 3}H)SKF-10,047 binding sites corresponds to that of sigma sites from rat and guinea pig brain. When the affinities of 18 different drugs at (+)-({sup 3}H)SKF-10,047 binding sites in membranes and solubilized preparations were compared, a correlation coefficient of 0.99 and a slope of 1.03 were obtained, indicating that the pharmacological selectivity of rat liver sigma sites is retained after solubilization. In addition,more » the binding of 20 nM ({sup 3}H)progesterone to solubilized rat liver preparations was found to exhibit a pharmacological selectivity appropriate for sigma sites. A stimulatory effect of phenytoin on (+)-({sup 3}H)SKF-10,047 binding to sigma sites persisted after solubilization. When the solubilized preparation was subjected to molecular sizing chromatography, a single peak exhibiting specific (+)-({sup 3}H)SKF-10,047 binding was obtained. The binding activity of this peak was stimulated symmetrically when assays were performed in the presence of 300 microM phenytoin. The molecular weight of the CHAPS-solubilized sigma site complex was estimated to be 450,000 daltons. After solubilization with CHAPS, rat liver sigma sites were enriched to 12 pmol/mg of protein. The present results demonstrate a successful solubilization of sigma sites from rat liver membranes and provide direct evidence that the gonadal steroid progesterone binds to sigma sites. The results also suggest that the anticonvulsant phenytoin binds to an associated allosteric site on the sigma site complex.« less
POLYVIEW-MM: web-based platform for animation and analysis of molecular simulations
Porollo, Aleksey; Meller, Jaroslaw
2010-01-01
Molecular simulations offer important mechanistic and functional clues in studies of proteins and other macromolecules. However, interpreting the results of such simulations increasingly requires tools that can combine information from multiple structural databases and other web resources, and provide highly integrated and versatile analysis tools. Here, we present a new web server that integrates high-quality animation of molecular motion (MM) with structural and functional analysis of macromolecules. The new tool, dubbed POLYVIEW-MM, enables animation of trajectories generated by molecular dynamics and related simulation techniques, as well as visualization of alternative conformers, e.g. obtained as a result of protein structure prediction methods or small molecule docking. To facilitate structural analysis, POLYVIEW-MM combines interactive view and analysis of conformational changes using Jmol and its tailored extensions, publication quality animation using PyMol, and customizable 2D summary plots that provide an overview of MM, e.g. in terms of changes in secondary structure states and relative solvent accessibility of individual residues in proteins. Furthermore, POLYVIEW-MM integrates visualization with various structural annotations, including automated mapping of known inter-action sites from structural homologs, mapping of cavities and ligand binding sites, transmembrane regions and protein domains. URL: http://polyview.cchmc.org/conform.html. PMID:20504857
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kalra, Rajkumar S., E-mail: renu-wadhwa@aist.go.jp; Wadhwa, Renu, E-mail: renu-wadhwa@aist.go.jp
2015-02-27
Epithelial membrane antigen (EMA or MUC1) is a heavily glycosylated, type I transmembrane glycoprotein commonly expressed by epithelial cells of duct organs. It has been shown to be aberrantly glycosylated in several diseases including cancer. Protein sequence based annotation and analysis of glycosylation profile of glycoproteins by robust computational and comprehensive algorithms provides possible insights to the mechanism(s) of anomalous glycosylation. In present report, by using a number of bioinformatics applications we studied EMA/MUC1 and explored its trans-membrane structural domain sequence that is widely subjected to glycosylation. Exploration of different extracellular motifs led to prediction of N and O-linked glycosylationmore » target sites. Based on the putative O-linked target sites, glycosylated moieties and pathways were envisaged. Furthermore, Protein network analysis demonstrated physical interaction of EMA with a number of proteins and confirmed its functional involvement in cell growth and proliferation pathways. Gene Ontology analysis suggested an involvement of EMA in a number of functions including signal transduction, protein binding, processing and transport along with glycosylation. Thus, present study explored potential of bioinformatics prediction approach in analyzing glycosylation, co-expression and interaction patterns of EMA/MUC1 glycoprotein.« less
Binding mode of cytochalasin B to F-actin is altered by lateral binding of regulatory proteins.
Suzuki, N; Mihashi, K
1991-01-01
The binding of cytochalasin B (CB) to F-actin was studied using a trace amount of [3H]-cytochalasin B. F-Actin-bound CB was separated from free CB by ultracentrifugation and the amount of F-actin-bound CB was determined by comparing the radioactivity both in the supernatant and in the precipitate. A filament of pure F-actin possessed one high-affinity binding site for CB (Kd = 5.0 nM) at the B-end. When the filament was bound to native tropomyosin (complex of tropomyosin and troponin), two low-affinity binding sites for CB (Kd = 230 nM) were created, while the high-affinity binding site was reserved (Kd = 3.4 nM). It was concluded that the creation of low-affinity binding sites was primarily due to binding of tropomyosin to F-actin, as judged from the following two observations: (1) a filament of F-actin/tropomyosin complex possessed one high-affinity binding site (Kd = 3.9 nM) plus two low-affinity binding sites (Kd = 550 nM); (2) the Ca2(+)-receptive state of troponin C in F-actin/native tropomyosin complex did not affect CB binding.
m6ASNP: a tool for annotating genetic variants by m6A function.
Jiang, Shuai; Xie, Yubin; He, Zhihao; Zhang, Ya; Zhao, Yuli; Chen, Li; Zheng, Yueyuan; Miao, Yanyan; Zuo, Zhixiang; Ren, Jian
2018-05-01
Large-scale genome sequencing projects have identified many genetic variants for diverse diseases. A major goal of these projects is to characterize these genetic variants to provide insight into their function and roles in diseases. N6-methyladenosine (m6A) is one of the most abundant RNA modifications in eukaryotes. Recent studies have revealed that aberrant m6A modifications are involved in many diseases. In this study, we present a user-friendly web server called "m6ASNP" that is dedicated to the identification of genetic variants that target m6A modification sites. A random forest model was implemented in m6ASNP to predict whether the methylation status of an m6A site is altered by the variants that surround the site. In m6ASNP, genetic variants in a standard variant call format (VCF) are accepted as the input data, and the output includes an interactive table that contains the genetic variants annotated by m6A function. In addition, statistical diagrams and a genome browser are provided to visualize the characteristics and to annotate the genetic variants. We believe that m6ASNP is a very convenient tool that can be used to boost further functional studies investigating genetic variants. The web server "m6ASNP" is implemented in JAVA and PHP and is freely available at [60].
Use of Annotations for Component and Framework Interoperability
NASA Astrophysics Data System (ADS)
David, O.; Lloyd, W.; Carlson, J.; Leavesley, G. H.; Geter, F.
2009-12-01
The popular programming languages Java and C# provide annotations, a form of meta-data construct. Software frameworks for web integration, web services, database access, and unit testing now take advantage of annotations to reduce the complexity of APIs and the quantity of integration code between the application and framework infrastructure. Adopting annotation features in frameworks has been observed to lead to cleaner and leaner application code. The USDA Object Modeling System (OMS) version 3.0 fully embraces the annotation approach and additionally defines a meta-data standard for components and models. In version 3.0 framework/model integration previously accomplished using API calls is now achieved using descriptive annotations. This enables the framework to provide additional functionality non-invasively such as implicit multithreading, and auto-documenting capabilities while achieving a significant reduction in the size of the model source code. Using a non-invasive methodology leads to models and modeling components with only minimal dependencies on the modeling framework. Since models and modeling components are not directly bound to framework by the use of specific APIs and/or data types they can more easily be reused both within the framework as well as outside of it. To study the effectiveness of an annotation based framework approach with other modeling frameworks, a framework-invasiveness study was conducted to evaluate the effects of framework design on model code quality. A monthly water balance model was implemented across several modeling frameworks and several software metrics were collected. The metrics selected were measures of non-invasive design methods for modeling frameworks from a software engineering perspective. It appears that the use of annotations positively impacts several software quality measures. In a next step, the PRMS model was implemented in OMS 3.0 and is currently being implemented for water supply forecasting in the western United States at the USDA NRCS National Water and Climate Center. PRMS is a component based modular precipitation-runoff model developed to evaluate the impacts of various combinations of precipitation, climate, and land use on streamflow and general basin hydrology. The new OMS 3.0 PRMS model source code is more concise and flexible as a result of using the new framework’s annotation based approach. The fully annotated components are now providing information directly for (i) model assembly and building, (ii) dataflow analysis for implicit multithreading, (iii) automated and comprehensive model documentation of component dependencies, physical data properties, (iv) automated model and component testing, and (v) automated audit-traceability to account for all model resources leading to a particular simulation result. Experience to date has demonstrated the multi-purpose value of using annotations. Annotations are also a feasible and practical method to enable interoperability among models and modeling frameworks. As a prototype example, model code annotations were used to generate binding and mediation code to allow the use of OMS 3.0 model components within the OpenMI context.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Luthin, G.R.; Wolfe, B.B.
The properties of (/sup 3/H)quinuclidinylbenzilate ( (/sup 3/H)QNB) binding and (/sup 3/H)pirenzepine ( (/sup 3/H)PZ) binding to various regions of rat brain were compared. (/sup 3/H)PZ appeared to bind with high affinity to a single site, with a Kd value of approximately 15 nM in the cerebral cortex. The rank order of potencies of muscarinic drugs to inhibit binding of either (/sup 3/H)QNB or (/sup 3/H)PZ was QNB greater than atropine . scopolamine greater than pirenzepine greater than oxotremorine greater than bethanechol. Muscarinic antagonists (except PZ) inhibited both (/sup 3/H)PZ and (/sup 3/H)QNB binding with Hill coefficients of approximately 1.more » PZ inhibited (/sup 3/H)QNB binding in cortex with a Hill coefficient of 0.7, but inhibited (/sup 3/H)PZ binding with a Hill coefficient of 1.0. Hill coefficients for agonists were less than 1. The density of (/sup 3/H)PZ binding sites was approximately half the density of (/sup 3/H)QNB binding sites in cortex, striatum and hippocampus. In pons-medulla and cerebellum, the densities of (/sup 3/H)PZ binding sites were 20 and 0%, respectively, relative to the densities of (/sup 3/H)QNB binding sites. When unlabeled PZ was used to compete for (/sup 3/H)QNB binding, the relative number of high-affinity PZ binding sites in cortex, pons and cerebellum agreed with the relative number of (/sup 3/H)PZ binding sites in those regions. The binding of (/sup 3/H)PZ and (/sup 3/H)QNB was nonadditive in cortex. GTP inhibited high-affinity oxotremorine binding, but not PZ binding. Together, these data suggest that (/sup 3/H)PZ binds to a subset of (/sup 3/H)QNB binding sites. Whether this subset reflects the existence of subtypes of muscarinic receptors or is a consequence of coupling to another membrane protein remains to be seen.« less
Price, D J; Rivnay, B; Fu, Y; Jiang, S; Avraham, S; Avraham, H
1997-02-28
The Csk homologous kinase (CHK), formerly MATK, has previously been shown to bind to activated c-KIT. In this report, we characterize the binding of SH2(CHK) to specific phosphotyrosine sites on the c-KIT protein sequence. Phosphopeptide inhibition of the in vitro interaction of SH2(CHK)-glutathione S-transferase fusion protein/c-KIT from SCF/KL-treated Mo7e megakaryocytic cells indicated that two sites on c-KIT were able to bind SH2(CHK). These sites were the Tyr568/570 diphosphorylated sequence and the monophosphorylated Tyr721 sequence. To confirm this, we precipitated native CHK from cellular extracts using phosphorylated peptides linked to Affi-Gel 15. In addition, purified SH2(CHK)-glutathione S-transferase fusion protein was precipitated with the same peptide beads. All of the peptide bead-binding studies were consistent with the direct binding of SH2(CHK) to phosphorylated Tyr568/570 and Tyr721 sites. Binding of FYN and SHC to the diphosphorylated Tyr568/570 site was observed, while binding of Csk to this site was not observed. The SH2(CHK) binding to the two sites is direct and not through phosphorylated intermediates such as FYN or SHC. Site-directed mutagenesis of the full-length c-KIT cDNA followed by transient transfection indicated that only the Tyr568/570, and not the Tyr721, is able to bind SH2(CHK). This indicates that CHK binds to the same site on c-KIT to which FYN binds, possibly bringing the two into proximity on associated c-KIT subunits and leading to the down-regulation of FYN by CHK.
Hebner, Christy; Lasanen, Julie; Battle, Scott; Aiyar, Ashok
2003-07-05
Epstein-Barr virus (EBV) and the closely related Herpesvirus papio (HVP) are stably replicated as episomes in proliferating latently infected cells. Maintenance and partitioning of these viral plasmids requires a viral sequence in cis, termed the family of repeats (FR), that is bound by a viral protein, Epstein-Barr nuclear antigen 1 (EBNA1). Upon binding FR, EBNA1 maintains viral genomes in proliferating cells and activates transcription from viral promoters required for immortalization. FR from either virus encodes multiple binding sites for the viral maintenance protein, EBNA1, with the FR from the prototypic B95-8 strain of EBV containing 20 binding sites, and FR from HVP containing 8 binding sites. In addition to differences in the number of EBNA1-binding sites, adjacent binding sites in the EBV FR are typically separated by 14 base pairs (bp), but are separated by 10 bp in HVP. We tested whether the number of binding sites, as well as the distance between adjacent binding sites, affects the function of EBNA1 in transcription activation or plasmid maintenance. Our results indicate that EBNA1 activates transcription more efficiently when adjacent binding sites are separated by 10 bp, the spacing observed in HVP. In contrast, using two separate assays, we demonstrate that plasmid maintenance is greatly augmented when adjacent EBNA1-binding sites are separated by 14 bp, and therefore, presumably lie on the same face of the DNA double helix. These results provide indication that the functions of EBNA1 in transcription activation and plasmid maintenance are separable.
ERIC Educational Resources Information Center
Lindroth, Linda K.
1996-01-01
Annotates 16 World Wide Web (WWW) sites dealing with math and science education matters covered in feature articles for this journal issue. Topics include math fairs, classroom restructuring, and hands-on science. (JW)
Existence of three subtypes of bradykinin B2 receptors in guinea pig.
Seguin, L; Widdowson, P S; Giesen-Crouse, E
1992-12-01
We describe the binding of [3H]bradykinin to homogenates of guinea pig brain, lung, and ileum. Analysis of [3H]bradykinin binding kinetics in guinea pig brain, lung, and ileum suggests the existence of two binding sites in each tissue. The finding of two binding sites for [3H]bradykinin in ileum, lung, and brain was further supported by Scatchard analysis of equilibrium binding in each tissue. [3H]Bradykinin binds to a high-affinity site in brain, lung, and ileum (KD = 70-200 pM), which constitutes approximately 20% of the bradykinin binding, and to a second, lower-affinity site (0.63-0.95 nM), which constitutes the remaining 80% of binding. Displacement studies with various bradykinin analogues led us to subdivide the high- and lower-affinity sites in each tissue and to suggest the existence of three subtypes of B2 receptors in the guinea pig, which we classify as B2a, B2b, and B2c. Binding of [3H]bradykinin is largely to a B2b receptor subtype, which constitutes the majority of binding in brain, lung, and ileum and represents the lower-affinity site in our binding studies. Receptor subtype B2c constitutes approximately 20% of binding sites in the brain and lung and is equivalent to the high-affinity site in brain and lung. We suggest that a third subtype of B2 receptor (high-affinity site in ileum), B2a, is found only in the ileum. All three subtypes of B2 receptors display a high affinity for bradykinin, whereas they show different affinities for various bradykinin analogues displaying agonist or antagonist activities.(ABSTRACT TRUNCATED AT 250 WORDS)
Denys, A; Allain, F; Carpentier, M; Spik, G
1998-12-15
Cyclophilin B (CyPB) is a cyclosporin A (CsA)-binding protein, mainly associated with the secretory pathway, and is released in biological fluids. We recently reported that CyPB specifically binds to T-lymphocytes and promotes enhanced incorporation of CsA. The interactions with cellular binding sites involved, at least in part, the specific N-terminal extension of the protein. In this study, we intended to specify further the nature of the CyPB-binding sites on peripheral blood T-lymphocytes. We first provide evidence that the CyPB binding to heparin-Sepharose is prevented by soluble sulphated glycosaminoglycans (GAG), raising the interesting possibility that such interactions may occur on the T-cell surface. We then characterized CyPB binding to T-cell surface GAG and found that these interactions involved the N-terminal extension of CyPB, but not its conserved CsA-binding domain. In addition, we determined the presence of a second CyPB binding site, which we termed a type I site, in contrast with type II for GAG interactions. The two binding sites exhibit a similar affinity but the expression of the type I site was 3-fold lower. The conclusion that CyPB binding to the type I site is distinct from the interactions with GAG was based on the findings that it was (1) resistant to NaCl wash and GAG-degrading enzyme treatments, (2) reduced in the presence of CsA or cyclophilin C, and (3) unmodified in the presence of either the N-terminal peptide of CyPB or protamine. Finally, we showed that the type I binding sites were involved in an endocytosis process, supporting the hypothesis that they may correspond to a functional receptor for CyPB.
Nguyen, T V; Juorio, A V
1989-10-01
The present study assessed changes of tryptamine, dopamine D2, 5-HT1 and 5-HT2 binding sites in rat brain following chronic treatment with low (5 mg/kg/day) and high (40 mg/kg/day) doses of molindone, a clinically effective psychotropic drug. The high-dose molindone treatment produced a decrease in the number of tryptamine binding sites while both high and low doses caused an increase in the number of dopamine D2 binding sites in the striatum. No significant changes were observed in either 5-HT1 or 5-HT2 binding sites in the cerebral cortex. Competition binding experiments showed that molindone was a potent inhibitor at dopamine D2 but less effective at tryptamine, 5-HT1 and 5-HT2 binding sites. The inhibition activity of molindone towards type A monoamine oxidase produced a significant increase in endogenous tryptamine accumulation rate which was much higher than that of dopamine and 5-HT. These findings suggest that the reduction in the number of tryptamine binding sites produced by chronic molindone administration is related to monoamine oxidase inhibition and that the increase in the number of dopamine D2 binding sites is correlated to receptor blocking activity of the drug.
Chou, Kuo-Chen; Shen, Hong-Bin
2007-05-01
One of the critical challenges in predicting protein subcellular localization is how to deal with the case of multiple location sites. Unfortunately, so far, no efforts have been made in this regard except for the one focused on the proteins in budding yeast only. For most existing predictors, the multiple-site proteins are either excluded from consideration or assumed even not existing. Actually, proteins may simultaneously exist at, or move between, two or more different subcellular locations. For instance, according to the Swiss-Prot database (version 50.7, released 19-Sept-2006), among the 33,925 eukaryotic protein entries that have experimentally observed subcellular location annotations, 2715 have multiple location sites, meaning about 8% bearing the multiplex feature. Proteins with multiple locations or dynamic feature of this kind are particularly interesting because they may have some very special biological functions intriguing to investigators in both basic research and drug discovery. Meanwhile, according to the same Swiss-Prot database, the number of total eukaryotic protein entries (except those annotated with "fragment" or those with less than 50 amino acids) is 90,909, meaning a gap of (90,909-33,925) = 56,984 entries for which no knowledge is available about their subcellular locations. Although one can use the computational approach to predict the desired information for the blank, so far, all the existing methods for predicting eukaryotic protein subcellular localization are limited in the case of single location site only. To overcome such a barrier, a new ensemble classifier, named Euk-mPLoc, was developed that can be used to deal with the case of multiple location sites as well. Euk-mPLoc is freely accessible to the public as a Web server at http://202.120.37.186/bioinf/euk-multi. Meanwhile, to support the people working in the relevant areas, Euk-mPLoc has been used to identify all eukaryotic protein entries in the Swiss-Prot database that do not have subcellular location annotations or are annotated as being uncertain. The large-scale results thus obtained have been deposited at the same Web site via a downloadable file prepared with Microsoft Excel and named "Tab_Euk-mPLoc.xls". Furthermore, to include new entries of eukaryotic proteins and reflect the continuous development of Euk-mPLoc in both the coverage scope and prediction accuracy, we will timely update the downloadable file as well as the predictor, and keep users informed by publishing a short note in the Journal and making an announcement in the Web Page.
Impact of germline and somatic missense variations on drug binding sites.
Yan, C; Pattabiraman, N; Goecks, J; Lam, P; Nayak, A; Pan, Y; Torcivia-Rodriguez, J; Voskanian, A; Wan, Q; Mazumder, R
2017-03-01
Advancements in next-generation sequencing (NGS) technologies are generating a vast amount of data. This exacerbates the current challenge of translating NGS data into actionable clinical interpretations. We have comprehensively combined germline and somatic nonsynonymous single-nucleotide variations (nsSNVs) that affect drug binding sites in order to investigate their prevalence. The integrated data thus generated in conjunction with exome or whole-genome sequencing can be used to identify patients who may not respond to a specific drug because of alterations in drug binding efficacy due to nsSNVs in the target protein's gene. To identify the nsSNVs that may affect drug binding, protein-drug complex structures were retrieved from Protein Data Bank (PDB) followed by identification of amino acids in the protein-drug binding sites using an occluded surface method. Then, the germline and somatic mutations were mapped to these amino acids to identify which of these alter protein-drug binding sites. Using this method we identified 12 993 amino acid-drug binding sites across 253 unique proteins bound to 235 unique drugs. The integration of amino acid-drug binding sites data with both germline and somatic nsSNVs data sets revealed 3133 nsSNVs affecting amino acid-drug binding sites. In addition, a comprehensive drug target discovery was conducted based on protein structure similarity and conservation of amino acid-drug binding sites. Using this method, 81 paralogs were identified that could serve as alternative drug targets. In addition, non-human mammalian proteins bound to drugs were used to identify 142 homologs in humans that can potentially bind to drugs. In the current protein-drug pairs that contain somatic mutations within their binding site, we identified 85 proteins with significant differential gene expression changes associated with specific cancer types. Information on protein-drug binding predicted drug target proteins and prevalence of both somatic and germline nsSNVs that disrupt these binding sites can provide valuable knowledge for personalized medicine treatment. A web portal is available where nsSNVs from individual patient can be checked by scanning against DrugVar to determine whether any of the SNVs affect the binding of any drug in the database.
Escherichia coli K-12: a cooperatively developed annotation snapshot—2005
Riley, Monica; Abe, Takashi; Arnaud, Martha B.; Berlyn, Mary K.B.; Blattner, Frederick R.; Chaudhuri, Roy R.; Glasner, Jeremy D.; Horiuchi, Takashi; Keseler, Ingrid M.; Kosuge, Takehide; Mori, Hirotada; Perna, Nicole T.; Plunkett, Guy; Rudd, Kenneth E.; Serres, Margrethe H.; Thomas, Gavin H.; Thomson, Nicholas R.; Wishart, David; Wanner, Barry L.
2006-01-01
The goal of this group project has been to coordinate and bring up-to-date information on all genes of Escherichia coli K-12. Annotation of the genome of an organism entails identification of genes, the boundaries of genes in terms of precise start and end sites, and description of the gene products. Known and predicted functions were assigned to each gene product on the basis of experimental evidence or sequence analysis. Since both kinds of evidence are constantly expanding, no annotation is complete at any moment in time. This is a snapshot analysis based on the most recent genome sequences of two E.coli K-12 bacteria. An accurate and up-to-date description of E.coli K-12 genes is of particular importance to the scientific community because experimentally determined properties of its gene products provide fundamental information for annotation of innumerable genes of other organisms. Availability of the complete genome sequence of two K-12 strains allows comparison of their genotypes and mutant status of alleles. PMID:16397293
Martin, Tiphaine; Sherman, David J; Durrens, Pascal
2011-01-01
The Génolevures online database (URL: http://www.genolevures.org) stores and provides the data and results obtained by the Génolevures Consortium through several campaigns of genome annotation of the yeasts in the Saccharomycotina subphylum (hemiascomycetes). This database is dedicated to large-scale comparison of these genomes, storing not only the different chromosomal elements detected in the sequences, but also the logical relations between them. The database is divided into a public part, accessible to anyone through Internet, and a private part where the Consortium members make genome annotations with our Magus annotation system; this system is used to annotate several related genomes in parallel. The public database is widely consulted and offers structured data, organized using a REST web site architecture that allows for automated requests. The implementation of the database, as well as its associated tools and methods, is evolving to cope with the influx of genome sequences produced by Next Generation Sequencing (NGS). Copyright © 2011 Académie des sciences. Published by Elsevier SAS. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Valley, Cary T.; Porter, Douglas F.; Qiu, Chen
2012-06-28
mRNA control hinges on the specificity and affinity of proteins for their RNA binding sites. Regulatory proteins must bind their own sites and reject even closely related noncognate sites. In the PUF [Pumilio and fem-3 binding factor (FBF)] family of RNA binding proteins, individual proteins discriminate differences in the length and sequence of binding sites, allowing each PUF to bind a distinct battery of mRNAs. Here, we show that despite these differences, the pattern of RNA interactions is conserved among PUF proteins: the two ends of the PUF protein make critical contacts with the two ends of the RNA sites.more » Despite this conserved 'two-handed' pattern of recognition, the RNA sequence is flexible. Among the binding sites of yeast Puf4p, RNA sequence dictates the pattern in which RNA bases are flipped away from the binding surface of the protein. Small differences in RNA sequence allow new modes of control, recruiting Puf5p in addition to Puf4p to a single site. This embedded information adds a new layer of biological meaning to the connections between RNA targets and PUF proteins.« less
Thermodynamic compensation upon binding to exosite 1 and the active site of thrombin.
Treuheit, Nicholas A; Beach, Muneera A; Komives, Elizabeth A
2011-05-31
Several lines of experimental evidence including amide exchange and NMR suggest that ligands binding to thrombin cause reduced backbone dynamics. Binding of the covalent inhibitor dPhe-Pro-Arg chloromethyl ketone to the active site serine, as well as noncovalent binding of a fragment of the regulatory protein, thrombomodulin, to exosite 1 on the back side of the thrombin molecule both cause reduced dynamics. However, the reduced dynamics do not appear to be accompanied by significant conformational changes. In addition, binding of ligands to the active site does not change the affinity of thrombomodulin fragments binding to exosite 1; however, the thermodynamic coupling between exosite 1 and the active site has not been fully explored. We present isothermal titration calorimetry experiments that probe changes in enthalpy and entropy upon formation of binary ligand complexes. The approach relies on stringent thrombin preparation methods and on the use of dansyl-l-arginine-(3-methyl-1,5-pantanediyl)amide and a DNA aptamer as ligands with ideal thermodynamic signatures for binding to the active site and to exosite 1. Using this approach, the binding thermodynamic signatures of each ligand alone as well as the binding signatures of each ligand when the other binding site was occupied were measured. Different exosite 1 ligands with widely varied thermodynamic signatures cause a similar reduction in ΔH and a concomitantly lower entropy cost upon DAPA binding at the active site. The results suggest a general phenomenon of enthalpy-entropy compensation consistent with reduction of dynamics/increased folding of thrombin upon ligand binding to either the active site or exosite 1.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dissanayake, V.U.; Hughes, J.; Hunter, J.C.
The specific binding of the selective {mu}-, {delta}-, and {kappa}-opioid ligands (3H)(D-Ala2,MePhe4,Gly-ol5)enkephalin ((3H) DAGOL), (3H)(D-Pen2,D-Pen5)enkephalin ((3H)DPDPE), and (3H)U69593, respectively, to crude membranes of the guinea pig and rat whole kidney, kidney cortex, and kidney medulla was investigated. In addition, the distribution of specific 3H-opioid binding sites in the guinea pig and rat kidney was visualized by autoradiography. Homogenate binding and autoradiography demonstrated the absence of {mu}- and {kappa}-opioid binding sites in the guinea pig kidney. No opioid binding sites were demonstrable in the rat kidney. In the guinea pig whole kidney, cortex, and medulla, saturation studies demonstrated that (3H)DPDPE boundmore » with high affinity (KD = 2.6-3.5 nM) to an apparently homogeneous population of binding sites (Bmax = 8.4-30 fmol/mg of protein). Competition studies using several opioid compounds confirmed the nature of the {delta}-opioid binding site. Autoradiography experiments demonstrated that specific (3H)DPDPE binding sites were distributed radially in regions of the inner and outer medulla and at the corticomedullary junction of the guinea pig kidney. Computer-assisted image analysis of saturation data yielded KD values (4.5-5.0 nM) that were in good agreement with those obtained from the homogenate binding studies. Further investigation of the {delta}-opioid binding site in medulla homogenates, using agonist ((3H)DPDPE) and antagonist ((3H)diprenorphine) binding in the presence of Na+, Mg2+, and nucleotides, suggested that the {delta}-opioid site is linked to a second messenger system via a GTP-binding protein. Further studies are required to establish the precise localization of the {delta} binding site in the guinea pig kidney and to determine the nature of the second messenger linked to the GTP-binding protein in the medulla.« less
Ni, Ming; Ye, Fuqiang; Zhu, Juanjuan; Li, Zongwei; Yang, Shuai; Yang, Bite; Han, Lu; Wu, Yongge; Chen, Ying; Li, Fei; Wang, Shengqi; Bo, Xiaochen
2014-12-01
Numerous public microarray datasets are valuable resources for the scientific communities. Several online tools have made great steps to use these data by querying related datasets with users' own gene signatures or expression profiles. However, dataset annotation and result exhibition still need to be improved. ExpTreeDB is a database that allows for queries on human and mouse microarray experiments from Gene Expression Omnibus with gene signatures or profiles. Compared with similar applications, ExpTreeDB pays more attention to dataset annotations and result visualization. We introduced a multiple-level annotation system to depict and organize original experiments. For example, a tamoxifen-treated cell line experiment is hierarchically annotated as 'agent→drug→estrogen receptor antagonist→tamoxifen'. Consequently, retrieved results are exhibited by an interactive tree-structured graphics, which provide an overview for related experiments and might enlighten users on key items of interest. The database is freely available at http://biotech.bmi.ac.cn/ExpTreeDB. Web site is implemented in Perl, PHP, R, MySQL and Apache. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Murase, Hirotaka; Noguchi, Tomoharu; Sasaki, Shigeki
2018-06-01
Chromomycin A3 (CMA3) is an aureolic acid-type antitumor antibiotic. CMA3 forms dimeric complexes with divalent cations, such as Mg 2+ , which strongly binds to the GC rich sequence of DNA to inhibit DNA replication and transcription. In this study, the binding property of CMA3 to the DNA sequence containing multiple GC-rich binding sites was investigated by measuring the protection from hydrolysis by the restriction enzymes, AccII and Fnu4HI, for the center of the CGCG site and the 5'-GC↓GGC site, respectively. In contrast to the standard DNase I footprinting method, the DNA substrates are fully hydrolyzed by the restriction enzymes, therefore, the full protection of DNA at all the cleavable sites indicates that CMA3 simultaneously binds to all the binding sites. The restriction enzyme assay has suggested that CMA3 has a high tendency to bind the successive CGCG sites and the CGG repeat. Copyright © 2018 Elsevier Ltd. All rights reserved.
Distinct p53 genomic binding patterns in normal and cancer-derived human cells
McCorkle, Sean R; McCombie, WR; Dunn, John J
2011-01-01
Here, we report genome-wide analysis of the tumor suppressor p53 binding sites in normal human cells. 743 high-confidence ChIP-seq peaks representing putative genomic binding sites were identified in normal IMR90 fibroblasts using a reference chromatin sample. More than 40% were located within 2 kb of a transcription start site (TSS), a distribution similar to that documented for individually studied, functional p53 binding sites and, to date, not observed by previous p53 genome-wide studies. Nearly half of the high-confidence binding sites in the IMR90 cells reside in CpG islands in marked contrast to sites reported in cancer-derived cells. The distinct genomic features of the IMR90 binding sites do not reflect a distinct preference for specific sequences, since the de novo developed p53 motif based on our study is similar to those reported by genome-wide studies of cancer cells. More likely, the different chromatin landscape in normal, compared with cancer-derived cells, influences p53 binding via modulating availability of the sites. We compared the IMR90 ChIP-seq peaks to the recently published IMR90 methylome1 and demonstrated that they are enriched at hypomethylated DNA. Our study represents the first genome-wide, de novo mapping of p53 binding sites in normal human cells and reveals that p53 binding sites reside in distinct genomic landscapes in normal and cancer-derived human cells. PMID:22127205
Ethylene binding site affinity in ripening apples
DOE Office of Scientific and Technical Information (OSTI.GOV)
Blankenship, S.M.; Sisler, E.C.
1993-09-01
Scatchard plots for ethylene binding in apples (Malus domestica Borkh.), which were harvested weekly for 5 weeks to include the ethylene climacteric rise, showed C[sub 50] values (concentration of ethylene needed to occupy 50% of the ethylene binding sites) of 0.10, 0.11, 0.34, 0.40, and 0.57 [mu]l ethylene/liter[sup [minus]1], respectively, for each of the 5 weeks. Higher ethylene concentrations were required to saturate the binding sites during the climacteric rise than at other times. Diffusion of [sup 14]C-ethylene from the binding sites was curvilinear and did not show any indication of multiple binding sites. Ethylene was not metabolized by applemore » tissue.« less
NASA Technical Reports Server (NTRS)
D'Alonzo, Richard C.; Selvamurugan, Nagarajan; Karsenty, Gerard; Partridge, Nicola C.
2002-01-01
Previously, we determined that the activator protein-1 (AP-1)-binding site and the runt domain (RD)-binding site and their binding proteins, c-Fos.c-Jun and Cbfa, regulate the collagenase-3 promoter in parathyroid hormone-treated and differentiating osteoblasts. Here we show that Cbfa1 and c-Fos.c-Jun appear to cooperatively bind the RD- and AP-1-binding sites and form ternary structures in vitro. Both in vitro and in vivo co-immunoprecipitation and yeast two-hybrid studies further demonstrate interaction between Cbfa1 with c-Fos and c-Jun in the absence of phosphorylation and without binding to DNA. Additionally, only the runt domain of Cbfa1 was required for interaction with c-Jun and c-Fos. In mammalian cells, overexpression of Cbfa1 enhanced c-Jun activation of AP-1-binding site promoter activity, demonstrating functional interaction. Finally, insertion of base pairs that disrupted the helical phasing between the AP-1- and RD-binding sites also inhibited collagenase-3 promoter activation. Thus, we provide direct evidence that Cbfa1 and c-Fos.c-Jun physically interact and cooperatively bind the AP-1- and RD-binding sites in the collagenase-3 promoter. Moreover, the AP-1- and RD-binding sites appear to be organized in a specific required helical arrangement that facilitates transcription factor interaction and enables promoter activation.
Functional identification and characterization of sodium binding sites in Na symporters
Loo, Donald D. F.; Jiang, Xuan; Gorraitz, Edurne; Hirayama, Bruce A.; Wright, Ernest M.
2013-01-01
Sodium cotransporters from several different gene families belong to the leucine transporter (LeuT) structural family. Although the identification of Na+ in binding sites is beyond the resolution of the structures, two Na+ binding sites (Na1 and Na2) have been proposed in LeuT. Na2 is conserved in the LeuT family but Na1 is not. A biophysical method has been used to measure sodium dissociation constants (Kd) of wild-type and mutant human sodium glucose cotransport (hSGLT1) proteins to identify the Na+ binding sites in hSGLT1. The Na1 site is formed by residues in the sugar binding pocket, and their mutation influences sodium binding to Na1 but not to Na2. For the canonical Na2 site formed by two –OH side chains, S392 and S393, and three backbone carbonyls, mutation of S392 to cysteine increased the sodium Kd by sixfold. This was accompanied by a dramatic reduction in the apparent sugar and phlorizin affinities. We suggest that mutation of S392 in the Na2 site produces a structural rearrangement of the sugar binding pocket to disrupt both the binding of the second Na+ and the binding of sugar. In contrast, the S393 mutations produce no significant changes in sodium, sugar, and phlorizin affinities. We conclude that the Na2 site is conserved in hSGLT1, the side chain of S392 and the backbone carbonyl of S393 are important in the first Na+ binding, and that Na+ binding to Na2 promotes binding to Na1 and also sugar binding. PMID:24191006
Navé, Jean-François; Benveniste, Pierre
1984-01-01
The specific binding of 1-[3H]naphthyl acetic acid (NAA) to membrane-bound binding sites from maize (Zea mays cv INRA 258) coleoptiles is inactivated by phenylglyoxal. The inactivation obeys pseudo first-order kinetics. The rate of inactivation is proportional to phenylglyoxal concentration. Under conditions at which significant binding occurs, NAA, R and S-1-naphthyl 2-propionic acids protect the auxin binding site against inactivation by phenylglyoxal. Scatchard analysis shows that the inhibition of binding corresponds to a decrease in the concentration of sites but not in the affinity. The results of the present chemical modification study indicate that at least one arginyl residue is involved in the positively charged recognition site of the carboxylate anion of NAA. PMID:16663499
Spurny, Radovan; Debaveye, Sarah; Farinha, Ana; Veys, Ken; Vos, Ann M.; Gossas, Thomas; Atack, John; Bertrand, Sonia; Bertrand, Daniel; Danielson, U. Helena; Tresadern, Gary; Ulens, Chris
2015-01-01
The α7 nicotinic acetylcholine receptor (nAChR) belongs to the family of pentameric ligand-gated ion channels and is involved in fast synaptic signaling. In this study, we take advantage of a recently identified chimera of the extracellular domain of the native α7 nicotinic acetylcholine receptor and acetylcholine binding protein, termed α7-AChBP. This chimeric receptor was used to conduct an innovative fragment-library screening in combination with X-ray crystallography to identify allosteric binding sites. One allosteric site is surface-exposed and is located near the N-terminal α-helix of the extracellular domain. Ligand binding at this site causes a conformational change of the α-helix as the fragment wedges between the α-helix and a loop homologous to the main immunogenic region of the muscle α1 subunit. A second site is located in the vestibule of the receptor, in a preexisting intrasubunit pocket opposite the agonist binding site and corresponds to a previously identified site involved in positive allosteric modulation of the bacterial homolog ELIC. A third site is located at a pocket right below the agonist binding site. Using electrophysiological recordings on the human α7 nAChR we demonstrate that the identified fragments, which bind at these sites, can modulate receptor activation. This work presents a structural framework for different allosteric binding sites in the α7 nAChR and paves the way for future development of novel allosteric modulators with therapeutic potential. PMID:25918415
Krystkowiak, Izabella; Manguy, Jean; Davey, Norman E
2018-06-05
There is a pressing need for in silico tools that can aid in the identification of the complete repertoire of protein binding (SLiMs, MoRFs, miniMotifs) and modification (moiety attachment/removal, isomerization, cleavage) motifs. We have created PSSMSearch, an interactive web-based tool for rapid statistical modeling, visualization, discovery and annotation of protein motif specificity determinants to discover novel motifs in a proteome-wide manner. PSSMSearch analyses proteomes for regions with significant similarity to a motif specificity determinant model built from a set of aligned motif-containing peptides. Multiple scoring methods are available to build a position-specific scoring matrix (PSSM) describing the motif specificity determinant model. This model can then be modified by a user to add prior knowledge of specificity determinants through an interactive PSSM heatmap. PSSMSearch includes a statistical framework to calculate the significance of specificity determinant model matches against a proteome of interest. PSSMSearch also includes the SLiMSearch framework's annotation, motif functional analysis and filtering tools to highlight relevant discriminatory information. Additional tools to annotate statistically significant shared keywords and GO terms, or experimental evidence of interaction with a motif-recognizing protein have been added. Finally, PSSM-based conservation metrics have been created for taxonomic range analyses. The PSSMSearch web server is available at http://slim.ucd.ie/pssmsearch/.
Muscarinic binding sites in cultured bovine pulmonary arterial endothelial cells
DOE Office of Scientific and Technical Information (OSTI.GOV)
Aronstam, R.S.; Catravas, J.D.; Ryan, U.S.
The authors have previously reported a) the presence of muscarinic binding sites on cultured bovine pulmonary arterial endothelial cells (BPAE; 2,000 sites/cell) and b) that acetylcholine inhibits the release of thromboxane B/sub 2/ fro BPAE. Since the authors findings could reflect muscarinic receptors (mAChR) on BPAE, they have further investigated the nature of BPAE muscarinic binding sites and contrast them to those of known functional mAChR. Muscarinic binding sites on BPAE resembled mAChR in that a) the binding of 3 nM /sup 3/H QNB was inhibited by muscarinic agonists and antagonists; b) /sup 3/H QNB binding was 30 times moremore » sensitive to R(-)- than to S(+)-QNB; c) carbamylcholine binding was resolved into high and low affinity components (IC50's = 0.04 and 2 ..mu..M; d) 5'-guanylylimidodiphosphate (100 ..mu..M) shifted agonist binding curves to the right by a factor of 3; 4) the atropine-sensitive binding of /sup 3/H oxotremorine-M (/sup 3/H-OXO-M) was depressed by the guanine nucleotide (IC50 + 60 ..mu..M). However, although gallamine allosterically regulates mAChR binding in other tissues, it did not affect the rates of dissociation of /sup 3/H QNB, /sup 3/H methylscopolamine or /sup 3/H OXO-M from BPAE binding sites. Thus, BPAE muscarinic binding sites posses many but not all of the properties associated with functional mAChR.« less
Websites for Primary Sources and Civics Education
ERIC Educational Resources Information Center
Rulli, Daniel
2005-01-01
This article features a list of websites for primary sources and civics education. The World Wide Web has become an excellent source for facsimiles, images, and transcriptions of primary sources. As it would be impossible to provide a comprehensive list of all the sites, this annotated list highlights selective sites that provide access to…
Autoradiographic localization of endothelin-1 binding sites in porcine skin
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhao, Y.D.; Springall, D.R.; Wharton, J.
Autoradiographic techniques and {sup 125}I-labeled endothelin-1 were used to study the distribution of endothelin-1 binding sites in porcine skin. Specific endothelin-1 binding sites were localized to blood vessels (capillaries, deep cutaneous vascular plexus, arteries, and arterioles), the deep dermal and connective tissue sheath of hair follicles, sebaceous and sweat glands, and arrector pili muscle. Specific binding was inhibited by endothelin-2 and endothelin-3 as well as endothelin-1. Non-specific binding was found in the epidermis and the medulla of hair follicles. No binding was found in connective tissue or fat. These vascular binding sites may represent endothelin receptors, in keeping with themore » known cutaneous vasoconstrictor actions of the peptide. If all binding sites are receptors, the results suggest that endothelin could also regulate the function of sweat glands and may have trophic effects in the skin.« less
Transcriptomic Responses to Salinity Stress in the Pacific Oyster Crassostrea gigas
Zhao, Xuelin; Yu, Hong; Kong, Lingfeng; Li, Qi
2012-01-01
Background Low salinity is one of the main factors limiting the distribution and survival of marine species. As a euryhaline species, the Pacific oyster Crassostrea gigas is considered to be tolerant to relative low salinity. The genes that regulate C. gigas responses to osmotic stress were monitored using the next-generation sequencing of whole transcriptome with samples taken from gills. By RNAseq technology, transcript catalogs of up- and down-regulated genes were generated from the oysters exposed to low and optimal salinity seawater. Methodology/Principal Findings Through Illumina sequencing, we reported 1665 up-regulated transcripts and 1815 down-regulated transcripts. A total of 45771 protein-coding contigs were identified from two groups based on sequence similarities with known proteins. As determined by GO annotation and KEGG pathway mapping, functional annotation of the genes recovered diverse biological functions and processes. The genes that changed expression significantly were highly represented in cellular process and regulation of biological process, intracellular and cell, binding and protein binding according to GO annotation. The results highlighted genes related to osmoregulation, signaling and interactions of osmotic stress response, anti-apoptotic reactions as well as immune response, cell adhesion and communication, cytoskeleton and cell cycle. Conclusions/Significance Through more than 1.5 million sequence reads and the expression data of the two libraries, the study provided some useful insights into signal transduction pathways in oysters and offered a number of candidate genes as potential markers of tolerance to hypoosmotic stress for oysters. In addition, the characterization of C. gigas transcriptome will not only provide a better understanding of the molecular mechanisms about the response to osmotic stress of the oysters, but also facilitate research into biological processes to find underlying physiological adaptations to hypoosmotic shock for marine invertebrates. PMID:23029449
Withey, Jeffrey H; DiRita, Victor J
2005-05-01
The Gram-negative bacterium Vibrio cholerae is the infectious agent responsible for the disease Asiatic cholera. The genes required for V. cholerae virulence, such as those encoding the cholera toxin (CT) and toxin-coregulated pilus (TCP), are controlled by a cascade of transcriptional activators. Ultimately, the direct transcriptional activator of the majority of V. cholerae virulence genes is the AraC/XylS family member ToxT protein, the expression of which is activated by the ToxR and TcpP proteins. Previous studies have identified the DNA sites to which ToxT binds upstream of the ctx operon, encoding CT, and the tcpA operon, encoding, among other products, the major subunit of the TCP. These known ToxT binding sites are seemingly dissimilar in sequence other than being A/T rich. Further results suggested that ctx and tcpA each has a pair of ToxT binding sites arranged in a direct repeat orientation upstream of the core promoter elements. In this work, using both transcriptional lacZ fusions and in vitro copper-phenanthroline footprinting experiments, we have identified the ToxT binding sites between the divergently transcribed acfA and acfD genes, which encode components of the accessory colonization factor required for efficient intestinal colonization by V. cholerae. Our results indicate that ToxT binds to a pair of DNA sites between acfA and acfD in an inverted repeat orientation. Moreover, a mutational analysis of the ToxT binding sites indicates that both binding sites are required by ToxT for transcriptional activation of both acfA and acfD. Using copper-phenanthroline footprinting to assess the occupancy of ToxT on DNA having mutations in one of these binding sites, we found that protection by ToxT of the unaltered binding site was not affected, whereas protection by ToxT of the mutant binding site was significantly reduced in the region of the mutations. The results of further footprinting experiments using DNA templates having +5 bp and +10 bp insertions between the two ToxT binding sites indicate that both binding sites are occupied by ToxT regardless of their positions relative to each other. Based on these results, we propose that ToxT binds independently to two DNA sites between acfA and acfD to activate transcription of both genes.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Moses, Alan M.; Chiang, Derek Y.; Pollard, Daniel A.
2004-10-28
We introduce a method (MONKEY) to identify conserved transcription-factor binding sites in multispecies alignments. MONKEY employs probabilistic models of factor specificity and binding site evolution, on which basis we compute the likelihood that putative sites are conserved and assign statistical significance to each hit. Using genomes from the genus Saccharomyces, we illustrate how the significance of real sites increases with evolutionary distance and explore the relationship between conservation and function.
Chertkova, Aleksandra A; Schiffman, Joshua S; Nuzhdin, Sergey V; Kozlov, Konstantin N; Samsonova, Maria G; Gursky, Vitaly V
2017-02-07
Cis-regulatory sequences are often composed of many low-affinity transcription factor binding sites (TFBSs). Determining the evolutionary and functional importance of regulatory sequence composition is impeded without a detailed knowledge of the genotype-phenotype map. We simulate the evolution of regulatory sequences involved in Drosophila melanogaster embryo segmentation during early development. Natural selection evaluates gene expression dynamics produced by a computational model of the developmental network. We observe a dramatic decrease in the total number of transcription factor binding sites through the course of evolution. Despite a decrease in average sequence binding energies through time, the regulatory sequences tend towards organisations containing increased high affinity transcription factor binding sites. Additionally, the binding energies of separate sequence segments demonstrate ubiquitous mutual correlations through time. Fewer than 10% of initial TFBSs are maintained throughout the entire simulation, deemed 'core' sites. These sites have increased functional importance as assessed under wild-type conditions and their binding energy distributions are highly conserved. Furthermore, TFBSs within close proximity of core sites exhibit increased longevity, reflecting functional regulatory interactions with core sites. In response to elevated mutational pressure, evolution tends to sample regulatory sequence organisations with fewer, albeit on average, stronger functional transcription factor binding sites. These organisations are also shaped by the regulatory interactions among core binding sites with sites in their local vicinity.
Ndah, Elvis; Jonckheere, Veronique
2017-01-01
Proteogenomics is an emerging research field yet lacking a uniform method of analysis. Proteogenomic studies in which N-terminal proteomics and ribosome profiling are combined, suggest that a high number of protein start sites are currently missing in genome annotations. We constructed a proteogenomic pipeline specific for the analysis of N-terminal proteomics data, with the aim of discovering novel translational start sites outside annotated protein coding regions. In summary, unidentified MS/MS spectra were matched to a specific N-terminal peptide library encompassing protein N termini encoded in the Arabidopsis thaliana genome. After a stringent false discovery rate filtering, 117 protein N termini compliant with N-terminal methionine excision specificity and indicative of translation initiation were found. These include N-terminal protein extensions and translation from transposable elements and pseudogenes. Gene prediction provided supporting protein-coding models for approximately half of the protein N termini. Besides the prediction of functional domains (partially) contained within the newly predicted ORFs, further supporting evidence of translation was found in the recently released Araport11 genome re-annotation of Arabidopsis and computational translations of sequences stored in public repositories. Most interestingly, complementary evidence by ribosome profiling was found for 23 protein N termini. Finally, by analyzing protein N-terminal peptides, an in silico analysis demonstrates the applicability of our N-terminal proteogenomics strategy in revealing protein-coding potential in species with well- and poorly-annotated genomes. PMID:28432195
Willems, Patrick; Ndah, Elvis; Jonckheere, Veronique; Stael, Simon; Sticker, Adriaan; Martens, Lennart; Van Breusegem, Frank; Gevaert, Kris; Van Damme, Petra
2017-06-01
Proteogenomics is an emerging research field yet lacking a uniform method of analysis. Proteogenomic studies in which N-terminal proteomics and ribosome profiling are combined, suggest that a high number of protein start sites are currently missing in genome annotations. We constructed a proteogenomic pipeline specific for the analysis of N-terminal proteomics data, with the aim of discovering novel translational start sites outside annotated protein coding regions. In summary, unidentified MS/MS spectra were matched to a specific N-terminal peptide library encompassing protein N termini encoded in the Arabidopsis thaliana genome. After a stringent false discovery rate filtering, 117 protein N termini compliant with N-terminal methionine excision specificity and indicative of translation initiation were found. These include N-terminal protein extensions and translation from transposable elements and pseudogenes. Gene prediction provided supporting protein-coding models for approximately half of the protein N termini. Besides the prediction of functional domains (partially) contained within the newly predicted ORFs, further supporting evidence of translation was found in the recently released Araport11 genome re-annotation of Arabidopsis and computational translations of sequences stored in public repositories. Most interestingly, complementary evidence by ribosome profiling was found for 23 protein N termini. Finally, by analyzing protein N-terminal peptides, an in silico analysis demonstrates the applicability of our N-terminal proteogenomics strategy in revealing protein-coding potential in species with well- and poorly-annotated genomes. © 2017 by The American Society for Biochemistry and Molecular Biology, Inc.
Zhang, Jia; Yang, Ming-Kun; Zeng, Honghui; Ge, Feng
2016-11-01
Although the number of sequenced prokaryotic genomes is growing rapidly, experimentally verified annotation of prokaryotic genome remains patchy and challenging. To facilitate genome annotation efforts for prokaryotes, we developed an open source software called GAPP for genome annotation and global profiling of post-translational modifications (PTMs) in prokaryotes. With a single command, it provides a standard workflow to validate and refine predicted genetic models and discover diverse PTM events. We demonstrated the utility of GAPP using proteomic data from Helicobacter pylori, one of the major human pathogens that is responsible for many gastric diseases. Our results confirmed 84.9% of the existing predicted H. pylori proteins, identified 20 novel protein coding genes, and corrected four existing gene models with regard to translation initiation sites. In particular, GAPP revealed a large repertoire of PTMs using the same proteomic data and provided a rich resource that can be used to examine the functions of reversible modifications in this human pathogen. This software is a powerful tool for genome annotation and global discovery of PTMs and is applicable to any sequenced prokaryotic organism; we expect that it will become an integral part of ongoing genome annotation efforts for prokaryotes. GAPP is freely available at https://sourceforge.net/projects/gappproteogenomic/. © 2016 by The American Society for Biochemistry and Molecular Biology, Inc.
Tsai, Keng-Chang; Jian, Jhih-Wei; Yang, Ei-Wen; Hsu, Po-Chiang; Peng, Hung-Pin; Chen, Ching-Tai; Chen, Jun-Bo; Chang, Jeng-Yih; Hsu, Wen-Lian; Yang, An-Suei
2012-01-01
Non-covalent protein-carbohydrate interactions mediate molecular targeting in many biological processes. Prediction of non-covalent carbohydrate binding sites on protein surfaces not only provides insights into the functions of the query proteins; information on key carbohydrate-binding residues could suggest site-directed mutagenesis experiments, design therapeutics targeting carbohydrate-binding proteins, and provide guidance in engineering protein-carbohydrate interactions. In this work, we show that non-covalent carbohydrate binding sites on protein surfaces can be predicted with relatively high accuracy when the query protein structures are known. The prediction capabilities were based on a novel encoding scheme of the three-dimensional probability density maps describing the distributions of 36 non-covalent interacting atom types around protein surfaces. One machine learning model was trained for each of the 30 protein atom types. The machine learning algorithms predicted tentative carbohydrate binding sites on query proteins by recognizing the characteristic interacting atom distribution patterns specific for carbohydrate binding sites from known protein structures. The prediction results for all protein atom types were integrated into surface patches as tentative carbohydrate binding sites based on normalized prediction confidence level. The prediction capabilities of the predictors were benchmarked by a 10-fold cross validation on 497 non-redundant proteins with known carbohydrate binding sites. The predictors were further tested on an independent test set with 108 proteins. The residue-based Matthews correlation coefficient (MCC) for the independent test was 0.45, with prediction precision and sensitivity (or recall) of 0.45 and 0.49 respectively. In addition, 111 unbound carbohydrate-binding protein structures for which the structures were determined in the absence of the carbohydrate ligands were predicted with the trained predictors. The overall prediction MCC was 0.49. Independent tests on anti-carbohydrate antibodies showed that the carbohydrate antigen binding sites were predicted with comparable accuracy. These results demonstrate that the predictors are among the best in carbohydrate binding site predictions to date. PMID:22848404
DOE Office of Scientific and Technical Information (OSTI.GOV)
D'Amato, R.J.; Largent, B.L.; Snowman, A.M.
1987-07-01
Citalopram is a potent and selective inhibitor of neuronal serotonin uptake. In rat brain membranes (/sup 3/H)citalopram demonstrates saturable and reversible binding with a KD of 0.8 nM and a maximal number of binding sites (Bmax) of 570 fmol/mg of protein. The drug specificity for (/sup 3/H)citalopram binding and synaptosomal serotonin uptake are closely correlated. Inhibition of (/sup 3/H)citalopram binding by both serotonin and imipramine is consistent with a competitive interaction in both equilibrium and kinetic analyses. The autoradiographic pattern of (/sup 3/H)citalopram binding sites closely resembles the distribution of serotonin. By contrast, detailed equilibrium-saturation analysis of (/sup 3/H)imipramine bindingmore » reveals two binding components, i.e., high affinity (KD = 9 nM, Bmax = 420 fmol/mg of protein) and low affinity (KD = 553 nM, Bmax = 8560 fmol/mg of protein) sites. Specific (/sup 3/H)imipramine binding, defined as the binding inhibited by 100 microM desipramine, is displaced only partially by serotonin. Various studies reveal that the serotonin-sensitive portion of binding corresponds to the high affinity sites of (/sup 3/H)imipramine binding whereas the serotonin-insensitive binding corresponds to the low affinity sites. Lesioning of serotonin neurons with p-chloroamphetamine causes a large decrease in (/sup 3/H)citalopram and serotonin-sensitive (/sup 3/H)imipramine binding with only a small effect on serotonin-insensitive (/sup 3/H)imipramine binding. The dissociation rate of (/sup 3/H)imipramine or (/sup 3/H)citalopram is not altered by citalopram, imipramine or serotonin up to concentrations of 10 microM. The regional distribution of serotonin sensitive (/sup 3/H)imipramine high affinity binding sites closely resembles that of (/sup 3/H)citalopram binding.« less
Jiang, Peng; Singh, Mona; Coller, Hilary A
2013-01-01
Transcript degradation is a widespread and important mechanism for regulating protein abundance. Two major regulators of transcript degradation are RNA Binding Proteins (RBPs) and microRNAs (miRNAs). We computationally explored whether RBPs and miRNAs cooperate to promote transcript decay. We defined five RBP motifs based on the evolutionary conservation of their recognition sites in 3'UTRs as the binding motifs for Pumilio (PUM), U1A, Fox-1, Nova, and UAUUUAU. Recognition sites for some of these RBPs tended to localize at the end of long 3'UTRs. A specific group of miRNA recognition sites were enriched within 50 nts from the RBP recognition sites for PUM and UAUUUAU. The presence of both a PUM recognition site and a recognition site for preferentially co-occurring miRNAs was associated with faster decay of the associated transcripts. For PUM and its co-occurring miRNAs, binding of the RBP to its recognition sites was predicted to release nearby miRNA recognition sites from RNA secondary structures. The mammalian miRNAs that preferentially co-occur with PUM binding sites have recognition seeds that are reverse complements to the PUM recognition motif. Their binding sites have the potential to form hairpin secondary structures with proximal PUM binding sites that would normally limit RISC accessibility, but would be more accessible to miRNAs in response to the binding of PUM. In sum, our computational analyses suggest that a specific set of RBPs and miRNAs work together to affect transcript decay, with the rescue of miRNA recognition sites via RBP binding as one possible mechanism of cooperativity.
Stapleton, Brian; Walker, Lawrence R; Logan, Timothy M
2013-03-19
Thermodynamic measurements of Fe(II) binding and activation of repressor function in the iron-dependent repressor from Mycobacterium tuberculosis (IdeR) are reported. IdeR, a member of the diphtheria toxin repressor family of proteins, regulates iron homeostasis and contributes to the virulence response in M. tuberculosis. Although iron is the physiological ligand, this is the first detailed analysis of iron binding and activation in this protein. The results showed that IdeR binds 2 equiv of Fe(II) with dissociation constants that differ by a factor of 25. The high- and low-affinity iron binding sites were assigned to physical binding sites I and II, respectively, using metal binding site mutants. IdeR was also found to contain a high-affinity Zn(II) binding site that was assigned to physical metal binding site II through the use of binding site mutants and metal competition assays. Fe(II) binding was modestly weaker in the presence of Zn(II), but the coupled metal binding-DNA binding affinity was significantly stronger, requiring 30-fold less Fe(II) to activate DNA binding compared to Fe(II) alone. Together, these results suggest that IdeR is a mixed-metal repressor, where Zn(II) acts as a structural metal and Fe(II) acts to trigger the physiologically relevant promoter binding. This new model for IdeR activation provides a better understanding of IdeR and the biology of iron homeostasis in M. tuberculosis.
Tam, S W; Cook, L
1984-01-01
The relationship between binding of antipsychotic drugs and sigma psychotomimetic opiates to binding sites for the sigma agonist (+)-[3H]SKF 10,047 (N-allylnormetazocine) and to dopamine D2 sites was investigated. In guinea pig brain membranes, (+)-[3H]SKF 10,047 bound to a single class of sites with a Kd of 4 X 10(-8) M and a Bmax of 333 fmol/mg of protein. This binding was different from mu, kappa, or delta opiate receptor binding. It was inhibited by opiates that produce psychotomimetic activities but not by opiates that lack such activities. Some antipsychotic drugs inhibited (+)-[3H]SKF 10,047 binding with high to moderate affinities in the following order of potency: haloperidol greater than perphenazine greater than fluphenazine greater than acetophenazine greater than trifluoperazine greater than molindone greater than or equal to pimozide greater than or equal to thioridazine greater than or equal to chlorpromazine greater than or equal to triflupromazine. However, there were other antipsychotic drugs such as spiperone and clozapine that showed low affinity for the (+)-[3H]SKF 10,047 binding sites. Affinities of antipsychotic drugs for (+)-[3H]SKF 10,047 binding sites did not correlate with those for [3H]spiperone (dopamine D2) sites. [3H]-Haloperidol binding in whole brain membranes was also inhibited by the sigma opiates pentazocine, cyclazocine, and (+)-SKF 10,047. In the striatum, about half of the saturable [3H]haloperidol binding was to [3H]spiperone (D2) sites and the other half was to sites similar to (+)-[3H]SKF 10,047 binding sites. PMID:6147851
Uncoupling metallonuclease metal ion binding sites via nudge mutagenesis.
Papadakos, Grigorios A; Nastri, Horacio; Riggs, Paul; Dupureur, Cynthia M
2007-05-01
The hydrolysis of phosphodiester bonds by nucleases is critical to nucleic acid processing. Many nucleases utilize metal ion cofactors, and for a number of these enzymes two active-site metal ions have been detected. Testing proposed mechanistic roles for individual bound metal ions has been hampered by the similarity between the sites and cooperative behavior. In the homodimeric PvuII restriction endonuclease, the metal ion dependence of DNA binding is sigmoidal and consistent with two classes of coupled metal ion binding sites. We reasoned that a conservative active-site mutation would perturb the ligand field sufficiently to observe the titration of individual metal ion binding sites without significantly disturbing enzyme function. Indeed, mutation of a Tyr residue 5.5 A from both metal ions in the enzyme-substrate crystal structure (Y94F) renders the metal ion dependence of DNA binding biphasic: two classes of metal ion binding sites become distinct in the presence of DNA. The perturbation in metal ion coordination is supported by 1H-15N heteronuclear single quantum coherence spectra of enzyme-Ca(II) and enzyme-Ca(II)-DNA complexes. Metal ion binding by free Y94F is basically unperturbed: through multiple experiments with different metal ions, the data are consistent with two alkaline earth metal ion binding sites per subunit of low millimolar affinity, behavior which is very similar to that of the wild type. The results presented here indicate a role for the hydroxyl group of Tyr94 in the coupling of metal ion binding sites in the presence of DNA. Its removal causes the affinities for the two metal ion binding sites to be resolved in the presence of substrate. Such tuning of metal ion affinities will be invaluable to efforts to ascertain the contributions of individual bound metal ions to metallonuclease function.
Identification of the heparin binding site on adeno-associated virus serotype 3B (AAV-3B)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lerch, Thomas F.; Chapman, Michael S., E-mail: chapmami@ohsu.edu
2012-02-05
Adeno-associated virus is a promising vector for gene therapy. In the current study, the binding site on AAV serotype 3B for the heparan sulfate proteoglycan (HSPG) receptor has been characterized. X-ray diffraction identified a disaccharide binding site at the most positively charged region on the virus surface. The contributions of basic amino acids at this and other sites were characterized using site-directed mutagenesis. Both heparin and cell binding are correlated to positive charge at the disaccharide binding site, and transduction is significantly decreased in AAV-3B vectors mutated at this site to reduce heparin binding. While the receptor attachment sites ofmore » AAV-3B and AAV-2 are both in the general vicinity of the viral spikes, the exact amino acids that participate in electrostatic interactions are distinct. Diversity in the mechanisms of cell attachment by AAV serotypes will be an important consideration for the rational design of improved gene therapy vectors.« less
Identification of the heparin binding site on adeno-associated virus serotype 3B (AAV-3B)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lerch, Thomas F.; Chapman, Michael S.
2012-05-24
Adeno-associated virus is a promising vector for gene therapy. In the current study, the binding site on AAV serotype 3B for the heparan sulfate proteoglycan (HSPG) receptor has been characterized. X-ray diffraction identified a disaccharide binding site at the most positively charged region on the virus surface. The contributions of basic amino acids at this and other sites were characterized using site-directed mutagenesis. Both heparin and cell binding are correlated to positive charge at the disaccharide binding site, and transduction is significantly decreased in AAV-3B vectors mutated at this site to reduce heparin binding. While the receptor attachment sites ofmore » AAV-3B and AAV-2 are both in the general vicinity of the viral spikes, the exact amino acids that participate in electrostatic interactions are distinct. Diversity in the mechanisms of cell attachment by AAV serotypes will be an important consideration for the rational design of improved gene therapy vectors.« less
Location of Bromide Ions in Tetragonal Lysozyme Crystals
NASA Technical Reports Server (NTRS)
Lim, Kap; Nadarajah, Arunan; Forsythe, Elizabeth L.; Pusey, Marc L.
1998-01-01
Anions have been shown to play a dominant role in the crystallization of chicken egg white lysozyme from salt solutions. Previous studies employing X-ray crystallography had found one chloride ion binding site in the tetragonal crystal form of the protein and four nitrate ion binding sites in the monoclinic form. In this study the anion positions in the tetragonal form were determined from the difference Fourier map obtained from lysozyme crystal grown in bromide and chloride solutions. Five possible anion binding sites were found in this manner. Some of these sites were in pockets containing basic residues while others were near neutral, but polar, residues. The sole chloride ion binding site found in previous studies was confirmed, while four of these sites corresponded to four binding sites found for nitrate ions in monoclinic crystals. The study suggests that most of the anion binding sites in lysozyme remain unchanged, even when different anions and different crystal forms of lysozyme are employed.
Locations of Bromide Ions in Tetragonal Lysozyme Crystals
NASA Technical Reports Server (NTRS)
Lim, Kap; Nadarajah, Arunan; Forsythe, Elizabeth L.; Pusey, Marc L.
1998-01-01
Anions have been shown to play a dominant role in the crystallization of chicken egg-white lysozyme from salt solutions. Previous studies employing X-ray crystallography have found one chloride ion binding site in the tetragonal crystal form of the protein and four nitrate ion binding sites in the monoclinic form. In this study the anion positions in the tetragonal form were determined from the difference Fourier map obtained from lysozyme crystals grown in bromide and chloride solutions. Five possible anion-binding sites were found in this manner. Some of these sites were in pockets containing basic residues while others were near neutral, but polar, residues. The sole chloride ion binding site found in previous studies was confirmed, while four further sites were found which corresponded to the four binding sites found for nitrate ions in monoclinic crystals. The study suggests that most of the anion-binding sites in lysozyme remain unchanged even when different anions and different crystal forms of lysozyme are employed.
Discovering amino acid patterns on binding sites in protein complexes
Kuo, Huang-Cheng; Ong, Ping-Lin; Lin, Jung-Chang; Huang, Jen-Peng
2011-01-01
Discovering amino acid (AA) patterns on protein binding sites has recently become popular. We propose a method to discover the association relationship among AAs on binding sites. Such knowledge of binding sites is very helpful in predicting protein-protein interactions. In this paper, we focus on protein complexes which have protein-protein recognition. The association rule mining technique is used to discover geographically adjacent amino acids on a binding site of a protein complex. When mining, instead of treating all AAs of binding sites as a transaction, we geographically partition AAs of binding sites in a protein complex. AAs in a partition are treated as a transaction. For the partition process, AAs on a binding site are projected from three-dimensional to two-dimensional. And then, assisted with a circular grid, AAs on the binding site are placed into grid cells. A circular grid has ten rings: a central ring, the second ring with 6 sectors, the third ring with 12 sectors, and later rings are added to four sectors in order. As for the radius of each ring, we examined the complexes and found that 10Å is a suitable range, which can be set by the user. After placing these recognition complexes on the circular grid, we obtain mining records (i.e. transactions) from each sector. A sector is regarded as a record. Finally, we use the association rule to mine these records for frequent AA patterns. If the support of an AA pattern is larger than the predetermined minimum support (i.e. threshold), it is called a frequent pattern. With these discovered patterns, we offer the biologists a novel point of view, which will improve the prediction accuracy of protein-protein recognition. In our experiments, we produced the AA patterns by data mining. As a result, we found that arginine (arg) most frequently appears on the binding sites of two proteins in the recognition protein complexes, while cysteine (cys) appears the fewest. In addition, if we discriminate the shape of binding sites between concave and convex further, we discover that patterns {arg, glu, asp} and {arg, ser, asp} on the concave shape of binding sites in a protein more frequently (i.e. higher probability) make contact with {lys} or {arg} on the convex shape of binding sites in another protein. Thus, we can confidently achieve a rate of at least 78%. On the other hand {val, gly, lys} on the convex surface of binding sites in proteins is more frequently in contact with {asp} on the concave site of another protein, and the confidence achieved is over 81%. Applying data mining in biology can reveal more facts that may otherwise be ignored or not easily discovered by the naked eye. Furthermore, we can discover more relationships among AAs on binding sites by appropriately rotating these residues on binding sites from a three-dimension to two-dimension perspective. We designed a circular grid to deposit the data, which total to 463 records consisting of AAs. Then we used the association rules to mine these records for discovering relationships. The proposed method in this paper provides an insight into the characteristics of binding sites for recognition complexes. PMID:21464838
Duggin, Iain G; Matthews, Jacqueline M; Dixon, Nicholas E; Wake, R Gerry; Mackay, Joel P
2005-04-01
Two dimers of the replication terminator protein (RTP) of Bacillus subtilis bind to a chromosomal DNA terminator site to effect polar replication fork arrest. Cooperative binding of the dimers to overlapping half-sites within the terminator is essential for arrest. It was suggested previously that polarity of fork arrest is the result of the RTP dimer at the blocking (proximal) side within the complex binding very tightly and the permissive-side RTP dimer binding relatively weakly. In order to investigate this "differential binding affinity" model, we have constructed a series of mutant terminators that contain half-sites of widely different RTP binding affinities in various combinations. Although there appeared to be a correlation between binding affinity at the proximal half-site and fork arrest efficiency in vivo for some terminators, several deviated significantly from this correlation. Some terminators exhibited greatly reduced binding cooperativity (and therefore have reduced affinity at each half-site) but were highly efficient in fork arrest, whereas one terminator had normal affinity over the proximal half-site, yet had low fork arrest efficiency. The results show clearly that there is no direct correlation between the RTP binding affinity (either within the full complex or at the proximal half-site within the full complex) and the efficiency of replication fork arrest in vivo. Thus, the differential binding affinity over the proximal and distal half-sites cannot be solely responsible for functional polarity of fork arrest. Furthermore, efficient fork arrest relies on features in addition to the tight binding of RTP to terminator DNA.
Konuma, Tsuyoshi; Lee, Young-Ho; Goto, Yuji; Sakurai, Kazumasa
2013-01-01
Chemical shift perturbations (CSPs) in NMR spectra provide useful information about the interaction of a protein with its ligands. However, in a multiple-ligand-binding system, determining quantitative parameters such as a dissociation constant (K(d) ) is difficult. Here, we used a method we named CS-PCA, a principal component analysis (PCA) of chemical shift (CS) data, to analyze the interaction between bovine β-lactoglobulin (βLG) and 1-anilinonaphthalene-8-sulfonate (ANS), which is a multiple-ligand-binding system. The CSP on the binding of ANS involved contributions from two distinct binding sites. PCA of the titration data successfully separated the CSP pattern into contributions from each site. Docking simulations based on the separated CSP patterns provided the structures of βLG-ANS complexes for each binding site. In addition, we determined the K(d) values as 3.42 × 10⁻⁴ M² and 2.51 × 10⁻³ M for Sites 1 and 2, respectively. In contrast, it was difficult to obtain reliable K(d) values for respective sites from the isothermal titration calorimetry experiments. Two ANS molecules were found to bind at Site 1 simultaneously, suggesting that the binding occurs cooperatively with a partial unfolding of the βLG structure. On the other hand, the binding of ANS to Site 2 was a simple attachment without a significant conformational change. From the present results, CS-PCA was confirmed to provide not only the positions and the K(d) values of binding sites but also information about the binding mechanism. Thus, it is anticipated to be a general method to investigate protein-ligand interactions. Copyright © 2012 Wiley Periodicals, Inc.
Thermodynamic compensation upon binding to exosite 1 and the active site of thrombin
Treuheit, Nicholas A.; Beach, Muneera A.; Komives, Elizabeth A.
2011-01-01
Several lines of experimental evidence including amide exchange and NMR suggest that ligands binding to thrombin cause reduced backbone dynamics. Binding of the covalent inhibitor dPhe-Pro-Arg chloromethylketone to the active site serine, as well as non-covalent binding of a fragment of the regulatory protein, thrombomodulin, to exosite 1 on the back side of the thrombin molecule both cause reduced dynamics. However, the reduced dynamics do not appear to be accompanied by significant conformational changes. In addition, binding of ligands to the active site does not change the affinity of thrombomodulin fragments binding to exosite 1, however, the thermodynamic coupling between exosite 1 and the active site has not been fully explored. We present isothermal titration calorimetry experiments that probe changes in enthalpy and entropy upon formation of binary ligand complexes. The approach relies on stringent thrombin preparation methods and on the use of dansyl-L-arginine-(3-methyl-1,5-pantanediyl) amide and a DNA aptamer as ligands with ideal thermodynamic signatures for binding to the active site and to exosite 1. Using this approach, the binding thermodynamic signatures of each ligand alone as well as the binding signatures of each ligand when the other binding site was occupied were measured. Different exosite 1 ligands with widely varied thermodynamic signatures cause the same reduction in ΔH and a concomitantly lower entropy cost upon DAPA binding at the active site. The results suggest a general phenomenon of enthalpy-entropy compensation consistent with reduction of dynamics/increased folding of thrombin upon ligand binding to either the active site or to exosite 1. PMID:21526769
Using Carbohydrate Interaction Assays to Reveal Novel Binding Sites in Carbohydrate Active Enzymes.
Cockburn, Darrell; Wilkens, Casper; Dilokpimol, Adiphol; Nakai, Hiroyuki; Lewińska, Anna; Abou Hachem, Maher; Svensson, Birte
2016-01-01
Carbohydrate active enzymes often contain auxiliary binding sites located either on independent domains termed carbohydrate binding modules (CBMs) or as so-called surface binding sites (SBSs) on the catalytic module at a certain distance from the active site. The SBSs are usually critical for the activity of their cognate enzyme, though they are not readily detected in the sequence of a protein, but normally require a crystal structure of a complex for their identification. A variety of methods, including affinity electrophoresis (AE), insoluble polysaccharide pulldown (IPP) and surface plasmon resonance (SPR) have been used to study auxiliary binding sites. These techniques are complementary as AE allows monitoring of binding to soluble polysaccharides, IPP to insoluble polysaccharides and SPR to oligosaccharides. Here we show that these methods are useful not only for analyzing known binding sites, but also for identifying new ones, even without structural data available. We further verify the chosen assays discriminate between known SBS/CBM containing enzymes and negative controls. Altogether 35 enzymes are screened for the presence of SBSs or CBMs and several novel binding sites are identified, including the first SBS ever reported in a cellulase. This work demonstrates that combinations of these methods can be used as a part of routine enzyme characterization to identify new binding sites and advance the study of SBSs and CBMs, allowing them to be detected in the absence of structural data.
Using Carbohydrate Interaction Assays to Reveal Novel Binding Sites in Carbohydrate Active Enzymes
Wilkens, Casper; Dilokpimol, Adiphol; Nakai, Hiroyuki; Lewińska, Anna; Abou Hachem, Maher; Svensson, Birte
2016-01-01
Carbohydrate active enzymes often contain auxiliary binding sites located either on independent domains termed carbohydrate binding modules (CBMs) or as so-called surface binding sites (SBSs) on the catalytic module at a certain distance from the active site. The SBSs are usually critical for the activity of their cognate enzyme, though they are not readily detected in the sequence of a protein, but normally require a crystal structure of a complex for their identification. A variety of methods, including affinity electrophoresis (AE), insoluble polysaccharide pulldown (IPP) and surface plasmon resonance (SPR) have been used to study auxiliary binding sites. These techniques are complementary as AE allows monitoring of binding to soluble polysaccharides, IPP to insoluble polysaccharides and SPR to oligosaccharides. Here we show that these methods are useful not only for analyzing known binding sites, but also for identifying new ones, even without structural data available. We further verify the chosen assays discriminate between known SBS/CBM containing enzymes and negative controls. Altogether 35 enzymes are screened for the presence of SBSs or CBMs and several novel binding sites are identified, including the first SBS ever reported in a cellulase. This work demonstrates that combinations of these methods can be used as a part of routine enzyme characterization to identify new binding sites and advance the study of SBSs and CBMs, allowing them to be detected in the absence of structural data. PMID:27504624
Volatile anesthetics compete for common binding sites on bovine serum albumin: a 19F-NMR study.
Dubois, B W; Cherian, S F; Evers, A S
1993-01-01
There is controversy as to the molecular nature of volatile anesthetic target sites. One proposal is that volatile anesthetics bind directly to hydrophobic binding sites on certain sensitive target proteins. Consistent with this hypothesis, we have previously shown that a fluorinated volatile anesthetic, isoflurane, binds saturably [Kd (dissociation constant) = 1.4 +/- 0.2 mM, Bmax = 4.2 +/- 0.3 sites] to fatty acid-displaceable domains on serum albumin. In the current study, we used 19F-NMR T2 relaxation to examine whether other volatile anesthetics bind to the same sites on albumin and, if so, whether they vary in their affinity for these sites. We show that three other fluorinated volatile anesthetics bind with varying affinity to fatty acid-displaceable domains on serum albumin: halothane, Kd = 1.3 +/- 0.2 mM; methoxyflurane, Kd = 2.6 +/- 0.3 mM; and sevoflurane, Kd = 4.5 +/- 0.6 mM. These three anesthetics inhibit isoflurane binding in a competitive manner: halothane, K(i) (inhibition constant) = 1.3 +/- 0.2 mM; methoxyflurane, K(i) = 2.5 +/- 0.4 mM; and sevoflurane, K(i) = 5.4 +/- 0.7 mM--similar to each anesthetic's respective Kd of binding to fatty acid displaceable sites. These results illustrate that a variety of volatile anesthetics can compete for binding to specific sites on a protein. PMID:8341659
Characterization of melatonin binding sites in the Harderian gland and median eminence of the rat
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lopez-Gonzalez, M.A.; Calvo, J.R.; Rubio, A.
The characterization of specific melatonin binding sites in the Harderian gland (HG) and median eminence (ME) of the rat was studied using ({sup 125}I)melatonin. Binding of melatonin to membrane crude preparations of both tissues was dependent on time and temperature. Thus, maximal binding was obtained at 37{degree}C after 30-60 min incubation. Binding was also dependent on protein concentration. The specific binding of ({sup 125}I)melatonin was saturable, exhibiting only the class of binding sites in both tissues. The dissociation constants (Kd) were 170 and 190 pM for ME and HG, respectively. The concentration of the binding sites in ME was 8more » fmol/mg protein, and in the HG 4 fmol/mg protein. In competition studies, binding of ({sup 125}I)melatonin to ME or HG was inhibited by increasing concentration of native melatonin; 50% inhibition was observed at about 702 and 422 nM for ME and HG, respectively. Additionally, the ({sup 125}I)melatonin binding to the crude membranes was not affected by the addition of different drugs such as norepinephrine, isoproterenol, phenylephrine, propranolol, or prazosin. The results confirm the presence of melatonin binding sites in median eminence and show, for the first time, the existence of melatonin binding sites in the Harderian gland.« less
In vivo binding of PRDM9 reveals interactions with noncanonical genomic sites
Grey, Corinne; Clément, Julie A.J.; Buard, Jérôme; Leblanc, Benjamin; Gut, Ivo; Gut, Marta; Duret, Laurent
2017-01-01
In mouse and human meiosis, DNA double-strand breaks (DSBs) initiate homologous recombination and occur at specific sites called hotspots. The localization of these sites is determined by the sequence-specific DNA binding domain of the PRDM9 histone methyl transferase. Here, we performed an extensive analysis of PRDM9 binding in mouse spermatocytes. Unexpectedly, we identified a noncanonical recruitment of PRDM9 to sites that lack recombination activity and the PRDM9 binding consensus motif. These sites include gene promoters, where PRDM9 is recruited in a DSB-dependent manner. Another subset reveals DSB-independent interactions between PRDM9 and genomic sites, such as the binding sites for the insulator protein CTCF. We propose that these DSB-independent sites result from interactions between hotspot-bound PRDM9 and genomic sequences located on the chromosome axis. PMID:28336543
DOE Office of Scientific and Technical Information (OSTI.GOV)
Biegon, A.; Rainbow, T.C.
1983-05-01
The high affinity binding sites for the antidepressant desmethlyimipramine (DMI) have been localized in rat brain by quantitative autoradiography. There are high concentrations of binding sites in the locus ceruleus, the anterior ventral thalamus, the ventral portion of the bed nucleus of the stria terminalis, the paraventricular and the dorsomedial nuclei of the hypothalamus. The distribution of DMI binding sites is in striking accord with the distribution of norepinephrine terminals. Pretreatment of rats with the neurotoxin 6-hydroxydopamine, which causes a selective degeneration of catecholamine terminals, results in 60 to 90% decrease in DMI binding. These data support the idea thatmore » high affinity binding sites for DMI are located on presynaptic noradrenergic terminals.« less
Klingl, Stefan; Sandmann, Achim; Taccardi, Nicola; Sticht, Heinrich; Muller, Yves A.; Hensel, Michael
2017-01-01
The giant non-fimbrial adhesin SiiE of Salmonella enterica mediates the first contact to the apical site of epithelial cells and enables subsequent invasion. SiiE is a 595 kDa protein composed of 53 repetitive bacterial immunoglobulin (BIg) domains and the only known substrate of the SPI4-encoded type 1 secretion system (T1SS). The crystal structure of BIg50-52 of SiiE revealed two distinct Ca2+-binding sites per BIg domain formed by conserved aspartate or glutamate residues. In a mutational analysis Ca2+-binding sites were disrupted by aspartate to serine exchange at various positions in the BIg domains of SiiE. Amounts of secreted SiiE diminish with a decreasing number of intact Ca2+-binding sites. BIg domains of SiiE contain distinct Ca2+-binding sites, with type I sites being similar to other T1SS-secreted proteins and type II sites newly identified in SiiE. We functionally and structurally dissected the roles of type I and type II Ca2+-binding sites in SiiE, as well as the importance of Ca2+-binding sites in various positions of SiiE. Type I Ca2+-binding sites were critical for efficient secretion of SiiE and a decreasing number of type I sites correlated with reduced secretion. Type II sites were less important for secretion, stability and surface expression of SiiE, however integrity of type II sites in the C-terminal portion was required for the function of SiiE in mediating adhesion and invasion. PMID:28558023
Binding of N-methylscopolamine to the extracellular domain of muscarinic acetylcholine receptors
NASA Astrophysics Data System (ADS)
Jakubík, Jan; Randáková, Alena; Zimčík, Pavel; El-Fakahany, Esam E.; Doležal, Vladimír
2017-01-01
Interaction of orthosteric ligands with extracellular domain was described at several aminergic G protein-coupled receptors, including muscarinic acetylcholine receptors. The orthosteric antagonists quinuclidinyl benzilate (QNB) and N-methylscopolamine (NMS) bind to the binding pocket of the muscarinic acetylcholine receptor formed by transmembrane α-helices. We show that high concentrations of either QNB or NMS slow down dissociation of their radiolabeled species from all five subtypes of muscarinic acetylcholine receptors, suggesting allosteric binding. The affinity of NMS at the allosteric site is in the micromolar range for all receptor subtypes. Using molecular modelling of the M2 receptor we found that E172 and E175 in the second extracellular loop and N419 in the third extracellular loop are involved in allosteric binding of NMS. Mutation of these amino acids to alanine decreased affinity of NMS for the allosteric binding site confirming results of molecular modelling. The allosteric binding site of NMS overlaps with the binding site of some allosteric, ectopic and bitopic ligands. Understanding of interactions of NMS at the allosteric binding site is essential for correct analysis of binding and action of these ligands.
STUDIES OF VERAPAMIL BINDING TO HUMAN SERUM ALBUMIN BY HIGH-PERFORMANCE AFFINITY CHROMATOGRAPHY
Mallik, Rangan; Yoo, Michelle J.; Chen, Sike; Hage, David S.
2008-01-01
The binding of verapamil to the protein human serum albumin (HSA) was examined by using high-performance affinity chromatography. Many previous reports have investigated the binding of verapamil with HSA, but the exact strength and nature of this interaction (e.g., the number and location of binding sites) is still unclear. In this study, frontal analysis indicated that at least one major binding site was present for R- and S-verapamil on HSA, with estimated association equilibrium constants on the order of 104 M−1 and a 1.4-fold difference in these values for the verapamil enantiomers at pH 7.4 and 37°C. The presence of a second, weaker group of binding sites on HSA was also suggested by these results. Competitive binding studies using zonal elution were carried out between verapamil and various probe compounds that have known interactions with several major and minor sites on HSA. R/S-Verapamil was found to have direct competition with S-warfarin, indicating that verapamil was binding to Sudlow site I (i.e., the warfarin-azapropazone site of HSA). The average association equilibrium constant for R- and S-verapamil at this site was 1.4 (±0.1) × 104 M−1. Verapamil did not have any notable binding to Sudlow site II of HSA but did appear to have some weak allosteric interactions with L-tryptophan, a probe for this site. An allosteric interaction between verapamil and tamoxifen (a probe for the tamoxifen site) was also noted, which was consistent with the binding of verapamil at Sudlow site I. No interaction was seen between verapamil and digitoxin, a probe for the digitoxin site of HSA. These results gave good agreement with previous observations made in the literature and help provide a more detailed description of how verapamil is transported in blood and of how it may interact with other drugs in the body. PMID:18980867
Ho, Daniel W H; Sze, Karen M F; Ng, Irene O L
2015-08-28
Viral integration into the human genome upon infection is an important risk factor for various human malignancies. We developed viral integration site detection tool called Virus-Clip, which makes use of information extracted from soft-clipped sequencing reads to identify exact positions of human and virus breakpoints of integration events. With initial read alignment to virus reference genome and streamlined procedures, Virus-Clip delivers a simple, fast and memory-efficient solution to viral integration site detection. Moreover, it can also automatically annotate the integration events with the corresponding affected human genes. Virus-Clip has been verified using whole-transcriptome sequencing data and its detection was validated to have satisfactory sensitivity and specificity. Marked advancement in performance was detected, compared to existing tools. It is applicable to versatile types of data including whole-genome sequencing, whole-transcriptome sequencing, and targeted sequencing. Virus-Clip is available at http://web.hku.hk/~dwhho/Virus-Clip.zip.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Palacios, J.M.; Chinaglia, G.; Rigo, M.
1991-02-01
Autoradiographic techniques were used to examine the distribution and levels of neurotensin receptor binding sites in the basal ganglia and related regions of the human brain. Monoiodo ({sup 125}I-Tyr3)neurotensin was used as a ligand. High amounts of neurotensin receptor binding sites were found in the substantia nigra pars compacta. Lower but significant quantities of neurotensin receptor binding sites characterized the caudate, putamen, and nucleus accumbens, while very low quantities were seen in both medial and lateral segments of the globus pallidus. In Huntington's chorea, the levels of neurotensin receptor binding sites were found to be comparable to those of controlmore » cases. Only slight but not statistically significant decreases in amounts of receptor binding sites were detected in the dorsal part of the head and in the body of caudate nucleus. No alterations in the levels of neurotensin receptor binding sites were observed in the substantia nigra pars compacta and reticulata. These results suggest that a large proportion of neurotensin receptor binding sites in the basal ganglia are located on intrinsic neurons and on extrinsic afferent fibers that do not degenerate in Huntington's disease.« less
The Accuracy and Reliability of Crowdsource Annotations of Digital Retinal Images
Mitry, Danny; Zutis, Kris; Dhillon, Baljean; Peto, Tunde; Hayat, Shabina; Khaw, Kay-Tee; Morgan, James E.; Moncur, Wendy; Trucco, Emanuele; Foster, Paul J.
2016-01-01
Purpose Crowdsourcing is based on outsourcing computationally intensive tasks to numerous individuals in the online community who have no formal training. Our aim was to develop a novel online tool designed to facilitate large-scale annotation of digital retinal images, and to assess the accuracy of crowdsource grading using this tool, comparing it to expert classification. Methods We used 100 retinal fundus photograph images with predetermined disease criteria selected by two experts from a large cohort study. The Amazon Mechanical Turk Web platform was used to drive traffic to our site so anonymous workers could perform a classification and annotation task of the fundus photographs in our dataset after a short training exercise. Three groups were assessed: masters only, nonmasters only and nonmasters with compulsory training. We calculated the sensitivity, specificity, and area under the curve (AUC) of receiver operating characteristic (ROC) plots for all classifications compared to expert grading, and used the Dice coefficient and consensus threshold to assess annotation accuracy. Results In total, we received 5389 annotations for 84 images (excluding 16 training images) in 2 weeks. A specificity and sensitivity of 71% (95% confidence interval [CI], 69%–74%) and 87% (95% CI, 86%–88%) was achieved for all classifications. The AUC in this study for all classifications combined was 0.93 (95% CI, 0.91–0.96). For image annotation, a maximal Dice coefficient (∼0.6) was achieved with a consensus threshold of 0.25. Conclusions This study supports the hypothesis that annotation of abnormalities in retinal images by ophthalmologically naive individuals is comparable to expert annotation. The highest AUC and agreement with expert annotation was achieved in the nonmasters with compulsory training group. Translational Relevance The use of crowdsourcing as a technique for retinal image analysis may be comparable to expert graders and has the potential to deliver timely, accurate, and cost-effective image analysis. PMID:27668130
The Accuracy and Reliability of Crowdsource Annotations of Digital Retinal Images.
Mitry, Danny; Zutis, Kris; Dhillon, Baljean; Peto, Tunde; Hayat, Shabina; Khaw, Kay-Tee; Morgan, James E; Moncur, Wendy; Trucco, Emanuele; Foster, Paul J
2016-09-01
Crowdsourcing is based on outsourcing computationally intensive tasks to numerous individuals in the online community who have no formal training. Our aim was to develop a novel online tool designed to facilitate large-scale annotation of digital retinal images, and to assess the accuracy of crowdsource grading using this tool, comparing it to expert classification. We used 100 retinal fundus photograph images with predetermined disease criteria selected by two experts from a large cohort study. The Amazon Mechanical Turk Web platform was used to drive traffic to our site so anonymous workers could perform a classification and annotation task of the fundus photographs in our dataset after a short training exercise. Three groups were assessed: masters only, nonmasters only and nonmasters with compulsory training. We calculated the sensitivity, specificity, and area under the curve (AUC) of receiver operating characteristic (ROC) plots for all classifications compared to expert grading, and used the Dice coefficient and consensus threshold to assess annotation accuracy. In total, we received 5389 annotations for 84 images (excluding 16 training images) in 2 weeks. A specificity and sensitivity of 71% (95% confidence interval [CI], 69%-74%) and 87% (95% CI, 86%-88%) was achieved for all classifications. The AUC in this study for all classifications combined was 0.93 (95% CI, 0.91-0.96). For image annotation, a maximal Dice coefficient (∼0.6) was achieved with a consensus threshold of 0.25. This study supports the hypothesis that annotation of abnormalities in retinal images by ophthalmologically naive individuals is comparable to expert annotation. The highest AUC and agreement with expert annotation was achieved in the nonmasters with compulsory training group. The use of crowdsourcing as a technique for retinal image analysis may be comparable to expert graders and has the potential to deliver timely, accurate, and cost-effective image analysis.
n-Dodecyl β-D-maltoside specifically competes with general anesthetics for anesthetic binding sites.
Xu, Longhe; Matsunaga, Felipe; Xi, Jin; Li, Min; Ma, Jingyuan; Liu, Renyu
2014-01-01
We recently demonstrated that the anionic detergent sodium dodecyl sulfate (SDS) specifically interacts with the anesthetic binding site in horse spleen apoferritin, a soluble protein which models anesthetic binding sites in receptors. This raises the possibility of other detergents similarly interacting with and occluding such sites from anesthetics, thereby preventing the proper identification of novel anesthetic binding sites. n-Dodecyl β-D-maltoside (DDM) is a non-ionic detergent commonly used during protein-anesthetic studies because of its mild and non-denaturing properties. In this study, we demonstrate that SDS and DDM occupy anesthetic binding sites in the model proteins human serum albumin (HSA) and horse spleen apoferritin and thereby inhibit the binding of the general anesthetics propofol and isoflurane. DDM specifically interacts with HSA (Kd = 40 μM) with a lower affinity than SDS (Kd = 2 μM). DDM exerts all these effects while not perturbing the native structures of either model protein. Computational calculations corroborated the experimental results by demonstrating that the binding sites for DDM and both anesthetics on the model proteins overlapped. Collectively, our results indicate that DDM and SDS specifically interact with anesthetic binding sites and may thus prevent the identification of novel anesthetic sites. Special precaution should be taken when undertaking and interpreting results from protein-anesthetic investigations utilizing detergents like SDS and DDM.
High-Affinity Quasi-Specific Sites in the Genome: How the DNA-Binding Proteins Cope with Them
Chakrabarti, J.; Chandra, Navin; Raha, Paromita; Roy, Siddhartha
2011-01-01
Many prokaryotic transcription factors home in on one or a few target sites in the presence of a huge number of nonspecific sites. Our analysis of λ-repressor in the Escherichia coli genome based on single basepair substitution experiments shows the presence of hundreds of sites having binding energy within 3 Kcal/mole of the OR1 binding energy, and thousands of sites with binding energy above the nonspecific binding energy. The effect of such sites on DNA-based processes has not been fully explored. The presence of such sites dramatically lowers the occupation probability of the specific site far more than if the genome were composed of nonspecific sites only. Our Brownian dynamics studies show that the presence of quasi-specific sites results in very significant kinetic effects as well. In contrast to λ-repressor, the E. coli genome has orders of magnitude lower quasi-specific sites for GalR, an integral transcription factor, thus causing little competition for the specific site. We propose that GalR and perhaps repressors of the same family have evolved binding modes that lead to much smaller numbers of quasi-specific sites to remove the untoward effects of genomic DNA. PMID:21889449
Kinze, S; Schöneberg, T; Meyer, R; Martin, H; Kaufmann, R
1996-10-11
In this paper, cholecystokinin (CCK) B-type binding sites were characterized with receptor binding studies in different human brain regions (various parts of cerebral cortex, basal ganglia, hippocampus, thalamus, cerebellar cortex) collected from 22 human postmortem brains. With the exception of the thalamus, where no specific CCK binding sites were found, a pharmacological characterization demonstrated a single class of high affinity CCK sites in all brain areas investigated. Receptor densities ranged from 0.5 fmol/mg protein (hippocampus) to 8.4 fmol/mg protein (nucleus caudatus). These CCK binding sites displayed a typical CCKA binding profile as shown in competition studies by using different CCK-related compounds and non peptide CCK antagonists discriminating between CCKA and CCKB sites. The rank order of agonist or antagonist potency in inhibiting specific sulphated [propionyl-3H]cholecystokinin octapeptide binding was similar and highly correlated for the brain regions investigated as demonstrated by a computer-assisted analysis. Therefore it is concluded that CCKB binding sites in human cerebral cortex, basal ganglia, cerebellar cortex share identical ligand binding characteristics.
Six independent fucose-binding sites in the crystal structure of Aspergillus oryzae lectin
DOE Office of Scientific and Technical Information (OSTI.GOV)
Makyio, Hisayoshi; Shimabukuro, Junpei; Suzuki, Tatsuya
The crystal structure of AOL (a fucose-specific lectin of Aspergillus oryzae) has been solved by SAD (single-wavelength anomalous diffraction) and MAD (multi-wavelength anomalous diffraction) phasing of seleno-fucosides. The overall structure is a six-bladed β-propeller similar to that of other fucose-specific lectins. The fucose moieties of the seleno-fucosides are located in six fucose-binding sites. Although the Arg and Glu/Gln residues bound to the fucose moiety are common to all fucose-binding sites, the amino-acid residues involved in fucose binding at each site are not identical. The varying peak heights of the seleniums in the electron density map suggest that each fucose-binding sitemore » has a different carbohydrate binding affinity. - Highlights: • The six-bladed β-propeller structure of AOL was solved by seleno-sugar phasing. • The mode of fucose binding is essentially conserved at all six binding sites. • The seleno-fucosides exhibit slightly different interactions and electron densities. • These findings suggest that the affinity for fucose is not identical at each site.« less
Binding Leverage as a Molecular Basis for Allosteric Regulation
Mitternacht, Simon; Berezovsky, Igor N.
2011-01-01
Allosteric regulation involves conformational transitions or fluctuations between a few closely related states, caused by the binding of effector molecules. We introduce a quantity called binding leverage that measures the ability of a binding site to couple to the intrinsic motions of a protein. We use Monte Carlo simulations to generate potential binding sites and either normal modes or pairs of crystal structures to describe relevant motions. We analyze single catalytic domains and multimeric allosteric enzymes with complex regulation. For the majority of the analyzed proteins, we find that both catalytic and allosteric sites have high binding leverage. Furthermore, our analysis of the catabolite activator protein, which is allosteric without conformational change, shows that its regulation involves other types of motion than those modulated at sites with high binding leverage. Our results point to the importance of incorporating dynamic information when predicting functional sites. Because it is possible to calculate binding leverage from a single crystal structure it can be used for characterizing proteins of unknown function and predicting latent allosteric sites in any protein, with implications for drug design. PMID:21935347
Kimura, Yukihiro; Yura, Yuki; Hayashi, Yusuke; Li, Yong; Onoda, Moe; Yu, Long-Jiang; Wang-Otomo, Zheng-Yu; Ohno, Takashi
2016-12-15
The light-harvesting 1 reaction center (LH1-RC) complex from thermophilic photosynthetic bacterium Thermochromatium (Tch.) tepidum exhibits enhanced thermostability and an unusual LH1 Q y transition, both induced by Ca 2+ binding. In this study, metal-binding sites and metal-protein interactions in the LH1-RC complexes from wild-type (B915) and biosynthetically Sr 2+ -substituted (B888) Tch. tepidum were investigated by isothermal titration calorimetry (ITC), atomic absorption (AA), and attenuated total reflection (ATR) Fourier transform infrared (FTIR) spectroscopies. The ITC measurements revealed stoichiometric ratios of approximately 1:1 for binding of Ca 2+ , Sr 2+ , or Ba 2+ to the LH1 αβ-subunit, indicating the presence of 16 binding sites in both B915 and B888. The AA analysis provided direct evidence for Ca 2+ and Sr 2+ binding to B915 and B888, respectively, in their purified states. Metal-binding experiments supported that Ca 2+ and Sr 2+ (or Ba 2+ ) competitively associate with the binding sites in both species. The ATR-FTIR difference spectra upon Ca 2+ depletion and Sr 2+ substitution demonstrated that dissociation and binding of Ca 2+ are predominantly responsible for metal-dependent conformational changes of B915 and B888. The present results are largely compatible with the recent structural evidence that another binding site for Sr 2+ (or Ba 2+ ) exists in the vicinity of the Ca 2+ -binding site, a part of which is shared in both metal-binding sites.
2011-01-01
Background Along with high affinity binding of epibatidine (Kd1≈10 pM) to α4β2 nicotinic acetylcholine receptor (nAChR), low affinity binding of epibatidine (Kd2≈1-10 nM) to an independent binding site has been reported. Studying this low affinity binding is important because it might contribute understanding about the structure and synthesis of α4β2 nAChR. The binding behavior of epibatidine and α4β2 AChR raises a question about interpreting binding data from two independent sites with ligand depletion and nonspecific binding, both of which can affect equilibrium binding of [3H]epibatidine and α4β2 nAChR. If modeled incorrectly, ligand depletion and nonspecific binding lead to inaccurate estimates of binding constants. Fitting total equilibrium binding as a function of total ligand accurately characterizes a single site with ligand depletion and nonspecific binding. The goal of this study was to determine whether this approach is sufficient with two independent high and low affinity sites. Results Computer simulations of binding revealed complexities beyond fitting total binding for characterizing the second, low affinity site of α4β2 nAChR. First, distinguishing low-affinity specific binding from nonspecific binding was a potential problem with saturation data. Varying the maximum concentration of [3H]epibatidine, simultaneously fitting independently measured nonspecific binding, and varying α4β2 nAChR concentration were effective remedies. Second, ligand depletion helped identify the low affinity site when nonspecific binding was significant in saturation or competition data, contrary to a common belief that ligand depletion always is detrimental. Third, measuring nonspecific binding without α4β2 nAChR distinguished better between nonspecific binding and low-affinity specific binding under some circumstances of competitive binding than did presuming nonspecific binding to be residual [3H]epibatidine binding after adding a large concentration of cold competitor. Fourth, nonspecific binding of a heterologous competitor changed estimates of high and low inhibition constants but did not change the ratio of those estimates. Conclusions Investigating the low affinity site of α4β2 nAChR with equilibrium binding when ligand depletion and nonspecific binding are present likely needs special attention to experimental design and data interpretation beyond fitting total binding data. Manipulation of maximum ligand and receptor concentrations and intentionally increasing ligand depletion are potentially helpful approaches. PMID:22112852
Evaluation of the Significance of Starch Surface Binding Sites on Human Pancreatic α-Amylase.
Zhang, Xiaohua; Caner, Sami; Kwan, Emily; Li, Chunmin; Brayer, Gary D; Withers, Stephen G
2016-11-01
Starch provides the major source of caloric intake in many diets. Cleavage of starch into malto-oligosaccharides in the gut is catalyzed by pancreatic α-amylase. These oligosaccharides are then further cleaved by gut wall α-glucosidases to release glucose, which is absorbed into the bloodstream. Potential surface binding sites for starch on the pancreatic amylase, distinct from the active site of the amylase, have been identified through X-ray crystallographic analyses. The role of these sites in the degradation of both starch granules and soluble starch was probed by the generation of a series of surface variants modified at each site to disrupt binding. Kinetic analysis of the binding and/or cleavage of substrates ranging from simple maltotriosides to soluble starch and insoluble starch granules has allowed evaluation of the potential role of each such surface site. In this way, two key surface binding sites, on the same face as the active site, are identified. One site, containing a pair of aromatic residues, is responsible for attachment to starch granules, while a second site featuring a tryptophan residue around which a malto-oligosaccharide wraps is shown to heavily influence soluble starch binding and hydrolysis. These studies provide insights into the mechanisms by which enzymes tackle the degradation of largely insoluble polymers and also present some new approaches to the interrogation of the binding sites involved.
Cooperative DNA binding and sequence discrimination by the Opaque2 bZIP factor.
Yunes, J A; Vettore, A L; da Silva, M J; Leite, A; Arruda, P
1998-01-01
The maize Opaque2 (O2) protein is a basic leucine zipper transcription factor that controls the expression of distinct classes of endosperm genes through the recognition of different cis-acting elements in their promoters. The O2 target region in the promoter of the alpha-coixin gene was analyzed in detail and shown to comprise two closely adjacent binding sites, named O2u and O2d, which are related in sequence to the GCN4 binding site. Quantitative DNase footprint analysis indicated that O2 binding to alpha-coixin target sites is best described by a cooperative model. Transient expression assays showed that the two adjacent sites act synergistically. This synergy is mediated in part by cooperative DNA binding. In tobacco protoplasts, O2 binding at the O2u site is more important for enhancer activity than is binding at the O2d site, suggesting that the architecture of the O2-DNA complex is important for interaction with the transcriptional machinery. PMID:9811800
Cooperative DNA binding and sequence discrimination by the Opaque2 bZIP factor.
Yunes, J A; Vettore, A L; da Silva, M J; Leite, A; Arruda, P
1998-11-01
The maize Opaque2 (O2) protein is a basic leucine zipper transcription factor that controls the expression of distinct classes of endosperm genes through the recognition of different cis-acting elements in their promoters. The O2 target region in the promoter of the alpha-coixin gene was analyzed in detail and shown to comprise two closely adjacent binding sites, named O2u and O2d, which are related in sequence to the GCN4 binding site. Quantitative DNase footprint analysis indicated that O2 binding to alpha-coixin target sites is best described by a cooperative model. Transient expression assays showed that the two adjacent sites act synergistically. This synergy is mediated in part by cooperative DNA binding. In tobacco protoplasts, O2 binding at the O2u site is more important for enhancer activity than is binding at the O2d site, suggesting that the architecture of the O2-DNA complex is important for interaction with the transcriptional machinery.
Position specific variation in the rate of evolution in transcription factor binding sites
Moses, Alan M; Chiang, Derek Y; Kellis, Manolis; Lander, Eric S; Eisen, Michael B
2003-01-01
Background The binding sites of sequence specific transcription factors are an important and relatively well-understood class of functional non-coding DNAs. Although a wide variety of experimental and computational methods have been developed to characterize transcription factor binding sites, they remain difficult to identify. Comparison of non-coding DNA from related species has shown considerable promise in identifying these functional non-coding sequences, even though relatively little is known about their evolution. Results Here we analyse the genome sequences of the budding yeasts Saccharomyces cerevisiae, S. bayanus, S. paradoxus and S. mikatae to study the evolution of transcription factor binding sites. As expected, we find that both experimentally characterized and computationally predicted binding sites evolve slower than surrounding sequence, consistent with the hypothesis that they are under purifying selection. We also observe position-specific variation in the rate of evolution within binding sites. We find that the position-specific rate of evolution is positively correlated with degeneracy among binding sites within S. cerevisiae. We test theoretical predictions for the rate of evolution at positions where the base frequencies deviate from background due to purifying selection and find reasonable agreement with the observed rates of evolution. Finally, we show how the evolutionary characteristics of real binding motifs can be used to distinguish them from artefacts of computational motif finding algorithms. Conclusion As has been observed for protein sequences, the rate of evolution in transcription factor binding sites varies with position, suggesting that some regions are under stronger functional constraint than others. This variation likely reflects the varying importance of different positions in the formation of the protein-DNA complex. The characterization of the pattern of evolution in known binding sites will likely contribute to the effective use of comparative sequence data in the identification of transcription factor binding sites and is an important step toward understanding the evolution of functional non-coding DNA. PMID:12946282
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zoghbi, M. E.; Altenberg, G. A.
The functional unit of ATP-binding cassette (ABC) transporters consists of two transmembrane domains and two nucleotide-binding domains (NBDs). ATP binding elicits association of the two NBDs, forming a dimer in a head-to-tail arrangement, with two nucleotides “sandwiched” at the dimer interface. Each of the two nucleotide-binding sites is formed by residues from the two NBDs. We recently found that the prototypical NBD MJ0796 from Methanocaldococcus jannaschii dimerizes in response to ATP binding and dissociates completely following ATP hydrolysis. However, it is still unknown whether dissociation of NBD dimers follows ATP hydrolysis at one or both nucleotide-binding sites. Here, we usedmore » luminescence resonance energy transfer to study heterodimers formed by one active (donor-labeled) and one catalytically defective (acceptor-labeled) NBD. Rapid mixing experiments in a stop-flow chamber showed that NBD heterodimers with one functional and one inactive site dissociated at a rate indistinguishable from that of dimers with two hydrolysis-competent sites. Comparison of the rates of NBD dimer dissociation and ATP hydrolysis indicated that dissociation followed hydrolysis of one ATP. We conclude that ATP hydrolysis at one nucleotide-binding site drives NBD dimer dissociation.« less
The 2nd DBCLS BioHackathon: interoperable bioinformatics Web services for integrated applications
2011-01-01
Background The interaction between biological researchers and the bioinformatics tools they use is still hampered by incomplete interoperability between such tools. To ensure interoperability initiatives are effectively deployed, end-user applications need to be aware of, and support, best practices and standards. Here, we report on an initiative in which software developers and genome biologists came together to explore and raise awareness of these issues: BioHackathon 2009. Results Developers in attendance came from diverse backgrounds, with experts in Web services, workflow tools, text mining and visualization. Genome biologists provided expertise and exemplar data from the domains of sequence and pathway analysis and glyco-informatics. One goal of the meeting was to evaluate the ability to address real world use cases in these domains using the tools that the developers represented. This resulted in i) a workflow to annotate 100,000 sequences from an invertebrate species; ii) an integrated system for analysis of the transcription factor binding sites (TFBSs) enriched based on differential gene expression data obtained from a microarray experiment; iii) a workflow to enumerate putative physical protein interactions among enzymes in a metabolic pathway using protein structure data; iv) a workflow to analyze glyco-gene-related diseases by searching for human homologs of glyco-genes in other species, such as fruit flies, and retrieving their phenotype-annotated SNPs. Conclusions Beyond deriving prototype solutions for each use-case, a second major purpose of the BioHackathon was to highlight areas of insufficiency. We discuss the issues raised by our exploration of the problem/solution space, concluding that there are still problems with the way Web services are modeled and annotated, including: i) the absence of several useful data or analysis functions in the Web service "space"; ii) the lack of documentation of methods; iii) lack of compliance with the SOAP/WSDL specification among and between various programming-language libraries; and iv) incompatibility between various bioinformatics data formats. Although it was still difficult to solve real world problems posed to the developers by the biological researchers in attendance because of these problems, we note the promise of addressing these issues within a semantic framework. PMID:21806842
The 2nd DBCLS BioHackathon: interoperable bioinformatics Web services for integrated applications.
Katayama, Toshiaki; Wilkinson, Mark D; Vos, Rutger; Kawashima, Takeshi; Kawashima, Shuichi; Nakao, Mitsuteru; Yamamoto, Yasunori; Chun, Hong-Woo; Yamaguchi, Atsuko; Kawano, Shin; Aerts, Jan; Aoki-Kinoshita, Kiyoko F; Arakawa, Kazuharu; Aranda, Bruno; Bonnal, Raoul Jp; Fernández, José M; Fujisawa, Takatomo; Gordon, Paul Mk; Goto, Naohisa; Haider, Syed; Harris, Todd; Hatakeyama, Takashi; Ho, Isaac; Itoh, Masumi; Kasprzyk, Arek; Kido, Nobuhiro; Kim, Young-Joo; Kinjo, Akira R; Konishi, Fumikazu; Kovarskaya, Yulia; von Kuster, Greg; Labarga, Alberto; Limviphuvadh, Vachiranee; McCarthy, Luke; Nakamura, Yasukazu; Nam, Yunsun; Nishida, Kozo; Nishimura, Kunihiro; Nishizawa, Tatsuya; Ogishima, Soichi; Oinn, Tom; Okamoto, Shinobu; Okuda, Shujiro; Ono, Keiichiro; Oshita, Kazuki; Park, Keun-Joon; Putnam, Nicholas; Senger, Martin; Severin, Jessica; Shigemoto, Yasumasa; Sugawara, Hideaki; Taylor, James; Trelles, Oswaldo; Yamasaki, Chisato; Yamashita, Riu; Satoh, Noriyuki; Takagi, Toshihisa
2011-08-02
The interaction between biological researchers and the bioinformatics tools they use is still hampered by incomplete interoperability between such tools. To ensure interoperability initiatives are effectively deployed, end-user applications need to be aware of, and support, best practices and standards. Here, we report on an initiative in which software developers and genome biologists came together to explore and raise awareness of these issues: BioHackathon 2009. Developers in attendance came from diverse backgrounds, with experts in Web services, workflow tools, text mining and visualization. Genome biologists provided expertise and exemplar data from the domains of sequence and pathway analysis and glyco-informatics. One goal of the meeting was to evaluate the ability to address real world use cases in these domains using the tools that the developers represented. This resulted in i) a workflow to annotate 100,000 sequences from an invertebrate species; ii) an integrated system for analysis of the transcription factor binding sites (TFBSs) enriched based on differential gene expression data obtained from a microarray experiment; iii) a workflow to enumerate putative physical protein interactions among enzymes in a metabolic pathway using protein structure data; iv) a workflow to analyze glyco-gene-related diseases by searching for human homologs of glyco-genes in other species, such as fruit flies, and retrieving their phenotype-annotated SNPs. Beyond deriving prototype solutions for each use-case, a second major purpose of the BioHackathon was to highlight areas of insufficiency. We discuss the issues raised by our exploration of the problem/solution space, concluding that there are still problems with the way Web services are modeled and annotated, including: i) the absence of several useful data or analysis functions in the Web service "space"; ii) the lack of documentation of methods; iii) lack of compliance with the SOAP/WSDL specification among and between various programming-language libraries; and iv) incompatibility between various bioinformatics data formats. Although it was still difficult to solve real world problems posed to the developers by the biological researchers in attendance because of these problems, we note the promise of addressing these issues within a semantic framework.
Hughes, Samantha J; Tanner, Julian A; Hindley, Alison D; Miller, Andrew D; Gould, Ian R
2003-01-01
Background Charging of transfer-RNA with cognate amino acid is accomplished by the aminoacyl-tRNA synthetases, and proceeds through an aminoacyl adenylate intermediate. The lysyl-tRNA synthetase has evolved an active site that specifically binds lysine and ATP. Previous molecular dynamics simulations of the heat-inducible Escherichia coli lysyl-tRNA synthetase, LysU, have revealed differences in the binding of ATP and aspects of asymmetry between the nominally equivalent active sites of this dimeric enzyme. The possibility that this asymmetry results in different binding affinities for the ligands is addressed here by a parallel computational and biochemical study. Results Biochemical experiments employing isothermal calorimetry, steady-state fluorescence and circular dichroism are used to determine the order and stoichiometries of the lysine and nucleotide binding events, and the associated thermodynamic parameters. An ordered mechanism of substrate addition is found, with lysine having to bind prior to the nucleotide in a magnesium dependent process. Two lysines are found to bind per dimer, and trigger a large conformational change. Subsequent nucleotide binding causes little structural rearrangement and crucially only occurs at a single catalytic site, in accord with the simulations. Molecular dynamics based free energy calculations of the ATP binding process are used to determine the binding affinities of each site. Significant differences in ATP binding affinities are observed, with only one active site capable of realizing the experimental binding free energy. Half-of-the-sites models in which the nucleotide is only present at one active site achieve their full binding potential irrespective of the subunit choice. This strongly suggests the involvement of an anti-cooperative mechanism. Pathways for relaying information between the two active sites are proposed. Conclusions The asymmetry uncovered here appears to be a common feature of oligomeric aminoacyl-tRNA synthetases, and may play an important functional role. We suggest a manner in which catalytic efficiency could be improved by LysU operating in an alternating sites mechanism. PMID:12787471
DOE Office of Scientific and Technical Information (OSTI.GOV)
Larsen, A.
This study examines various energy resources in Utah including oil impregnated rocks (oil shale and oil sand deposits), geothermal, coal, uranium, oil and natural gas in terms of the following dimensions: resurce potential and location; resource technology, development and production status; resource development requirements; potential environmental and socio-economic impacts; and transportation tradeoffs. The advantages of minemouth power plants in comparison to combined cycle or hybrid power plants are also examined. Annotative bibliographies of the energy resources are presented in the appendices. Specific topics summarized in these annotative bibliographies include: economics, environmental impacts, water requirements, production technology, and siting requirements.
Carbohydrate binding properties of the stinging nettle (Urtica dioica) rhizome lectin.
Shibuya, N; Goldstein, I J; Shafer, J A; Peumans, W J; Broekaert, W F
1986-08-15
The interaction of the stinging nettle rhizome lectin (UDA) with carbohydrates was studied by using the techniques of quantitative precipitation, hapten inhibition, equilibrium dialysis, and uv difference spectroscopy. The Carbohydrate binding site of UDA was determined to be complementary to an N,N',N"-triacetylchitotriose unit and proposed to consist of three subsites, each of which has a slightly different binding specificity. UDA also has a hydrophobic interacting region adjacent to the carbohydrate binding site. Equilibrium dialysis and uv difference spectroscopy revealed that UDA has two carbohydrate binding sites per molecule consisting of a single polypeptide chain. These binding sites either have intrinsically different affinities for ligand molecules, or they may display negative cooperativity toward ligand binding.
Cerisier, Natacha; Regad, Leslie; Triki, Dhoha; Petitjean, Michel; Flatters, Delphine; Camproux, Anne-Claude
2017-10-01
While recent literature focuses on drug promiscuity, the characterization of promiscuous binding sites (ability to bind several ligands) remains to be explored. Here, we present a proteochemometric modeling approach to analyze diverse ligands and corresponding multiple binding sub-pockets associated with one promiscuous binding site to characterize protein-ligand recognition. We analyze both geometrical and physicochemical profile correspondences. This approach was applied to examine the well-studied druggable urokinase catalytic domain inhibitor binding site, which results in a large number of complex structures bound to various ligands. This approach emphasizes the importance of jointly characterizing pocket and ligand spaces to explore the impact of ligand diversity on sub-pocket properties and to establish their main profile correspondences. This work supports an interest in mining available 3D holo structures associated with a promiscuous binding site to explore its main protein-ligand recognition tendency. © 2017 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.
RBind: computational network method to predict RNA binding sites.
Wang, Kaili; Jian, Yiren; Wang, Huiwen; Zeng, Chen; Zhao, Yunjie
2018-04-26
Non-coding RNA molecules play essential roles by interacting with other molecules to perform various biological functions. However, it is difficult to determine RNA structures due to their flexibility. At present, the number of experimentally solved RNA-ligand and RNA-protein structures is still insufficient. Therefore, binding sites prediction of non-coding RNA is required to understand their functions. Current RNA binding site prediction algorithms produce many false positive nucleotides that are distance away from the binding sites. Here, we present a network approach, RBind, to predict the RNA binding sites. We benchmarked RBind in RNA-ligand and RNA-protein datasets. The average accuracy of 0.82 in RNA-ligand and 0.63 in RNA-protein testing showed that this network strategy has a reliable accuracy for binding sites prediction. The codes and datasets are available at https://zhaolab.com.cn/RBind. yjzhaowh@mail.ccnu.edu.cn. Supplementary data are available at Bioinformatics online.
Fang, Chong; Nagy-Staroń, Anna; Grafe, Martin; Heermann, Ralf; Jung, Kirsten; Gebhard, Susanne; Mascher, Thorsten
2017-04-01
BceRS and PsdRS are paralogous two-component systems in Bacillus subtilis controlling the response to antimicrobial peptides. In the presence of extracellular bacitracin and nisin, respectively, the two response regulators (RRs) bind their target promoters, P bceA or P psdA , resulting in a strong up-regulation of target gene expression and ultimately antibiotic resistance. Despite high sequence similarity between the RRs BceR and PsdR and their known binding sites, no cross-regulation has been observed between them. We therefore investigated the specificity determinants of P bceA and P psdA that ensure the insulation of these two paralogous pathways at the RR-promoter interface. In vivo and in vitro analyses demonstrate that the regulatory regions within these two promoters contain three important elements: in addition to the known (main) binding site, we identified a linker region and a secondary binding site that are crucial for functionality. Initial binding to the high-affinity, low-specificity main binding site is a prerequisite for the subsequent highly specific binding of a second RR dimer to the low-affinity secondary binding site. In addition to this hierarchical cooperative binding, discrimination requires a competition of the two RRs for their respective binding site mediated by only slight differences in binding affinities. © 2016 John Wiley & Sons Ltd.
A novel substance P binding site in bovine adrenal medulla.
Geraghty, D P; Livett, B G; Rogerson, F M; Burcher, E
1990-05-04
Radioligand binding techniques were used to characterize the substance P (SP) binding site on membranes prepared from bovine adrenal medullae. 125I-labelled Bolton-Hunter substance P (BHSP), which recognises the C-terminally directed, SP-preferring NK1 receptor, showed no specific binding. In contrast, binding of [3H]SP was saturable (at 6 nM) and reversible, with an equilibrium dissociation constant (Kd) 1.46 +/- 0.73 nM, Bmax 0.73 +/- 0.06 pmol/g wet weight and Hill coefficient 0.98 +/- 0.01. Specific binding of [3H]SP was displaced by SP greater than neurokinin A (NKA) greater than SP(3-11) approximately SP(1-9) greater than SP(1-7) approximately SP(1-4) approximately SP(1-6), with neurokinin B (NKB) and SP(1-3) very weak competitors and SP(5-11), SP(7-11) and SP(9-11) causing negligible inhibition (up to 10 microM). This potency order is quite distinct from that seen with binding to an NK1 site, a conclusion confirmed by the lack of BHSP binding. It appears that Lys3 and/or Pro4 are critical for binding, suggesting an anionic binding site. These data suggest the existence of an unusual binding site which may represent a novel SP receptor. This site appears to require the entire sequence of the SP molecule for full recognition.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tam, S.W.; Cook, L.
1984-09-01
The relationship between binding of antipsychotic drugs and sigma psychotomimetic opiates to binding sites for the sigma agonist (+)-(/sup 3/H)SKF 10,047 (N-allylnormetazocine) and to dopamine D/sub 2/ sites was investigated. In guinea pig brain membranes, (+)-(/sup 3/H)SKF 10,047 bound to single class of sites with a K/sub d/ of 4 x 10/sup -8/ M and a B/sub max/ of 333 fmol/mg of protein. This binding was different from ..mu.., kappa, or delta opiate receptor binding. It was inhibited by opiates that produce psychotomimetic activities but not by opiates that lack such activities. Some antipsychotic drugs inhibited (+)-(/sup 3/H)SKF 10,047 bindingmore » with high to moderate affinities in the following order of potency: haloperidol > perphenazine > fluphenazine > acetophenazine > trifluoperazine > molindone greater than or equal to pimozide greater than or equal to thioridazine greater than or equal to chlorpromazine greater than or equal to triflupromazine. However, there were other antipsychotic drugs such as spiperone and clozapine that showed low affinity for the (+)-(/sup 3/H)SKF 10,047 binding sites. Affinities of antipsychotic drugs for (+)-(/sup 3/H)SKF 10,047 binding sites did not correlate with those for (/sup 3/H)spiperone (dopamine D/sub 2/) sites. (/sup 3/H)-Haloperidol binding in whole brain membranes was also inhibited by the sigma opiates pentazocine, cyclazocine, and (+)-(/sup 3/H)SKF 10,047. In the striatum, about half of the saturable (/sup 3/H)haloperidol binding was to (/sup 3/H)spiperone (D/sub 2/) sites and the other half was to sites similar to (+)-(/sup 3/H)SKF 10,047 binding sites. 15 references, 4 figures, 1 table.« less
Principal short-term findings of the National Fire and Fire Surrogate study
James McIver; Karen Erickson; Andrew Youngblood
2012-01-01
Principal findings of the National Fire and Fire Surrogate (FFS) study are presented in an annotated bibliography and summarized in tabular form by site, discipline (ecosystem component), treatment type, and major theme. Composed of 12 sites, the FFS is a comprehensive multidisciplinary experiment designed to evaluate the costs and ecological consequences of...
Tech Talk for Social Studies Teachers Lest We Forget: Remembering Pearl Harbor.
ERIC Educational Resources Information Center
Green, Tim
2001-01-01
Presents an annotated bibliography that provides Web sites about Pearl Harbor (Hawaii). Includes Web sites that cover Pearl Harbor history, a live view of Pearl Harbor, stories from people who remember where they were during the attack, information on the naval station at Pearl Harbor, and a virtual tour of the USS Arizona. (CMK)
Coupry, I; Armsby, C C; Alper, S L; Brugnara, C; Parini, A
1996-01-04
In the present report, we investigated the potential involvement of imidazoline I1 and I2 binding sites in the inhibition of the Ca(2+)-activated K+ channel (Gardos channel) by clotrimazole in human red cells. Ca(2+)-activated 86Rb influx was inhibited by clotrimazole and efaroxan but not by the imidazoline binding site ligands clonidine, moxonidine, cirazoline and idazoxan (100 microM). Binding studies with [3H]idazoxan and [3H]p-aminoclonidine did not reveal the expression of I1 and I2 binding sites in erythrocytes. These data indicate that the effects of clotrimazole and efaroxan on the erythrocyte Ca(2+)-activated K+ channel may be mediated by a 'non-I1/non-I2' binding site.
Accurate and sensitive quantification of protein-DNA binding affinity.
Rastogi, Chaitanya; Rube, H Tomas; Kribelbauer, Judith F; Crocker, Justin; Loker, Ryan E; Martini, Gabriella D; Laptenko, Oleg; Freed-Pastor, William A; Prives, Carol; Stern, David L; Mann, Richard S; Bussemaker, Harmen J
2018-04-17
Transcription factors (TFs) control gene expression by binding to genomic DNA in a sequence-specific manner. Mutations in TF binding sites are increasingly found to be associated with human disease, yet we currently lack robust methods to predict these sites. Here, we developed a versatile maximum likelihood framework named No Read Left Behind (NRLB) that infers a biophysical model of protein-DNA recognition across the full affinity range from a library of in vitro selected DNA binding sites. NRLB predicts human Max homodimer binding in near-perfect agreement with existing low-throughput measurements. It can capture the specificity of the p53 tetramer and distinguish multiple binding modes within a single sample. Additionally, we confirm that newly identified low-affinity enhancer binding sites are functional in vivo, and that their contribution to gene expression matches their predicted affinity. Our results establish a powerful paradigm for identifying protein binding sites and interpreting gene regulatory sequences in eukaryotic genomes. Copyright © 2018 the Author(s). Published by PNAS.
Accurate and sensitive quantification of protein-DNA binding affinity
Rastogi, Chaitanya; Rube, H. Tomas; Kribelbauer, Judith F.; Crocker, Justin; Loker, Ryan E.; Martini, Gabriella D.; Laptenko, Oleg; Freed-Pastor, William A.; Prives, Carol; Stern, David L.; Mann, Richard S.; Bussemaker, Harmen J.
2018-01-01
Transcription factors (TFs) control gene expression by binding to genomic DNA in a sequence-specific manner. Mutations in TF binding sites are increasingly found to be associated with human disease, yet we currently lack robust methods to predict these sites. Here, we developed a versatile maximum likelihood framework named No Read Left Behind (NRLB) that infers a biophysical model of protein-DNA recognition across the full affinity range from a library of in vitro selected DNA binding sites. NRLB predicts human Max homodimer binding in near-perfect agreement with existing low-throughput measurements. It can capture the specificity of the p53 tetramer and distinguish multiple binding modes within a single sample. Additionally, we confirm that newly identified low-affinity enhancer binding sites are functional in vivo, and that their contribution to gene expression matches their predicted affinity. Our results establish a powerful paradigm for identifying protein binding sites and interpreting gene regulatory sequences in eukaryotic genomes. PMID:29610332
Valdramidou, Dimitra; Humphries, Martin J.; Mould, A. Paul
2012-01-01
Integrin-ligand interactions are regulated in a complex manner by divalent cations, and previous studies have identified ligand-competent, stimulatory, and inhibitory cation-binding sites. In collagen-binding integrins, such as α2β1, ligand recognition takes place exclusively at the α subunit I domain. However, activation of the αI domain depends on its interaction with a structurally similar domain in the β subunit known as the I-like or βI domain. The top face of the βI domain contains three cation-binding sites: the metal-ion dependent adhesion site (MIDAS), the ADMIDAS (adjacent to MIDAS) and LIMBS (ligand-associated metal binding site). The role of these sites in controlling ligand binding to the αI domain has yet to be elucidated. Mutation of the MIDAS or LIMBS completely blocked collagen binding to α2β1; in contrast mutation of the ADMIDAS reduced ligand recognition but this effect could be overcome by the activating mAb TS2/16. Hence, the MIDAS and LIMBS appear to be essential for the interaction between αI and βI whereas occupancy of the ADMIDAS has an allosteric effect on the conformation of βI. An activating mutation in the α2 I domain partially restored ligand binding to the MIDAS and LIMBS mutants. Analysis of the effects of Ca2+, Mg2+ and Mn2+ on ligand binding to these mutants showed that the MIDAS is a ligand-competent site through which Mn2+ stimulates ligand binding, whereas the LIMBS is a stimulatory Ca2+-binding site, occupancy of which increases the affinity of Mg2+ for the MIDAS. PMID:18820259
Rehman, Md Tabish; Shamsi, Hira; Khan, Asad U
2014-06-02
The mechanism of interaction between imipenem and HSA was investigated by various techniques like fluorescence, UV.vis absorbance, FRET, circular dichroism, urea denaturation, enzyme kinetics, ITC, and molecular docking. We found that imipenem binds to HSA at a high affinity site located in subdomain IIIA (Sudlow's site I) and a low affinity site located in subdomain IIA.IIB. Electrostatic interactions played a vital role along with hydrogen bonding and hydrophobic interactions in stabilizing the imipenem.HSA complex at subdomain IIIA, while only electrostatic and hydrophobic interactions were present at subdomain IIA.IIB. The binding and thermodynamic parameters obtained by ITC showed that the binding of imipenem to HSA was a spontaneous process (ΔGD⁰(D)= -32.31 kJ mol(-1) for high affinity site and ΔGD⁰(D) = -23.02 kJ mol(-1) for low affinity site) with binding constants in the range of 10(4)-10(5) M(-1). Spectroscopic investigation revealed only one binding site of imipenem on HSA (Ka∼10(4) M(-1)). FRET analysis showed that the binding distance between imipenem and HSA (Trp-214) was optimal (r = 4.32 nm) for quenching to occur. Decrease in esterase-like activity of HSA in the presence of imipenem showed that Arg-410 and Tyr-411 of subdomain IIIA (Sudlow's site II) were directly involved in the binding process. CD spectral analysis showed altered conformation of HSA upon imipenem binding. Moreover, the binding of imipenem to subdomain IIIA (Sudlow's site II) of HSA also affected its folding pathway as clear from urea-induced denaturation studies.
Sriram, K. K.; Yeh, Jia-Wei; Lin, Yii-Lih; Chang, Yi-Ren; Chou, Chia-Fu
2014-01-01
Mapping transcription factor (TF) binding sites along a DNA backbone is crucial in understanding the regulatory circuits that control cellular processes. Here, we deployed a method adopting bioconjugation, nanofluidic confinement and fluorescence single molecule imaging for direct mapping of TF (RNA polymerase) binding sites on field-stretched single DNA molecules. Using this method, we have mapped out five of the TF binding sites of E. coli RNA polymerase to bacteriophage λ-DNA, where two promoter sites and three pseudo-promoter sites are identified with the corresponding binding frequency of 45% and 30%, respectively. Our method is quick, robust and capable of resolving protein-binding locations with high accuracy (∼ 300 bp), making our system a complementary platform to the methods currently practiced. It is advantageous in parallel analysis and less prone to false positive results over other single molecule mapping techniques such as optical tweezers, atomic force microscopy and molecular combing, and could potentially be extended to general mapping of protein–DNA interaction sites. PMID:24753422
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ford, K.A.; LaBarbera, A.R.
1988-11-01
The purpose of these studies was to determine whether changes in FSH receptors correlated with FSH-induced attenuation of FSH-responsive adenylyl cyclase in immature porcine granulosa cells. Cells were incubated with FSH (1-1000 ng/ml) for up to 24 h, treated with acidified medium (pH 3.5) to remove FSH bound to cells, and incubated with (125I)iodo-porcine FSH to quantify FSH-binding sites. FSH increased binding of FSH in a time-, temperature-, and FSH concentration-dependent manner. FSH (200 ng/ml) increased binding approximately 4-fold within 16 h. Analysis of equilibrium saturation binding data indicated that the increase in binding sites reflected a 2.3-fold increase inmore » receptor number and a 5.4-fold increase in apparent affinity. The increase in binding did not appear to be due to 1) a decrease in receptor turnover, since the basal rate of turnover appeared to be very slow; 2) an increase in receptor synthesis, since agents that inhibit protein synthesis and glycosylation did not block the increase in binding; or 3) an increase in intracellular receptors, since agents that inhibit cytoskeletal components had no effect. Agents that increase intracellular cAMP did not affect FSH binding. The increase in binding appeared to result from unmasking of cryptic FSH-binding sites, since FSH increased binding in cell-free membrane preparations to the same extent as in cells. Unmasking of cryptic sites was hormone specific, and the sites bound FSH specifically. Unmasking of sites was reversible in a time- and temperature-dependent manner after removal of bound FSH. The similarity between the FSH dose-response relationships for unmasking of FSH-binding sites and attenuation of FSH-responsive cAMP production suggests that the two processes are functionally linked.« less
Warfield, Becka M.
2017-01-01
RNA aptamers are oligonucleotides that bind with high specificity and affinity to target ligands. In the absence of bound ligand, secondary structures of RNA aptamers are generally stable, but single-stranded and loop regions, including ligand binding sites, lack defined structures and exist as ensembles of conformations. For example, the well-characterized theophylline-binding aptamer forms a highly stable binding site when bound to theophylline, but the binding site is unstable and disordered when theophylline is absent. Experimental methods have not revealed at atomic resolution the conformations that the theophylline aptamer explores in its unbound state. Consequently, in the present study we applied 21 microseconds of molecular dynamics simulations to structurally characterize the ensemble of conformations that the aptamer adopts in the absence of theophylline. Moreover, we apply Markov state modeling to predict the kinetics of transitions between unbound conformational states. Our simulation results agree with experimental observations that the theophylline binding site is found in many distinct binding-incompetent states and show that these states lack a binding pocket that can accommodate theophylline. The binding-incompetent states interconvert with binding-competent states through structural rearrangement of the binding site on the nanosecond to microsecond timescale. Moreover, we have simulated the complete theophylline binding pathway. Our binding simulations supplement prior experimental observations of slow theophylline binding kinetics by showing that the binding site must undergo a large conformational rearrangement after the aptamer and theophylline form an initial complex, most notably, a major rearrangement of the C27 base from a buried to solvent-exposed orientation. Theophylline appears to bind by a combination of conformational selection and induced fit mechanisms. Finally, our modeling indicates that when Mg2+ ions are present the population of binding-competent aptamer states increases more than twofold. This population change, rather than direct interactions between Mg2+ and theophylline, accounts for altered theophylline binding kinetics. PMID:28437473
Case, S S; Huber, P; Lloyd, J A
1999-11-01
A large nuclear protein complex, termed gammaPE (for gamma-globin promoter and enhancer binding factor), binds to five sites located 5' and 3' of the human y-globin gene. Two proteins, SATB1 (special A-T-rich binding protein 1) and HOXB2, can bind to yPE binding sites. SATB1 binds to nuclear matrix-attachment sites, and HOXB2 is a homeodomain protein important in neural development that is also expressed during erythropoiesis. The present work showed that antisera directed against either SATB1 or HOXB2 reacted specifically with the entire gammaPE complex in electrophoretic mobility shift assays (EMSAs), suggesting that the two proteins can bind to the gammaPE binding site simultaneously. When SATB1 or HOXB2 was expressed in vitro, they could bind independently to gammaPE binding sites in EMSA. Interestingly, the proteins expressed in vitro competed effectively with each other for the gammaPE binding site, suggesting that this may occur under certain conditions in vivo. Transient cotransfections of a HOXB2 cDNA and a y-globin-luciferase reporter gene construct into cells expressing SATB1 suggested that SATB1 has a positive and HOXB2 a negative regulatory effect on transcription. Taking into account their potentially opposing effects and binding activities, SATB1 and HOXB2 may modulate the amount of gamma-globin mRNA expressed during development and differentiation.
Slack, Robert J; Russell, Linda J; Barton, Nick P; Weston, Cathryn; Nalesso, Giovanna; Thompson, Sally-Anne; Allen, Morven; Chen, Yu Hua; Barnes, Ashley; Hodgson, Simon T; Hall, David A
2013-01-01
Chemokine receptor antagonists appear to access two distinct binding sites on different members of this receptor family. One class of CCR4 antagonists has been suggested to bind to a site accessible from the cytoplasm while a second class did not bind to this site. In this report, we demonstrate that antagonists representing a variety of structural classes bind to two distinct allosteric sites on CCR4. The effects of pairs of low-molecular weight and/or chemokine CCR4 antagonists were evaluated on CCL17- and CCL22-induced responses of human CCR4+ T cells. This provided an initial grouping of the antagonists into sets which appeared to bind to distinct binding sites. Binding studies were then performed with radioligands from each set to confirm these groupings. Some novel receptor theory was developed to allow the interpretation of the effects of the antagonist combinations. The theory indicates that, generally, the concentration-ratio of a pair of competing allosteric modulators is maximally the sum of their individual effects while that of two modulators acting at different sites is likely to be greater than their sum. The low-molecular weight antagonists could be grouped into two sets on the basis of the functional and binding experiments. The antagonistic chemokines formed a third set whose behaviour was consistent with that of simple competitive antagonists. These studies indicate that there are two allosteric regulatory sites on CCR4. PMID:25505571
Wei, Qing; La, David; Kihara, Daisuke
2017-01-01
Prediction of protein-protein interaction sites in a protein structure provides important information for elucidating the mechanism of protein function and can also be useful in guiding a modeling or design procedures of protein complex structures. Since prediction methods essentially assess the propensity of amino acids that are likely to be part of a protein docking interface, they can help in designing protein-protein interactions. Here, we introduce BindML and BindML+ protein-protein interaction sites prediction methods. BindML predicts protein-protein interaction sites by identifying mutation patterns found in known protein-protein complexes using phylogenetic substitution models. BindML+ is an extension of BindML for distinguishing permanent and transient types of protein-protein interaction sites. We developed an interactive web-server that provides a convenient interface to assist in structural visualization of protein-protein interactions site predictions. The input data for the web-server are a tertiary structure of interest. BindML and BindML+ are available at http://kiharalab.org/bindml/ and http://kiharalab.org/bindml/plus/ .
Computational Optimization and Characterization of Molecularly Imprinted Polymers
NASA Astrophysics Data System (ADS)
Terracina, Jacob J.
Molecularly imprinted polymers (MIPs) are a class of materials containing sites capable of selectively binding to the imprinted target molecule. Computational chemistry techniques were used to study the effect of different fabrication parameters (the monomer-to-target ratios, pre-polymerization solvent, temperature, and pH) on the formation of the MIP binding sites. Imprinted binding sites were built in silico for the purposes of better characterizing the receptor - ligand interactions. Chiefly, the sites were characterized with respect to their selectivities and the heterogeneity between sites. First, a series of two-step molecular mechanics (MM) and quantum mechanics (QM) computational optimizations of monomer -- target systems was used to determine optimal monomer-to-target ratios for the MIPs. Imidazole- and xanthine-derived target molecules were studied. The investigation included both small-scale models (one-target) and larger scale models (five-targets). The optimal ratios differed between the small and larger scales. For the larger models containing multiple targets, binding-site surface area analysis was used to evaluate the heterogeneity of the sites. The more fully surrounded sites had greater binding energies. Molecular docking was then used to measure the selectivities of the QM-optimized binding sites by comparing the binding energies of the imprinted target to that of a structural analogue. Selectivity was also shown to improve as binding sites become more fully encased by the monomers. For internal sites, docking consistently showed selectivity favoring the molecules that had been imprinted via QM geometry optimizations. The computationally imprinted sites were shown to exhibit size-, shape-, and polarity-based selectivity. This represented a novel approach to investigate the selectivity and heterogeneity of imprinted polymer binding sites, by applying the rapid orientation screening of MM docking to the highly accurate QM-optimized geometries. Next, we sought to computationally construct and investigate binding sites for their enantioselectivity. Again, a two-step MM [special characters removed] QM optimization scheme was used to "computationally imprint" chiral molecules. Using docking techniques, the imprinted binding sites were shown to exhibit an enantioselective preference for the imprinted molecule over its enantiomer. Docking of structurally similar chiral molecules showed that the sites computationally imprinted with R- or S-tBOC-tyrosine were able to differentiate between R- and S-forms of other tyrosine derivatives. The cross-enantioselectivity did not hold for chiral molecules that did not share the tyrosine H-bonding functional group orientations. Further analysis of the individual monomer - target interactions within the binding site led us to conclude that H-bonding functional groups that are located immediately next to the target's chiral center, and therefore spatially fixed relative to the chiral center, will have a stronger contribution to the enantioselectivity of the site than those groups separated from the chiral center by two or more rotatable bonds. These models were the first computationally imprinted binding sites to exhibit this enantioselective preference for the imprinted target molecules. Finally, molecular dynamics (MD) was used to quantify H-bonding interactions between target molecules, monomers, and solvents representative of the pre-polymerization matrix. It was found that both target dimerization and solvent interference decrease the number of monomer - target H-bonds present. Systems were optimized via simulated annealing to create binding sites that were then subjected to molecular docking analysis. Docking showed that the presence of solvent had a detrimental effect on the sensitivity and selectivity of the sites, and that solvents with more H-bonding capabilities were more disruptive to the binding properties of the site. Dynamic simulations also showed that increasing the temperature of the solution can significantly decrease the number of H-bonds formed between the targets and monomers. It is believed that the monomer - target complexes formed within the pre-polymerization matrix are translated into the selective binding cavities formed during polymerization. Elucidating the nature of these interactions in silico improves our understanding of MIPs, ultimately allowing for more optimized sensing materials.
Hogan, Daniel J; Riordan, Daniel P; Gerber, André P; Herschlag, Daniel; Brown, Patrick O
2008-10-28
RNA-binding proteins (RBPs) have roles in the regulation of many post-transcriptional steps in gene expression, but relatively few RBPs have been systematically studied. We searched for the RNA targets of 40 proteins in the yeast Saccharomyces cerevisiae: a selective sample of the approximately 600 annotated and predicted RBPs, as well as several proteins not annotated as RBPs. At least 33 of these 40 proteins, including three of the four proteins that were not previously known or predicted to be RBPs, were reproducibly associated with specific sets of a few to several hundred RNAs. Remarkably, many of the RBPs we studied bound mRNAs whose protein products share identifiable functional or cytotopic features. We identified specific sequences or predicted structures significantly enriched in target mRNAs of 16 RBPs. These potential RNA-recognition elements were diverse in sequence, structure, and location: some were found predominantly in 3'-untranslated regions, others in 5'-untranslated regions, some in coding sequences, and many in two or more of these features. Although this study only examined a small fraction of the universe of yeast RBPs, 70% of the mRNA transcriptome had significant associations with at least one of these RBPs, and on average, each distinct yeast mRNA interacted with three of the RBPs, suggesting the potential for a rich, multidimensional network of regulation. These results strongly suggest that combinatorial binding of RBPs to specific recognition elements in mRNAs is a pervasive mechanism for multi-dimensional regulation of their post-transcriptional fate.
NASA Astrophysics Data System (ADS)
Poornima, C. S.; Dean, P. M.
1995-12-01
Water molecules are known to play an important rôle in mediating protein-ligand interactions. If water molecules are conserved at the ligand-binding sites of homologous proteins, such a finding may suggest the structural importance of water molecules in ligand binding. Structurally conserved water molecules change the conventional definition of `binding sites' by changing the shape and complementarity of these sites. Such conserved water molecules can be important for site-directed ligand/drug design. Therefore, five different sets of homologous protein/protein-ligand complexes have been examined to identify the conserved water molecules at the ligand-binding sites. Our analysis reveals that there are as many as 16 conserved water molecules at the FAD binding site of glutathione reductase between the crystal structures obtained from human and E. coli. In the remaining four sets of high-resolution crystal structures, 2-4 water molecules have been found to be conserved at the ligand-binding sites. The majority of these conserved water molecules are either bound in deep grooves at the protein-ligand interface or completely buried in cavities between the protein and the ligand. All these water molecules, conserved between the protein/protein-ligand complexes from different species, have identical or similar apolar and polar interactions in a given set. The site residues interacting with the conserved water molecules at the ligand-binding sites have been found to be highly conserved among proteins from different species; they are more conserved compared to the other site residues interacting with the ligand. These water molecules, in general, make multiple polar contacts with protein-site residues.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sultatos, L.G.; Kaushik, R.
2008-08-01
The peripheral anionic site of acetylcholinesterase, when occupied by a ligand, is known to modulate reaction rates at the active site of this important enzyme. The current report utilized the peripheral anionic site specific fluorogenic probe thioflavin t to determine if the organophosphates chlorpyrifos oxon and dichlorvos bind to the peripheral anionic site of human recombinant acetylcholinesterase, since certain organophosphates display concentration-dependent kinetics when inhibiting this enzyme. Incubation of 3 nM acetylcholinesterase active sites with 50 nM or 2000 nM inhibitor altered both the B{sub max} and K{sub d} for thioflavin t binding to the peripheral anionic site. However, thesemore » changes resulted from phosphorylation of Ser203 since increasing either inhibitor from 50 nM to 2000 nM did not alter further thioflavin t binding kinetics. Moreover, the organophosphate-induced decrease in B{sub max} did not represent an actual reduction in binding sites, but instead likely resulted from conformational interactions between the acylation and peripheral anionic sites that led to a decrease in the rigidity of bound thioflavin t. A drop in fluorescence quantum yield, leading to an apparent decrease in B{sub max}, would accompany the decreased rigidity of bound thioflavin t molecules. The organophosphate-induced alterations in K{sub d} represented changes in binding affinity of thioflavin t, with diethylphosphorylation of Ser203 increasing K{sub d}, and dimethylphosphorylation of Ser203 decreasing K{sub d}. These results indicate that chlorpyrifos oxon and dichlorvos do not bind directly to the peripheral anionic site of acetylcholinesterase, but can affect binding to that site through phosphorylation of Ser203.« less
Du, Yushen; Wu, Nicholas C; Jiang, Lin; Zhang, Tianhao; Gong, Danyang; Shu, Sara; Wu, Ting-Ting; Sun, Ren
2016-11-01
Identification and annotation of functional residues are fundamental questions in protein sequence analysis. Sequence and structure conservation provides valuable information to tackle these questions. It is, however, limited by the incomplete sampling of sequence space in natural evolution. Moreover, proteins often have multiple functions, with overlapping sequences that present challenges to accurate annotation of the exact functions of individual residues by conservation-based methods. Using the influenza A virus PB1 protein as an example, we developed a method to systematically identify and annotate functional residues. We used saturation mutagenesis and high-throughput sequencing to measure the replication capacity of single nucleotide mutations across the entire PB1 protein. After predicting protein stability upon mutations, we identified functional PB1 residues that are essential for viral replication. To further annotate the functional residues important to the canonical or noncanonical functions of viral RNA-dependent RNA polymerase (vRdRp), we performed a homologous-structure analysis with 16 different vRdRp structures. We achieved high sensitivity in annotating the known canonical polymerase functional residues. Moreover, we identified a cluster of noncanonical functional residues located in the loop region of the PB1 β-ribbon. We further demonstrated that these residues were important for PB1 protein nuclear import through the interaction with Ran-binding protein 5. In summary, we developed a systematic and sensitive method to identify and annotate functional residues that are not restrained by sequence conservation. Importantly, this method is generally applicable to other proteins about which homologous-structure information is available. To fully comprehend the diverse functions of a protein, it is essential to understand the functionality of individual residues. Current methods are highly dependent on evolutionary sequence conservation, which is usually limited by sampling size. Sequence conservation-based methods are further confounded by structural constraints and multifunctionality of proteins. Here we present a method that can systematically identify and annotate functional residues of a given protein. We used a high-throughput functional profiling platform to identify essential residues. Coupling it with homologous-structure comparison, we were able to annotate multiple functions of proteins. We demonstrated the method with the PB1 protein of influenza A virus and identified novel functional residues in addition to its canonical function as an RNA-dependent RNA polymerase. Not limited to virology, this method is generally applicable to other proteins that can be functionally selected and about which homologous-structure information is available. Copyright © 2016 Du et al.
The Binding of Silibinin, the Main Constituent of Silymarin, to Site I on Human Serum Albumin.
Yamasaki, Keishi; Sato, Hiroki; Minagoshi, Saori; Kyubun, Karin; Anraku, Makoto; Miyamura, Shigeyuki; Watanabe, Hiroshi; Taguchi, Kazuaki; Seo, Hakaru; Maruyama, Toru; Otagiri, Masaki
2017-01-01
Silibinin is the main constituent of silymarin, an extract from the seeds of milk thistle (Silybum marianum). Because silibinin has many pharmacological activities, extending its clinical use in the treatment of a wider variety of diseases would be desirable. In this study, we report on the binding of silibinin to plasma proteins, an issue that has not previously been extensively studied. The findings indicated that silibinin mainly binds to human serum albumin (HSA). Mutual displacement experiments using ligands that primarily bind to sites I and II clearly revealed that silibinin binds tightly and selectively to site I (subsites Ia and/or Ic) of HSA, which is located in subdomain IIA. Thermodynamic analyses suggested that hydrogen bonding and van der Waals interactions are major contributors to silibinin-HSA interactions. Furthermore, the binding of silibinin to HSA was found to be decreased with increasing ionic strength and detergent concentration of the media, suggesting that electrostatic and hydrophobic interactions are involved in the binding. Trp214 and Arg218 were identified as being involved in the binding of silibinin to site I, based on binding experiments using chemically modified- and mutant-HSAs. In conclusion, the available evidence indicates that silibinin binds to the region close to Trp214 and Arg218 in site I of HSA with assistance by multiple forces and can displace site I drugs (e.g., warfarin or iodipamide), but not site II drugs (e.g., ibuprofen).
DOE Office of Scientific and Technical Information (OSTI.GOV)
Boyd, S.K.
1987-01-01
Because arginine vasotocin (AVT) activates male sexual behaviors in the rough-skinned newt (Taricha granulosa), quantitative autoradiography with radiolabeled arginine vasopressin (/sup 3/H-AVP) was used to localize and characterize putative AVT receptors in the brain of this amphibian. Binding of /sup 3/H-AVP to sites within the medial pallium was saturable, specific, reversible, of high affinity and low capacity. These binding sites appear to represent authentic central nervous system receptors for AVT. Furthermore, ligand specificity for the binding sites in this amphibian differs from that reported for AVP binding sites in rat brains. Dense concentrations of specific binding sites were located inmore » the olfactory nerve as it entered the olfactory bulb within the medial pallium, dorsal pallium, and amygdala pars lateralis of the telencephalon, and in the tegmental region of the medulla. Concentrations of binding sites differed significantly among various brain regions. A comparison of male and female newts collected during the breeding season revealed no sexual dimorphism. These areas may represent site(s) of action where AVT elicits sexual behaviors in male T. granulosa.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cosman, M; Zeller, L; Lightstone, F C
2002-01-01
The clostridial neurotoxins include the closely related tetanus (TeNT) and botulinum (BoNT) toxins. Botulinum toxin is used to treat severe muscle disorders and as a cosmetic wrinkle reducer. Large quantities of botulinum toxin have also been produced by terrorists for use as a biological weapon. Because there are no known antidotes for these toxins, they thus pose a potential threat to human health whether by an accidental overdose or by a hostile deployment. Thus, the discovery of high specificity and affinity compounds that can inhibit their binding to neural cells can be used as antidotes or in the design ofmore » chemical detectors. Using the crystal structure of the C fragment of the tetanus toxin (TetC), which is the cell recognition and cell surface binding domain, and the computational program DOCK, sets of small molecules have been predicted to bind to two different sites located on the surface of this protein. While Site-1 is common to the TeNT and BoNTs, Site-2 is unique to TeNT. Pairs of these molecules from each site can then be linked together synthetically to thereby increase the specificity and affinity for this toxin. Electrospray ionization mass spectroscopy was used to experimentally screen each compound for binding. Mixtures containing binders were further screened for activity under biologically relevant conditions using nuclear magnetic resonance (NMR) methods. The screening of mixtures of compounds offers increased efficiency and throughput as compared to testing single compounds and can also evaluate how possible structural changes induced by the binding of one ligand can influence the binding of the second ligand. In addition, competitive binding experiments with mixtures containing ligands predicted to bind the same site could identify the best binder for that site. NMR transfer nuclear Overhauser effect (trNOE) confirm that TetC binds doxorubicin but that this molecule is displaced by N-acetylneuraminic acid (sialic acid) in a mixture that also contains 3-sialyllactose (another predicted site 1 binder) and bisbenzimide 33342 (non-binder). A series of five predicted Site-2 binders were then screened sequentially in the presence of the Site-1 binder doxorubicin. These experiments showed that the compounds lavendustin A and naphthofluorescein-di-({beta}-D-galactopyranoside) binds along with doxorubicin to TetC. Further experiments indicate that doxorubicin and lavendustin are potential candidates to use in preparing a bidendate inhibitor specific for TetC. The simultaneous binding of two different predicted Site-2 ligands to TetC suggests that they may bind multiple sites. Another possibility is that the conformations of the binding sites are dynamic and can bind multiple diverse ligands at a single site depending on the pre-existing conformation of the protein, especially when doxorubicin is already bound.« less
Ap4A and ADP-beta-S binding to P2 purinoceptors present on rat brain synaptic terminals.
Pintor, J.; Díaz-Rey, M. A.; Miras-Portugal, M. T.
1993-01-01
1. Diadenosine tetraphosphate (Ap4A) a dinucleotide stored and released from rat brain synaptic terminals presents two types of affinity binding sites in synaptosomes. When [3H]-Ap4A was used for binding studies a Kd value of 0.10 +/- 0.014 nM and a Bmax value of 16.6 +/- 1.2 fmol mg-1 protein were obtained for the high affinity binding site from the Scatchard analysis. The second binding site, obtained by displacement studies, showed a Ki value of 0.57 +/- 0.09 microM. 2. Displacement of [3H]-Ap4A by non-labelled Ap4A and P2-purinoceptor ligands showed a displacement order of Ap4A > adenosine 5'-O-(2-thiodiphosphate) (ADP-beta-S) > 5'-adenylyl-imidodiphosphate (AMP-PNP) > alpha,beta-methylene adenosine 5'-triphosphate (alpha,beta-MeATP) in both sites revealed by the Ki values of 0.017 nM, 0.030 nM, 0.058 nM and 0.147 nM respectively for the high affinity binding site and values of 0.57 microM, 0.87 microM, 2.20 microM and 4.28 microM respectively for the second binding site. 3. Studies of the P2-purinoceptors present in synaptosomes were also performed with [35S]-ADP-beta-S. This radioligand showed two binding sites the first with Kd and Bmax values of 0.11 +/- 0.022 nM and 3.9 +/- 2.1 fmol mg-1 of protein respectively for the high affinity binding site obtained from the Scatchard plot. The second binding site showed a Ki of 0.018 +/- 0.0035 microM obtained from displacement curves. 4. Competition studies with diadenosine polyphosphates of [35S]-ADP-beta-S binding showed a displacement order of Ap4A > Ap5A > Ap6A in the high affinity binding site and Ki values of 0.023 nM, 0.081 nM and 5.72 nM respectively.(ABSTRACT TRUNCATED AT 250 WORDS) PMID:8485620
Influence of sulfhydryl sites on metal binding by bacteria
NASA Astrophysics Data System (ADS)
Nell, Ryan M.; Fein, Jeremy B.
2017-02-01
The role of sulfhydryl sites within bacterial cell envelopes is still unknown, but the sites may control the fate and bioavailability of metals. Organic sulfhydryl compounds are important complexing ligands in aqueous systems and they can influence metal speciation in natural waters. Though representing only approximately 5-10% of the total available binding sites on bacterial surfaces, sulfhydryl sites exhibit high binding affinities for some metals. Due to the potential importance of bacterial sulfhydryl sites in natural systems, metal-bacterial sulfhydryl site binding constants must be determined in order to construct accurate models of the fate and distribution of metals in these systems. To date, only Cd-sulfhydryl binding has been quantified. In this study, the thermodynamic stabilities of Mn-, Co-, Ni-, Zn-, Sr- and Pb-sulfhydryl bacterial cell envelope complexes were determined for the bacterial species Shewanella oneidensis MR-1. Metal adsorption experiments were conducted as a function of both pH, ranging from 5.0 to 7.0, and metal loading, from 0.5 to 40.0 μmol/g (wet weight) bacteria, in batch experiments in order to determine if metal-sulfhydryl binding occurs. Initially, the data were used to calculate the value of the stability constants for the important metal-sulfhydryl bacterial complexes for each metal-loading condition studied, assuming a single binding reaction for the dominant metal-binding site type under the pH conditions of the experiments. For most of the metals that we studied, these calculated stability constant values increased significantly with decreasing metal loading, strongly suggesting that our initial assumption was not valid and that more than one type of binding occurs at the assumed binding site. We then modeled each dataset with two distinct site types with identical acidity constants: one site with a high metal-site stability constant value, which we take to represent metal-sulfhydryl binding and which dominates under low metal loading conditions, and another more abundant site that we term non-sulfhydryl sites that becomes important at high metal loadings. The resulting calculated stability constants do not vary significantly as a function of metal loading and yield reasonable fits to the observed adsorption behaviors as a function of both pH and metal loading. We use the results to calculate the speciation of metals bound by the bacterial envelope in realistic bacteria-bearing, heavy metal contaminated systems in order to demonstrate the potential importance of metal-sulfhydryl binding in the budget of bacterially-adsorbed metals under low metal-loading conditions.
Discovery of the ammonium substrate site on glutamine synthetase, a third cation binding site.
Liaw, S. H.; Kuo, I.; Eisenberg, D.
1995-01-01
Glutamine synthetase (GS) catalyzes the ATP-dependent condensation of ammonia and glutamate to yield glutamine, ADP, and inorganic phosphate in the presence of divalent cations. Bacterial GS is an enzyme of 12 identical subunits, arranged in two rings of 6, with the active site between each pair of subunits in a ring. In earlier work, we have reported the locations within the funnel-shaped active site of the substrates glutamate and ATP and of the two divalent cations, but the site for ammonia (or ammonium) has remained elusive. Here we report the discovery by X-ray crystallography of a binding site on GS for monovalent cations, Tl+ and Cs+, which is probably the binding site for the substrate ammonium ion. Fourier difference maps show the following. (1) Tl+ and Cs+ bind at essentially the same site, with ligands being Glu 212, Tyr 179, Asp 50', Ser 53' of the adjacent subunit, and the substrate glutamate. From its position adjacent to the substrate glutamate and the cofactor ADP, we propose that this monovalent cation site is the substrate ammonium ion binding site. This proposal is supported by enzyme kinetics. Our kinetic measurements show that Tl+, Cs+, and NH4+ are competitive inhibitors to NH2OH in the gamma-glutamyl transfer reaction. (2) GS is a trimetallic enzyme containing two divalent cation sites (n1, n2) and one monovalent cation site per subunit. These three closely spaced ions are all at the active site: the distance between n1 and n2 is 6 A, between n1 and Tl+ is 4 A, and between n2 and Tl+ is 7 A. Glu 212 and the substrate glutamate are bridging ligands for the n1 ion and Tl+. (3) The presence of a monovalent cation in this site may enhance the structural stability of GS, because of its effect of balancing the negative charges of the substrate glutamate and its ligands and because of strengthening the "side-to-side" intersubunit interaction through the cation-protein bonding. (4) The presence of the cofactor ADP increases the Tl+ binding to GS because ADP binding induces movement of Asp 50' toward this monovalent cation site, essentially forming the site. This observation supports a two-step mechanism with ordered substrate binding: ATP first binds to GS, then Glu binds and attacks ATP to form gamma-glutamyl phosphate and ADP, which complete the ammonium binding site. The third substrate, an ammonium ion, then binds to GS, and then loses a proton to form the more active species ammonia, which attacks the gamma-glutamyl phosphate to yield Gln. (5) Because the products (Glu or Gln) of the reactions catalyzed by GS are determined by the molecule (water or ammonium) attacking the intermediate gamma-glutamyl phosphate, this negatively charged ammonium binding pocket has been designed naturally for high affinity of ammonium to GS, permitting glutamine synthesis to proceed in aqueous solution. PMID:8563633
RNA binding protein and binding site useful for expression of recombinant molecules
Mayfield, Stephen P.
2006-10-17
The present invention relates to a gene expression system in eukaryotic and prokaryotic cells, preferably plant cells and intact plants. In particular, the invention relates to an expression system having a RB47 binding site upstream of a translation initiation site for regulation of translation mediated by binding of RB47 protein, a member of the poly(A) binding protein family. Regulation is further effected by RB60, a protein disulfide isomerase. The expression system is capable of functioning in the nuclear/cytoplasm of cells and in the chloroplast of plants. Translation regulation of a desired molecule is enhanced approximately 100 fold over that obtained without RB47 binding site activation.
RNA binding protein and binding site useful for expression of recombinant molecules
Mayfield, Stephen
2000-01-01
The present invention relates to a gene expression system in eukaryotic and prokaryotic cells, preferably plant cells and intact plants. In particular, the invention relates to an expression system having a RB47 binding site upstream of a translation initiation site for regulation of translation mediated by binding of RB47 protein, a member of the poly(A) binding protein family. Regulation is further effected by RB60, a protein disulfide isomerase. The expression system is capable of functioning in the nuclear/cytoplasm of cells and in the chloroplast of plants. Translation regulation of a desired molecule is enhanced approximately 100 fold over that obtained without RB47 binding site activation.
Acceleration of Binding Site Comparisons by Graph Partitioning.
Krotzky, Timo; Klebe, Gerhard
2015-08-01
The comparison of protein binding sites is a prominent task in computational chemistry and has been studied in many different ways. For the automatic detection and comparison of putative binding cavities the Cavbase system has been developed which uses a coarse-grained set of pseudocenters to represent the physicochemical properties of a binding site and employs a graph-based procedure to calculate similarities between two binding sites. However, the comparison of two graphs is computationally quite demanding which makes large-scale studies such as the rapid screening of entire databases hardly feasible. In a recent work, we proposed the method Local Cliques (LC) for the efficient comparison of Cavbase binding sites. It employs a clique heuristic to detect the maximum common subgraph of two binding sites and an extended graph model to additionally compare the shape of individual surface patches. In this study, we present an alternative to further accelerate the LC method by partitioning the binding-site graphs into disjoint components prior to their comparisons. The pseudocenter sets are split with regard to their assigned phyiscochemical type, which leads to seven much smaller graphs than the original one. Applying this approach on the same test scenarios as in the former comprehensive way results in a significant speed-up without sacrificing accuracy. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
GenProBiS: web server for mapping of sequence variants to protein binding sites.
Konc, Janez; Skrlj, Blaz; Erzen, Nika; Kunej, Tanja; Janezic, Dusanka
2017-07-03
Discovery of potentially deleterious sequence variants is important and has wide implications for research and generation of new hypotheses in human and veterinary medicine, and drug discovery. The GenProBiS web server maps sequence variants to protein structures from the Protein Data Bank (PDB), and further to protein-protein, protein-nucleic acid, protein-compound, and protein-metal ion binding sites. The concept of a protein-compound binding site is understood in the broadest sense, which includes glycosylation and other post-translational modification sites. Binding sites were defined by local structural comparisons of whole protein structures using the Protein Binding Sites (ProBiS) algorithm and transposition of ligands from the similar binding sites found to the query protein using the ProBiS-ligands approach with new improvements introduced in GenProBiS. Binding site surfaces were generated as three-dimensional grids encompassing the space occupied by predicted ligands. The server allows intuitive visual exploration of comprehensively mapped variants, such as human somatic mis-sense mutations related to cancer and non-synonymous single nucleotide polymorphisms from 21 species, within the predicted binding sites regions for about 80 000 PDB protein structures using fast WebGL graphics. The GenProBiS web server is open and free to all users at http://genprobis.insilab.org. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Artemis and ACT: viewing, annotating and comparing sequences stored in a relational database.
Carver, Tim; Berriman, Matthew; Tivey, Adrian; Patel, Chinmay; Böhme, Ulrike; Barrell, Barclay G; Parkhill, Julian; Rajandream, Marie-Adèle
2008-12-01
Artemis and Artemis Comparison Tool (ACT) have become mainstream tools for viewing and annotating sequence data, particularly for microbial genomes. Since its first release, Artemis has been continuously developed and supported with additional functionality for editing and analysing sequences based on feedback from an active user community of laboratory biologists and professional annotators. Nevertheless, its utility has been somewhat restricted by its limitation to reading and writing from flat files. Therefore, a new version of Artemis has been developed, which reads from and writes to a relational database schema, and allows users to annotate more complex, often large and fragmented, genome sequences. Artemis and ACT have now been extended to read and write directly to the Generic Model Organism Database (GMOD, http://www.gmod.org) Chado relational database schema. In addition, a Gene Builder tool has been developed to provide structured forms and tables to edit coordinates of gene models and edit functional annotation, based on standard ontologies, controlled vocabularies and free text. Artemis and ACT are freely available (under a GPL licence) for download (for MacOSX, UNIX and Windows) at the Wellcome Trust Sanger Institute web sites: http://www.sanger.ac.uk/Software/Artemis/ http://www.sanger.ac.uk/Software/ACT/
Emerman, Amy B; Blower, Michael
2018-06-14
RNA-binding proteins (RBPs) are critical regulators of gene expression. Recent studies have uncovered hundreds of mRNA-binding proteins that do not contain annotated RNA-binding domains and have well-established roles in other cellular processes. Investigation of these nonconventional RBPs is critical for revealing novel RNA-binding domains and may disclose connections between RNA regulation and other aspects of cell biology. Endosomal sorting complex required for transport II (ESCRT-II) is a nonconventional RNA-binding complex that has a canonical role in multivesicular body formation. ESCRT-II previously has been identified as an RNA-binding complex in Drosophila oocytes, but whether its RNA-binding properties extend beyond Drosophila is unknown. In this study, we found that the RNA-binding properties of ESCRT-II are conserved in Xenopus eggs, where ESCRT-II interacted with hundreds of mRNAs. Using a UV-crosslinking approach, we demonstrated that ESCRT-II binds directly to RNA through its subunit Vps25. UV-crosslinking and immunoprecipitation (CLIP)-Seq revealed that Vps25 specifically recognizes a polypurine (i.e. GA-rich) motif in RNA. Using purified components, we could reconstitute the selective Vps25-mediated binding of the polypurine motif in vitro. Our results provide insight into the mechanism by which ESCRT-II selectively binds to mRNAs and also suggest an unexpected link between endosome biology and RNA regulation. Published under license by The American Society for Biochemistry and Molecular Biology, Inc.
Dadaev, Tokhir; Saunders, Edward J; Newcombe, Paul J; Anokian, Ezequiel; Leongamornlert, Daniel A; Brook, Mark N; Cieza-Borrella, Clara; Mijuskovic, Martina; Wakerell, Sarah; Olama, Ali Amin Al; Schumacher, Fredrick R; Berndt, Sonja I; Benlloch, Sara; Ahmed, Mahbubl; Goh, Chee; Sheng, Xin; Zhang, Zhuo; Muir, Kenneth; Govindasami, Koveela; Lophatananon, Artitaya; Stevens, Victoria L; Gapstur, Susan M; Carter, Brian D; Tangen, Catherine M; Goodman, Phyllis; Thompson, Ian M; Batra, Jyotsna; Chambers, Suzanne; Moya, Leire; Clements, Judith; Horvath, Lisa; Tilley, Wayne; Risbridger, Gail; Gronberg, Henrik; Aly, Markus; Nordström, Tobias; Pharoah, Paul; Pashayan, Nora; Schleutker, Johanna; Tammela, Teuvo L J; Sipeky, Csilla; Auvinen, Anssi; Albanes, Demetrius; Weinstein, Stephanie; Wolk, Alicja; Hakansson, Niclas; West, Catharine; Dunning, Alison M; Burnet, Neil; Mucci, Lorelei; Giovannucci, Edward; Andriole, Gerald; Cussenot, Olivier; Cancel-Tassin, Géraldine; Koutros, Stella; Freeman, Laura E Beane; Sorensen, Karina Dalsgaard; Orntoft, Torben Falck; Borre, Michael; Maehle, Lovise; Grindedal, Eli Marie; Neal, David E; Donovan, Jenny L; Hamdy, Freddie C; Martin, Richard M; Travis, Ruth C; Key, Tim J; Hamilton, Robert J; Fleshner, Neil E; Finelli, Antonio; Ingles, Sue Ann; Stern, Mariana C; Rosenstein, Barry; Kerns, Sarah; Ostrer, Harry; Lu, Yong-Jie; Zhang, Hong-Wei; Feng, Ninghan; Mao, Xueying; Guo, Xin; Wang, Guomin; Sun, Zan; Giles, Graham G; Southey, Melissa C; MacInnis, Robert J; FitzGerald, Liesel M; Kibel, Adam S; Drake, Bettina F; Vega, Ana; Gómez-Caamaño, Antonio; Fachal, Laura; Szulkin, Robert; Eklund, Martin; Kogevinas, Manolis; Llorca, Javier; Castaño-Vinyals, Gemma; Penney, Kathryn L; Stampfer, Meir; Park, Jong Y; Sellers, Thomas A; Lin, Hui-Yi; Stanford, Janet L; Cybulski, Cezary; Wokolorczyk, Dominika; Lubinski, Jan; Ostrander, Elaine A; Geybels, Milan S; Nordestgaard, Børge G; Nielsen, Sune F; Weisher, Maren; Bisbjerg, Rasmus; Røder, Martin Andreas; Iversen, Peter; Brenner, Hermann; Cuk, Katarina; Holleczek, Bernd; Maier, Christiane; Luedeke, Manuel; Schnoeller, Thomas; Kim, Jeri; Logothetis, Christopher J; John, Esther M; Teixeira, Manuel R; Paulo, Paula; Cardoso, Marta; Neuhausen, Susan L; Steele, Linda; Ding, Yuan Chun; De Ruyck, Kim; De Meerleer, Gert; Ost, Piet; Razack, Azad; Lim, Jasmine; Teo, Soo-Hwang; Lin, Daniel W; Newcomb, Lisa F; Lessel, Davor; Gamulin, Marija; Kulis, Tomislav; Kaneva, Radka; Usmani, Nawaid; Slavov, Chavdar; Mitev, Vanio; Parliament, Matthew; Singhal, Sandeep; Claessens, Frank; Joniau, Steven; Van den Broeck, Thomas; Larkin, Samantha; Townsend, Paul A; Aukim-Hastie, Claire; Gago-Dominguez, Manuela; Castelao, Jose Esteban; Martinez, Maria Elena; Roobol, Monique J; Jenster, Guido; van Schaik, Ron H N; Menegaux, Florence; Truong, Thérèse; Koudou, Yves Akoli; Xu, Jianfeng; Khaw, Kay-Tee; Cannon-Albright, Lisa; Pandha, Hardev; Michael, Agnieszka; Kierzek, Andrzej; Thibodeau, Stephen N; McDonnell, Shannon K; Schaid, Daniel J; Lindstrom, Sara; Turman, Constance; Ma, Jing; Hunter, David J; Riboli, Elio; Siddiq, Afshan; Canzian, Federico; Kolonel, Laurence N; Le Marchand, Loic; Hoover, Robert N; Machiela, Mitchell J; Kraft, Peter; Freedman, Matthew; Wiklund, Fredrik; Chanock, Stephen; Henderson, Brian E; Easton, Douglas F; Haiman, Christopher A; Eeles, Rosalind A; Conti, David V; Kote-Jarai, Zsofia
2018-06-11
Prostate cancer is a polygenic disease with a large heritable component. A number of common, low-penetrance prostate cancer risk loci have been identified through GWAS. Here we apply the Bayesian multivariate variable selection algorithm JAM to fine-map 84 prostate cancer susceptibility loci, using summary data from a large European ancestry meta-analysis. We observe evidence for multiple independent signals at 12 regions and 99 risk signals overall. Only 15 original GWAS tag SNPs remain among the catalogue of candidate variants identified; the remainder are replaced by more likely candidates. Biological annotation of our credible set of variants indicates significant enrichment within promoter and enhancer elements, and transcription factor-binding sites, including AR, ERG and FOXA1. In 40 regions at least one variant is colocalised with an eQTL in prostate cancer tissue. The refined set of candidate variants substantially increase the proportion of familial relative risk explained by these known susceptibility regions, which highlights the importance of fine-mapping studies and has implications for clinical risk profiling.
Clustering approaches to identifying gene expression patterns from DNA microarray data.
Do, Jin Hwan; Choi, Dong-Kug
2008-04-30
The analysis of microarray data is essential for large amounts of gene expression data. In this review we focus on clustering techniques. The biological rationale for this approach is the fact that many co-expressed genes are co-regulated, and identifying co-expressed genes could aid in functional annotation of novel genes, de novo identification of transcription factor binding sites and elucidation of complex biological pathways. Co-expressed genes are usually identified in microarray experiments by clustering techniques. There are many such methods, and the results obtained even for the same datasets may vary considerably depending on the algorithms and metrics for dissimilarity measures used, as well as on user-selectable parameters such as desired number of clusters and initial values. Therefore, biologists who want to interpret microarray data should be aware of the weakness and strengths of the clustering methods used. In this review, we survey the basic principles of clustering of DNA microarray data from crisp clustering algorithms such as hierarchical clustering, K-means and self-organizing maps, to complex clustering algorithms like fuzzy clustering.
Annotation and Structural Analysis of Sialylated Human Milk Oligosaccharides
Wu, Shuai; Grimm, Rudolf; German, J. Bruce; Lebrilla, Carlito B.
2011-01-01
Sialylated human milk oligosaccharides (SHMOs) are important components of human milk oligosaccharides. Sialic acids are typically found on the nonreducing end and are known binding sites for pathogens and aid in neonates’ brain development. Due to their negative charge and hydrophilic nature, they also help modulate cell-cell interactions. It has also been shown that sialic acids are involved in regulating the immune response and aid in brain development. In this study, the enriched SHMOs from pooled milk sample were analyzed by HPLC-Chip/QTOF MS. The instrument employs a microchip-based nano-LC column packed with porous graphitized carbon (PGC) to provide excellent isomer separation for SHMOs with highly reproducible retention time. The precursor ions were further examined with collision-induced dissociation (CID). By applying the proper collision energy, isomers can be readily differentiated by diagnostic peaks and characteristic fragmentation patterns. A set of 30 SHMO structures with retention times, accurate masses and MS/MS spectra was deduced and incorporated into an HMO library. When combined with previously determined neutral components, a library with over 70 structures is obtained allowing high-throughput oligosaccharide structure identification. PMID:21133381
Structure-Based Annotation of a Novel Sugar Isomerase from the Pathogenic E. coli O157:H7
DOE Office of Scientific and Technical Information (OSTI.GOV)
van Staalduinen, L.; Park, C; Yeom, S
2010-01-01
Prokaryotes can use a variety of sugars as carbon sources in order to provide a selective survival advantage. The gene z5688 found in the pathogenic Escherichia coli O157:H7 encodes a 'hypothetical' protein of unknown function. Sequence analysis identified the gene product as a putative member of the cupin superfamily of proteins, but no other functional information was known. We have determined the crystal structure of the Z5688 protein at 1.6 {angstrom} resolution and identified the protein as a novel E. coli sugar isomerase (EcSI) through overall fold analysis and secondary-structure matching. Extensive substrate screening revealed that EcSI is capable ofmore » acting on D-lyxose and D-mannose. The complex structure of EcSI with fructose allowed the identification of key active-site residues, and mutagenesis confirmed their importance. The structure of EcSI also suggested a novel mechanism for substrate binding and product release in a cupin sugar isomerase. Supplementation of a nonpathogenic E. coli strain with EcSI enabled cell growth on the rare pentose d-lyxose.« less
Wu, Wei; Park, Kyung-Tae; Holyoak, Todd; Lutkenhaus, Joe
2011-01-01
Summary The three Min proteins spatially regulate Z ring positioning in E. coli and are dynamically associated with the membrane. MinD binds to vesicles in the presence of ATP and can recruit MinC or MinE. Biochemical and genetic evidence indicate the binding sites for these two proteins on MinD overlap. Here we solved the structure of a hydrolytic-deficient mutant of MinD truncated for the C-terminal amphipathic helix involved in binding to the membrane. The structure solved in the presence of ATP is a dimer and reveals the face of MinD abutting the membrane. Using a combination of random and extensive site-directed mutagenesis additional residues important for MinE and MinC binding were identified. The location of these residues on the MinD structure confirms that the binding sites overlap and reveals that the binding sites are at the dimer interface and exposed to the cytosol. The location of the binding sites at the dimer interface offers a simple explanation for the ATP-dependency of MinC and MinE binding to MinD. PMID:21231967
Interaction between phloretin and the red blood cell membrane
1976-01-01
Phloretin binding to red blood cell components has been characterized at pH6, where binding and inhibitory potency are maximal. Binding to intact red cells and to purified hemoglobin are nonsaturated processes approximately equal in magnitude, which strongly suggests that most of the red cell binding may be ascribed to hemoglobin. This conclusion is supported by the fact that homoglobin-free red cell ghosts can bind only 10% as much phloretin as an equivalent number of red cells. The permeability of the red cell membrane to phloretin has been determined by a direct measurement at the time-course of the phloretin uptake. At a 2% hematocrit, the half time for phloretin uptake is 8.7s, corresponding to a permeability coefficient of 2 x 10(-4) cm/s. The concentration dependence of the binding to ghosts reveals two saturable components. Phloretin binds with high affinity (K diss = 1.5 muM) to about 2.5 x 10(6) sites per cell; it also binds with lower affinity (Kdiss = 54 muM) to a second (5.5 x 10(7) per cell) set of sites. In sonicated total lipid extracts of red cell ghosts, phloretin binding consists of a single, saturable component. Its affinity and total number of sites are not significantly different from those of the low affinity binding process in ghosts. No high affinity binding of phloretin is exhibited by the red cell lipid extracts. Therefore, the high affinity phloretin binding sites are related to membrane proteins, and the low affinity sites result from phloretin binding to lipid. The identification of these two types of binding sites allows phloretin effects on protein-mediated transport processes to be distinguished from effects on the lipid region of the membrane. PMID:5575
A peek into tropomyosin binding and unfolding on the actin filament.
Singh, Abhishek; Hitchcock-Degregori, Sarah E
2009-07-24
Tropomyosin is a prototypical coiled coil along its length with subtle variations in structure that allow interactions with actin and other proteins. Actin binding globally stabilizes tropomyosin. Tropomyosin-actin interaction occurs periodically along the length of tropomyosin. However, it is not well understood how tropomyosin binds actin. Tropomyosin's periodic binding sites make differential contributions to two components of actin binding, cooperativity and affinity, and can be classified as primary or secondary sites. We show through mutagenesis and analysis of recombinant striated muscle alpha-tropomyosins that primary actin binding sites have a destabilizing coiled-coil interface, typically alanine-rich, embedded within a non-interface recognition sequence. Introduction of an Ala cluster in place of the native, more stable interface in period 2 and/or period 3 sites (of seven) increased the affinity or cooperativity of actin binding, analysed by cosedimentation and differential scanning calorimetry. Replacement of period 3 with period 5 sequence, an unstable region of known importance for cooperative actin binding, increased the cooperativity of binding. Introduction of the fluorescent probe, pyrene, near the mutation sites in periods 2 and 3 reported local instability, stabilization by actin binding, and local unfolding before or coincident with dissociation from actin (measured using light scattering), and chain dissociation (analyzed using circular dichroism). This, and previous work, suggests that regions of tropomyosin involved in binding actin have non-interface residues specific for interaction with actin and an unstable interface that is locally stabilized upon binding. The destabilized interface allows residues on the coiled-coil surface to obtain an optimal conformation for interaction with actin by increasing the number of local substates that the side chains can sample. We suggest that local disorder is a property typical of coiled coil binding sites and proteins that have multiple binding partners, of which tropomyosin is one type.
Ahmed, Ahmed H; Oswald, Robert E
2010-03-11
Glutamate receptors are the most prevalent excitatory neurotransmitter receptors in the vertebrate central nervous system and are important potential drug targets for cognitive enhancement and the treatment of schizophrenia. Allosteric modulators of AMPA receptors promote dimerization by binding to a dimer interface and reducing desensitization and deactivation. The pyrrolidine allosteric modulators, piracetam and aniracetam, were among the first of this class of drugs to be discovered. We have determined the structure of the ligand binding domain of the AMPA receptor subtypes GluA2 and GluA3 with piracetam and a corresponding structure of GluA3 with aniracetam. Both drugs bind to GluA2 and GluA3 in a very similar manner, suggesting little subunit specificity. However, the binding sites for piracetam and aniracetam differ considerably. Aniracetam binds to a symmetrical site at the center of the dimer interface. Piracetam binds to multiple sites along the dimer interface with low occupation, one of which is a unique binding site for potential allosteric modulators. This new site may be of importance in the design of new allosteric regulators.
Ahmed, Ahmed H.; Oswald, Robert E.
2010-01-01
Glutamate receptors are the most prevalent excitatory neurotransmitter receptors in the vertebrate central nervous system and are important potential drug targets for cognitive enhancement and the treatment of schizophrenia. Allosteric modulators of AMPA receptors promote dimerization by binding to a dimer interface and reducing desensitization and deactivation. The pyrrolidine allosteric modulators, piracetam and aniracetam, were among the first of this class of drugs to be discovered. We have determined the structure of the ligand binding domain of the AMPA receptor subtypes GluA2 and GluA3 with piracetam and a corresponding structure of GluA3 with aniracetam. Both drugs bind to both GluA2 and GluA3 in a very similar manner, suggesting little subunit specificity. However, the binding sites for piracetam and aniracetam differ considerably. Aniracetam binds to a symmetrical site at the center of the dimer interface. Piracetam binds to multiple sites along the dimer interface with low occupation, one of which is a unique binding site for potential allosteric modulators. This new site may be of importance in the design of new allosteric regulators. PMID:20163115
Amyloid tracers detect multiple binding sites in Alzheimer's disease brain tissue.
Ni, Ruiqing; Gillberg, Per-Göran; Bergfors, Assar; Marutle, Amelia; Nordberg, Agneta
2013-07-01
Imaging fibrillar amyloid-β deposition in the human brain in vivo by positron emission tomography has improved our understanding of the time course of amyloid-β pathology in Alzheimer's disease. The most widely used amyloid-β imaging tracer so far is (11)C-Pittsburgh compound B, a thioflavin derivative but other (11)C- and (18)F-labelled amyloid-β tracers have been studied in patients with Alzheimer's disease and cognitively normal control subjects. However, it has not yet been established whether different amyloid tracers bind to identical sites on amyloid-β fibrils, offering the same ability to detect the regional amyloid-β burden in the brains. In this study, we characterized (3)H-Pittsburgh compound B binding in autopsied brain regions from 23 patients with Alzheimer's disease and 20 control subjects (aged 50 to 88 years). The binding properties of the amyloid tracers FDDNP, AV-45, AV-1 and BF-227 were also compared with those of (3)H-Pittsburgh compound B in the frontal cortices of patients with Alzheimer's disease. Saturation binding studies revealed the presence of high- and low-affinity (3)H-Pittsburgh compound B binding sites in the frontal cortex (K(d1): 3.5 ± 1.6 nM; K(d2): 133 ± 30 nM) and hippocampus (K(d1):5.6 ± 2.2 nM; K(d2): 181 ± 132 nM) of Alzheimer's disease brains. The relative proportion of high-affinity to low-affinity sites was 6:1 in the frontal cortex and 3:1 in the hippocampus. One control showed both high- and low-affinity (3)H-Pittsburgh compound B binding sites (K(d1): 1.6 nM; K(d2): 330 nM) in the cortex while the others only had a low-affinity site (K(d2): 191 ± 70 nM). (3)H-Pittsburgh compound B binding in Alzheimer's disease brains was higher in the frontal and parietal cortices than in the caudate nucleus and hippocampus, and negligible in the cerebellum. Competitive binding studies with (3)H-Pittsburgh compound B in the frontal cortices of Alzheimer's disease brains revealed high- and low-affinity binding sites for BTA-1 (Ki: 0.2 nM, 70 nM), florbetapir (1.8 nM, 53 nM) and florbetaben (1.0 nM, 65 nM). BF-227 displaced 83% of (3)H-Pittsburgh compound B binding, mainly at a low-affinity site (311 nM), whereas FDDNP only partly displaced (40%). We propose a multiple binding site model for the amyloid tracers (binding sites 1, 2 and 3), where AV-45 (florbetapir), AV-1 (florbetaben), and Pittsburgh compound B, all show nanomolar affinity for the high-affinity site (binding site 1), as visualized by positron emission tomography. BF-227 shows mainly binding to site 3 and FDDNP shows only some binding to site 2. Different amyloid tracers may provide new insight into the pathophysiological mechanisms in the progression of Alzheimer's disease.
Role of Electrostatics in Protein-RNA Binding: The Global vs the Local Energy Landscape.
Ghaemi, Zhaleh; Guzman, Irisbel; Gnutt, David; Luthey-Schulten, Zaida; Gruebele, Martin
2017-09-14
U1A protein-stem loop 2 RNA association is a basic step in the assembly of the spliceosomal U1 small nuclear ribonucleoprotein. Long-range electrostatic interactions due to the positive charge of U1A are thought to provide high binding affinity for the negatively charged RNA. Short range interactions, such as hydrogen bonds and contacts between RNA bases and protein side chains, favor a specific binding site. Here, we propose that electrostatic interactions are as important as local contacts in biasing the protein-RNA energy landscape toward a specific binding site. We show by using molecular dynamics simulations that deletion of two long-range electrostatic interactions (K22Q and K50Q) leads to mutant-specific alternative RNA bound states. One of these states preserves short-range interactions with aromatic residues in the original binding site, while the other one does not. We test the computational prediction with experimental temperature-jump kinetics using a tryptophan probe in the U1A-RNA binding site. The two mutants show the distinct predicted kinetic behaviors. Thus, the stem loop 2 RNA has multiple binding sites on a rough RNA-protein binding landscape. We speculate that the rough protein-RNA binding landscape, when biased to different local minima by electrostatics, could be one way that protein-RNA interactions evolve toward new binding sites and novel function.
Xu, Youjun; Wang, Shiwei; Hu, Qiwan; Gao, Shuaishi; Ma, Xiaomin; Zhang, Weilin; Shen, Yihang; Chen, Fangjin; Lai, Luhua; Pei, Jianfeng
2018-05-10
CavityPlus is a web server that offers protein cavity detection and various functional analyses. Using protein three-dimensional structural information as the input, CavityPlus applies CAVITY to detect potential binding sites on the surface of a given protein structure and rank them based on ligandability and druggability scores. These potential binding sites can be further analysed using three submodules, CavPharmer, CorrSite, and CovCys. CavPharmer uses a receptor-based pharmacophore modelling program, Pocket, to automatically extract pharmacophore features within cavities. CorrSite identifies potential allosteric ligand-binding sites based on motion correlation analyses between cavities. CovCys automatically detects druggable cysteine residues, which is especially useful to identify novel binding sites for designing covalent allosteric ligands. Overall, CavityPlus provides an integrated platform for analysing comprehensive properties of protein binding cavities. Such analyses are useful for many aspects of drug design and discovery, including target selection and identification, virtual screening, de novo drug design, and allosteric and covalent-binding drug design. The CavityPlus web server is freely available at http://repharma.pku.edu.cn/cavityplus or http://www.pkumdl.cn/cavityplus.
de Souza, Gustavo A.; Arntzen, Magnus Ø.; Fortuin, Suereta; Schürch, Anita C.; Målen, Hiwa; McEvoy, Christopher R. E.; van Soolingen, Dick; Thiede, Bernd; Warren, Robin M.; Wiker, Harald G.
2011-01-01
Precise annotation of genes or open reading frames is still a difficult task that results in divergence even for data generated from the same genomic sequence. This has an impact in further proteomic studies, and also compromises the characterization of clinical isolates with many specific genetic variations that may not be represented in the selected database. We recently developed software called multistrain mass spectrometry prokaryotic database builder (MSMSpdbb) that can merge protein databases from several sources and be applied on any prokaryotic organism, in a proteomic-friendly approach. We generated a database for the Mycobacterium tuberculosis complex (using three strains of Mycobacterium bovis and five of M. tuberculosis), and analyzed data collected from two laboratory strains and two clinical isolates of M. tuberculosis. We identified 2561 proteins, of which 24 were present in M. tuberculosis H37Rv samples, but not annotated in the M. tuberculosis H37Rv genome. We were also able to identify 280 nonsynonymous single amino acid polymorphisms and confirm 367 translational start sites. As a proof of concept we applied the database to whole-genome DNA sequencing data of one of the clinical isolates, which allowed the validation of 116 predicted single amino acid polymorphisms and the annotation of 131 N-terminal start sites. Moreover we identified regions not present in the original M. tuberculosis H37Rv sequence, indicating strain divergence or errors in the reference sequence. In conclusion, we demonstrated the potential of using a merged database to better characterize laboratory or clinical bacterial strains. PMID:21030493
Berillo, Olga; Régnier, Mireille; Ivashchenko, Anatoly
2014-01-01
microRNAs are small RNA molecules that inhibit the translation of target genes. microRNA binding sites are located in the untranslated regions as well as in the coding domains. We describe TmiRUSite and TmiROSite scripts developed using python as tools for the extraction of nucleotide sequences for miRNA binding sites with their encoded amino acid residue sequences. The scripts allow for retrieving a set of additional sequences at left and at right from the binding site. The scripts presents all received data in table formats that are easy to analyse further. The predicted data finds utility in molecular and evolutionary biology studies. They find use in studying miRNA binding sites in animals and plants. TmiRUSite and TmiROSite scripts are available for free from authors upon request and at https: //sites.google.com/site/malaheenee/downloads for download.
LHRH-pituitary plasma membrane binding: the presence of specific binding sites in other tissues.
Marshall, J C; Shakespear, R A; Odell, W D
1976-11-01
Two specific binding sites for LHRH are present on plasma membranes prepared from rat and bovine anterior pituitary glands. One site is of high affinity (K = 2X108 1/MOL) and the second is of lower affinity (8-5X105 1/mol) and much greater capacity. Studies on membrane fractions prepared from other tissues showed the presence of a single specific site for LHRH. The kinetics and specificity of this site were similar to those of the lower affinity pituitary receptor. These results indicate that only pituitary membranes possess the higher affinity binding site and suggest that the low affinity site is not of physiological importance in the regulation of gonadotrophin secretion. After dissociation from membranes of non-pituitary tissues 125I-LHRH rebound to pituitary membrane preparations. Thus receptor binding per se does not result in degradation of LHRH and the function of these peripheral receptors remains obscure.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Strauch, Eva-Maria; Bernard, Steffen M.; La, David
Many viral surface glycoproteins and cell surface receptors are homo-oligomers1, 2, 3, 4, and thus can potentially be targeted by geometrically matched homo-oligomers that engage all subunits simultaneously to attain high avidity and/or lock subunits together. The adaptive immune system cannot generally employ this strategy since the individual antibody binding sites are not arranged with appropriate geometry to simultaneously engage multiple sites in a single target homo-oligomer. We describe a general strategy for the computational design of homo-oligomeric protein assemblies with binding functionality precisely matched to homo-oligomeric target sites5, 6, 7, 8. In the first step, a small protein ismore » designed that binds a single site on the target. In the second step, the designed protein is assembled into a homo-oligomer such that the designed binding sites are aligned with the target sites. We use this approach to design high-avidity trimeric proteins that bind influenza A hemagglutinin (HA) at its conserved receptor binding site. The designed trimers can both capture and detect HA in a paper-based diagnostic format, neutralizes influenza in cell culture, and completely protects mice when given as a single dose 24 h before or after challenge with influenza.« less
An alternate binding site for PPARγ ligands
Hughes, Travis S.; Giri, Pankaj Kumar; de Vera, Ian Mitchelle S.; Marciano, David P.; Kuruvilla, Dana S.; Shin, Youseung; Blayo, Anne-Laure; Kamenecka, Theodore M.; Burris, Thomas P.; Griffin, Patrick R.; Kojetin, Douglas J.
2014-01-01
PPARγ is a target for insulin sensitizing drugs such as glitazones, which improve plasma glucose maintenance in patients with diabetes. Synthetic ligands have been designed to mimic endogenous ligand binding to a canonical ligand-binding pocket to hyperactivate PPARγ. Here we reveal that synthetic PPARγ ligands also bind to an alternate site, leading to unique receptor conformational changes that impact coregulator binding, transactivation and target gene expression. Using structure-function studies we show that alternate site binding occurs at pharmacologically relevant ligand concentrations, and is neither blocked by covalently bound synthetic antagonists nor by endogenous ligands indicating non-overlapping binding with the canonical pocket. Alternate site binding likely contributes to PPARγ hyperactivation in vivo, perhaps explaining why PPARγ full and partial or weak agonists display similar adverse effects. These findings expand our understanding of PPARγ activation by ligands and suggest that allosteric modulators could be designed to fine tune PPARγ activity without competing with endogenous ligands. PMID:24705063
Concerted formation of macromolecular Suppressor–mutator transposition complexes
Raina, Ramesh; Schläppi, Michael; Karunanandaa, Balasulojini; Elhofy, Adam; Fedoroff, Nina
1998-01-01
Transposition of the maize Suppressor–mutator (Spm) transposon requires two element-encoded proteins, TnpA and TnpD. Although there are multiple TnpA binding sites near each element end, binding of TnpA to DNA is not cooperative, and the binding affinity is not markedly affected by the number of binding sites per DNA fragment. However, intermolecular complexes form cooperatively between DNA fragments with three or more TnpA binding sites. TnpD, itself not a sequence-specific DNA-binding protein, binds to TnpA and stabilizes the TnpA–DNA complex. The high redundancy of TnpA binding sites at both element ends and the protein–protein interactions between DNA-bound TnpA complexes and between these and TnpD imply a concerted transition of the element from a linear to a protein crosslinked transposition complex within a very narrow protein concentration range. PMID:9671711
Kappel, Kalli; Miao, Yinglong; McCammon, J Andrew
2015-11-01
Elucidating the detailed process of ligand binding to a receptor is pharmaceutically important for identifying druggable binding sites. With the ability to provide atomistic detail, computational methods are well poised to study these processes. Here, accelerated molecular dynamics (aMD) is proposed to simulate processes of ligand binding to a G-protein-coupled receptor (GPCR), in this case the M3 muscarinic receptor, which is a target for treating many human diseases, including cancer, diabetes and obesity. Long-timescale aMD simulations were performed to observe the binding of three chemically diverse ligand molecules: antagonist tiotropium (TTP), partial agonist arecoline (ARc) and full agonist acetylcholine (ACh). In comparison with earlier microsecond-timescale conventional MD simulations, aMD greatly accelerated the binding of ACh to the receptor orthosteric ligand-binding site and the binding of TTP to an extracellular vestibule. Further aMD simulations also captured binding of ARc to the receptor orthosteric site. Additionally, all three ligands were observed to bind in the extracellular vestibule during their binding pathways, suggesting that it is a metastable binding site. This study demonstrates the applicability of aMD to protein-ligand binding, especially the drug recognition of GPCRs.
Oligomycin frames a common drug-binding site in the ATP synthase
DOE Office of Scientific and Technical Information (OSTI.GOV)
Symersky, Jindrich; Osowski, Daniel; Walters, D. Eric
We report the high-resolution (1.9 {angstrom}) crystal structure of oligomycin bound to the subunit c10 ring of the yeast mitochondrial ATP synthase. Oligomycin binds to the surface of the c10 ring making contact with two neighboring molecules at a position that explains the inhibitory effect on ATP synthesis. The carboxyl side chain of Glu59, which is essential for proton translocation, forms an H-bond with oligomycin via a bridging water molecule but is otherwise shielded from the aqueous environment. The remaining contacts between oligomycin and subunit c are primarily hydrophobic. The amino acid residues that form the oligomycin-binding site are 100%more » conserved between human and yeast but are widely different from those in bacterial homologs, thus explaining the differential sensitivity to oligomycin. Prior genetics studies suggest that the oligomycin-binding site overlaps with the binding site of other antibiotics, including those effective against Mycobacterium tuberculosis, and thereby frames a common 'drug-binding site.' We anticipate that this drug-binding site will serve as an effective target for new antibiotics developed by rational design.« less
Nisius, Britta; Gohlke, Holger
2012-09-24
Analyzing protein binding sites provides detailed insights into the biological processes proteins are involved in, e.g., into drug-target interactions, and so is of crucial importance in drug discovery. Herein, we present novel alignment-independent binding site descriptors based on DrugScore potential fields. The potential fields are transformed to a set of information-rich descriptors using a series expansion in 3D Zernike polynomials. The resulting Zernike descriptors show a promising performance in detecting similarities among proteins with low pairwise sequence identities that bind identical ligands, as well as within subfamilies of one target class. Furthermore, the Zernike descriptors are robust against structural variations among protein binding sites. Finally, the Zernike descriptors show a high data compression power, and computing similarities between binding sites based on these descriptors is highly efficient. Consequently, the Zernike descriptors are a useful tool for computational binding site analysis, e.g., to predict the function of novel proteins, off-targets for drug candidates, or novel targets for known drugs.
Rosenberry, Terrone L; Sonoda, Leilani K; Dekat, Sarah E; Cusack, Bernadette; Johnson, Joseph L
2008-12-09
Acetylcholinesterase (AChE) contains a narrow and deep active site gorge with two sites of ligand binding, an acylation site (or A-site) at the base of the gorge and a peripheral site (or P-site) near the gorge entrance. The P-site contributes to catalytic efficiency by transiently binding substrates on their way to the acylation site, where a short-lived acylated enzyme intermediate is produced. Carbamates are very poor substrates that, like other AChE substrates, form an initial enzyme-substrate complex with free AChE (E) and proceed to an acylated enzyme intermediate (EC), which is then hydrolyzed. However, the hydrolysis of EC is slow enough to resolve the acylation and deacylation steps on the catalytic pathway. Here, we focus on the reaction of carbachol (carbamoylcholine) with AChE. The kinetics and thermodynamics of this reaction are of special interest because carbachol is an isosteric analogue of the physiological substrate acetylcholine. We show that the reaction can be monitored with thioflavin T as a fluorescent reporter group. The fluorescence of thioflavin T is strongly enhanced when it binds to the P-site of AChE, and this fluorescence is partially quenched when a second ligand binds to the A-site to form a ternary complex. Analysis of the fluorescence reaction profiles was challenging because four thermodynamic parameters and two fluorescence coefficients were fitted from the combined data both for E and for EC. Respective equilibrium dissociation constants of 6 and 26 mM were obtained for carbachol binding to the A- and P-sites in E and of 2 and 32 mM for carbachol binding to the A- and P-sites in EC. These constants for the binding of carbachol to the P-site are about an order of magnitude larger (i.e., indicating lower affinity) than previous estimates for the binding of acetylthiocholine to the P-site.
Rosenberry, Terrone L.; Sonoda, Leilani K.; Dekat, Sarah E.; Cusack, Bernadette; Johnson, Joseph L.
2009-01-01
Acetylcholinesterase (AChE) contains a narrow and deep active site gorge with two sites of ligand binding, an acylation site (or A-site) at the base of the gorge and a peripheral site (or P-site) near the gorge entrance. The P-site contributes to catalytic efficiency by transiently binding substrates on their way to the acylation site, where a short-lived acylated enzyme intermediate is produced. Carbamates are very poor substrates that, like other AChE substrates, form an initial enzyme-substrate complex with free AChE (E) and proceed to an acylated enzyme intermediate (EC) which is then hydrolyzed. However, the hydrolysis of EC is slow enough to resolve the acylation and deacylation steps on the catalytic pathway. Here we focus on the reaction of carbachol (carbamoylcholine) with AChE. The kinetics and thermodynamics of this reaction are of special interest because carbachol is an isosteric analog of the physiological substrate acetylcholine. We show that the reaction can be monitored with thioflavin T as a fluorescent reporter group. The fluorescence of thioflavin T is strongly enhanced when it binds to the P-site of AChE, and this fluorescence is partially quenched when a second ligand binds to the A-site to form a ternary complex. Analysis of the fluorescence reaction profiles was challenging, because four thermodynamic parameters and two fluorescence coefficients were fitted from the combined data both for E and for EC. Respective equilibrium dissociation constants of 6 and 26 mM were obtained for carbachol binding to the A- and P-sites in E and of 2 and 32 mM for carbachol binding to the A- and P-sites in EC. These constants for the binding of carbachol to the P-site are about an order of magnitude larger (i.e., indicating lower affinity) than previous estimates for the binding of acetylthiocholine to the P-site. PMID:19006330
Drug Promiscuity in PDB: Protein Binding Site Similarity Is Key.
Haupt, V Joachim; Daminelli, Simone; Schroeder, Michael
2013-01-01
Drug repositioning applies established drugs to new disease indications with increasing success. A pre-requisite for drug repurposing is drug promiscuity (polypharmacology) - a drug's ability to bind to several targets. There is a long standing debate on the reasons for drug promiscuity. Based on large compound screens, hydrophobicity and molecular weight have been suggested as key reasons. However, the results are sometimes contradictory and leave space for further analysis. Protein structures offer a structural dimension to explain promiscuity: Can a drug bind multiple targets because the drug is flexible or because the targets are structurally similar or even share similar binding sites? We present a systematic study of drug promiscuity based on structural data of PDB target proteins with a set of 164 promiscuous drugs. We show that there is no correlation between the degree of promiscuity and ligand properties such as hydrophobicity or molecular weight but a weak correlation to conformational flexibility. However, we do find a correlation between promiscuity and structural similarity as well as binding site similarity of protein targets. In particular, 71% of the drugs have at least two targets with similar binding sites. In order to overcome issues in detection of remotely similar binding sites, we employed a score for binding site similarity: LigandRMSD measures the similarity of the aligned ligands and uncovers remote local similarities in proteins. It can be applied to arbitrary structural binding site alignments. Three representative examples, namely the anti-cancer drug methotrexate, the natural product quercetin and the anti-diabetic drug acarbose are discussed in detail. Our findings suggest that global structural and binding site similarity play a more important role to explain the observed drug promiscuity in the PDB than physicochemical drug properties like hydrophobicity or molecular weight. Additionally, we find ligand flexibility to have a minor influence.
Nucleotide Interdependency in Transcription Factor Binding Sites in the Drosophila Genome.
Dresch, Jacqueline M; Zellers, Rowan G; Bork, Daniel K; Drewell, Robert A
2016-01-01
A long-standing objective in modern biology is to characterize the molecular components that drive the development of an organism. At the heart of eukaryotic development lies gene regulation. On the molecular level, much of the research in this field has focused on the binding of transcription factors (TFs) to regulatory regions in the genome known as cis-regulatory modules (CRMs). However, relatively little is known about the sequence-specific binding preferences of many TFs, especially with respect to the possible interdependencies between the nucleotides that make up binding sites. A particular limitation of many existing algorithms that aim to predict binding site sequences is that they do not allow for dependencies between nonadjacent nucleotides. In this study, we use a recently developed computational algorithm, MARZ, to compare binding site sequences using 32 distinct models in a systematic and unbiased approach to explore nucleotide dependencies within binding sites for 15 distinct TFs known to be critical to Drosophila development. Our results indicate that many of these proteins have varying levels of nucleotide interdependencies within their DNA recognition sequences, and that, in some cases, models that account for these dependencies greatly outperform traditional models that are used to predict binding sites. We also directly compare the ability of different models to identify the known KRUPPEL TF binding sites in CRMs and demonstrate that a more complex model that accounts for nucleotide interdependencies performs better when compared with simple models. This ability to identify TFs with critical nucleotide interdependencies in their binding sites will lead to a deeper understanding of how these molecular characteristics contribute to the architecture of CRMs and the precise regulation of transcription during organismal development.
Nucleotide Interdependency in Transcription Factor Binding Sites in the Drosophila Genome
Dresch, Jacqueline M.; Zellers, Rowan G.; Bork, Daniel K.; Drewell, Robert A.
2016-01-01
A long-standing objective in modern biology is to characterize the molecular components that drive the development of an organism. At the heart of eukaryotic development lies gene regulation. On the molecular level, much of the research in this field has focused on the binding of transcription factors (TFs) to regulatory regions in the genome known as cis-regulatory modules (CRMs). However, relatively little is known about the sequence-specific binding preferences of many TFs, especially with respect to the possible interdependencies between the nucleotides that make up binding sites. A particular limitation of many existing algorithms that aim to predict binding site sequences is that they do not allow for dependencies between nonadjacent nucleotides. In this study, we use a recently developed computational algorithm, MARZ, to compare binding site sequences using 32 distinct models in a systematic and unbiased approach to explore nucleotide dependencies within binding sites for 15 distinct TFs known to be critical to Drosophila development. Our results indicate that many of these proteins have varying levels of nucleotide interdependencies within their DNA recognition sequences, and that, in some cases, models that account for these dependencies greatly outperform traditional models that are used to predict binding sites. We also directly compare the ability of different models to identify the known KRUPPEL TF binding sites in CRMs and demonstrate that a more complex model that accounts for nucleotide interdependencies performs better when compared with simple models. This ability to identify TFs with critical nucleotide interdependencies in their binding sites will lead to a deeper understanding of how these molecular characteristics contribute to the architecture of CRMs and the precise regulation of transcription during organismal development. PMID:27330274
[3H]MK-801 binding sites in post-mortem human frontal cortex.
Kornhuber, J; Mack-Burkhardt, F; Kornhuber, M E; Riederer, P
1989-03-29
The binding of [3H]MK-801 ((+)-5-methyl-10,11-dihydro-5H-dibenzo[a,d]cyclohepten-5,10-imine maleate) was investigated in extensively washed homogenates of post-mortem human frontal cortex. The association of [3H]MK-801 proceeded slowly (t1/2 = 553 min) and reached equilibrium only after a prolonged incubation (greater than 24 h). The dissociation of [3H]MK-801 from the binding site was also slow (t1/2 = 244 min). Glutamate, glycine and magnesium markedly increased the rate of association (t1/2 = 14.8 min) and dissociation (t1/2 = 36.5 min). At equilibrium, the binding was not altered by these substances. Specific binding was linear with protein concentration, was saturable, reversible, stereoselective, heat-labile and was nearly absent in the white matter. Scatchard analysis of the saturation curves obtained at equilibrium indicated that there was a high-affinity (Kd1 1.39 +/- 0.21 nM, Bmax1 0.483 +/- 0.084 pmol/mg protein) and a low-affinity (Kd2 116.25 +/- 50.79 nM, Bmax2 3.251 +/- 0.991 pmol/mg protein) binding site. All competition curves obtained with (+)-MK-801, (-)-MK-801, phencyclidine and ketamine had Hill coefficients of less than unity and were best explained by a two-site model. Thus, our results demonstrate the presence of binding sites for MK-801 in post-mortem human brains and provide evidence for binding site heterogeneity. Furthermore, glutamate, glycine and magnesium accelerate the association and dissociation of [3H]MK-801 to and from its binding sites. The results add support to the hypothesis that MK-801, glutamate, glycine and magnesium all bind to different sites on the NMDA receptor-ion channel complex.
Simon, S; Le Goff, A; Frobert, Y; Grassi, J; Massoulié, J
1999-09-24
We investigated the target sites of three inhibitory monoclonal antibodies on Electrophorus acetylcholinesterase (AChE). Previous studies showed that Elec-403 and Elec-410 are directed to overlapping but distinct epitopes in the peripheral site, at the entrance of the catalytic gorge, whereas Elec-408 binds to a different region. Using Electrophorus/rat AChE chimeras, we identified surface residues that differed between sensitive and insensitive AChEs: the replacement of a single Electrophorus residue by its rat homolog was able to abolish binding and inhibition, for each antibody. Reciprocally, binding and inhibition by Elec-403 and by Elec-410 could be conferred to rat AChE by the reverse mutation. Elec-410 appears to bind to one side of the active gorge, whereas Elec-403 covers its opening, explaining why the AChE-Elec-410 complex reacts faster than the AChE-Elec-403 or AChE-fasciculin complexes with two active site inhibitors, m-(N,N, N-trimethyltammonio)trifluoro-acetophenone and echothiophate. Elec-408 binds to the region of the putative "back door," distant from the peripheral site, and does not interfere with the access of inhibitors to the active site. The binding of an antibody to this novel regulatory site may inhibit the enzyme by blocking the back door or by inducing a conformational distortion within the active site.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yusim, Karina; Korber, Bette Tina Marie; Barouch, Dan
HIV Molecular Immunology is a companion volume to HIV Sequence Compendium. This publication, the 2014 edition, is the PDF version of the web-based HIV Immunology Database (http://www.hiv.lanl.gov/content/immunology/). The web interface for this relational database has many search options, as well as interactive tools to help immunologists design reagents and interpret their results. In the HIV Immunology Database, HIV-specific B-cell and T-cell responses are summarized and annotated. Immunological responses are divided into three parts, CTL, T helper, and antibody. Within these parts, defined epitopes are organized by protein and binding sites within each protein, moving from left to right through themore » coding regions spanning the HIV genome. We include human responses to natural HIV infections, as well as vaccine studies in a range of animal models and human trials. Responses that are not specifically defined, such as responses to whole proteins or monoclonal antibody responses to discontinuous epitopes, are summarized at the end of each protein section. Studies describing general HIV responses to the virus, but not to any specific protein, are included at the end of each part. The annotation includes information such as crossreactivity, escape mutations, antibody sequence, TCR usage, functional domains that overlap with an epitope, immune response associations with rates of progression and therapy, and how specific epitopes were experimentally defined. Basic information such as HLA specificities for T-cell epitopes, isotypes of monoclonal antibodies, and epitope sequences are included whenever possible. All studies that we can find that incorporate the use of a specific monoclonal antibody are included in the entry for that antibody. A single T-cell epitope can have multiple entries, generally one entry per study. Finally, maps of all defined linear epitopes relative to the HXB2 reference proteins are provided.« less
Wein, Thomas; Höfner, Georg; Rappenglück, Sebastian; Sichler, Sonja; Niessen, Karin V; Seeger, Thomas; Worek, Franz; Thiermann, Horst; Wanner, Klaus T
2018-09-01
Irreversible inhibition of the acetylcholine esterase upon intoxication with organophosphorus compounds leads to an accumulation of acetylcholine in the synaptic cleft and a subsequent desensitization of nicotinic acetylcholine receptors which may ultimately result in respiratory failure. The bispyridinium compound MB327 has been found to restore functional activity of nAChR thus representing a promising starting point for the development of new drugs for the treatment of organophosphate poisoning. In order to optimize the resensitizing effect of MB327 on nAChR, it would be very helpful to know the MB327 specific binding site to apply structure based molecular modeling. The binding site for MB327 at the nAChR is not known and so far goal of speculations, but it has been shown that MB327 does not bind to the orthosteric acetylcholine binding site. We have used docking calculations to screen the surface of nAChR for possible binding sites of MB327. The results indicate that at least two potential binding sites for MB327 at nAChR are present inside the channel pore. In these binding sites, MB327 intercalates between the γ-α and β-δ subunits of nAChR, respectively. Both putative MB327 binding sites show an unsymmetrical distribution of surrounding hydrophilic and lipophilic amino acids. This suggests that substitution of MB327-related bispyridinium compounds on one of the two pyridinium rings with polar substituents should have a favorable effect on the pharmacological function. Copyright © 2017 Elsevier B.V. All rights reserved.
Kim, Dong Seon; Hahn, Yoonsoo
2012-11-13
Evolution of splice sites is a well-known phenomenon that results in transcript diversity during human evolution. Many novel splice sites are derived from repetitive elements and may not contribute to protein products. Here, we analyzed annotated human protein-coding exons and identified human-specific splice sites that arose after the human-chimpanzee divergence. We analyzed multiple alignments of the annotated human protein-coding exons and their respective orthologous mammalian genome sequences to identify 85 novel splice sites (50 splice acceptors and 35 donors) in the human genome. The novel protein-coding exons, which are expressed either constitutively or alternatively, produce novel protein isoforms by insertion, deletion, or frameshift. We found three cases in which the human-specific isoform conferred novel molecular function in the human cells: the human-specific IMUP protein isoform induces apoptosis of the trophoblast and is implicated in pre-eclampsia; the intronization of a part of SMOX gene exon produces inactive spermine oxidase; the human-specific NUB1 isoform shows reduced interaction with ubiquitin-like proteins, possibly affecting ubiquitin pathways. Although the generation of novel protein isoforms does not equate to adaptive evolution, we propose that these cases are useful candidates for a molecular functional study to identify proteomic changes that might bring about novel phenotypes during human evolution.
Comparison of the fibrin-binding activities in the N- and C-termini of fibronectin.
Rostagno, A A; Schwarzbauer, J E; Gold, L I
1999-03-01
Fibronectin (Fn) binds to fibrin in clots by covalent and non-covalent interactions. The N- and C-termini of Fn each contain one non-covalent fibrin-binding site, which are composed of type 1 (F1) structural repeats. We have previously localized the N-terminal site to the fourth and fifth F1 repeats (4F1.5F1). In the current studies, using proteolytic and recombinant proteins representing both the N- and C-terminal fibrin-binding regions, we localized and characterized the C-terminal fibrin-binding site, compared the relative fibrin-binding activities of both sites and determined the contribution of each site to the fibrin-binding activity of intact Fn. By fibrin-affinity chromatography, a protein composed of the 10F1 repeat through to the C-terminus of Fn (10F1-COOH), expressed in COS-1 cells, and 10F1-12F1, produced in Saccharomyces cerevisiae, displayed fibrin-binding activity. However, since 10F1 and 10F1.11F1 were not active, the presence of 12F1 is required for fibrin binding. A proteolytic fragment of 14.4 kDa, beginning 14 residues N-terminal to 10F1, was isolated from the fibrin-affinity matrix. Radio-iodinated 14.4 kDa fibrin-binding peptide/protein (FBP) demonstrated a dose-dependent and saturable binding to fibrin-coated wells that was both competitively inhibited and reversed by unlabelled 14.4 kDa FBP. Comparison of the fibrin-binding affinities of proteolytic FBPs from the N-terminus (25.9 kDa FBP), the C-terminus (14.4 kDa) and intact Fn by ELISA yielded estimated Kd values of 216, 18 and 2.1 nM, respectively. The higher fibrin-binding affinity of the N-terminus was substantiated by the ability of both a recombinant 4F1.5F1 and a monoclonal antibody (mAb) to this site to maximally inhibit biotinylated Fn binding to fibrin by 80%, and by blocking the 90% inhibitory activity of a polyclonal anti-Fn, by absorption with the 25.9 kDa FBP. We propose that whereas the N-terminal site appears to contribute to most of the binding activity of native Fn to fibrin, the specific binding of the C-terminal site may strengthen this interaction.
Comparison of the fibrin-binding activities in the N- and C-termini of fibronectin.
Rostagno, A A; Schwarzbauer, J E; Gold, L I
1999-01-01
Fibronectin (Fn) binds to fibrin in clots by covalent and non-covalent interactions. The N- and C-termini of Fn each contain one non-covalent fibrin-binding site, which are composed of type 1 (F1) structural repeats. We have previously localized the N-terminal site to the fourth and fifth F1 repeats (4F1.5F1). In the current studies, using proteolytic and recombinant proteins representing both the N- and C-terminal fibrin-binding regions, we localized and characterized the C-terminal fibrin-binding site, compared the relative fibrin-binding activities of both sites and determined the contribution of each site to the fibrin-binding activity of intact Fn. By fibrin-affinity chromatography, a protein composed of the 10F1 repeat through to the C-terminus of Fn (10F1-COOH), expressed in COS-1 cells, and 10F1-12F1, produced in Saccharomyces cerevisiae, displayed fibrin-binding activity. However, since 10F1 and 10F1.11F1 were not active, the presence of 12F1 is required for fibrin binding. A proteolytic fragment of 14.4 kDa, beginning 14 residues N-terminal to 10F1, was isolated from the fibrin-affinity matrix. Radio-iodinated 14.4 kDa fibrin-binding peptide/protein (FBP) demonstrated a dose-dependent and saturable binding to fibrin-coated wells that was both competitively inhibited and reversed by unlabelled 14.4 kDa FBP. Comparison of the fibrin-binding affinities of proteolytic FBPs from the N-terminus (25.9 kDa FBP), the C-terminus (14.4 kDa) and intact Fn by ELISA yielded estimated Kd values of 216, 18 and 2.1 nM, respectively. The higher fibrin-binding affinity of the N-terminus was substantiated by the ability of both a recombinant 4F1.5F1 and a monoclonal antibody (mAb) to this site to maximally inhibit biotinylated Fn binding to fibrin by 80%, and by blocking the 90% inhibitory activity of a polyclonal anti-Fn, by absorption with the 25.9 kDa FBP. We propose that whereas the N-terminal site appears to contribute to most of the binding activity of native Fn to fibrin, the specific binding of the C-terminal site may strengthen this interaction. PMID:10024513
Fire and fire surrogate study: annotated highlights from oak-dominated sites
Daniel A. Yaussy; Thomas A. Waldrop
2009-01-01
The National Fire and Fire Surrogate (FFS) study was implemented to investigate the ecological impacts of prescribed fire and mechanical operations to mimic fire in restoring the structure and function of forests typically maintained by frequent, low-intensity fires. Two of the 12 sites were located in oak-dominated forests, one in Ohio and another in North Carolina....
Surfing for history: an annotated bibliography of select websites/pages on the history of dentistry.
Matlak, Andrea
2007-01-01
The Internet includes many sites that provide secondary source information on the history of dentistry. These sites are maintained by diverse groups such as dental libraries, dental museums, commercial enterprises, dentists, dental offices, dental organizations and nondental related organizations and individuals. The information provided in this paper is as eclectic and diverse as the sources suggest.
Johnson, Britney; McConnell, Patrick; Kozlov, Alex G; Mekel, Marlene; Lohman, Timothy M; Gross, Michael L; Amarasinghe, Gaya K; Cooper, John A
2018-05-29
Actin assembly is important for cell motility. The ability of actin subunits to join or leave filaments via the barbed end is critical to actin dynamics. Capping protein (CP) binds to barbed ends to prevent subunit gain and loss and is regulated by proteins that include V-1 and CARMIL. V-1 inhibits CP by sterically blocking one binding site for actin. CARMILs bind at a distal site and decrease the affinity of CP for actin, suggested to be caused by conformational changes. We used hydrogen-deuterium exchange with mass spectrometry (HDX-MS) to probe changes in structural dynamics induced by V-1 and CARMIL binding to CP. V-1 and CARMIL induce changes in both proteins' binding sites on the surface of CP, along with a set of internal residues. Both also affect the conformation of CP's ββ subunit "tentacle," a second distal actin-binding site. Concerted regulation of actin assembly by CP occurs through allosteric couplings between CP modulator and actin binding sites. Copyright © 2018 The Author(s). Published by Elsevier Inc. All rights reserved.
Basken, Nathan E.; Mathias, Carla J.; Green, Mark A.
2008-01-01
The Cu-PTSM (pyruvaldehyde bis(N4-methylthiosemicarbazonato)copper(II)) and Cu-ATSM (diacetyl bis(N4-methylthiosemicarbazonato)copper(II)) radiopharmaceuticals exhibit strong, species-dependent binding to human serum albumin (HSA), while Cu-ETS (ethylglyoxal bis(thiosemicarbazonato)copper(II)) appears to only exhibit non-specific binding to human and animal serum albumins. This study examines the structural basis for HSA binding of Cu-PTSM and Cu-ATSM via competition with drugs having known albumin binding sites. Warfarin, furosemide, ibuprofen, phenylbutazone, benzylpenicillin, and cephmandole were added to HSA solutions at drug:HSA mole ratios from 0 to 8:1, followed by quantification of radiopharmaceutical binding to HSA by ultrafiltration. Warfarin, a site IIA drug, progressively displaced both [64Cu]Cu-PTSM and [64Cu]Cu-ATSM from HSA. At 8:1 warfarin:HSA mole ratios, free [64Cu]Cu-PTSM and [64Cu]Cu-ATSM levels increased 300–500%. This was in contrast to solutions containing ibuprofen, a site IIIA drug; no increase in free [64Cu]Cu-PTSM or [64Cu]Cu-ATSM was observed except at high ibuprofen:HSA ratios, where secondary ibuprofen binding to the IIA site may cause modest radiopharmaceutical displacement. By contrast, and consistent with earlier findings suggesting Cu-ETS exhibits only non-specific associations, [64Cu]Cu-ETS binding to HSA was unaffected by the addition of drugs that bind in either site. We conclude that the species-dependence of Cu-PTSM and Cu-ATSM albumin binding arises from interaction(s) with the IIA site of HSA. PMID:18937368
Wang, Ruijia; Nambiar, Ram; Zheng, Dinghai
2018-01-01
Abstract PolyA_DB is a database cataloging cleavage and polyadenylation sites (PASs) in several genomes. Previous versions were based mainly on expressed sequence tags (ESTs), which had a limited amount and could lead to inaccurate PAS identification due to the presence of internal A-rich sequences in transcripts. Here, we present an updated version of the database based solely on deep sequencing data. First, PASs are mapped by the 3′ region extraction and deep sequencing (3′READS) method, ensuring unequivocal PAS identification. Second, a large volume of data based on diverse biological samples increases PAS coverage by 3.5-fold over the EST-based version and provides PAS usage information. Third, strand-specific RNA-seq data are used to extend annotated 3′ ends of genes to obtain more thorough annotations of alternative polyadenylation (APA) sites. Fourth, conservation information of PAS across mammals sheds light on significance of APA sites. The database (URL: http://www.polya-db.org/v3) currently holds PASs in human, mouse, rat and chicken, and has links to the UCSC genome browser for further visualization and for integration with other genomic data. PMID:29069441
Binding and Translocation of Termination Factor Rho Studied at the Single-Molecule Level
Koslover, Daniel J.; Fazal, Furqan M.; Mooney, Rachel A.; Landick, Robert; Block, Steven M.
2012-01-01
Rho termination factor is an essential hexameric helicase responsible for terminating 20–50% of all mRNA synthesis in E. coli. We used single- molecule force spectroscopy to investigate Rho-RNA binding interactions at the Rho- utilization (rut) site of the ? tR1 terminator. Our results are consistent with Rho complexes adopting two states, one that binds 57 ±2 nucleotides of RNA across all six of the Rho primary binding sites, and another that binds 85 ±2 nucleotides at the six primary sites plus a single secondary site situated at the center of the hexamer. The single-molecule data serve to establish that Rho translocates 5′-to-3′ towards RNA polymerase (RNAP) by a tethered-tracking mechanism, looping out the intervening RNA between the rut site and RNAP. These findings lead to a general model for Rho binding and translocation, and establish a novel experimental approach that should facilitate additional single- molecule studies of RNA-binding proteins. PMID:22885804
Field, Jessica J; Pera, Benet; Gallego, Juan Estévez; Calvo, Enrique; Rodríguez-Salarichs, Javier; Sáez-Calvo, Gonzalo; Zuwerra, Didier; Jordi, Michel; Andreu, José M; Prota, Andrea E; Ménchon, Grégory; Miller, John H; Altmann, Karl-Heinz; Díaz, J Fernando
2018-03-23
The marine natural product zampanolide and analogues thereof constitute a new chemotype of taxoid site microtubule-stabilizing agents with a covalent mechanism of action. Zampanolide-ligated tubulin has the switch-activation loop (M-loop) in the assembly prone form and, thus, represents an assembly activated state of the protein. In this study, we have characterized the biochemical properties of the covalently modified, activated tubulin dimer, and we have determined the effect of zampanolide on tubulin association and the binding of tubulin ligands at other binding sites. Tubulin activation by zampanolide does not affect its longitudinal oligomerization but does alter its lateral association properties. The covalent binding of zampanolide to β-tubulin affects both the colchicine site, causing a change of the quantum yield of the bound ligand, and the exchangeable nucleotide binding site, reducing the affinity for the nucleotide. While these global effects do not change the binding affinity of 2-methoxy-5-(2,3,4-trimethoxyphenyl)-2,4,6-cycloheptatrien-1-one (MTC) (a reversible binder of the colchicine site), the binding affinity of a fluorescent analogue of GTP (Mant-GTP) at the nucleotide E-site is reduced from 12 ± 2 × 10 5 M -1 in the case of unmodified tubulin to 1.4 ± 0.3 × 10 5 M -1 in the case of the zampanolide tubulin adduct, indicating signal transmission between the taxane site and the colchicine and nucleotide sites of β-tubulin.
Binding Pathway of Opiates to μ-Opioid Receptors Revealed by Machine Learning
NASA Astrophysics Data System (ADS)
Barati Farimani, Amir; Feinberg, Evan; Pande, Vijay
2018-02-01
Many important analgesics relieve pain by binding to the $\\mu$-Opioid Receptor ($\\mu$OR), which makes the $\\mu$OR among the most clinically relevant proteins of the G Protein Coupled Receptor (GPCR) family. Despite previous studies on the activation pathways of the GPCRs, the mechanism of opiate binding and the selectivity of $\\mu$OR are largely unknown. We performed extensive molecular dynamics (MD) simulation and analysis to find the selective allosteric binding sites of the $\\mu$OR and the path opiates take to bind to the orthosteric site. In this study, we predicted that the allosteric site is responsible for the attraction and selection of opiates. Using Markov state models and machine learning, we traced the pathway of opiates in binding to the orthosteric site, the main binding pocket. Our results have important implications in designing novel analgesics.