The Biomolecular Interaction Network Database and related tools 2005 update
Alfarano, C.; Andrade, C. E.; Anthony, K.; Bahroos, N.; Bajec, M.; Bantoft, K.; Betel, D.; Bobechko, B.; Boutilier, K.; Burgess, E.; Buzadzija, K.; Cavero, R.; D'Abreo, C.; Donaldson, I.; Dorairajoo, D.; Dumontier, M. J.; Dumontier, M. R.; Earles, V.; Farrall, R.; Feldman, H.; Garderman, E.; Gong, Y.; Gonzaga, R.; Grytsan, V.; Gryz, E.; Gu, V.; Haldorsen, E.; Halupa, A.; Haw, R.; Hrvojic, A.; Hurrell, L.; Isserlin, R.; Jack, F.; Juma, F.; Khan, A.; Kon, T.; Konopinsky, S.; Le, V.; Lee, E.; Ling, S.; Magidin, M.; Moniakis, J.; Montojo, J.; Moore, S.; Muskat, B.; Ng, I.; Paraiso, J. P.; Parker, B.; Pintilie, G.; Pirone, R.; Salama, J. J.; Sgro, S.; Shan, T.; Shu, Y.; Siew, J.; Skinner, D.; Snyder, K.; Stasiuk, R.; Strumpf, D.; Tuekam, B.; Tao, S.; Wang, Z.; White, M.; Willis, R.; Wolting, C.; Wong, S.; Wrong, A.; Xin, C.; Yao, R.; Yates, B.; Zhang, S.; Zheng, K.; Pawson, T.; Ouellette, B. F. F.; Hogue, C. W. V.
2005-01-01
The Biomolecular Interaction Network Database (BIND) (http://bind.ca) archives biomolecular interaction, reaction, complex and pathway information. Our aim is to curate the details about molecular interactions that arise from published experimental research and to provide this information, as well as tools to enable data analysis, freely to researchers worldwide. BIND data are curated into a comprehensive machine-readable archive of computable information and provides users with methods to discover interactions and molecular mechanisms. BIND has worked to develop new methods for visualization that amplify the underlying annotation of genes and proteins to facilitate the study of molecular interaction networks. BIND has maintained an open database policy since its inception in 1999. Data growth has proceeded at a tremendous rate, approaching over 100 000 records. New services provided include a new BIND Query and Submission interface, a Standard Object Access Protocol service and the Small Molecule Interaction Database (http://smid.blueprint.org) that allows users to determine probable small molecule binding sites of new sequences and examine conserved binding residues. PMID:15608229
Lepoivre, Cyrille; Bergon, Aurélie; Lopez, Fabrice; Perumal, Narayanan B; Nguyen, Catherine; Imbert, Jean; Puthier, Denis
2012-01-31
Deciphering gene regulatory networks by in silico approaches is a crucial step in the study of the molecular perturbations that occur in diseases. The development of regulatory maps is a tedious process requiring the comprehensive integration of various evidences scattered over biological databases. Thus, the research community would greatly benefit from having a unified database storing known and predicted molecular interactions. Furthermore, given the intrinsic complexity of the data, the development of new tools offering integrated and meaningful visualizations of molecular interactions is necessary to help users drawing new hypotheses without being overwhelmed by the density of the subsequent graph. We extend the previously developed TranscriptomeBrowser database with a set of tables containing 1,594,978 human and mouse molecular interactions. The database includes: (i) predicted regulatory interactions (computed by scanning vertebrate alignments with a set of 1,213 position weight matrices), (ii) potential regulatory interactions inferred from systematic analysis of ChIP-seq experiments, (iii) regulatory interactions curated from the literature, (iv) predicted post-transcriptional regulation by micro-RNA, (v) protein kinase-substrate interactions and (vi) physical protein-protein interactions. In order to easily retrieve and efficiently analyze these interactions, we developed In-teractomeBrowser, a graph-based knowledge browser that comes as a plug-in for Transcriptome-Browser. The first objective of InteractomeBrowser is to provide a user-friendly tool to get new insight into any gene list by providing a context-specific display of putative regulatory and physical interactions. To achieve this, InteractomeBrowser relies on a "cell compartments-based layout" that makes use of a subset of the Gene Ontology to map gene products onto relevant cell compartments. This layout is particularly powerful for visual integration of heterogeneous biological information and is a productive avenue in generating new hypotheses. The second objective of InteractomeBrowser is to fill the gap between interaction databases and dynamic modeling. It is thus compatible with the network analysis software Cytoscape and with the Gene Interaction Network simulation software (GINsim). We provide examples underlying the benefits of this visualization tool for large gene set analysis related to thymocyte differentiation. The InteractomeBrowser plugin is a powerful tool to get quick access to a knowledge database that includes both predicted and validated molecular interactions. InteractomeBrowser is available through the TranscriptomeBrowser framework and can be found at: http://tagc.univ-mrs.fr/tbrowser/. Our database is updated on a regular basis.
Hermjakob, Henning; Montecchi-Palazzi, Luisa; Bader, Gary; Wojcik, Jérôme; Salwinski, Lukasz; Ceol, Arnaud; Moore, Susan; Orchard, Sandra; Sarkans, Ugis; von Mering, Christian; Roechert, Bernd; Poux, Sylvain; Jung, Eva; Mersch, Henning; Kersey, Paul; Lappe, Michael; Li, Yixue; Zeng, Rong; Rana, Debashis; Nikolski, Macha; Husi, Holger; Brun, Christine; Shanker, K; Grant, Seth G N; Sander, Chris; Bork, Peer; Zhu, Weimin; Pandey, Akhilesh; Brazma, Alvis; Jacq, Bernard; Vidal, Marc; Sherman, David; Legrain, Pierre; Cesareni, Gianni; Xenarios, Ioannis; Eisenberg, David; Steipe, Boris; Hogue, Chris; Apweiler, Rolf
2004-02-01
A major goal of proteomics is the complete description of the protein interaction network underlying cell physiology. A large number of small scale and, more recently, large-scale experiments have contributed to expanding our understanding of the nature of the interaction network. However, the necessary data integration across experiments is currently hampered by the fragmentation of publicly available protein interaction data, which exists in different formats in databases, on authors' websites or sometimes only in print publications. Here, we propose a community standard data model for the representation and exchange of protein interaction data. This data model has been jointly developed by members of the Proteomics Standards Initiative (PSI), a work group of the Human Proteome Organization (HUPO), and is supported by major protein interaction data providers, in particular the Biomolecular Interaction Network Database (BIND), Cellzome (Heidelberg, Germany), the Database of Interacting Proteins (DIP), Dana Farber Cancer Institute (Boston, MA, USA), the Human Protein Reference Database (HPRD), Hybrigenics (Paris, France), the European Bioinformatics Institute's (EMBL-EBI, Hinxton, UK) IntAct, the Molecular Interactions (MINT, Rome, Italy) database, the Protein-Protein Interaction Database (PPID, Edinburgh, UK) and the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING, EMBL, Heidelberg, Germany).
DockScreen: A database of in silico biomolecular interactions to support computational toxicology
We have developed DockScreen, a database of in silico biomolecular interactions designed to enable rational molecular toxicological insight within a computational toxicology framework. This database is composed of chemical/target (receptor and enzyme) binding scores calculated by...
Cataloging the biomedical world of pain through semi-automated curation of molecular interactions
Jamieson, Daniel G.; Roberts, Phoebe M.; Robertson, David L.; Sidders, Ben; Nenadic, Goran
2013-01-01
The vast collection of biomedical literature and its continued expansion has presented a number of challenges to researchers who require structured findings to stay abreast of and analyze molecular mechanisms relevant to their domain of interest. By structuring literature content into topic-specific machine-readable databases, the aggregate data from multiple articles can be used to infer trends that can be compared and contrasted with similar findings from topic-independent resources. Our study presents a generalized procedure for semi-automatically creating a custom topic-specific molecular interaction database through the use of text mining to assist manual curation. We apply the procedure to capture molecular events that underlie ‘pain’, a complex phenomenon with a large societal burden and unmet medical need. We describe how existing text mining solutions are used to build a pain-specific corpus, extract molecular events from it, add context to the extracted events and assess their relevance. The pain-specific corpus contains 765 692 documents from Medline and PubMed Central, from which we extracted 356 499 unique normalized molecular events, with 261 438 single protein events and 93 271 molecular interactions supplied by BioContext. Event chains are annotated with negation, speculation, anatomy, Gene Ontology terms, mutations, pain and disease relevance, which collectively provide detailed insight into how that event chain is associated with pain. The extracted relations are visualized in a wiki platform (wiki-pain.org) that enables efficient manual curation and exploration of the molecular mechanisms that underlie pain. Curation of 1500 grouped event chains ranked by pain relevance revealed 613 accurately extracted unique molecular interactions that in the future can be used to study the underlying mechanisms involved in pain. Our approach demonstrates that combining existing text mining tools with domain-specific terms and wiki-based visualization can facilitate rapid curation of molecular interactions to create a custom database. Database URL: ••• PMID:23707966
Cataloging the biomedical world of pain through semi-automated curation of molecular interactions.
Jamieson, Daniel G; Roberts, Phoebe M; Robertson, David L; Sidders, Ben; Nenadic, Goran
2013-01-01
The vast collection of biomedical literature and its continued expansion has presented a number of challenges to researchers who require structured findings to stay abreast of and analyze molecular mechanisms relevant to their domain of interest. By structuring literature content into topic-specific machine-readable databases, the aggregate data from multiple articles can be used to infer trends that can be compared and contrasted with similar findings from topic-independent resources. Our study presents a generalized procedure for semi-automatically creating a custom topic-specific molecular interaction database through the use of text mining to assist manual curation. We apply the procedure to capture molecular events that underlie 'pain', a complex phenomenon with a large societal burden and unmet medical need. We describe how existing text mining solutions are used to build a pain-specific corpus, extract molecular events from it, add context to the extracted events and assess their relevance. The pain-specific corpus contains 765 692 documents from Medline and PubMed Central, from which we extracted 356 499 unique normalized molecular events, with 261 438 single protein events and 93 271 molecular interactions supplied by BioContext. Event chains are annotated with negation, speculation, anatomy, Gene Ontology terms, mutations, pain and disease relevance, which collectively provide detailed insight into how that event chain is associated with pain. The extracted relations are visualized in a wiki platform (wiki-pain.org) that enables efficient manual curation and exploration of the molecular mechanisms that underlie pain. Curation of 1500 grouped event chains ranked by pain relevance revealed 613 accurately extracted unique molecular interactions that in the future can be used to study the underlying mechanisms involved in pain. Our approach demonstrates that combining existing text mining tools with domain-specific terms and wiki-based visualization can facilitate rapid curation of molecular interactions to create a custom database. Database URL: •••
HPIDB 2.0: a curated database for host–pathogen interactions
Ammari, Mais G.; Gresham, Cathy R.; McCarthy, Fiona M.; Nanduri, Bindu
2016-01-01
Identification and analysis of host–pathogen interactions (HPI) is essential to study infectious diseases. However, HPI data are sparse in existing molecular interaction databases, especially for agricultural host–pathogen systems. Therefore, resources that annotate, predict and display the HPI that underpin infectious diseases are critical for developing novel intervention strategies. HPIDB 2.0 (http://www.agbase.msstate.edu/hpi/main.html) is a resource for HPI data, and contains 45, 238 manually curated entries in the current release. Since the first description of the database in 2010, multiple enhancements to HPIDB data and interface services were made that are described here. Notably, HPIDB 2.0 now provides targeted biocuration of molecular interaction data. As a member of the International Molecular Exchange consortium, annotations provided by HPIDB 2.0 curators meet community standards to provide detailed contextual experimental information and facilitate data sharing. Moreover, HPIDB 2.0 provides access to rapidly available community annotations that capture minimum molecular interaction information to address immediate researcher needs for HPI network analysis. In addition to curation, HPIDB 2.0 integrates HPI from existing external sources and contains tools to infer additional HPI where annotated data are scarce. Compared to other interaction databases, our data collection approach ensures HPIDB 2.0 users access the most comprehensive HPI data from a wide range of pathogens and their hosts (594 pathogen and 70 host species, as of February 2016). Improvements also include enhanced search capacity, addition of Gene Ontology functional information, and implementation of network visualization. The changes made to HPIDB 2.0 content and interface ensure that users, especially agricultural researchers, are able to easily access and analyse high quality, comprehensive HPI data. All HPIDB 2.0 data are updated regularly, are publically available for direct download, and are disseminated to other molecular interaction resources. Database URL: http://www.agbase.msstate.edu/hpi/main.html PMID:27374121
Shoshi, Alban; Hoppe, Tobias; Kormeier, Benjamin; Ogultarhan, Venus; Hofestädt, Ralf
2015-02-28
Adverse drug reactions are one of the most common causes of death in industrialized Western countries. Nowadays, empirical data from clinical studies for the approval and monitoring of drugs and molecular databases is available. The integration of database information is a promising method for providing well-based knowledge to avoid adverse drug reactions. This paper presents our web-based decision support system GraphSAW which analyzes and evaluates drug interactions and side effects based on data from two commercial and two freely available molecular databases. The system is able to analyze single and combined drug-drug interactions, drug-molecule interactions as well as single and cumulative side effects. In addition, it allows exploring associative networks of drugs, molecules, metabolic pathways, and diseases in an intuitive way. The molecular medication analysis includes the capabilities of the upper features. A statistical evaluation of the integrated data and top 20 drugs concerning drug interactions and side effects is performed. The results of the data analysis give an overview of all theoretically possible drug interactions and side effects. The evaluation shows a mismatch between pharmaceutical and molecular databases. The concordance of drug interactions was about 12% and 9% of drug side effects. An application case with prescribed data of 11 patients is presented in order to demonstrate the functionality of the system under real conditions. For each patient at least two interactions occured in every medication and about 8% of total diseases were possibly induced by drug therapy. GraphSAW (http://tunicata.techfak.uni-bielefeld.de/graphsaw/) is meant to be a web-based system for health professionals and researchers. GraphSAW provides comprehensive drug-related knowledge and an improved medication analysis which may support efforts to reduce the risk of medication errors and numerous drastic side effects.
MIMO: an efficient tool for molecular interaction maps overlap
2013-01-01
Background Molecular pathways represent an ensemble of interactions occurring among molecules within the cell and between cells. The identification of similarities between molecular pathways across organisms and functions has a critical role in understanding complex biological processes. For the inference of such novel information, the comparison of molecular pathways requires to account for imperfect matches (flexibility) and to efficiently handle complex network topologies. To date, these characteristics are only partially available in tools designed to compare molecular interaction maps. Results Our approach MIMO (Molecular Interaction Maps Overlap) addresses the first problem by allowing the introduction of gaps and mismatches between query and template pathways and permits -when necessary- supervised queries incorporating a priori biological information. It then addresses the second issue by relying directly on the rich graph topology described in the Systems Biology Markup Language (SBML) standard, and uses multidigraphs to efficiently handle multiple queries on biological graph databases. The algorithm has been here successfully used to highlight the contact point between various human pathways in the Reactome database. Conclusions MIMO offers a flexible and efficient graph-matching tool for comparing complex biological pathways. PMID:23672344
AlQuraishi, Mohammed; Tang, Shengdong; Xia, Xide
2015-11-19
Molecular interactions between proteins and DNA molecules underlie many cellular processes, including transcriptional regulation, chromosome replication, and nucleosome positioning. Computational analyses of protein-DNA interactions rely on experimental data characterizing known protein-DNA interactions structurally and biochemically. While many databases exist that contain either structural or biochemical data, few integrate these two data sources in a unified fashion. Such integration is becoming increasingly critical with the rapid growth of structural and biochemical data, and the emergence of algorithms that rely on the synthesis of multiple data types to derive computational models of molecular interactions. We have developed an integrated affinity-structure database in which the experimental and quantitative DNA binding affinities of helix-turn-helix proteins are mapped onto the crystal structures of the corresponding protein-DNA complexes. This database provides access to: (i) protein-DNA structures, (ii) quantitative summaries of protein-DNA binding affinities using position weight matrices, and (iii) raw experimental data of protein-DNA binding instances. Critically, this database establishes a correspondence between experimental structural data and quantitative binding affinity data at the single basepair level. Furthermore, we present a novel alignment algorithm that structurally aligns the protein-DNA complexes in the database and creates a unified residue-level coordinate system for comparing the physico-chemical environments at the interface between complexes. Using this unified coordinate system, we compute the statistics of atomic interactions at the protein-DNA interface of helix-turn-helix proteins. We provide an interactive website for visualization, querying, and analyzing this database, and a downloadable version to facilitate programmatic analysis. This database will facilitate the analysis of protein-DNA interactions and the development of programmatic computational methods that capitalize on integration of structural and biochemical datasets. The database can be accessed at http://ProteinDNA.hms.harvard.edu.
Shah, Eric D; Fisch, Brandon M A; Arceci, Robert J; Buckley, Jonathan D; Reaman, Gregory H; Sorensen, Poul H; Triche, Timothy J; Reynolds, C Patrick
2014-05-01
Academic laboratories are developing increasingly large amounts of data that describe the genomic landscape and gene expression patterns of various types of cancers. Such data can potentially identify novel oncology molecular targets in cancer types that may not be the primary focus of a drug sponsor's initial research for an investigational new drug. Obtaining preclinical data that point toward the potential for a given molecularly targeted agent, or a novel combination of agents requires knowledge of drugs currently in development in both the academic and commercial sectors. We have developed the DrugPath database ( http://www.drugpath.org ) as a comprehensive, free-of-charge resource for academic investigators to identify agents being developed in academics or industry that may act against molecular targets of interest. DrugPath data on molecular targets overlay the Michigan Molecular Interactions ( http://mimi.ncibi.org ) gene-gene interaction map to facilitate identification of related agents in the same pathway. The database catalogs 2,081 drug development programs representing 751 drug sponsors and 722 molecular and genetic targets. DrugPath should assist investigators in identifying and obtaining drugs acting on specific molecular targets for biological and preclinical therapeutic studies.
Cytoscape: a software environment for integrated models of biomolecular interaction networks.
Shannon, Paul; Markiel, Andrew; Ozier, Owen; Baliga, Nitin S; Wang, Jonathan T; Ramage, Daniel; Amin, Nada; Schwikowski, Benno; Ideker, Trey
2003-11-01
Cytoscape is an open source software project for integrating biomolecular interaction networks with high-throughput expression data and other molecular states into a unified conceptual framework. Although applicable to any system of molecular components and interactions, Cytoscape is most powerful when used in conjunction with large databases of protein-protein, protein-DNA, and genetic interactions that are increasingly available for humans and model organisms. Cytoscape's software Core provides basic functionality to layout and query the network; to visually integrate the network with expression profiles, phenotypes, and other molecular states; and to link the network to databases of functional annotations. The Core is extensible through a straightforward plug-in architecture, allowing rapid development of additional computational analyses and features. Several case studies of Cytoscape plug-ins are surveyed, including a search for interaction pathways correlating with changes in gene expression, a study of protein complexes involved in cellular recovery to DNA damage, inference of a combined physical/functional interaction network for Halobacterium, and an interface to detailed stochastic/kinetic gene regulatory models.
The IntAct molecular interaction database in 2012
Kerrien, Samuel; Aranda, Bruno; Breuza, Lionel; Bridge, Alan; Broackes-Carter, Fiona; Chen, Carol; Duesbury, Margaret; Dumousseau, Marine; Feuermann, Marc; Hinz, Ursula; Jandrasits, Christine; Jimenez, Rafael C.; Khadake, Jyoti; Mahadevan, Usha; Masson, Patrick; Pedruzzi, Ivo; Pfeiffenberger, Eric; Porras, Pablo; Raghunath, Arathi; Roechert, Bernd; Orchard, Sandra; Hermjakob, Henning
2012-01-01
IntAct is an open-source, open data molecular interaction database populated by data either curated from the literature or from direct data depositions. Two levels of curation are now available within the database, with both IMEx-level annotation and less detailed MIMIx-compatible entries currently supported. As from September 2011, IntAct contains approximately 275 000 curated binary interaction evidences from over 5000 publications. The IntAct website has been improved to enhance the search process and in particular the graphical display of the results. New data download formats are also available, which will facilitate the inclusion of IntAct's data in the Semantic Web. IntAct is an active contributor to the IMEx consortium (http://www.imexconsortium.org). IntAct source code and data are freely available at http://www.ebi.ac.uk/intact. PMID:22121220
Databases and coordinated research projects at the IAEA on atomic processes in plasmas
NASA Astrophysics Data System (ADS)
Braams, Bastiaan J.; Chung, Hyun-Kyung
2012-05-01
The Atomic and Molecular Data Unit at the IAEA works with a network of national data centres to encourage and coordinate production and dissemination of fundamental data for atomic, molecular and plasma-material interaction (A+M/PMI) processes that are relevant to the realization of fusion energy. The Unit maintains numerical and bibliographical databases and has started a Wiki-style knowledge base. The Unit also contributes to A+M database interface standards and provides a search engine that offers a common interface to multiple numerical A+M/PMI databases. Coordinated Research Projects (CRPs) bring together fusion energy researchers and atomic, molecular and surface physicists for joint work towards the development of new data and new methods. The databases and current CRPs on A+M/PMI processes are briefly described here.
An affinity-structure database of helix-turn-helix: DNA complexes with a universal coordinate system
DOE Office of Scientific and Technical Information (OSTI.GOV)
AlQuraishi, Mohammed; Tang, Shengdong; Xia, Xide
Molecular interactions between proteins and DNA molecules underlie many cellular processes, including transcriptional regulation, chromosome replication, and nucleosome positioning. Computational analyses of protein-DNA interactions rely on experimental data characterizing known protein-DNA interactions structurally and biochemically. While many databases exist that contain either structural or biochemical data, few integrate these two data sources in a unified fashion. Such integration is becoming increasingly critical with the rapid growth of structural and biochemical data, and the emergence of algorithms that rely on the synthesis of multiple data types to derive computational models of molecular interactions. We have developed an integrated affinity-structure database inmore » which the experimental and quantitative DNA binding affinities of helix-turn-helix proteins are mapped onto the crystal structures of the corresponding protein-DNA complexes. This database provides access to: (i) protein-DNA structures, (ii) quantitative summaries of protein-DNA binding affinities using position weight matrices, and (iii) raw experimental data of protein-DNA binding instances. Critically, this database establishes a correspondence between experimental structural data and quantitative binding affinity data at the single basepair level. Furthermore, we present a novel alignment algorithm that structurally aligns the protein-DNA complexes in the database and creates a unified residue-level coordinate system for comparing the physico-chemical environments at the interface between complexes. Using this unified coordinate system, we compute the statistics of atomic interactions at the protein-DNA interface of helix-turn-helix proteins. We provide an interactive website for visualization, querying, and analyzing this database, and a downloadable version to facilitate programmatic analysis. Lastly, this database will facilitate the analysis of protein-DNA interactions and the development of programmatic computational methods that capitalize on integration of structural and biochemical datasets. The database can be accessed at http://ProteinDNA.hms.harvard.edu.« less
An affinity-structure database of helix-turn-helix: DNA complexes with a universal coordinate system
AlQuraishi, Mohammed; Tang, Shengdong; Xia, Xide
2015-11-19
Molecular interactions between proteins and DNA molecules underlie many cellular processes, including transcriptional regulation, chromosome replication, and nucleosome positioning. Computational analyses of protein-DNA interactions rely on experimental data characterizing known protein-DNA interactions structurally and biochemically. While many databases exist that contain either structural or biochemical data, few integrate these two data sources in a unified fashion. Such integration is becoming increasingly critical with the rapid growth of structural and biochemical data, and the emergence of algorithms that rely on the synthesis of multiple data types to derive computational models of molecular interactions. We have developed an integrated affinity-structure database inmore » which the experimental and quantitative DNA binding affinities of helix-turn-helix proteins are mapped onto the crystal structures of the corresponding protein-DNA complexes. This database provides access to: (i) protein-DNA structures, (ii) quantitative summaries of protein-DNA binding affinities using position weight matrices, and (iii) raw experimental data of protein-DNA binding instances. Critically, this database establishes a correspondence between experimental structural data and quantitative binding affinity data at the single basepair level. Furthermore, we present a novel alignment algorithm that structurally aligns the protein-DNA complexes in the database and creates a unified residue-level coordinate system for comparing the physico-chemical environments at the interface between complexes. Using this unified coordinate system, we compute the statistics of atomic interactions at the protein-DNA interface of helix-turn-helix proteins. We provide an interactive website for visualization, querying, and analyzing this database, and a downloadable version to facilitate programmatic analysis. Lastly, this database will facilitate the analysis of protein-DNA interactions and the development of programmatic computational methods that capitalize on integration of structural and biochemical datasets. The database can be accessed at http://ProteinDNA.hms.harvard.edu.« less
CREDO: a structural interactomics database for drug discovery
Schreyer, Adrian M.; Blundell, Tom L.
2013-01-01
CREDO is a unique relational database storing all pairwise atomic interactions of inter- as well as intra-molecular contacts between small molecules and macromolecules found in experimentally determined structures from the Protein Data Bank. These interactions are integrated with further chemical and biological data. The database implements useful data structures and algorithms such as cheminformatics routines to create a comprehensive analysis platform for drug discovery. The database can be accessed through a web-based interface, downloads of data sets and web services at http://www-cryst.bioc.cam.ac.uk/credo. Database URL: http://www-cryst.bioc.cam.ac.uk/credo PMID:23868908
HBVPathDB: a database of HBV infection-related molecular interaction network.
Zhang, Yi; Bo, Xiao-Chen; Yang, Jing; Wang, Sheng-Qi
2005-03-21
To describe molecules or genes interaction between hepatitis B viruses (HBV) and host, for understanding how virus' and host's genes and molecules are networked to form a biological system and for perceiving mechanism of HBV infection. The knowledge of HBV infection-related reactions was organized into various kinds of pathways with carefully drawn graphs in HBVPathDB. Pathway information is stored with relational database management system (DBMS), which is currently the most efficient way to manage large amounts of data and query is implemented with powerful Structured Query Language (SQL). The search engine is written using Personal Home Page (PHP) with SQL embedded and web retrieval interface is developed for searching with Hypertext Markup Language (HTML). We present the first version of HBVPathDB, which is a HBV infection-related molecular interaction network database composed of 306 pathways with 1 050 molecules involved. With carefully drawn graphs, pathway information stored in HBVPathDB can be browsed in an intuitive way. We develop an easy-to-use interface for flexible accesses to the details of database. Convenient software is implemented to query and browse the pathway information of HBVPathDB. Four search page layout options-category search, gene search, description search, unitized search-are supported by the search engine of the database. The database is freely available at http://www.bio-inf.net/HBVPathDB/HBV/. The conventional perspective HBVPathDB have already contained a considerable amount of pathway information with HBV infection related, which is suitable for in-depth analysis of molecular interaction network of virus and host. HBVPathDB integrates pathway data-sets with convenient software for query, browsing, visualization, that provides users more opportunity to identify regulatory key molecules as potential drug targets and to explore the possible mechanism of HBV infection based on gene expression datasets.
Databases and coordinated research projects at the IAEA on atomic processes in plasmas
DOE Office of Scientific and Technical Information (OSTI.GOV)
Braams, Bastiaan J.; Chung, Hyun-Kyung
2012-05-25
The Atomic and Molecular Data Unit at the IAEA works with a network of national data centres to encourage and coordinate production and dissemination of fundamental data for atomic, molecular and plasma-material interaction (A+M/PMI) processes that are relevant to the realization of fusion energy. The Unit maintains numerical and bibliographical databases and has started a Wiki-style knowledge base. The Unit also contributes to A+M database interface standards and provides a search engine that offers a common interface to multiple numerical A+M/PMI databases. Coordinated Research Projects (CRPs) bring together fusion energy researchers and atomic, molecular and surface physicists for joint workmore » towards the development of new data and new methods. The databases and current CRPs on A+M/PMI processes are briefly described here.« less
Library of molecular associations: curating the complex molecular basis of liver diseases.
Buchkremer, Stefan; Hendel, Jasmin; Krupp, Markus; Weinmann, Arndt; Schlamp, Kai; Maass, Thorsten; Staib, Frank; Galle, Peter R; Teufel, Andreas
2010-03-20
Systems biology approaches offer novel insights into the development of chronic liver diseases. Current genomic databases supporting systems biology analyses are mostly based on microarray data. Although these data often cover genome wide expression, the validity of single microarray experiments remains questionable. However, for systems biology approaches addressing the interactions of molecular networks comprehensive but also highly validated data are necessary. We have therefore generated the first comprehensive database for published molecular associations in human liver diseases. It is based on PubMed published abstracts and aimed to close the gap between genome wide coverage of low validity from microarray data and individual highly validated data from PubMed. After an initial text mining process, the extracted abstracts were all manually validated to confirm content and potential genetic associations and may therefore be highly trusted. All data were stored in a publicly available database, Library of Molecular Associations http://www.medicalgenomics.org/databases/loma/news, currently holding approximately 1260 confirmed molecular associations for chronic liver diseases such as HCC, CCC, liver fibrosis, NASH/fatty liver disease, AIH, PBC, and PSC. We furthermore transformed these data into a powerful resource for molecular liver research by connecting them to multiple biomedical information resources. Together, this database is the first available database providing a comprehensive view and analysis options for published molecular associations on multiple liver diseases.
Molecular Interaction Map of the Mammalian Cell Cycle Control and DNA Repair Systems
Kohn, Kurt W.
1999-01-01
Eventually to understand the integrated function of the cell cycle regulatory network, we must organize the known interactions in the form of a diagram, map, and/or database. A diagram convention was designed capable of unambiguous representation of networks containing multiprotein complexes, protein modifications, and enzymes that are substrates of other enzymes. To facilitate linkage to a database, each molecular species is symbolically represented only once in each diagram. Molecular species can be located on the map by means of indexed grid coordinates. Each interaction is referenced to an annotation list where pertinent information and references can be found. Parts of the network are grouped into functional subsystems. The map shows how multiprotein complexes could assemble and function at gene promoter sites and at sites of DNA damage. It also portrays the richness of connections between the p53-Mdm2 subsystem and other parts of the network. PMID:10436023
Protein-protein interaction networks: unraveling the wiring of molecular machines within the cell.
De Las Rivas, Javier; Fontanillo, Celia
2012-11-01
Mapping and understanding of the protein interaction networks with their key modules and hubs can provide deeper insights into the molecular machinery underlying complex phenotypes. In this article, we present the basic characteristics and definitions of protein networks, starting with a distinction of the different types of associations between proteins. We focus the review on protein-protein interactions (PPIs), a subset of associations defined as physical contacts between proteins that occur by selective molecular docking in a particular biological context. We present such definition as opposed to other types of protein associations derived from regulatory, genetic, structural or functional relations. To determine PPIs, a variety of binary and co-complex methods exist; however, not all the technologies provide the same information and data quality. A way of increasing confidence in a given protein interaction is to integrate orthogonal experimental evidences. The use of several complementary methods testing each single interaction assesses the accuracy of PPI data and tries to minimize the occurrence of false interactions. Following this approach there have been important efforts to unify primary databases of experimentally proven PPIs into integrated databases. These meta-databases provide a measure of the confidence of interactions based on the number of experimental proofs that report them. As a conclusion, we can state that integrated information allows the building of more reliable interaction networks. Identification of communities, cliques, modules and hubs by analysing the topological parameters and graph properties of the protein networks allows the discovery of central/critical nodes, which are candidates to regulate cellular flux and dynamics.
Parametrization of an Orbital-Based Linear-Scaling Quantum Force Field for Noncovalent Interactions
2015-01-01
We parametrize a linear-scaling quantum mechanical force field called mDC for the accurate reproduction of nonbonded interactions. We provide a new benchmark database of accurate ab initio interactions between sulfur-containing molecules. A variety of nonbond databases are used to compare the new mDC method with other semiempirical, molecular mechanical, ab initio, and combined semiempirical quantum mechanical/molecular mechanical methods. It is shown that the molecular mechanical force field significantly and consistently reproduces the benchmark results with greater accuracy than the semiempirical models and our mDC model produces errors twice as small as the molecular mechanical force field. The comparisons between the methods are extended to the docking of drug candidates to the Cyclin-Dependent Kinase 2 protein receptor. We correlate the protein–ligand binding energies to their experimental inhibition constants and find that the mDC produces the best correlation. Condensed phase simulation of mDC water is performed and shown to produce O–O radial distribution functions similar to TIP4P-EW. PMID:24803856
Learning about Intermolecular Interactions from the Cambridge Structural Database
ERIC Educational Resources Information Center
Battle, Gary M.; Allen, Frank H.
2012-01-01
A clear understanding and appreciation of noncovalent interactions, especially hydrogen bonding, are vitally important to students of chemistry and the life sciences, including biochemistry, molecular biology, pharmacology, and medicine. The opportunities afforded by the IsoStar knowledge base of intermolecular interactions to enhance the…
Atlas - a data warehouse for integrative bioinformatics.
Shah, Sohrab P; Huang, Yong; Xu, Tao; Yuen, Macaire M S; Ling, John; Ouellette, B F Francis
2005-02-21
We present a biological data warehouse called Atlas that locally stores and integrates biological sequences, molecular interactions, homology information, functional annotations of genes, and biological ontologies. The goal of the system is to provide data, as well as a software infrastructure for bioinformatics research and development. The Atlas system is based on relational data models that we developed for each of the source data types. Data stored within these relational models are managed through Structured Query Language (SQL) calls that are implemented in a set of Application Programming Interfaces (APIs). The APIs include three languages: C++, Java, and Perl. The methods in these API libraries are used to construct a set of loader applications, which parse and load the source datasets into the Atlas database, and a set of toolbox applications which facilitate data retrieval. Atlas stores and integrates local instances of GenBank, RefSeq, UniProt, Human Protein Reference Database (HPRD), Biomolecular Interaction Network Database (BIND), Database of Interacting Proteins (DIP), Molecular Interactions Database (MINT), IntAct, NCBI Taxonomy, Gene Ontology (GO), Online Mendelian Inheritance in Man (OMIM), LocusLink, Entrez Gene and HomoloGene. The retrieval APIs and toolbox applications are critical components that offer end-users flexible, easy, integrated access to this data. We present use cases that use Atlas to integrate these sources for genome annotation, inference of molecular interactions across species, and gene-disease associations. The Atlas biological data warehouse serves as data infrastructure for bioinformatics research and development. It forms the backbone of the research activities in our laboratory and facilitates the integration of disparate, heterogeneous biological sources of data enabling new scientific inferences. Atlas achieves integration of diverse data sets at two levels. First, Atlas stores data of similar types using common data models, enforcing the relationships between data types. Second, integration is achieved through a combination of APIs, ontology, and tools. The Atlas software is freely available under the GNU General Public License at: http://bioinformatics.ubc.ca/atlas/
Atlas – a data warehouse for integrative bioinformatics
Shah, Sohrab P; Huang, Yong; Xu, Tao; Yuen, Macaire MS; Ling, John; Ouellette, BF Francis
2005-01-01
Background We present a biological data warehouse called Atlas that locally stores and integrates biological sequences, molecular interactions, homology information, functional annotations of genes, and biological ontologies. The goal of the system is to provide data, as well as a software infrastructure for bioinformatics research and development. Description The Atlas system is based on relational data models that we developed for each of the source data types. Data stored within these relational models are managed through Structured Query Language (SQL) calls that are implemented in a set of Application Programming Interfaces (APIs). The APIs include three languages: C++, Java, and Perl. The methods in these API libraries are used to construct a set of loader applications, which parse and load the source datasets into the Atlas database, and a set of toolbox applications which facilitate data retrieval. Atlas stores and integrates local instances of GenBank, RefSeq, UniProt, Human Protein Reference Database (HPRD), Biomolecular Interaction Network Database (BIND), Database of Interacting Proteins (DIP), Molecular Interactions Database (MINT), IntAct, NCBI Taxonomy, Gene Ontology (GO), Online Mendelian Inheritance in Man (OMIM), LocusLink, Entrez Gene and HomoloGene. The retrieval APIs and toolbox applications are critical components that offer end-users flexible, easy, integrated access to this data. We present use cases that use Atlas to integrate these sources for genome annotation, inference of molecular interactions across species, and gene-disease associations. Conclusion The Atlas biological data warehouse serves as data infrastructure for bioinformatics research and development. It forms the backbone of the research activities in our laboratory and facilitates the integration of disparate, heterogeneous biological sources of data enabling new scientific inferences. Atlas achieves integration of diverse data sets at two levels. First, Atlas stores data of similar types using common data models, enforcing the relationships between data types. Second, integration is achieved through a combination of APIs, ontology, and tools. The Atlas software is freely available under the GNU General Public License at: PMID:15723693
Kuang, Xingyan; Dhroso, Andi; Han, Jing Ginger; Shyu, Chi-Ren; Korkin, Dmitry
2016-01-01
Macromolecular interactions are formed between proteins, DNA and RNA molecules. Being a principle building block in macromolecular assemblies and pathways, the interactions underlie most of cellular functions. Malfunctioning of macromolecular interactions is also linked to a number of diseases. Structural knowledge of the macromolecular interaction allows one to understand the interaction’s mechanism, determine its functional implications and characterize the effects of genetic variations, such as single nucleotide polymorphisms, on the interaction. Unfortunately, until now the interactions mediated by different types of macromolecules, e.g. protein–protein interactions or protein–DNA interactions, are collected into individual and unrelated structural databases. This presents a significant obstacle in the analysis of macromolecular interactions. For instance, the homogeneous structural interaction databases prevent scientists from studying structural interactions of different types but occurring in the same macromolecular complex. Here, we introduce DOMMINO 2.0, a structural Database Of Macro-Molecular INteractiOns. Compared to DOMMINO 1.0, a comprehensive database on protein-protein interactions, DOMMINO 2.0 includes the interactions between all three basic types of macromolecules extracted from PDB files. DOMMINO 2.0 is automatically updated on a weekly basis. It currently includes ∼1 040 000 interactions between two polypeptide subunits (e.g. domains, peptides, termini and interdomain linkers), ∼43 000 RNA-mediated interactions, and ∼12 000 DNA-mediated interactions. All protein structures in the database are annotated using SCOP and SUPERFAMILY family annotation. As a result, protein-mediated interactions involving protein domains, interdomain linkers, C- and N- termini, and peptides are identified. Our database provides an intuitive web interface, allowing one to investigate interactions at three different resolution levels: whole subunit network, binary interaction and interaction interface. Database URL: http://dommino.org PMID:26827237
The MIntAct project—IntAct as a common curation platform for 11 molecular interaction databases
Orchard, Sandra; Ammari, Mais; Aranda, Bruno; Breuza, Lionel; Briganti, Leonardo; Broackes-Carter, Fiona; Campbell, Nancy H.; Chavali, Gayatri; Chen, Carol; del-Toro, Noemi; Duesbury, Margaret; Dumousseau, Marine; Galeota, Eugenia; Hinz, Ursula; Iannuccelli, Marta; Jagannathan, Sruthi; Jimenez, Rafael; Khadake, Jyoti; Lagreid, Astrid; Licata, Luana; Lovering, Ruth C.; Meldal, Birgit; Melidoni, Anna N.; Milagros, Mila; Peluso, Daniele; Perfetto, Livia; Porras, Pablo; Raghunath, Arathi; Ricard-Blum, Sylvie; Roechert, Bernd; Stutz, Andre; Tognolli, Michael; van Roey, Kim; Cesareni, Gianni; Hermjakob, Henning
2014-01-01
IntAct (freely available at http://www.ebi.ac.uk/intact) is an open-source, open data molecular interaction database populated by data either curated from the literature or from direct data depositions. IntAct has developed a sophisticated web-based curation tool, capable of supporting both IMEx- and MIMIx-level curation. This tool is now utilized by multiple additional curation teams, all of whom annotate data directly into the IntAct database. Members of the IntAct team supply appropriate levels of training, perform quality control on entries and take responsibility for long-term data maintenance. Recently, the MINT and IntAct databases decided to merge their separate efforts to make optimal use of limited developer resources and maximize the curation output. All data manually curated by the MINT curators have been moved into the IntAct database at EMBL-EBI and are merged with the existing IntAct dataset. Both IntAct and MINT are active contributors to the IMEx consortium (http://www.imexconsortium.org). PMID:24234451
Update of KDBI: Kinetic Data of Bio-molecular Interaction database
Kumar, Pankaj; Han, B. C.; Shi, Z.; Jia, J.; Wang, Y. P.; Zhang, Y. T.; Liang, L.; Liu, Q. F.; Ji, Z. L.; Chen, Y. Z.
2009-01-01
Knowledge of the kinetics of biomolecular interactions is important for facilitating the study of cellular processes and underlying molecular events, and is essential for quantitative study and simulation of biological systems. Kinetic Data of Bio-molecular Interaction database (KDBI) has been developed to provide information about experimentally determined kinetic data of protein–protein, protein–nucleic acid, protein–ligand, nucleic acid–ligand binding or reaction events described in the literature. To accommodate increasing demand for studying and simulating biological systems, numerous improvements and updates have been made to KDBI, including new ways to access data by pathway and molecule names, data file in System Biology Markup Language format, more efficient search engine, access to published parameter sets of simulation models of 63 pathways, and 2.3-fold increase of data (19 263 entries of 10 532 distinctive biomolecular binding and 11 954 interaction events, involving 2635 proteins/protein complexes, 847 nucleic acids, 1603 small molecules and 45 multi-step processes). KDBI is publically available at http://bidd.nus.edu.sg/group/kdbi/kdbi.asp. PMID:18971255
dbAMEPNI: a database of alanine mutagenic effects for protein–nucleic acid interactions
DOE Office of Scientific and Technical Information (OSTI.GOV)
Liu, Ling; Xiong, Yi; Gao, Hongyun
Protein–nucleic acid interactions play essential roles in various biological activities such as gene regulation, transcription, DNA repair and DNA packaging. Understanding the effects of amino acid substitutions on protein–nucleic acid binding affinities can help elucidate the molecular mechanism of protein–nucleic acid recognition. Until now, no comprehensive and updated database of quantitative binding data on alanine mutagenic effects for protein–nucleic acid interactions is publicly accessible. Thus, we developed a new database of Alanine Mutagenic Effects for Protein-Nucleic Acid Interactions (dbAMEPNI). dbAMEPNI is a manually curated, literature-derived database, comprising over 577 alanine mutagenic data with experimentally determined binding affinities for protein–nucleic acidmore » complexes. Here, it contains several important parameters, such as dissociation constant (Kd), Gibbs free energy change (ΔΔG), experimental conditions and structural parameters of mutant residues. In addition, the database provides an extended dataset of 282 single alanine mutations with only qualitative data (or descriptive effects) of thermodynamic information.« less
dbAMEPNI: a database of alanine mutagenic effects for protein–nucleic acid interactions
Liu, Ling; Xiong, Yi; Gao, Hongyun; ...
2018-04-02
Protein–nucleic acid interactions play essential roles in various biological activities such as gene regulation, transcription, DNA repair and DNA packaging. Understanding the effects of amino acid substitutions on protein–nucleic acid binding affinities can help elucidate the molecular mechanism of protein–nucleic acid recognition. Until now, no comprehensive and updated database of quantitative binding data on alanine mutagenic effects for protein–nucleic acid interactions is publicly accessible. Thus, we developed a new database of Alanine Mutagenic Effects for Protein-Nucleic Acid Interactions (dbAMEPNI). dbAMEPNI is a manually curated, literature-derived database, comprising over 577 alanine mutagenic data with experimentally determined binding affinities for protein–nucleic acidmore » complexes. Here, it contains several important parameters, such as dissociation constant (Kd), Gibbs free energy change (ΔΔG), experimental conditions and structural parameters of mutant residues. In addition, the database provides an extended dataset of 282 single alanine mutations with only qualitative data (or descriptive effects) of thermodynamic information.« less
Kleinau, Gunnar; Kreuchwig, Annika; Worth, Catherine L; Krause, Gerd
2010-06-01
The collection, description and molecular analysis of naturally occurring (pathogenic) mutations are important for understanding the functional mechanisms and malfunctions of biological units such as proteins. Numerous databases collate a huge amount of functional data or descriptions of mutations, but tools to analyse the molecular effects of genetic variations are as yet poorly provided. The goal of this work was therefore to develop a translational web-application that facilitates the interactive linkage of functional and structural data and which helps improve our understanding of the molecular basis of naturally occurring gain- or loss- of function mutations. Here we focus on the human glycoprotein hormone receptors (GPHRs), for which a huge number of mutations are known to cause diseases. We describe new options for interactive data analyses within three-dimensional structures, which enable the assignment of molecular relationships between structure and function. Strikingly, as the functional data are converted into relational percentage values, the system allows the comparison and classification of data from different GPHR subtypes and different experimental approaches. Our new application has been incorporated into a freely available database and website for the GPHRs (http://www.ssfa-gphr.de), but the principle development would also be applicable to other macromolecules.
Flexible network reconstruction from relational databases with Cytoscape and CytoSQL
2010-01-01
Background Molecular interaction networks can be efficiently studied using network visualization software such as Cytoscape. The relevant nodes, edges and their attributes can be imported in Cytoscape in various file formats, or directly from external databases through specialized third party plugins. However, molecular data are often stored in relational databases with their own specific structure, for which dedicated plugins do not exist. Therefore, a more generic solution is presented. Results A new Cytoscape plugin 'CytoSQL' is developed to connect Cytoscape to any relational database. It allows to launch SQL ('Structured Query Language') queries from within Cytoscape, with the option to inject node or edge features of an existing network as SQL arguments, and to convert the retrieved data to Cytoscape network components. Supported by a set of case studies we demonstrate the flexibility and the power of the CytoSQL plugin in converting specific data subsets into meaningful network representations. Conclusions CytoSQL offers a unified approach to let Cytoscape interact with relational databases. Thanks to the power of the SQL syntax, this tool can rapidly generate and enrich networks according to very complex criteria. The plugin is available at http://www.ptools.ua.ac.be/CytoSQL. PMID:20594316
Flexible network reconstruction from relational databases with Cytoscape and CytoSQL.
Laukens, Kris; Hollunder, Jens; Dang, Thanh Hai; De Jaeger, Geert; Kuiper, Martin; Witters, Erwin; Verschoren, Alain; Van Leemput, Koenraad
2010-07-01
Molecular interaction networks can be efficiently studied using network visualization software such as Cytoscape. The relevant nodes, edges and their attributes can be imported in Cytoscape in various file formats, or directly from external databases through specialized third party plugins. However, molecular data are often stored in relational databases with their own specific structure, for which dedicated plugins do not exist. Therefore, a more generic solution is presented. A new Cytoscape plugin 'CytoSQL' is developed to connect Cytoscape to any relational database. It allows to launch SQL ('Structured Query Language') queries from within Cytoscape, with the option to inject node or edge features of an existing network as SQL arguments, and to convert the retrieved data to Cytoscape network components. Supported by a set of case studies we demonstrate the flexibility and the power of the CytoSQL plugin in converting specific data subsets into meaningful network representations. CytoSQL offers a unified approach to let Cytoscape interact with relational databases. Thanks to the power of the SQL syntax, this tool can rapidly generate and enrich networks according to very complex criteria. The plugin is available at http://www.ptools.ua.ac.be/CytoSQL.
mpMoRFsDB: a database of molecular recognition features in membrane proteins.
Gypas, Foivos; Tsaousis, Georgios N; Hamodrakas, Stavros J
2013-10-01
Molecular recognition features (MoRFs) are small, intrinsically disordered regions in proteins that undergo a disorder-to-order transition on binding to their partners. MoRFs are involved in protein-protein interactions and may function as the initial step in molecular recognition. The aim of this work was to collect, organize and store all membrane proteins that contain MoRFs. Membrane proteins constitute ∼30% of fully sequenced proteomes and are responsible for a wide variety of cellular functions. MoRFs were classified according to their secondary structure, after interacting with their partners. We identified MoRFs in transmembrane and peripheral membrane proteins. The position of transmembrane protein MoRFs was determined in relation to a protein's topology. All information was stored in a publicly available mySQL database with a user-friendly web interface. A Jmol applet is integrated for visualization of the structures. mpMoRFsDB provides valuable information related to disorder-based protein-protein interactions in membrane proteins. http://bioinformatics.biol.uoa.gr/mpMoRFsDB
Veluraja, Kasinadar; Selvin, Jeyasigamani F A; Venkateshwari, Selvakumar; Priyadarzini, Thanu R K
2010-09-23
The inherent flexibility and lack of strong intramolecular interactions of oligosaccharides demand the use of theoretical methods for their structural elucidation. In spite of the developments of theoretical methods, not much research on glycoinformatics is done so far when compared to bioinformatics research on proteins and nucleic acids. We have developed three dimensional structural database for a sialic acid-containing carbohydrates (3DSDSCAR). This is an open-access database that provides 3D structural models of a given sialic acid-containing carbohydrate. At present, 3DSDSCAR contains 60 conformational models, belonging to 14 different sialic acid-containing carbohydrates, deduced through 10 ns molecular dynamics (MD) simulations. The database is available at the URL: http://www.3dsdscar.org. Copyright 2010 Elsevier Ltd. All rights reserved.
Pharmit: interactive exploration of chemical space.
Sunseri, Jocelyn; Koes, David Ryan
2016-07-08
Pharmit (http://pharmit.csb.pitt.edu) provides an online, interactive environment for the virtual screening of large compound databases using pharmacophores, molecular shape and energy minimization. Users can import, create and edit virtual screening queries in an interactive browser-based interface. Queries are specified in terms of a pharmacophore, a spatial arrangement of the essential features of an interaction, and molecular shape. Search results can be further ranked and filtered using energy minimization. In addition to a number of pre-built databases of popular compound libraries, users may submit their own compound libraries for screening. Pharmit uses state-of-the-art sub-linear algorithms to provide interactive screening of millions of compounds. Queries typically take a few seconds to a few minutes depending on their complexity. This allows users to iteratively refine their search during a single session. The easy access to large chemical datasets provided by Pharmit simplifies and accelerates structure-based drug design. Pharmit is available under a dual BSD/GPL open-source license. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Peng, Jiale; Li, Yaping; Zhou, Yeheng; Zhang, Li; Liu, Xingyong; Zuo, Zhili
2018-05-29
Gout is a common inflammatory arthritis caused by the deposition of urate crystals within joints. It is increasingly in prevalence during the past few decades as shown by the epidemiological survey results. Xanthine oxidase (XO) is a key enzyme to transfer hypoxanthine and xanthine to uric acid, whose overproduction leads to gout. Therefore, inhibiting the activity of xanthine oxidase is an important way to reduce the production of urate. In the study, in order to identify the potential natural products targeting XO, pharmacophore modeling was employed to filter databases. Here, two methods, pharmacophore based on ligand and pharmacophore based on receptor-ligand, were constructed by Discovery Studio. Then GOLD was used to refine the potential compounds with higher fitness scores. Finally, molecular docking and dynamics simulations were employed to analyze the interactions between compounds and protein. The best hypothesis was set as a 3D query to screen database, returning 785 and 297 compounds respectively. A merged set of the above 1082 molecules was subjected to molecular docking, which returned 144 hits with high-fitness scores. These molecules were clustered in four main kinds depending on different backbones. What is more, molecular docking showed that the representative compounds established key interactions with the amino acid residues in the protein, and the RMSD and RMSF of molecular dynamics results showed that these compounds can stabilize the protein. The information represented in the study confirmed previous reports. And it may assist to discover and design new backbones as potential XO inhibitors based on natural products.
Coordinated Research Projects of the IAEA Atomic and Molecular Data Unit
NASA Astrophysics Data System (ADS)
Braams, B. J.; Chung, H.-K.
2011-05-01
The IAEA Atomic and Molecular Data Unit is dedicated to the provision of databases for atomic, molecular and plasma-material interaction (AM/PMI) data that are relevant for nuclear fusion research. IAEA Coordinated Research Projects (CRPs) are the principal mechanism by which the Unit encourages data evaluation and the production of new data. Ongoing and planned CRPs on AM/PMI data are briefly described here.
DITOP: drug-induced toxicity related protein database.
Zhang, Jing-Xian; Huang, Wei-Juan; Zeng, Jing-Hua; Huang, Wen-Hui; Wang, Yi; Zhao, Rui; Han, Bu-Cong; Liu, Qing-Feng; Chen, Yu-Zong; Ji, Zhi-Liang
2007-07-01
Drug-induced toxicity related proteins (DITRPs) are proteins that mediate adverse drug reactions (ADRs) or toxicities through their binding to drugs or reactive metabolites. Collection of these proteins facilitates better understanding of the molecular mechanisms of drug-induced toxicity and the rational drug discovery. Drug-induced toxicity related protein database (DITOP) is such a database that is intending to provide comprehensive information of DITRPs. Currently, DITOP contains 1501 records, covering 618 distinct literature-reported DITRPs, 529 drugs/ligands and 418 distinct toxicity terms. These proteins were confirmed experimentally to interact with drugs or their reactive metabolites, thus directly or indirectly cause adverse effects or toxicities. Five major types of drug-induced toxicities or ADRs are included in DITOP, which are the idiosyncratic adverse drug reactions, the dose-dependent toxicities, the drug-drug interactions, the immune-mediated adverse drug effects (IMADEs) and the toxicities caused by genetic susceptibility. Molecular mechanisms underlying the toxicity and cross-links to related resources are also provided while available. Moreover, a series of user-friendly interfaces were designed for flexible retrieval of DITRPs-related information. The DITOP can be accessed freely at http://bioinf.xmu.edu.cn/databases/ADR/index.html. Supplementary data are available at Bioinformatics online.
Broadening the horizon – level 2.5 of the HUPO-PSI format for molecular interactions
Kerrien, Samuel; Orchard, Sandra; Montecchi-Palazzi, Luisa; Aranda, Bruno; Quinn, Antony F; Vinod, Nisha; Bader, Gary D; Xenarios, Ioannis; Wojcik, Jérôme; Sherman, David; Tyers, Mike; Salama, John J; Moore, Susan; Ceol, Arnaud; Chatr-aryamontri, Andrew; Oesterheld, Matthias; Stümpflen, Volker; Salwinski, Lukasz; Nerothin, Jason; Cerami, Ethan; Cusick, Michael E; Vidal, Marc; Gilson, Michael; Armstrong, John; Woollard, Peter; Hogue, Christopher; Eisenberg, David; Cesareni, Gianni; Apweiler, Rolf; Hermjakob, Henning
2007-01-01
Background Molecular interaction Information is a key resource in modern biomedical research. Publicly available data have previously been provided in a broad array of diverse formats, making access to this very difficult. The publication and wide implementation of the Human Proteome Organisation Proteomics Standards Initiative Molecular Interactions (HUPO PSI-MI) format in 2004 was a major step towards the establishment of a single, unified format by which molecular interactions should be presented, but focused purely on protein-protein interactions. Results The HUPO-PSI has further developed the PSI-MI XML schema to enable the description of interactions between a wider range of molecular types, for example nucleic acids, chemical entities, and molecular complexes. Extensive details about each supported molecular interaction can now be captured, including the biological role of each molecule within that interaction, detailed description of interacting domains, and the kinetic parameters of the interaction. The format is supported by data management and analysis tools and has been adopted by major interaction data providers. Additionally, a simpler, tab-delimited format MITAB2.5 has been developed for the benefit of users who require only minimal information in an easy to access configuration. Conclusion The PSI-MI XML2.5 and MITAB2.5 formats have been jointly developed by interaction data producers and providers from both the academic and commercial sector, and are already widely implemented and well supported by an active development community. PSI-MI XML2.5 enables the description of highly detailed molecular interaction data and facilitates data exchange between databases and users without loss of information. MITAB2.5 is a simpler format appropriate for fast Perl parsing or loading into Microsoft Excel. PMID:17925023
Wu, Chia-Chou; Chen, Bor-Sen
2016-01-01
Infected zebrafish coordinates defensive and offensive molecular mechanisms in response to Candida albicans infections, and invasive C. albicans coordinates corresponding molecular mechanisms to interact with the host. However, knowledge of the ensuing infection-activated signaling networks in both host and pathogen and their interspecific crosstalk during the innate and adaptive phases of the infection processes remains incomplete. In the present study, dynamic network modeling, protein interaction databases, and dual transcriptome data from zebrafish and C. albicans during infection were used to infer infection-activated host–pathogen dynamic interaction networks. The consideration of host–pathogen dynamic interaction systems as innate and adaptive loops and subsequent comparisons of inferred innate and adaptive networks indicated previously unrecognized crosstalk between known pathways and suggested roles of immunological memory in the coordination of host defensive and offensive molecular mechanisms to achieve specific and powerful defense against pathogens. Moreover, pathogens enhance intraspecific crosstalk and abrogate host apoptosis to accommodate enhanced host defense mechanisms during the adaptive phase. Accordingly, links between physiological phenomena and changes in the coordination of defensive and offensive molecular mechanisms highlight the importance of host–pathogen molecular interaction networks, and consequent inferences of the host–pathogen relationship could be translated into biomedical applications. PMID:26881892
Wu, Chia-Chou; Chen, Bor-Sen
2016-01-01
Infected zebrafish coordinates defensive and offensive molecular mechanisms in response to Candida albicans infections, and invasive C. albicans coordinates corresponding molecular mechanisms to interact with the host. However, knowledge of the ensuing infection-activated signaling networks in both host and pathogen and their interspecific crosstalk during the innate and adaptive phases of the infection processes remains incomplete. In the present study, dynamic network modeling, protein interaction databases, and dual transcriptome data from zebrafish and C. albicans during infection were used to infer infection-activated host-pathogen dynamic interaction networks. The consideration of host-pathogen dynamic interaction systems as innate and adaptive loops and subsequent comparisons of inferred innate and adaptive networks indicated previously unrecognized crosstalk between known pathways and suggested roles of immunological memory in the coordination of host defensive and offensive molecular mechanisms to achieve specific and powerful defense against pathogens. Moreover, pathogens enhance intraspecific crosstalk and abrogate host apoptosis to accommodate enhanced host defense mechanisms during the adaptive phase. Accordingly, links between physiological phenomena and changes in the coordination of defensive and offensive molecular mechanisms highlight the importance of host-pathogen molecular interaction networks, and consequent inferences of the host-pathogen relationship could be translated into biomedical applications.
Hall, Aaron Smalter; Shan, Yunfeng; Lushington, Gerald; Visvanathan, Mahesh
2016-01-01
Databases and exchange formats describing biological entities such as chemicals and proteins, along with their relationships, are a critical component of research in life sciences disciplines, including chemical biology wherein small information about small molecule properties converges with cellular and molecular biology. Databases for storing biological entities are growing not only in size, but also in type, with many similarities between them and often subtle differences. The data formats available to describe and exchange these entities are numerous as well. In general, each format is optimized for a particular purpose or database, and hence some understanding of these formats is required when choosing one for research purposes. This paper reviews a selection of different databases and data formats with the goal of summarizing their purposes, features, and limitations. Databases are reviewed under the categories of 1) protein interactions, 2) metabolic pathways, 3) chemical interactions, and 4) drug discovery. Representation formats will be discussed according to those describing chemical structures, and those describing genomic/proteomic entities. PMID:22934944
Smalter Hall, Aaron; Shan, Yunfeng; Lushington, Gerald; Visvanathan, Mahesh
2013-03-01
Databases and exchange formats describing biological entities such as chemicals and proteins, along with their relationships, are a critical component of research in life sciences disciplines, including chemical biology wherein small information about small molecule properties converges with cellular and molecular biology. Databases for storing biological entities are growing not only in size, but also in type, with many similarities between them and often subtle differences. The data formats available to describe and exchange these entities are numerous as well. In general, each format is optimized for a particular purpose or database, and hence some understanding of these formats is required when choosing one for research purposes. This paper reviews a selection of different databases and data formats with the goal of summarizing their purposes, features, and limitations. Databases are reviewed under the categories of 1) protein interactions, 2) metabolic pathways, 3) chemical interactions, and 4) drug discovery. Representation formats will be discussed according to those describing chemical structures, and those describing genomic/proteomic entities.
Hung, Tzu-Chieh; Lee, Wen-Yuan; Chen, Kuen-Bao; Chan, Yueh-Chiu; Chen, Calvin Yu-Chian
2014-01-01
Acquired immunodeficiency syndrome (AIDS), caused by human immunodeficiency virus (HIV), has become, because of the rapid spread of the disease, a serious global problem and cannot be treated. Recent studies indicate that VIF is a protein of HIV to prevent all of human immunity to attack HIV. Molecular compounds of traditional Chinese medicine (TCM) database filtered through molecular docking and molecular dynamics simulations to inhibit VIF can protect against HIV. Glutamic acid, plantagoguanidinic acid, and Aurantiamide acetate based docking score higher with other TCM compounds selected. Molecular dynamics are useful for analysis and detection ligand interactions. According to the docking position, hydrophobic interactions, hydrogen bonding changes, and structure variation, the study try to select the efficacy of traditional Chinese medicine compound Aurantiamide acetate is better than the other for protein-ligand interactions to maintain the protein composition, based on changes in the structure.
Morales-Bayuelo, Alejandro
2017-06-21
Mycobacterium tuberculosis remains one of the world's most devastating pathogens. For this reason, we developed a study involving 3D pharmacophore searching, selectivity analysis and database screening for a series of anti-tuberculosis compounds, associated with the protein kinases A, B, and G. This theoretical study is expected to shed some light onto some molecular aspects that could contribute to the knowledge of the molecular mechanics behind interactions of these compounds, with anti-tuberculosis activity. Using the Molecular Quantum Similarity field and reactivity descriptors supported in the Density Functional Theory, it was possible to measure the quantification of the steric and electrostatic effects through the Overlap and Coulomb quantitative convergence (alpha and beta) scales. In addition, an analysis of reactivity indices using global and local descriptors was developed, identifying the binding sites and selectivity on these anti-tuberculosis compounds in the active sites. Finally, the reported pharmacophores to PKn A, B and G, were used to carry out database screening, using a database with anti-tuberculosis drugs from the Kelly Chibale research group (http://www.kellychibaleresearch.uct.ac.za/), to find the compounds with affinity for the specific protein targets associated with PKn A, B and G. In this regard, this hybrid methodology (Molecular Mechanic/Quantum Chemistry) shows new insights into drug design that may be useful in the tuberculosis treatment today.
HAEdb: a novel interactive, locus-specific mutation database for the C1 inhibitor gene.
Kalmár, Lajos; Hegedüs, Tamás; Farkas, Henriette; Nagy, Melinda; Tordai, Attila
2005-01-01
Hereditary angioneurotic edema (HAE) is an autosomal dominant disorder characterized by episodic local subcutaneous and submucosal edema and is caused by the deficiency of the activated C1 esterase inhibitor protein (C1-INH or C1INH; approved gene symbol SERPING1). Published C1-INH mutations are represented in large universal databases (e.g., OMIM, HGMD), but these databases update their data rather infrequently, they are not interactive, and they do not allow searches according to different criteria. The HAEdb, a C1-INH gene mutation database (http://hae.biomembrane.hu) was created to contribute to the following expectations: 1) help the comprehensive collection of information on genetic alterations of the C1-INH gene; 2) create a database in which data can be searched and compared according to several flexible criteria; and 3) provide additional help in new mutation identification. The website uses MySQL, an open-source, multithreaded, relational database management system. The user-friendly graphical interface was written in the PHP web programming language. The website consists of two main parts, the freely browsable search function, and the password-protected data deposition function. Mutations of the C1-INH gene are divided in two parts: gross mutations involving DNA fragments >1 kb, and micro mutations encompassing all non-gross mutations. Several attributes (e.g., affected exon, molecular consequence, family history) are collected for each mutation in a standardized form. This database may facilitate future comprehensive analyses of C1-INH mutations and also provide regular help for molecular diagnostic testing of HAE patients in different centers.
UbSRD: The Ubiquitin Structural Relational Database.
Harrison, Joseph S; Jacobs, Tim M; Houlihan, Kevin; Van Doorslaer, Koenraad; Kuhlman, Brian
2016-02-22
The structurally defined ubiquitin-like homology fold (UBL) can engage in several unique protein-protein interactions and many of these complexes have been characterized with high-resolution techniques. Using Rosetta's structural classification tools, we have created the Ubiquitin Structural Relational Database (UbSRD), an SQL database of features for all 509 UBL-containing structures in the PDB, allowing users to browse these structures by protein-protein interaction and providing a platform for quantitative analysis of structural features. We used UbSRD to define the recognition features of ubiquitin (UBQ) and SUMO observed in the PDB and the orientation of the UBQ tail while interacting with certain types of proteins. While some of the interaction surfaces on UBQ and SUMO overlap, each molecule has distinct features that aid in molecular discrimination. Additionally, we find that the UBQ tail is malleable and can adopt a variety of conformations upon binding. UbSRD is accessible as an online resource at rosettadesign.med.unc.edu/ubsrd. Copyright © 2015 Elsevier Ltd. All rights reserved.
S66: A Well-balanced Database of Benchmark Interaction Energies Relevant to Biomolecular Structures
2011-01-01
With numerous new quantum chemistry methods being developed in recent years and the promise of even more new methods to be developed in the near future, it is clearly critical that highly accurate, well-balanced, reference data for many different atomic and molecular properties be available for the parametrization and validation of these methods. One area of research that is of particular importance in many areas of chemistry, biology, and material science is the study of noncovalent interactions. Because these interactions are often strongly influenced by correlation effects, it is necessary to use computationally expensive high-order wave function methods to describe them accurately. Here, we present a large new database of interaction energies calculated using an accurate CCSD(T)/CBS scheme. Data are presented for 66 molecular complexes, at their reference equilibrium geometries and at 8 points systematically exploring their dissociation curves; in total, the database contains 594 points: 66 at equilibrium geometries, and 528 in dissociation curves. The data set is designed to cover the most common types of noncovalent interactions in biomolecules, while keeping a balanced representation of dispersion and electrostatic contributions. The data set is therefore well suited for testing and development of methods applicable to bioorganic systems. In addition to the benchmark CCSD(T) results, we also provide decompositions of the interaction energies by means of DFT-SAPT calculations. The data set was used to test several correlated QM methods, including those parametrized specifically for noncovalent interactions. Among these, the SCS-MI-CCSD method outperforms all other tested methods, with a root-mean-square error of 0.08 kcal/mol for the S66 data set. PMID:21836824
Song, Huiying; Vanderheyden, Yoachim; Adams, Erwin; Desmet, Gert; Cabooter, Deirdre
2016-07-15
Diffusion plays an important role in all aspects of band broadening in chromatography. An accurate knowledge of molecular diffusion coefficients in different mobile phases is therefore crucial in fundamental column performance studies. Correlations available in literature, such as the Wilke-Chang equation, can provide good approximations of molecular diffusion under reversed-phase conditions. However, these correlations have been demonstrated to be less accurate for mobile phases containing a large percentage of acetonitrile, as is the case in hydrophilic interaction liquid chromatography. A database of experimentally measured molecular diffusion coefficients of some 45 polar and apolar compounds that are frequently used as test molecules under hydrophilic interaction liquid chromatography and reversed-phase conditions is therefore presented. Special attention is given to diffusion coefficients of polar compounds obtained in large percentages of acetonitrile (>90%). The effect of the buffer concentration (5-10mM ammonium acetate) on the obtained diffusion coefficients is investigated and is demonstrated to mainly influence the molecular diffusion of charged molecules. Diffusion coefficients are measured using the Taylor-Aris method and hence deduced from the peak broadening of a solute when flowing through a long open tube. The validity of the set-up employed for the measurement of the diffusion coefficients is demonstrated by ruling out the occurrence of longitudinal diffusion, secondary flow interactions and extra-column effects, while it is also shown that radial equilibration in the 15m long capillary is effective. Copyright © 2016 Elsevier B.V. All rights reserved.
RISE: a database of RNA interactome from sequencing experiments
Gong, Jing; Shao, Di; Xu, Kui
2018-01-01
Abstract We present RISE (http://rise.zhanglab.net), a database of RNA Interactome from Sequencing Experiments. RNA-RNA interactions (RRIs) are essential for RNA regulation and function. RISE provides a comprehensive collection of RRIs that mainly come from recent transcriptome-wide sequencing-based experiments like PARIS, SPLASH, LIGR-seq, and MARIO, as well as targeted studies like RIA-seq, RAP-RNA and CLASH. It also includes interactions aggregated from other primary databases and publications. The RISE database currently contains 328,811 RNA-RNA interactions mainly in human, mouse and yeast. While most existing RNA databases mainly contain interactions of miRNA targeting, notably, more than half of the RRIs in RISE are among mRNA and long non-coding RNAs. We compared different RRI datasets in RISE and found limited overlaps in interactions resolved by different techniques and in different cell lines. It may suggest technology preference and also dynamic natures of RRIs. We also analyzed the basic features of the human and mouse RRI networks and found that they tend to be scale-free, small-world, hierarchical and modular. The analysis may nominate important RNAs or RRIs for further investigation. Finally, RISE provides a Circos plot and several table views for integrative visualization, with extensive molecular and functional annotations to facilitate exploration of biological functions for any RRI of interest. PMID:29040625
Robinson, J M; Henderson, W A
2018-01-12
We report a method using functional-molecular databases and network modelling to identify hypothetical mRNA-miRNA interaction networks regulating intestinal epithelial barrier function. The model forms a data-analysis component of our cell culture experiments, which produce RNA expression data from Nanostring Technologies nCounter ® system. The epithelial tight-junction (TJ) and actin cytoskeleton interact as molecular components of the intestinal epithelial barrier. Upstream regulation of TJ-cytoskeleton interaction is effected by the Rac/Rock/Rho signaling pathway and other associated pathways which may be activated or suppressed by extracellular signaling from growth factors, hormones, and immune receptors. Pathway activations affect epithelial homeostasis, contributing to degradation of the epithelial barrier associated with osmotic dysregulation, inflammation, and tumor development. The complexity underlying miRNA-mRNA interaction networks represents a roadblock for prediction and validation of competing-endogenous RNA network function. We developed a network model to identify hypothetical co-regulatory motifs in a miRNA-mRNA interaction network related to epithelial function. A mRNA-miRNA interaction list was generated using KEGG and miRWalk2.0 databases. R-code was developed to quantify and visualize inherent network structures. We identified a sub-network with a high number of shared, targeting miRNAs, of genes associated with cellular proliferation and cancer, including c-MYC and Cyclin D.
BIOSPIDA: A Relational Database Translator for NCBI.
Hagen, Matthew S; Lee, Eva K
2010-11-13
As the volume and availability of biological databases continue widespread growth, it has become increasingly difficult for research scientists to identify all relevant information for biological entities of interest. Details of nucleotide sequences, gene expression, molecular interactions, and three-dimensional structures are maintained across many different databases. To retrieve all necessary information requires an integrated system that can query multiple databases with minimized overhead. This paper introduces a universal parser and relational schema translator that can be utilized for all NCBI databases in Abstract Syntax Notation (ASN.1). The data models for OMIM, Entrez-Gene, Pubmed, MMDB and GenBank have been successfully converted into relational databases and all are easily linkable helping to answer complex biological questions. These tools facilitate research scientists to locally integrate databases from NCBI without significant workload or development time.
The Comparative Toxicogenomics Database: update 2017.
Davis, Allan Peter; Grondin, Cynthia J; Johnson, Robin J; Sciaky, Daniela; King, Benjamin L; McMorran, Roy; Wiegers, Jolene; Wiegers, Thomas C; Mattingly, Carolyn J
2017-01-04
The Comparative Toxicogenomics Database (CTD; http://ctdbase.org/) provides information about interactions between chemicals and gene products, and their relationships to diseases. Core CTD content (chemical-gene, chemical-disease and gene-disease interactions manually curated from the literature) are integrated with each other as well as with select external datasets to generate expanded networks and predict novel associations. Today, core CTD includes more than 30.5 million toxicogenomic connections relating chemicals/drugs, genes/proteins, diseases, taxa, Gene Ontology (GO) annotations, pathways, and gene interaction modules. In this update, we report a 33% increase in our core data content since 2015, describe our new exposure module (that harmonizes exposure science information with core toxicogenomic data) and introduce a novel dataset of GO-disease inferences (that identify common molecular underpinnings for seemingly unrelated pathologies). These advancements centralize and contextualize real-world chemical exposures with molecular pathways to help scientists generate testable hypotheses in an effort to understand the etiology and mechanisms underlying environmentally influenced diseases. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
The 24th annual Nucleic Acids Research database issue: a look back and upcoming changes
Rigden, Daniel J
2017-01-01
Abstract This year's Database Issue of Nucleic Acids Research contains 152 papers that include descriptions of 54 new databases and update papers on 98 databases, of which 16 have not been previously featured in NAR. As always, these databases cover a broad range of molecular biology subjects, including genome structure, gene expression and its regulation, proteins, protein domains, and protein–protein interactions. Following the recent trend, an increasing number of new and established databases deal with the issues of human health, from cancer-causing mutations to drugs and drug targets. In accordance with this trend, three recently compiled databases that have been selected by NAR reviewers and editors as ‘breakthrough’ contributions, denovo-db, the Monarch Initiative, and Open Targets, cover human de novo gene variants, disease-related phenotypes in model organisms, and a bioinformatics platform for therapeutic target identification and validation, respectively. We expect these databases to attract the attention of numerous researchers working in various areas of genetics and genomics. Looking back at the past 12 years, we present here the ‘golden set’ of databases that have consistently served as authoritative, comprehensive, and convenient data resources widely used by the entire community and offer some lessons on what makes a successful database. The Database Issue is freely available online at the https://academic.oup.com/nar web site. An updated version of the NAR Molecular Biology Database Collection is available at http://www.oxfordjournals.org/nar/database/a/. PMID:28053160
Wang, Yongcui; Chen, Shilong; Deng, Naiyang; Wang, Yong
2013-01-01
Computational inference of novel therapeutic values for existing drugs, i.e., drug repositioning, offers the great prospect for faster and low-risk drug development. Previous researches have indicated that chemical structures, target proteins, and side-effects could provide rich information in drug similarity assessment and further disease similarity. However, each single data source is important in its own way and data integration holds the great promise to reposition drug more accurately. Here, we propose a new method for drug repositioning, PreDR (Predict Drug Repositioning), to integrate molecular structure, molecular activity, and phenotype data. Specifically, we characterize drug by profiling in chemical structure, target protein, and side-effects space, and define a kernel function to correlate drugs with diseases. Then we train a support vector machine (SVM) to computationally predict novel drug-disease interactions. PreDR is validated on a well-established drug-disease network with 1,933 interactions among 593 drugs and 313 diseases. By cross-validation, we find that chemical structure, drug target, and side-effects information are all predictive for drug-disease relationships. More experimentally observed drug-disease interactions can be revealed by integrating these three data sources. Comparison with existing methods demonstrates that PreDR is competitive both in accuracy and coverage. Follow-up database search and pathway analysis indicate that our new predictions are worthy of further experimental validation. Particularly several novel predictions are supported by clinical trials databases and this shows the significant prospects of PreDR in future drug treatment. In conclusion, our new method, PreDR, can serve as a useful tool in drug discovery to efficiently identify novel drug-disease interactions. In addition, our heterogeneous data integration framework can be applied to other problems. PMID:24244318
BIOSPIDA: A Relational Database Translator for NCBI
Hagen, Matthew S.; Lee, Eva K.
2010-01-01
As the volume and availability of biological databases continue widespread growth, it has become increasingly difficult for research scientists to identify all relevant information for biological entities of interest. Details of nucleotide sequences, gene expression, molecular interactions, and three-dimensional structures are maintained across many different databases. To retrieve all necessary information requires an integrated system that can query multiple databases with minimized overhead. This paper introduces a universal parser and relational schema translator that can be utilized for all NCBI databases in Abstract Syntax Notation (ASN.1). The data models for OMIM, Entrez-Gene, Pubmed, MMDB and GenBank have been successfully converted into relational databases and all are easily linkable helping to answer complex biological questions. These tools facilitate research scientists to locally integrate databases from NCBI without significant workload or development time. PMID:21347013
BioPAX – A community standard for pathway data sharing
Demir, Emek; Cary, Michael P.; Paley, Suzanne; Fukuda, Ken; Lemer, Christian; Vastrik, Imre; Wu, Guanming; D’Eustachio, Peter; Schaefer, Carl; Luciano, Joanne; Schacherer, Frank; Martinez-Flores, Irma; Hu, Zhenjun; Jimenez-Jacinto, Veronica; Joshi-Tope, Geeta; Kandasamy, Kumaran; Lopez-Fuentes, Alejandra C.; Mi, Huaiyu; Pichler, Elgar; Rodchenkov, Igor; Splendiani, Andrea; Tkachev, Sasha; Zucker, Jeremy; Gopinath, Gopal; Rajasimha, Harsha; Ramakrishnan, Ranjani; Shah, Imran; Syed, Mustafa; Anwar, Nadia; Babur, Ozgun; Blinov, Michael; Brauner, Erik; Corwin, Dan; Donaldson, Sylva; Gibbons, Frank; Goldberg, Robert; Hornbeck, Peter; Luna, Augustin; Murray-Rust, Peter; Neumann, Eric; Reubenacker, Oliver; Samwald, Matthias; van Iersel, Martijn; Wimalaratne, Sarala; Allen, Keith; Braun, Burk; Whirl-Carrillo, Michelle; Dahlquist, Kam; Finney, Andrew; Gillespie, Marc; Glass, Elizabeth; Gong, Li; Haw, Robin; Honig, Michael; Hubaut, Olivier; Kane, David; Krupa, Shiva; Kutmon, Martina; Leonard, Julie; Marks, Debbie; Merberg, David; Petri, Victoria; Pico, Alex; Ravenscroft, Dean; Ren, Liya; Shah, Nigam; Sunshine, Margot; Tang, Rebecca; Whaley, Ryan; Letovksy, Stan; Buetow, Kenneth H.; Rzhetsky, Andrey; Schachter, Vincent; Sobral, Bruno S.; Dogrusoz, Ugur; McWeeney, Shannon; Aladjem, Mirit; Birney, Ewan; Collado-Vides, Julio; Goto, Susumu; Hucka, Michael; Le Novère, Nicolas; Maltsev, Natalia; Pandey, Akhilesh; Thomas, Paul; Wingender, Edgar; Karp, Peter D.; Sander, Chris; Bader, Gary D.
2010-01-01
BioPAX (Biological Pathway Exchange) is a standard language to represent biological pathways at the molecular and cellular level. Its major use is to facilitate the exchange of pathway data (http://www.biopax.org). Pathway data captures our understanding of biological processes, but its rapid growth necessitates development of databases and computational tools to aid interpretation. However, the current fragmentation of pathway information across many databases with incompatible formats presents barriers to its effective use. BioPAX solves this problem by making pathway data substantially easier to collect, index, interpret and share. BioPAX can represent metabolic and signaling pathways, molecular and genetic interactions and gene regulation networks. BioPAX was created through a community process. Through BioPAX, millions of interactions organized into thousands of pathways across many organisms, from a growing number of sources, are available. Thus, large amounts of pathway data are available in a computable form to support visualization, analysis and biological discovery. PMID:20829833
Wan, Minghui; Liao, Dongjiang; Peng, Guilin; Xu, Xin; Yin, Weiqiang; Guo, Guixin; Jiang, Funeng; Zhong, Weide
2017-01-01
Chloride intracellular channel 1 (CLIC1) is involved in the development of most aggressive human tumors, including gastric, colon, lung, liver, and glioblastoma cancers. It has become an attractive new therapeutic target for several types of cancer. In this work, we aim to identify natural products as potent CLIC1 inhibitors from Traditional Chinese Medicine (TCM) database using structure-based virtual screening and molecular dynamics (MD) simulation. First, structure-based docking was employed to screen the refined TCM database and the top 500 TCM compounds were obtained and reranked by X-Score. Then, 30 potent hits were achieved from the top 500 TCM compounds using cluster and ligand-protein interaction analysis. Finally, MD simulation was employed to validate the stability of interactions between each hit and CLIC1 protein from docking simulation, and Molecular Mechanics/Generalized Born Surface Area (MM-GBSA) analysis was used to refine the virtual hits. Six TCM compounds with top MM-GBSA scores and ideal-binding models were confirmed as the final hits. Our study provides information about the interaction between TCM compounds and CLIC1 protein, which may be helpful for further experimental investigations. In addition, the top 6 natural products structural scaffolds could serve as building blocks in designing drug-like molecules for CLIC1 inhibition. PMID:29147652
NASA Astrophysics Data System (ADS)
Sakkiah, Sugunadevi; Thangapandian, Sundarapandian; John, Shalini; Lee, Keun Woo
2011-01-01
This study was performed to find the selective chemical features for Aurora kinase-B inhibitors using the potent methods like Hip-Hop, virtual screening, homology modeling, molecular dynamics and docking. The best hypothesis, Hypo1 was validated toward a wide range of test set containing the selective inhibitors of Aurora kinase-B. Homology modeling and molecular dynamics studies were carried out to perform the molecular docking studies. The best hypothesis Hypo1 was used as a 3D query to screen the chemical databases. The screened molecules from the databases were sorted based on ADME and drug like properties. The selective hit compounds were docked and the hydrogen bond interactions with the critical amino acids present in Aurora kinase-B were compared with the chemical features present in the Hypo1. Finally, we suggest that the chemical features present in the Hypo1 are vital for a molecule to inhibit the Aurora kinase-B activity.
SpectraPlot.com: Integrated spectroscopic modeling of atomic and molecular gases
NASA Astrophysics Data System (ADS)
Goldenstein, Christopher S.; Miller, Victor A.; Mitchell Spearrin, R.; Strand, Christopher L.
2017-10-01
SpectraPlot is a web-based application for simulating spectra of atomic and molecular gases. At the time this manuscript was written, SpectraPlot consisted of four primary tools for calculating: (1) atomic and molecular absorption spectra, (2) atomic and molecular emission spectra, (3) transition linestrengths, and (4) blackbody emission spectra. These tools currently employ the NIST ASD, HITRAN2012, and HITEMP2010 databases to perform line-by-line simulations of spectra. SpectraPlot employs a modular, integrated architecture, enabling multiple simulations across multiple databases and/or thermodynamic conditions to be visualized in an interactive plot window. The primary objective of this paper is to describe the architecture and spectroscopic models employed by SpectraPlot in order to provide its users with the knowledge required to understand the capabilities and limitations of simulations performed using SpectraPlot. Further, this manuscript discusses the accuracy of several underlying approximations used to decrease computational time, in particular, the use of far-wing cutoff criteria.
The systematic annotation of the three main GPCR families in Reactome.
Jassal, Bijay; Jupe, Steven; Caudy, Michael; Birney, Ewan; Stein, Lincoln; Hermjakob, Henning; D'Eustachio, Peter
2010-07-29
Reactome is an open-source, freely available database of human biological pathways and processes. A major goal of our work is to provide an integrated view of cellular signalling processes that spans from ligand-receptor interactions to molecular readouts at the level of metabolic and transcriptional events. To this end, we have built the first catalogue of all human G protein-coupled receptors (GPCRs) known to bind endogenous or natural ligands. The UniProt database has records for 797 proteins classified as GPCRs and sorted into families A/1, B/2 and C/3 on the basis of amino acid sequence. To these records we have added details from the IUPHAR database and our own manual curation of relevant literature to create reactions in which 563 GPCRs bind ligands and also interact with specific G-proteins to initiate signalling cascades. We believe the remaining 234 GPCRs are true orphans. The Reactome GPCR pathway can be viewed as a detailed interactive diagram and can be exported in many forms. It provides a template for the orthology-based inference of GPCR reactions for diverse model organism species, and can be overlaid with protein-protein interaction and gene expression datasets to facilitate overrepresentation studies and other forms of pathway analysis. Database URL: http://www.reactome.org.
EcoCyc: a comprehensive database resource for Escherichia coli
Keseler, Ingrid M.; Collado-Vides, Julio; Gama-Castro, Socorro; Ingraham, John; Paley, Suzanne; Paulsen, Ian T.; Peralta-Gil, Martín; Karp, Peter D.
2005-01-01
The EcoCyc database (http://EcoCyc.org/) is a comprehensive source of information on the biology of the prototypical model organism Escherichia coli K12. The mission for EcoCyc is to contain both computable descriptions of, and detailed comments describing, all genes, proteins, pathways and molecular interactions in E.coli. Through ongoing manual curation, extensive information such as summary comments, regulatory information, literature citations and evidence types has been extracted from 8862 publications and added to Version 8.5 of the EcoCyc database. The EcoCyc database can be accessed through a World Wide Web interface, while the downloadable Pathway Tools software and data files enable computational exploration of the data and provide enhanced querying capabilities that web interfaces cannot support. For example, EcoCyc contains carefully curated information that can be used as training sets for bioinformatics prediction of entities such as promoters, operons, genetic networks, transcription factor binding sites, metabolic pathways, functionally related genes, protein complexes and protein–ligand interactions. PMID:15608210
WormQTLHD—a web database for linking human disease to natural variation data in C. elegans
van der Velde, K. Joeri; de Haan, Mark; Zych, Konrad; Arends, Danny; Snoek, L. Basten; Kammenga, Jan E.; Jansen, Ritsert C.; Swertz, Morris A.; Li, Yang
2014-01-01
Interactions between proteins are highly conserved across species. As a result, the molecular basis of multiple diseases affecting humans can be studied in model organisms that offer many alternative experimental opportunities. One such organism—Caenorhabditis elegans—has been used to produce much molecular quantitative genetics and systems biology data over the past decade. We present WormQTLHD (Human Disease), a database that quantitatively and systematically links expression Quantitative Trait Loci (eQTL) findings in C. elegans to gene–disease associations in man. WormQTLHD, available online at http://www.wormqtl-hd.org, is a user-friendly set of tools to reveal functionally coherent, evolutionary conserved gene networks. These can be used to predict novel gene-to-gene associations and the functions of genes underlying the disease of interest. We created a new database that links C. elegans eQTL data sets to human diseases (34 337 gene–disease associations from OMIM, DGA, GWAS Central and NHGRI GWAS Catalogue) based on overlapping sets of orthologous genes associated to phenotypes in these two species. We utilized QTL results, high-throughput molecular phenotypes, classical phenotypes and genotype data covering different developmental stages and environments from WormQTL database. All software is available as open source, built on MOLGENIS and xQTL workbench. PMID:24217915
WormQTLHD--a web database for linking human disease to natural variation data in C. elegans.
van der Velde, K Joeri; de Haan, Mark; Zych, Konrad; Arends, Danny; Snoek, L Basten; Kammenga, Jan E; Jansen, Ritsert C; Swertz, Morris A; Li, Yang
2014-01-01
Interactions between proteins are highly conserved across species. As a result, the molecular basis of multiple diseases affecting humans can be studied in model organisms that offer many alternative experimental opportunities. One such organism-Caenorhabditis elegans-has been used to produce much molecular quantitative genetics and systems biology data over the past decade. We present WormQTL(HD) (Human Disease), a database that quantitatively and systematically links expression Quantitative Trait Loci (eQTL) findings in C. elegans to gene-disease associations in man. WormQTL(HD), available online at http://www.wormqtl-hd.org, is a user-friendly set of tools to reveal functionally coherent, evolutionary conserved gene networks. These can be used to predict novel gene-to-gene associations and the functions of genes underlying the disease of interest. We created a new database that links C. elegans eQTL data sets to human diseases (34 337 gene-disease associations from OMIM, DGA, GWAS Central and NHGRI GWAS Catalogue) based on overlapping sets of orthologous genes associated to phenotypes in these two species. We utilized QTL results, high-throughput molecular phenotypes, classical phenotypes and genotype data covering different developmental stages and environments from WormQTL database. All software is available as open source, built on MOLGENIS and xQTL workbench.
The Comparative Toxicogenomics Database (CTD): A Resource for Comparative Toxicological Studies
CJ, Mattingly; MC, Rosenstein; GT, Colby; JN, Forrest; JL, Boyer
2006-01-01
The etiology of most chronic diseases involves interactions between environmental factors and genes that modulate important biological processes (Olden and Wilson, 2000). We are developing the publicly available Comparative Toxicogenomics Database (CTD) to promote understanding about the effects of environmental chemicals on human health. CTD identifies interactions between chemicals and genes and facilitates cross-species comparative studies of these genes. The use of diverse animal models and cross-species comparative sequence studies has been critical for understanding basic physiological mechanisms and gene and protein functions. Similarly, these approaches will be valuable for exploring the molecular mechanisms of action of environmental chemicals and the genetic basis of differential susceptibility. PMID:16902965
ERIC Educational Resources Information Center
Moore, John W., Ed.
1988-01-01
Describes five computer software packages; four for MS-DOS Systems and one for Apple II. Included are SPEC20, an interactive simulation of a Bausch and Lomb Spectronic-20; a database for laboratory chemicals and programs for visualizing Boltzmann-like distributions, orbital plot for the hydrogen atom and molecular orbital theory. (CW)
Integrated web visualizations for protein-protein interaction databases.
Jeanquartier, Fleur; Jean-Quartier, Claire; Holzinger, Andreas
2015-06-16
Understanding living systems is crucial for curing diseases. To achieve this task we have to understand biological networks based on protein-protein interactions. Bioinformatics has come up with a great amount of databases and tools that support analysts in exploring protein-protein interactions on an integrated level for knowledge discovery. They provide predictions and correlations, indicate possibilities for future experimental research and fill the gaps to complete the picture of biochemical processes. There are numerous and huge databases of protein-protein interactions used to gain insights into answering some of the many questions of systems biology. Many computational resources integrate interaction data with additional information on molecular background. However, the vast number of diverse Bioinformatics resources poses an obstacle to the goal of understanding. We present a survey of databases that enable the visual analysis of protein networks. We selected M=10 out of N=53 resources supporting visualization, and we tested against the following set of criteria: interoperability, data integration, quantity of possible interactions, data visualization quality and data coverage. The study reveals differences in usability, visualization features and quality as well as the quantity of interactions. StringDB is the recommended first choice. CPDB presents a comprehensive dataset and IntAct lets the user change the network layout. A comprehensive comparison table is available via web. The supplementary table can be accessed on http://tinyurl.com/PPI-DB-Comparison-2015. Only some web resources featuring graph visualization can be successfully applied to interactive visual analysis of protein-protein interaction. Study results underline the necessity for further enhancements of visualization integration in biochemical analysis tools. Identified challenges are data comprehensiveness, confidence, interactive feature and visualization maturing.
A review of drug-induced liver injury databases.
Luo, Guangwen; Shen, Yiting; Yang, Lizhu; Lu, Aiping; Xiang, Zheng
2017-09-01
Drug-induced liver injuries have been a major focus of current research in drug development, and are also one of the major reasons for the failure and withdrawal of drugs in development. Drug-induced liver injuries have been systematically recorded in many public databases, which have become valuable resources in this field. In this study, we provide an overview of these databases, including the liver injury-specific databases LiverTox, LTKB, Open TG-GATEs, LTMap and Hepatox, and the general databases, T3DB, DrugBank, DITOP, DART, CTD and HSDB. The features and limitations of these databases are summarized and discussed in detail. Apart from their powerful functions, we believe that these databases can be improved in several ways: by providing the data about the molecular targets involved in liver toxicity, by incorporating information regarding liver injuries caused by drug interactions, and by regularly updating the data.
Beaver, John E; Bourne, Philip E; Ponomarenko, Julia V
2007-02-21
Structural information about epitopes, particularly the three-dimensional (3D) structures of antigens in complex with immune receptors, presents a valuable source of data for immunology. This information is available in the Protein Data Bank (PDB) and provided in curated form by the Immune Epitope Database and Analysis Resource (IEDB). With continued growth in these data and the importance in understanding molecular level interactions of immunological interest there is a need for new specialized molecular visualization and analysis tools. The EpitopeViewer is a platform-independent Java application for the visualization of the three-dimensional structure and sequence of epitopes and analyses of their interactions with antigen-specific receptors of the immune system (antibodies, T cell receptors and MHC molecules). The viewer renders both 3D views and two-dimensional plots of intermolecular interactions between the antigen and receptor(s) by reading curated data from the IEDB and/or calculated on-the-fly from atom coordinates from the PDB. The 3D views and associated interactions can be saved for future use and publication. The EpitopeViewer can be accessed from the IEDB Web site http://www.immuneepitope.org through the quick link 'Browse Records by 3D Structure.' The EpitopeViewer is designed and been tested for use by immunologists with little or no training in molecular graphics. The EpitopeViewer can be launched from most popular Web browsers without user intervention. A Java Runtime Environment (RJE) 1.4.2 or higher is required.
Capturing cooperative interactions with the PSI-MI format
Van Roey, Kim; Orchard, Sandra; Kerrien, Samuel; Dumousseau, Marine; Ricard-Blum, Sylvie; Hermjakob, Henning; Gibson, Toby J.
2013-01-01
The complex biological processes that control cellular function are mediated by intricate networks of molecular interactions. Accumulating evidence indicates that these interactions are often interdependent, thus acting cooperatively. Cooperative interactions are prevalent in and indispensible for reliable and robust control of cell regulation, as they underlie the conditional decision-making capability of large regulatory complexes. Despite an increased focus on experimental elucidation of the molecular details of cooperative binding events, as evidenced by their growing occurrence in literature, they are currently lacking from the main bioinformatics resources. One of the contributing factors to this deficiency is the lack of a computer-readable standard representation and exchange format for cooperative interaction data. To tackle this shortcoming, we added functionality to the widely used PSI-MI interchange format for molecular interaction data by defining new controlled vocabulary terms that allow annotation of different aspects of cooperativity without making structural changes to the underlying XML schema. As a result, we are able to capture cooperative interaction data in a structured format that is backward compatible with PSI-MI–based data and applications. This will facilitate the storage, exchange and analysis of cooperative interaction data, which in turn will advance experimental research on this fundamental principle in biology. Database URL: http://psi-mi-cooperativeinteractions.embl.de/ PMID:24067240
Kim, Mara; Cooper, Brian A.; Venkat, Rohit; Phillips, Julie B.; Eidem, Haley R.; Hirbo, Jibril; Nutakki, Sashank; Williams, Scott M.; Muglia, Louis J.; Capra, J. Anthony; Petren, Kenneth; Abbot, Patrick; Rokas, Antonis; McGary, Kriston L.
2016-01-01
Mammalian gestation and pregnancy are fast evolving processes that involve the interaction of the fetal, maternal and paternal genomes. Version 1.0 of the GEneSTATION database (http://genestation.org) integrates diverse types of omics data across mammals to advance understanding of the genetic basis of gestation and pregnancy-associated phenotypes and to accelerate the translation of discoveries from model organisms to humans. GEneSTATION is built using tools from the Generic Model Organism Database project, including the biology-aware database CHADO, new tools for rapid data integration, and algorithms that streamline synthesis and user access. GEneSTATION contains curated life history information on pregnancy and reproduction from 23 high-quality mammalian genomes. For every human gene, GEneSTATION contains diverse evolutionary (e.g. gene age, population genetic and molecular evolutionary statistics), organismal (e.g. tissue-specific gene and protein expression, differential gene expression, disease phenotype), and molecular data types (e.g. Gene Ontology Annotation, protein interactions), as well as links to many general (e.g. Entrez, PubMed) and pregnancy disease-specific (e.g. PTBgene, dbPTB) databases. By facilitating the synthesis of diverse functional and evolutionary data in pregnancy-associated tissues and phenotypes and enabling their quick, intuitive, accurate and customized meta-analysis, GEneSTATION provides a novel platform for comprehensive investigation of the function and evolution of mammalian pregnancy. PMID:26567549
Analysis of molecular pathways in pancreatic ductal adenocarcinomas with a bioinformatics approach.
Wang, Yan; Li, Yan
2015-01-01
Pancreatic ductal adenocarcinoma (PDAC) is a leading cause of cancer death worldwide. Our study aimed to reveal molecular mechanisms. Microarray data of GSE15471 (including 39 matching pairs of pancreatic tumor tissues and patient-matched normal tissues) was downloaded from Gene Expression Omnibus (GEO) database. We identified differentially expressed genes (DEGs) in PDAC tissues compared with normal tissues by limma package in R language. Then GO and KEGG pathway enrichment analyses were conducted with online DAVID. In addition, principal component analysis was performed and a protein-protein interaction network was constructed to study relationships between the DEGs through database STRING. A total of 532 DEGs were identified in the 38 PDAC tissues compared with 33 normal tissues. The results of principal component analysis of the top 20 DEGs could differentiate the PDAC tissues from normal tissues directly. In the PPI network, 8 of the 20 DEGs were all key genes of the collagen family. Additionally, FN1 (fibronectin 1) was also a hub node in the network. The genes of the collagen family as well as FN1 were significantly enriched in complement and coagulation cascades, ECM-receptor interaction and focal adhesion pathways. Our results suggest that genes of collagen family and FN1 may play an important role in PDAC progression. Meanwhile, these DEGs and enriched pathways, such as complement and coagulation cascades, ECM-receptor interaction and focal adhesion may be important molecular mechanisms involved in the development and progression of PDAC.
Scientific Use Cases for the Virtual Atomic and Molecular Data Center
NASA Astrophysics Data System (ADS)
Dubernet, M. L.; Aboudarham, J.; Ba, Y. A.; Boiziot, M.; Bottinelli, S.; Caux, E.; Endres, C.; Glorian, J. M.; Henry, F.; Lamy, L.; Le Sidaner, P.; Møller, T.; Moreau, N.; Rénié, C.; Roueff, E.; Schilke, P.; Vastel, C.; Zwoelf, C. M.
2014-12-01
VAMDC Consortium is a worldwide consortium which federates interoperable Atomic and Molecular databases through an e-science infrastructure. The contained data are of the highest scientific quality and are crucial for many applications: astrophysics, atmospheric physics, fusion, plasma and lighting technologies, health, etc. In this paper we present astrophysical scientific use cases in relation to the use of the VAMDC e-infrastructure. Those will cover very different applications such as: (i) modeling the spectra of interstellar objects using the myXCLASS software tool implemented in the Common Astronomy Software Applications package (CASA) or using the CASSIS software tool, in its stand-alone version or implemented in the Herschel Interactive Processing Environment (HIPE); (ii) the use of Virtual Observatory tools accessing VAMDC databases; (iii) the access of VAMDC from the Paris solar BASS2000 portal; (iv) the combination of tools and database from the APIS service (Auroral Planetary Imaging and Spectroscopy); (v) combination of heterogeneous data for the application to the interstellar medium from the SPECTCOL tool.
A prototype molecular interactive collaborative environment (MICE).
Bourne, P; Gribskov, M; Johnson, G; Moreland, J; Wavra, S; Weissig, H
1998-01-01
Illustrations of macromolecular structure in the scientific literature contain a high level of semantic content through which the authors convey, among other features, the biological function of that macromolecule. We refer to these illustrations as molecular scenes. Such scenes, if available electronically, are not readily accessible for further interactive interrogation. The basic PDB format does not retain features of the scene; formats like PostScript retain the scene but are not interactive; and the many formats used by individual graphics programs, while capable of reproducing the scene, are neither interchangeable nor can they be stored in a database and queried for features of the scene. MICE defines a Molecular Scene Description Language (MSDL) which allows scenes to be stored in a relational database (a molecular scene gallery) and queried. Scenes retrieved from the gallery are rendered in Virtual Reality Modeling Language (VRML) and currently displayed in WebView, a VRML browser modified to support the Virtual Reality Behavior System (VRBS) protocol. VRBS provides communication between multiple client browsers, each capable of manipulating the scene. This level of collaboration works well over standard Internet connections and holds promise for collaborative research at a distance and distance learning. Further, via VRBS, the VRML world can be used as a visual cue to trigger an application such as a remote MEME search. MICE is very much work in progress. Current work seeks to replace WebView with Netscape, Cosmoplayer, a standard VRML plug-in, and a Java-based console. The console consists of a generic kernel suitable for multiple collaborative applications and additional application-specific controls. Further details of the MICE project are available at http:/(/)mice.sdsc.edu.
Investigating Evolutionary Questions Using Online Molecular Databases.
ERIC Educational Resources Information Center
Puterbaugh, Mary N.; Burleigh, J. Gordon
2001-01-01
Recommends using online molecular databases as teaching tools to illustrate evolutionary questions and concepts while introducing students to public molecular databases. Provides activities in which students make molecular comparisons between species. (YDS)
Villoutreix, Bruno O; Kuenemann, Melaine A; Poyet, Jean-Luc; Bruzzoni-Giovanelli, Heriberto; Labbé, Céline; Lagorce, David; Sperandio, Olivier; Miteva, Maria A
2014-01-01
Fundamental processes in living cells are largely controlled by macromolecular interactions and among them, protein–protein interactions (PPIs) have a critical role while their dysregulations can contribute to the pathogenesis of numerous diseases. Although PPIs were considered as attractive pharmaceutical targets already some years ago, they have been thus far largely unexploited for therapeutic interventions with low molecular weight compounds. Several limiting factors, from technological hurdles to conceptual barriers, are known, which, taken together, explain why research in this area has been relatively slow. However, this last decade, the scientific community has challenged the dogma and became more enthusiastic about the modulation of PPIs with small drug-like molecules. In fact, several success stories were reported both, at the preclinical and clinical stages. In this review article, written for the 2014 International Summer School in Chemoinformatics (Strasbourg, France), we discuss in silico tools (essentially post 2012) and databases that can assist the design of low molecular weight PPI modulators (these tools can be found at www.vls3d.com). We first introduce the field of protein–protein interaction research, discuss key challenges and comment recently reported in silico packages, protocols and databases dedicated to PPIs. Then, we illustrate how in silico methods can be used and combined with experimental work to identify PPI modulators. PMID:25254076
Turbulent Mixing Chemistry in Disks
NASA Astrophysics Data System (ADS)
Semenov, D.; Wiebe, D.
2006-11-01
A gas-grain chemical model with surface reaction and 1D/2D turbulent mixing is available for protoplanetary disks and molecular clouds. Current version is based on the updated UMIST'95 database with gas-grain interactions (accretion, desorption, photoevaporation, etc.) and modified rate equation approach to surface chemistry (see also abstract for the static chemistry code).
β-secretase inhibitors for Alzheimer's disease: identification using pharmacoinformatics.
Islam, Md Ataul; Pillay, Tahir S
2018-02-01
In this study we searched for potential β-site amyloid precursor protein cleaving enzyme1 (BACE1) inhibitors using pharmacoinformatics. A large dataset containing 7155 known BACE1 inhibitors was evaluated for pharmacophore model generation. The final model (R = 0.950, RMSD = 1.094, Q 2 = 0.901, se = 0.332, [Formula: see text] = 0.901, [Formula: see text] = 0.756, sp = 0.468, [Formula: see text] = 0.667) was revealed with the importance of spatial arrangement of hydrogen bond acceptor and donor, hydrophobicity and aromatic ring features. The validated model was then used to search NCI and InterBioscreen databases for promising BACE1 inhibitors. The initial hits from both databases were sorted using a number of criteria and finally three molecules from each database were considered for further validation using molecular docking and molecular dynamics studies. Different protonation states of Asp32 and Asp228 dyad were analysed and best protonated form used for molecular docking study. Observation of the number of binding interactions in the molecular docking study supported the potential of these molecules being promising inhibitors. Values of RMSD, RMSF, Rg in molecular dynamics study and binding energies unquestionably explained that final screened molecules formed stable complexes inside the receptor cavity of BACE1. Hence, it can be concluded that the final screened six compounds may be potential therapeutic agents for Alzheimer's disease.
Xu, Huilei; Baroukh, Caroline; Dannenfelser, Ruth; Chen, Edward Y; Tan, Christopher M; Kou, Yan; Kim, Yujin E; Lemischka, Ihor R; Ma'ayan, Avi
2013-01-01
High content studies that profile mouse and human embryonic stem cells (m/hESCs) using various genome-wide technologies such as transcriptomics and proteomics are constantly being published. However, efforts to integrate such data to obtain a global view of the molecular circuitry in m/hESCs are lagging behind. Here, we present an m/hESC-centered database called Embryonic Stem Cell Atlas from Pluripotency Evidence integrating data from many recent diverse high-throughput studies including chromatin immunoprecipitation followed by deep sequencing, genome-wide inhibitory RNA screens, gene expression microarrays or RNA-seq after knockdown (KD) or overexpression of critical factors, immunoprecipitation followed by mass spectrometry proteomics and phosphoproteomics. The database provides web-based interactive search and visualization tools that can be used to build subnetworks and to identify known and novel regulatory interactions across various regulatory layers. The web-interface also includes tools to predict the effects of combinatorial KDs by additive effects controlled by sliders, or through simulation software implemented in MATLAB. Overall, the Embryonic Stem Cell Atlas from Pluripotency Evidence database is a comprehensive resource for the stem cell systems biology community. Database URL: http://www.maayanlab.net/ESCAPE
DASMI: exchanging, annotating and assessing molecular interaction data.
Blankenburg, Hagen; Finn, Robert D; Prlić, Andreas; Jenkinson, Andrew M; Ramírez, Fidel; Emig, Dorothea; Schelhorn, Sven-Eric; Büch, Joachim; Lengauer, Thomas; Albrecht, Mario
2009-05-15
Ever increasing amounts of biological interaction data are being accumulated worldwide, but they are currently not readily accessible to the biologist at a single site. New techniques are required for retrieving, sharing and presenting data spread over the Internet. We introduce the DASMI system for the dynamic exchange, annotation and assessment of molecular interaction data. DASMI is based on the widely used Distributed Annotation System (DAS) and consists of a data exchange specification, web servers for providing the interaction data and clients for data integration and visualization. The decentralized architecture of DASMI affords the online retrieval of the most recent data from distributed sources and databases. DASMI can also be extended easily by adding new data sources and clients. We describe all DASMI components and demonstrate their use for protein and domain interactions. The DASMI tools are available at http://www.dasmi.de/ and http://ipfam.sanger.ac.uk/graph. The DAS registry and the DAS 1.53E specification is found at http://www.dasregistry.org/.
Radiotherapy and "new" drugs-new side effects?
2011-01-01
Background and purpose Targeted drugs have augmented the cancer treatment armamentarium. Based on the molecular specificity, it was initially believed that these drugs had significantly less side effects. However, currently it is accepted that all of these agents have their specific side effects. Based on the given multimodal approach, special emphasis has to be placed on putative interactions of conventional cytostatic drugs, targeted agents and other modalities. The interaction of targeted drugs with radiation harbours special risks, since the awareness for interactions and even synergistic toxicities is lacking. At present, only limited is data available regarding combinations of targeted drugs and radiotherapy. This review gives an overview on the current knowledge on such combined treatments. Materials and methods Using the following MESH headings and combinations of these terms pubmed database was searched: Radiotherapy AND cetuximab/trastuzumab/panitumumab/nimotuzumab, bevacizumab, sunitinib/sorafenib/lapatinib/gefitinib/erlotinib/sirolimus, thalidomide/lenalidomide as well as erythropoietin. For citation crosscheck the ISI web of science database was used employing the same search terms. Results Several classes of targeted substances may be distinguished: Small molecules including kinase inhibitors and specific inhibitors, antibodies, and anti-angiogenic agents. Combination of these agents with radiotherapy may lead to specific toxicities or negatively influence the efficacy of RT. Though there is only little information on the interaction of molecular targeted radiation and radiotherapy in clinical settings, several critical incidents are reported. Conclusions The addition of molecular targeted drugs to conventional radiotherapy outside of approved regimens or clinical trials warrants a careful consideration especially when used in conjunction in hypo-fractionated regimens. Clinical trials are urgently needed in order to address the open question in regard to efficacy, early and late toxicity. PMID:22188921
A gene network bioinformatics analysis for pemphigoid autoimmune blistering diseases.
Barone, Antonio; Toti, Paolo; Giuca, Maria Rita; Derchi, Giacomo; Covani, Ugo
2015-07-01
In this theoretical study, a text mining search and clustering analysis of data related to genes potentially involved in human pemphigoid autoimmune blistering diseases (PAIBD) was performed using web tools to create a gene/protein interaction network. The Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) database was employed to identify a final set of PAIBD-involved genes and to calculate the overall significant interactions among genes: for each gene, the weighted number of links, or WNL, was registered and a clustering procedure was performed using the WNL analysis. Genes were ranked in class (leader, B, C, D and so on, up to orphans). An ontological analysis was performed for the set of 'leader' genes. Using the above-mentioned data network, 115 genes represented the final set; leader genes numbered 7 (intercellular adhesion molecule 1 (ICAM-1), interferon gamma (IFNG), interleukin (IL)-2, IL-4, IL-6, IL-8 and tumour necrosis factor (TNF)), class B genes were 13, whereas the orphans were 24. The ontological analysis attested that the molecular action was focused on extracellular space and cell surface, whereas the activation and regulation of the immunity system was widely involved. Despite the limited knowledge of the present pathologic phenomenon, attested by the presence of 24 genes revealing no protein-protein direct or indirect interactions, the network showed significant pathways gathered in several subgroups: cellular components, molecular functions, biological processes and the pathologic phenomenon obtained from the Kyoto Encyclopaedia of Genes and Genomes (KEGG) database. The molecular basis for PAIBD was summarised and expanded, which will perhaps give researchers promising directions for the identification of new therapeutic targets.
Zhang, Zhi-Guo; Song, Chang-Heng; Zhang, Fang-Zhen; Chen, Yan-Jing; Xiang, Li-Hua; Xiao, Gary Guishan; Ju, Da-Hong
2016-06-01
Rhizoma Dioscoreae extract (RDE) exhibits a protective effect on alveolar bone loss in ovariectomized (OVX) rats. The aim of this study was to predict the pathways or targets that are regulated by RDE, by re‑assessing our previously reported data and conducting a protein‑protein interaction (PPI) network analysis. In total, 383 differentially expressed genes (≥3‑fold) between alveolar bone samples from the RDE and OVX group rats were identified, and a PPI network was constructed based on these genes. Furthermore, four molecular clusters (A‑D) in the PPI network with the smallest P‑values were detected by molecular complex detection (MCODE) algorithm. Using Database for Annotation, Visualization and Integrated Discovery (DAVID) and Ingenuity Pathway Analysis (IPA) tools, two molecular clusters (A and B) were enriched for biological process in Gene Ontology (GO). Only cluster A was associated with biological pathways in the IPA database. GO and pathway analysis results showed that cluster A, associated with cell cycle regulation, was the most important molecular cluster in the PPI network. In addition, cyclin‑dependent kinase 1 (CDK1) may be a key molecule achieving the cell‑cycle‑regulatory function of cluster A. From the PPI network analysis, it was predicted that delayed cell cycle progression in excessive alveolar bone remodeling via downregulation of CDK1 may be another mechanism underling the anti‑osteopenic effect of RDE on alveolar bone.
PROFESS: a PROtein Function, Evolution, Structure and Sequence database
Triplet, Thomas; Shortridge, Matthew D.; Griep, Mark A.; Stark, Jaime L.; Powers, Robert; Revesz, Peter
2010-01-01
The proliferation of biological databases and the easy access enabled by the Internet is having a beneficial impact on biological sciences and transforming the way research is conducted. There are ∼1100 molecular biology databases dispersed throughout the Internet. To assist in the functional, structural and evolutionary analysis of the abundant number of novel proteins continually identified from whole-genome sequencing, we introduce the PROFESS (PROtein Function, Evolution, Structure and Sequence) database. Our database is designed to be versatile and expandable and will not confine analysis to a pre-existing set of data relationships. A fundamental component of this approach is the development of an intuitive query system that incorporates a variety of similarity functions capable of generating data relationships not conceived during the creation of the database. The utility of PROFESS is demonstrated by the analysis of the structural drift of homologous proteins and the identification of potential pancreatic cancer therapeutic targets based on the observation of protein–protein interaction networks. Database URL: http://cse.unl.edu/∼profess/ PMID:20624718
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhu, Yitan; Xu, Yanxun; Helseth, Donald L.
Background: Genetic interactions play a critical role in cancer development. Existing knowledge about cancer genetic interactions is incomplete, especially lacking evidences derived from large-scale cancer genomics data. The Cancer Genome Atlas (TCGA) produces multimodal measurements across genomics and features of thousands of tumors, which provide an unprecedented opportunity to investigate the interplays of genes in cancer. Methods: We introduce Zodiac, a computational tool and resource to integrate existing knowledge about cancer genetic interactions with new information contained in TCGA data. It is an evolution of existing knowledge by treating it as a prior graph, integrating it with a likelihood modelmore » derived by Bayesian graphical model based on TCGA data, and producing a posterior graph as updated and data-enhanced knowledge. In short, Zodiac realizes “Prior interaction map + TCGA data → Posterior interaction map.” Results: Zodiac provides molecular interactions for about 200 million pairs of genes. All the results are generated from a big-data analysis and organized into a comprehensive database allowing customized search. In addition, Zodiac provides data processing and analysis tools that allow users to customize the prior networks and update the genetic pathways of their interest. Zodiac is publicly available at www.compgenome.org/ZODIAC. Conclusions: Zodiac recapitulates and extends existing knowledge of molecular interactions in cancer. It can be used to explore novel gene-gene interactions, transcriptional regulation, and other types of molecular interplays in cancer.« less
Dorel, Mathurin; Viara, Eric; Barillot, Emmanuel; Zinovyev, Andrei; Kuperstein, Inna
2017-01-01
Human diseases such as cancer are routinely characterized by high-throughput molecular technologies, and multi-level omics data are accumulated in public databases at increasing rate. Retrieval and visualization of these data in the context of molecular network maps can provide insights into the pattern of regulation of molecular functions reflected by an omics profile. In order to make this task easy, we developed NaviCom, a Python package and web platform for visualization of multi-level omics data on top of biological network maps. NaviCom is bridging the gap between cBioPortal, the most used resource of large-scale cancer omics data and NaviCell, a data visualization web service that contains several molecular network map collections. NaviCom proposes several standardized modes of data display on top of molecular network maps, allowing addressing specific biological questions. We illustrate how users can easily create interactive network-based cancer molecular portraits via NaviCom web interface using the maps of Atlas of Cancer Signalling Network (ACSN) and other maps. Analysis of these molecular portraits can help in formulating a scientific hypothesis on the molecular mechanisms deregulated in the studied disease. NaviCom is available at https://navicom.curie.fr. © The Author(s) 2017. Published by Oxford University Press.
Davis, Allan Peter; Wiegers, Thomas C.; Murphy, Cynthia G.; Mattingly, Carolyn J.
2011-01-01
The Comparative Toxicogenomics Database (CTD) is a public resource that promotes understanding about the effects of environmental chemicals on human health. CTD biocurators read the scientific literature and convert free-text information into a structured format using official nomenclature, integrating third party controlled vocabularies for chemicals, genes, diseases and organisms, and a novel controlled vocabulary for molecular interactions. Manual curation produces a robust, richly annotated dataset of highly accurate and detailed information. Currently, CTD describes over 349 000 molecular interactions between 6800 chemicals, 20 900 genes (for 330 organisms) and 4300 diseases that have been manually curated from over 25 400 peer-reviewed articles. This manually curated data are further integrated with other third party data (e.g. Gene Ontology, KEGG and Reactome annotations) to generate a wealth of toxicogenomic relationships. Here, we describe our approach to manual curation that uses a powerful and efficient paradigm involving mnemonic codes. This strategy allows biocurators to quickly capture detailed information from articles by generating simple statements using codes to represent the relationships between data types. The paradigm is versatile, expandable, and able to accommodate new data challenges that arise. We have incorporated this strategy into a web-based curation tool to further increase efficiency and productivity, implement quality control in real-time and accommodate biocurators working remotely. Database URL: http://ctd.mdibl.org PMID:21933848
NGL Viewer: a web application for molecular visualization
Rose, Alexander S.; Hildebrand, Peter W.
2015-01-01
The NGL Viewer (http://proteinformatics.charite.de/ngl) is a web application for the visualization of macromolecular structures. By fully adopting capabilities of modern web browsers, such as WebGL, for molecular graphics, the viewer can interactively display large molecular complexes and is also unaffected by the retirement of third-party plug-ins like Flash and Java Applets. Generally, the web application offers comprehensive molecular visualization through a graphical user interface so that life scientists can easily access and profit from available structural data. It supports common structural file-formats (e.g. PDB, mmCIF) and a variety of molecular representations (e.g. ‘cartoon, spacefill, licorice’). Moreover, the viewer can be embedded in other web sites to provide specialized visualizations of entries in structural databases or results of structure-related calculations. PMID:25925569
Similarity-based modeling in large-scale prediction of drug-drug interactions.
Vilar, Santiago; Uriarte, Eugenio; Santana, Lourdes; Lorberbaum, Tal; Hripcsak, George; Friedman, Carol; Tatonetti, Nicholas P
2014-09-01
Drug-drug interactions (DDIs) are a major cause of adverse drug effects and a public health concern, as they increase hospital care expenses and reduce patients' quality of life. DDI detection is, therefore, an important objective in patient safety, one whose pursuit affects drug development and pharmacovigilance. In this article, we describe a protocol applicable on a large scale to predict novel DDIs based on similarity of drug interaction candidates to drugs involved in established DDIs. The method integrates a reference standard database of known DDIs with drug similarity information extracted from different sources, such as 2D and 3D molecular structure, interaction profile, target and side-effect similarities. The method is interpretable in that it generates drug interaction candidates that are traceable to pharmacological or clinical effects. We describe a protocol with applications in patient safety and preclinical toxicity screening. The time frame to implement this protocol is 5-7 h, with additional time potentially necessary, depending on the complexity of the reference standard DDI database and the similarity measures implemented.
Senachak, Jittisak; Cheevadhanarak, Supapon; Hongsthong, Apiradee
2015-07-29
Spirulina (Arthrospira) platensis is the only cyanobacterium that in addition to being studied at the molecular level and subjected to gene manipulation, can also be mass cultivated in outdoor ponds for commercial use as a food supplement. Thus, encountering environmental changes, including temperature stresses, is common during the mass production of Spirulina. The use of cyanobacteria as an experimental platform, especially for photosynthetic gene manipulation in plants and bacteria, is becoming increasingly important. Understanding the mechanisms and protein-protein interaction networks that underlie low- and high-temperature responses is relevant to Spirulina mass production. To accomplish this goal, high-throughput techniques such as OMICs analyses are used. Thus, large datasets must be collected, managed and subjected to information extraction. Therefore, databases including (i) proteomic analysis and protein-protein interaction (PPI) data and (ii) domain/motif visualization tools are required for potential use in temperature response models for plant chloroplasts and photosynthetic bacteria. A web-based repository was developed including an embedded database, SpirPro, and tools for network visualization. Proteome data were analyzed integrated with protein-protein interactions and/or metabolic pathways from KEGG. The repository provides various information, ranging from raw data (2D-gel images) to associated results, such as data from interaction and/or pathway analyses. This integration allows in silico analyses of protein-protein interactions affected at the metabolic level and, particularly, analyses of interactions between and within the affected metabolic pathways under temperature stresses for comparative proteomic analysis. The developed tool, which is coded in HTML with CSS/JavaScript and depicted in Scalable Vector Graphics (SVG), is designed for interactive analysis and exploration of the constructed network. SpirPro is publicly available on the web at http://spirpro.sbi.kmutt.ac.th . SpirPro is an analysis platform containing an integrated proteome and PPI database that provides the most comprehensive data on this cyanobacterium at the systematic level. As an integrated database, SpirPro can be applied in various analyses, such as temperature stress response networking analysis in cyanobacterial models and interacting domain-domain analysis between proteins of interest.
Biopython: freely available Python tools for computational molecular biology and bioinformatics.
Cock, Peter J A; Antao, Tiago; Chang, Jeffrey T; Chapman, Brad A; Cox, Cymon J; Dalke, Andrew; Friedberg, Iddo; Hamelryck, Thomas; Kauff, Frank; Wilczynski, Bartek; de Hoon, Michiel J L
2009-06-01
The Biopython project is a mature open source international collaboration of volunteer developers, providing Python libraries for a wide range of bioinformatics problems. Biopython includes modules for reading and writing different sequence file formats and multiple sequence alignments, dealing with 3D macro molecular structures, interacting with common tools such as BLAST, ClustalW and EMBOSS, accessing key online databases, as well as providing numerical methods for statistical learning. Biopython is freely available, with documentation and source code at (www.biopython.org) under the Biopython license.
Inferring Higher Functional Information for RIKEN Mouse Full-Length cDNA Clones With FACTS
Nagashima, Takeshi; Silva, Diego G.; Petrovsky, Nikolai; Socha, Luis A.; Suzuki, Harukazu; Saito, Rintaro; Kasukawa, Takeya; Kurochkin, Igor V.; Konagaya, Akihiko; Schönbach, Christian
2003-01-01
FACTS (Functional Association/Annotation of cDNA Clones from Text/Sequence Sources) is a semiautomated knowledge discovery and annotation system that integrates molecular function information derived from sequence analysis results (sequence inferred) with functional information extracted from text. Text-inferred information was extracted from keyword-based retrievals of MEDLINE abstracts and by matching of gene or protein names to OMIM, BIND, and DIP database entries. Using FACTS, we found that 47.5% of the 60,770 RIKEN mouse cDNA FANTOM2 clone annotations were informative for text searches. MEDLINE queries yielded molecular interaction-containing sentences for 23.1% of the clones. When disease MeSH and GO terms were matched with retrieved abstracts, 22.7% of clones were associated with potential diseases, and 32.5% with GO identifiers. A significant number (23.5%) of disease MeSH-associated clones were also found to have a hereditary disease association (OMIM Morbidmap). Inferred neoplastic and nervous system disease represented 49.6% and 36.0% of disease MeSH-associated clones, respectively. A comparison of sequence-based GO assignments with informative text-based GO assignments revealed that for 78.2% of clones, identical GO assignments were provided for that clone by either method, whereas for 21.8% of clones, the assignments differed. In contrast, for OMIM assignments, only 28.5% of clones had identical sequence-based and text-based OMIM assignments. Sequence, sentence, and term-based functional associations are included in the FACTS database (http://facts.gsc.riken.go.jp/), which permits results to be annotated and explored through web-accessible keyword and sequence search interfaces. The FACTS database will be a critical tool for investigating the functional complexity of the mouse transcriptome, cDNA-inferred interactome (molecular interactions), and pathome (pathologies). PMID:12819151
An automated method for finding molecular complexes in large protein interaction networks
Bader, Gary D; Hogue, Christopher WV
2003-01-01
Background Recent advances in proteomics technologies such as two-hybrid, phage display and mass spectrometry have enabled us to create a detailed map of biomolecular interaction networks. Initial mapping efforts have already produced a wealth of data. As the size of the interaction set increases, databases and computational methods will be required to store, visualize and analyze the information in order to effectively aid in knowledge discovery. Results This paper describes a novel graph theoretic clustering algorithm, "Molecular Complex Detection" (MCODE), that detects densely connected regions in large protein-protein interaction networks that may represent molecular complexes. The method is based on vertex weighting by local neighborhood density and outward traversal from a locally dense seed protein to isolate the dense regions according to given parameters. The algorithm has the advantage over other graph clustering methods of having a directed mode that allows fine-tuning of clusters of interest without considering the rest of the network and allows examination of cluster interconnectivity, which is relevant for protein networks. Protein interaction and complex information from the yeast Saccharomyces cerevisiae was used for evaluation. Conclusion Dense regions of protein interaction networks can be found, based solely on connectivity data, many of which correspond to known protein complexes. The algorithm is not affected by a known high rate of false positives in data from high-throughput interaction techniques. The program is available from . PMID:12525261
FragFit: a web-application for interactive modeling of protein segments into cryo-EM density maps.
Tiemann, Johanna K S; Rose, Alexander S; Ismer, Jochen; Darvish, Mitra D; Hilal, Tarek; Spahn, Christian M T; Hildebrand, Peter W
2018-05-21
Cryo-electron microscopy (cryo-EM) is a standard method to determine the three-dimensional structures of molecular complexes. However, easy to use tools for modeling of protein segments into cryo-EM maps are sparse. Here, we present the FragFit web-application, a web server for interactive modeling of segments of up to 35 amino acids length into cryo-EM density maps. The fragments are provided by a regularly updated database containing at the moment about 1 billion entries extracted from PDB structures and can be readily integrated into a protein structure. Fragments are selected based on geometric criteria, sequence similarity and fit into a given cryo-EM density map. Web-based molecular visualization with the NGL Viewer allows interactive selection of fragments. The FragFit web-application, accessible at http://proteinformatics.de/FragFit, is free and open to all users, without any login requirements.
Friedrich, Anne; Garnier, Nicolas; Gagnière, Nicolas; Nguyen, Hoan; Albou, Laurent-Philippe; Biancalana, Valérie; Bettler, Emmanuel; Deléage, Gilbert; Lecompte, Odile; Muller, Jean; Moras, Dino; Mandel, Jean-Louis; Toursel, Thierry; Moulinier, Luc; Poch, Olivier
2010-02-01
Understanding how genetic alterations affect gene products at the molecular level represents a first step in the elucidation of the complex relationships between genotypic and phenotypic variations, and is thus a major challenge in the postgenomic era. Here, we present SM2PH-db (http://decrypthon.igbmc.fr/sm2ph), a new database designed to investigate structural and functional impacts of missense mutations and their phenotypic effects in the context of human genetic diseases. A wealth of up-to-date interconnected information is provided for each of the 2,249 disease-related entry proteins (August 2009), including data retrieved from biological databases and data generated from a Sequence-Structure-Evolution Inference in Systems-based approach, such as multiple alignments, three-dimensional structural models, and multidimensional (physicochemical, functional, structural, and evolutionary) characterizations of mutations. SM2PH-db provides a robust infrastructure associated with interactive analysis tools supporting in-depth study and interpretation of the molecular consequences of mutations, with the more long-term goal of elucidating the chain of events leading from a molecular defect to its pathology. The entire content of SM2PH-db is regularly and automatically updated thanks to a computational grid data federation facilities provided in the context of the Decrypthon program. (c) 2009 Wiley-Liss, Inc.
Interactome of the hepatitis C virus: Literature mining with ANDSystem.
Saik, Olga V; Ivanisenko, Timofey V; Demenkov, Pavel S; Ivanisenko, Vladimir A
2016-06-15
A study of the molecular genetics mechanisms of host-pathogen interactions is of paramount importance in developing drugs against viral diseases. Currently, the literature contains a huge amount of information that describes interactions between HCV and human proteins. In addition, there are many factual databases that contain experimentally verified data on HCV-host interactions. The sources of such data are the original data along with the data manually extracted from the literature. However, the manual analysis of scientific publications is time consuming and, because of this, databases created with such an approach often do not have complete information. One of the most promising methods to provide actualisation and completeness of information is text mining. Here, with the use of a previously developed method by the authors using ANDSystem, an automated extraction of information on the interactions between HCV and human proteins was conducted. As a data source for the text mining approach, PubMed abstracts and full text articles were used. Additionally, external factual databases were analyzed. On the basis of this analysis, a special version of ANDSystem, extended with the HCV interactome, was created. The HCV interactome contains information about the interactions between 969 human and 11 HCV proteins. Among the 969 proteins, 153 'new' proteins were found not previously referred to in any external databases of protein-protein interactions for HCV-host interactions. Thus, the extended ANDSystem possesses a more comprehensive detailing of HCV-host interactions versus other existing databases. It was interesting that HCV proteins more preferably interact with human proteins that were already involved in a large number of protein-protein interactions as well as those associated with many diseases. Among human proteins of the HCV interactome, there were a large number of proteins regulated by microRNAs. It turned out that the results obtained for protein-protein interactions and microRNA-regulation did not depend on how well the proteins were studied, while protein-disease interactions appeared to be dependent on the level of study. In particular, the mean number of diseases linked to well-studied proteins (proteins were considered well-studied if they were mentioned in 50 or more PubMed publications) from the HCV interactome was 20.8, significantly exceeding the mean number of associations with diseases (10.1) for the total set of well-studied human proteins present in ANDSystem. For proteins not highly poorly-studied investigated, proteins from the HCV interactome (each protein was referred to in less than 50 publications) distribution of the number of diseases associated with them had no statistically significant differences from the distribution of the number of diseases associated with poorly-studied proteins based on the total set of human proteins stored in ANDSystem. With this, the average number of associations with diseases for the HCV interactome and the total set of human proteins were 0.3 and 0.2, respectively. Thus, ANDSystem, extended with the HCV interactome, can be helpful in a wide range of issues related to analyzing HCV-host interactions in the search for anti-HCV drug targets. The demo version of the extended ANDSystem covered here containing only interactions between human proteins, genes, metabolites, diseases, miRNAs and molecular-genetic pathways, as well as interactions between human proteins/genes and HCV proteins, is freely available at the following web address: http://www-bionet.sscc.ru/psd/andhcv/. Copyright © 2015 The Authors. Published by Elsevier B.V. All rights reserved.
Bonati, Laura; Corrada, Dario; Tagliabue, Sara Giani; Motta, Stefano
2017-02-01
Molecular modeling has given important contributions to elucidation of the main stages in the AhR signal transduction pathway. Despite the lack of experimentally determined structures of the AhR functional domains, information derived from homologous systems has been exploited for modeling their structure and interactions. Homology models of the AhR PASB domain have provided information on the binding cavity and contributed to elucidate species-specific differences in ligand binding. Molecular Docking simulations of the ligand binding process have given insights into differences in binding of diverse agonists, antagonists, and selective AhR modulators, and their application to virtual screening of large databases of compounds have allowed identification of novel AhR ligands. Recently available structural information on protein-protein and protein-DNA complexes of other bHLH-PAS systems has opened the way for modeling the AhR:ARNT dimer structure and investigating the mechanisms of AhR transformation and DNA binding. Future research directions should include simulation of the protein dynamics to obtain a more reliable description of intermolecular interactions involved in signal transmission.
NABIC marker database: A molecular markers information network of agricultural crops.
Kim, Chang-Kug; Seol, Young-Joo; Lee, Dong-Jun; Jeong, In-Seon; Yoon, Ung-Han; Lee, Gang-Seob; Hahn, Jang-Ho; Park, Dong-Suk
2013-01-01
In 2013, National Agricultural Biotechnology Information Center (NABIC) reconstructs a molecular marker database for useful genetic resources. The web-based marker database consists of three major functional categories: map viewer, RSN marker and gene annotation. It provides 7250 marker locations, 3301 RSN marker property, 3280 molecular marker annotation information in agricultural plants. The individual molecular marker provides information such as marker name, expressed sequence tag number, gene definition and general marker information. This updated marker-based database provides useful information through a user-friendly web interface that assisted in tracing any new structures of the chromosomes and gene positional functions using specific molecular markers. The database is available for free at http://nabic.rda.go.kr/gere/rice/molecularMarkers/
The European Bioinformatics Institute's data resources: towards systems biology.
Brooksbank, Catherine; Cameron, Graham; Thornton, Janet
2005-01-01
Genomic and post-genomic biological research has provided fine-grain insights into the molecular processes of life, but also threatens to drown biomedical researchers in data. Moreover, as new high-throughput technologies are developed, the types of data that are gathered en masse are diversifying. The need to collect, store and curate all this information in ways that allow its efficient retrieval and exploitation is greater than ever. The European Bioinformatics Institute's (EBI's) databases and tools have evolved to meet the changing needs of molecular biologists: since we last wrote about our services in the 2003 issue of Nucleic Acids Research, we have launched new databases covering protein-protein interactions (IntAct), pathways (Reactome) and small molecules (ChEBI). Our existing core databases have continued to evolve to meet the changing needs of biomedical researchers, and we have developed new data-access tools that help biologists to move intuitively through the different data types, thereby helping them to put the parts together to understand biology at the systems level. The EBI's data resources are all available on our website at http://www.ebi.ac.uk.
The European Bioinformatics Institute's data resources: towards systems biology
Brooksbank, Catherine; Cameron, Graham; Thornton, Janet
2005-01-01
Genomic and post-genomic biological research has provided fine-grain insights into the molecular processes of life, but also threatens to drown biomedical researchers in data. Moreover, as new high-throughput technologies are developed, the types of data that are gathered en masse are diversifying. The need to collect, store and curate all this information in ways that allow its efficient retrieval and exploitation is greater than ever. The European Bioinformatics Institute's (EBI's) databases and tools have evolved to meet the changing needs of molecular biologists: since we last wrote about our services in the 2003 issue of Nucleic Acids Research, we have launched new databases covering protein–protein interactions (IntAct), pathways (Reactome) and small molecules (ChEBI). Our existing core databases have continued to evolve to meet the changing needs of biomedical researchers, and we have developed new data-access tools that help biologists to move intuitively through the different data types, thereby helping them to put the parts together to understand biology at the systems level. The EBI's data resources are all available on our website at http://www.ebi.ac.uk. PMID:15608238
An infrastructure to mine molecular descriptors for ligand selection on virtual screening.
Seus, Vinicius Rosa; Perazzo, Giovanni Xavier; Winck, Ana T; Werhli, Adriano V; Machado, Karina S
2014-01-01
The receptor-ligand interaction evaluation is one important step in rational drug design. The databases that provide the structures of the ligands are growing on a daily basis. This makes it impossible to test all the ligands for a target receptor. Hence, a ligand selection before testing the ligands is needed. One possible approach is to evaluate a set of molecular descriptors. With the aim of describing the characteristics of promising compounds for a specific receptor we introduce a data warehouse-based infrastructure to mine molecular descriptors for virtual screening (VS). We performed experiments that consider as target the receptor HIV-1 protease and different compounds for this protein. A set of 9 molecular descriptors are taken as the predictive attributes and the free energy of binding is taken as a target attribute. By applying the J48 algorithm over the data we obtain decision tree models that achieved up to 84% of accuracy. The models indicate which molecular descriptors and their respective values are relevant to influence good FEB results. Using their rules we performed ligand selection on ZINC database. Our results show important reduction in ligands selection to be applied in VS experiments; for instance, the best selection model picked only 0.21% of the total amount of drug-like ligands.
NASA Astrophysics Data System (ADS)
De, Biplab; Adhikari, Indrani; Nandy, Ashis; Saha, Achintya; Goswami, Binoy Behari
2017-06-01
Design and development of antioxidant supplements constitute an essential aspect of research in order to derive molecules that would help to combat the free radical invasion to the human body and curb oxidative stress related diseases. The present work deals with the development of in silico models for a series of thiazolidine derivatives having antioxidant potential. The objective of the work is to obtain models that would help to design new thazolidine derivatives based on substituent modification and thereby predict their activity profile. The QSAR model thus developed helps in quantification of the extent of contribution of the various molecular fragments towards the activity of the molecules, while the 3D pharmacophore model provides a brief idea of the essential molecular features that help the molecules to interact with the neighbouring free radicals. Both the models have been extensively validated which ensures their predictive ability as well the potential to search molecular databases for selection of thiazolidine derivatives with potent antioxidant activity. The models can thus be utilised effectively for database searching with the aim to isolate active antioxidants belonging to the thiazolidine group.
Bioinformatics and molecular modeling in glycobiology
Schloissnig, Siegfried
2010-01-01
The field of glycobiology is concerned with the study of the structure, properties, and biological functions of the family of biomolecules called carbohydrates. Bioinformatics for glycobiology is a particularly challenging field, because carbohydrates exhibit a high structural diversity and their chains are often branched. Significant improvements in experimental analytical methods over recent years have led to a tremendous increase in the amount of carbohydrate structure data generated. Consequently, the availability of databases and tools to store, retrieve and analyze these data in an efficient way is of fundamental importance to progress in glycobiology. In this review, the various graphical representations and sequence formats of carbohydrates are introduced, and an overview of newly developed databases, the latest developments in sequence alignment and data mining, and tools to support experimental glycan analysis are presented. Finally, the field of structural glycoinformatics and molecular modeling of carbohydrates, glycoproteins, and protein–carbohydrate interaction are reviewed. PMID:20364395
MetNetAPI: A flexible method to access and manipulate biological network data from MetNet
2010-01-01
Background Convenient programmatic access to different biological databases allows automated integration of scientific knowledge. Many databases support a function to download files or data snapshots, or a webservice that offers "live" data. However, the functionality that a database offers cannot be represented in a static data download file, and webservices may consume considerable computational resources from the host server. Results MetNetAPI is a versatile Application Programming Interface (API) to the MetNetDB database. It abstracts, captures and retains operations away from a biological network repository and website. A range of database functions, previously only available online, can be immediately (and independently from the website) applied to a dataset of interest. Data is available in four layers: molecular entities, localized entities (linked to a specific organelle), interactions, and pathways. Navigation between these layers is intuitive (e.g. one can request the molecular entities in a pathway, as well as request in what pathways a specific entity participates). Data retrieval can be customized: Network objects allow the construction of new and integration of existing pathways and interactions, which can be uploaded back to our server. In contrast to webservices, the computational demand on the host server is limited to processing data-related queries only. Conclusions An API provides several advantages to a systems biology software platform. MetNetAPI illustrates an interface with a central repository of data that represents the complex interrelationships of a metabolic and regulatory network. As an alternative to data-dumps and webservices, it allows access to a current and "live" database and exposes analytical functions to application developers. Yet it only requires limited resources on the server-side (thin server/fat client setup). The API is available for Java, Microsoft.NET and R programming environments and offers flexible query and broad data- retrieval methods. Data retrieval can be customized to client needs and the API offers a framework to construct and manipulate user-defined networks. The design principles can be used as a template to build programmable interfaces for other biological databases. The API software and tutorials are available at http://www.metnetonline.org/api. PMID:21083943
CerebralWeb: a Cytoscape.js plug-in to visualize networks stratified by subcellular localization.
Frias, Silvia; Bryan, Kenneth; Brinkman, Fiona S L; Lynn, David J
2015-01-01
CerebralWeb is a light-weight JavaScript plug-in that extends Cytoscape.js to enable fast and interactive visualization of molecular interaction networks stratified based on subcellular localization or other user-supplied annotation. The application is designed to be easily integrated into any website and is configurable to support customized network visualization. CerebralWeb also supports the automatic retrieval of Cerebral-compatible localizations for human, mouse and bovine genes via a web service and enables the automated parsing of Cytoscape compatible XGMML network files. CerebralWeb currently supports embedded network visualization on the InnateDB (www.innatedb.com) and Allergy and Asthma Portal (allergen.innatedb.com) database and analysis resources. Database tool URL: http://www.innatedb.com/CerebralWeb © The Author(s) 2015. Published by Oxford University Press.
ZikaBase: An integrated ZIKV- Human Interactome Map database.
Gurumayum, Sanathoi; Brahma, Rahul; Naorem, Leimarembi Devi; Muthaiyan, Mathavan; Gopal, Jeyakodi; Venkatesan, Amouda
2018-01-15
Re-emergence of ZIKV has caused infections in more than 1.5 million people. The molecular mechanism and pathogenesis of ZIKV is not well explored due to unavailability of adequate model and lack of publically accessible resources to provide information of ZIKV-Human protein interactome map till today. This study made an attempt to curate the ZIKV-Human interaction proteins from published literatures and RNA-Seq data. 11 direct interaction, 12 associated genes are retrieved from literatures and 3742 Differentially Expressed Genes (DEGs) are obtained from RNA-Seq analysis. The genes have been analyzed to construct the ZIKV-Human Interactome Map. The importance of the study has been illustrated by the enrichment analysis and observed that direct interaction and associated genes are enriched in viral entry into host cell. Also, ZIKV infection modulates 32% signal and 27% immune system pathways. The integrated database, ZikaBase has been developed to help the virology research community and accessible at https://test5.bicpu.edu.in. Copyright © 2017 Elsevier Inc. All rights reserved.
Patel, Chirag N; Georrge, John J; Modi, Krunal M; Narechania, Moksha B; Patel, Daxesh P; Gonzalez, Frank J; Pandya, Himanshu A
2017-12-27
Alzheimer's disease (AD) is one of the most significant neurodegenerative disorders and its symptoms mostly appear in aged people. Catechol-o-methyltransferase (COMT) is one of the known target enzymes responsible for AD. With the use of 23 known inhibitors of COMT, a query has been generated and validated by screening against the database of 1500 decoys to obtain the GH score and enrichment value. The crucial features of the known inhibitors were evaluated by the online ZINC Pharmer to identify new leads from a ZINC database. Five hundred hits were retrieved from ZINC Pharmer and by ADMET (absorption, distribution, metabolism, excretion, and toxicity) filtering by using FAF-Drug-3 and 36 molecules were considered for molecular docking. From the COMT inhibitors, opicapone, fenoldopam, and quercetin were selected, while ZINC63625100_413 ZINC39411941_412, ZINC63234426_254, ZINC63637968_451, and ZINC64019452_303 were chosen for the molecular dynamics simulation analysis having high binding affinity and structural recognition. This study identified the potential COMT inhibitors through pharmacophore-based inhibitor screening leading to a more complete understanding of molecular-level interactions.
Ligand solvation in molecular docking.
Shoichet, B K; Leach, A R; Kuntz, I D
1999-01-01
Solvation plays an important role in ligand-protein association and has a strong impact on comparisons of binding energies for dissimilar molecules. When databases of such molecules are screened for complementarity to receptors of known structure, as often occurs in structure-based inhibitor discovery, failure to consider ligand solvation often leads to putative ligands that are too highly charged or too large. To correct for the different charge states and sizes of the ligands, we calculated electrostatic and non-polar solvation free energies for molecules in a widely used molecular database, the Available Chemicals Directory (ACD). A modified Born equation treatment was used to calculate the electrostatic component of ligand solvation. The non-polar component of ligand solvation was calculated based on the surface area of the ligand and parameters derived from the hydration energies of apolar ligands. These solvation energies were subtracted from the ligand-receptor interaction energies. We tested the usefulness of these corrections by screening the ACD for molecules that complemented three proteins of known structure, using a molecular docking program. Correcting for ligand solvation improved the rankings of known ligands and discriminated against molecules with inappropriate charge states and sizes.
Tandon, Chanderdeep
2013-01-01
Background The increasing number of patients suffering from urolithiasis represents one of the major challenges which nephrologists face worldwide today. For enhancing therapeutic outcomes of this disease, the pathogenic basis for the formation of renal stones is the need of hour. Proteins are found as major component in human renal stone matrix and are considered to have a potential role in crystal–membrane interaction, crystal growth and stone formation but their role in urolithiasis still remains obscure. Methods Proteins were isolated from the matrix of human CaOx containing kidney stones. Proteins having MW>3 kDa were subjected to anion exchange chromatography followed by molecular-sieve chromatography. The effect of these purified proteins was tested against CaOx nucleation and growth and on oxalate injured Madin–Darby Canine Kidney (MDCK) renal epithelial cells for their activity. Proteins were identified by Matrix-assisted laser desorption/ionization-time of flight (MALDI-TOF MS) followed by database search with MASCOT server. In silico molecular interaction studies with CaOx crystals were also investigated. Results Five proteins were identified from the matrix of calcium oxalate kidney stones by MALDI-TOF MS followed by database search with MASCOT server with the competence to control the stone formation process. Out of which two proteins were promoters, two were inhibitors and one protein had a dual activity of both inhibition and promotion towards CaOx nucleation and growth. Further molecular modelling calculations revealed the mode of interaction of these proteins with CaOx at the molecular level. Conclusions We identified and characterized Ethanolamine-phosphate cytidylyltransferase, Ras GTPase-activating-like protein, UDP-glucose:glycoprotein glucosyltransferase 2, RIMS-binding protein 3A, Macrophage-capping protein as novel proteins from the matrix of human calcium oxalate stone which play a critical role in kidney stone formation. Thus, these proteins having potential to modulate calcium oxalate crystallization will throw light on understanding and controlling urolithiasis in humans. PMID:23894559
PCPPI: a comprehensive database for the prediction of Penicillium-crop protein-protein interactions.
Yue, Junyang; Zhang, Danfeng; Ban, Rongjun; Ma, Xiaojing; Chen, Danyang; Li, Guangwei; Liu, Jia; Wisniewski, Michael; Droby, Samir; Liu, Yongsheng
2017-01-01
Penicillium expansum , the causal agent of blue mold, is one of the most prevalent post-harvest pathogens, infecting a wide range of crops after harvest. In response, crops have evolved various defense systems to protect themselves against this and other pathogens. Penicillium -crop interaction is a multifaceted process and mediated by pathogen- and host-derived proteins. Identification and characterization of the inter-species protein-protein interactions (PPIs) are fundamental to elucidating the molecular mechanisms underlying infection processes between P. expansum and plant crops. Here, we have developed PCPPI, the Penicillium -Crop Protein-Protein Interactions database, which is constructed based on the experimentally determined orthologous interactions in pathogen-plant systems and available domain-domain interactions (DDIs) in each PPI. Thus far, it stores information on 9911 proteins, 439 904 interactions and seven host species, including apple, kiwifruit, maize, pear, rice, strawberry and tomato. Further analysis through the gene ontology (GO) annotation indicated that proteins with more interacting partners tend to execute the essential function. Significantly, semantic statistics of the GO terms also provided strong support for the accuracy of our predicted interactions in PCPPI. We believe that all the PCPPI datasets are helpful to facilitate the study of pathogen-crop interactions and freely available to the research community. : http://bdg.hfut.edu.cn/pcppi/index.html. © The Author(s) 2017. Published by Oxford University Press.
Stacking and T-shape competition in aromatic-aromatic amino acid interactions.
Chelli, Riccardo; Gervasio, Francesco Luigi; Procacci, Piero; Schettino, Vincenzo
2002-05-29
The potential of mean force of interacting aromatic amino acids is calculated using molecular dynamics simulations. The free energy surface is determined in order to study stacking and T-shape competition for phenylalanine-phenylalanine (Phe-Phe), phenylalanine-tyrosine (Phe-Tyr), and tyrosine-tyrosine (Tyr-Tyr) complexes in vacuo, water, carbon tetrachloride, and methanol. Stacked structures are favored in all solvents with the exception of the Tyr-Tyr complex in carbon tetrachloride, where T-shaped structures are also important. The effect of anchoring the two alpha-carbons (C(alpha)) at selected distances is investigated. We find that short and large C(alpha)-C(alpha) distances favor stacked and T-shaped structures, respectively. We analyze a set of 2396 protein structures resolved experimentally. Comparison of theoretical free energies for the complexes to the experimental analogue shows that Tyr-Tyr interaction occurs mainly at the protein surface, while Phe-Tyr and Phe-Phe interactions are more frequent in the hydrophobic protein core. This is confirmed by the Voronoi polyhedron analysis on the database protein structures. As found from the free energy calculation, analysis of the protein database has shown that proximal and distal interacting aromatic residues are predominantly stacked and T-shaped, respectively.
NASA Astrophysics Data System (ADS)
Fu, Ying; Sun, Yi-Na; Yi, Ke-Han; Li, Ming-Qiang; Cao, Hai-Feng; Li, Jia-Zhong; Ye, Fei
2018-02-01
4-Hydroxyphenylpyruvate dioxygenase (EC 1.13.11.27, HPPD) is a potent new bleaching herbicide target. Therefore, in silico structure-based virtual screening was performed in order to speed up the identification of promising HPPD inhibitors. In this study, an integrated virtual screening protocol by combining 3D-pharmacophore model, molecular docking and molecular dynamics (MD) simulation was established to find novel HPPD inhibitors from four commercial databases. 3D-pharmacophore Hypo1 model was applied to efficiently narrow potential hits. The hit compounds were subsequently submitted to molecular docking studies, showing four compounds as potent inhibitor with the mechanism of the Fe(II) coordination and interaction with Phe360, Phe403 and Phe398. MD result demonstrated that nonpolar term of compound 3881 made great contributions to binding affinities. It showed an IC50 being 2.49 µM against AtHPPD in vitro. The results provided useful information for developing novel HPPD inhibitors, leading to further understanding of the interaction mechanism of HPPD inhibitors.
John, Shalini; Thangapandian, Sundarapandian; Lee, Keun Woo
2012-01-01
Human pancreatic cholesterol esterase (hCEase) is one of the lipases found to involve in the digestion of large and broad spectrum of substrates including triglycerides, phospholipids, cholesteryl esters, etc. The presence of bile salts is found to be very important for the activation of hCEase. Molecular dynamic simulations were performed for the apoform and bile salt complexed form of hCEase using the co-ordinates of two bile salts from bovine CEase. The stability of the systems throughout the simulation time was checked and two representative structures from the highly populated regions were selected using cluster analysis. These two representative structures were used in pharmacophore model generation. The generated pharmacophore models were validated and used in database screening. The screened hits were refined for their drug-like properties based on Lipinski's rule of five and ADMET properties. The drug-like compounds were further refined by molecular docking simulation using GOLD program based on the GOLD fitness score, mode of binding, and molecular interactions with the active site amino acids. Finally, three hits of novel scaffolds were selected as potential leads to be used in novel and potent hCEase inhibitor design. The stability of binding modes and molecular interactions of these final hits were re-assured by molecular dynamics simulations.
Database resources of the National Center for Biotechnology Information
2015-01-01
The National Center for Biotechnology Information (NCBI) provides a large suite of online resources for biological information and data, including the GenBank® nucleic acid sequence database and the PubMed database of citations and abstracts for published life science journals. Additional NCBI resources focus on literature (Bookshelf, PubMed Central (PMC) and PubReader); medical genetics (ClinVar, dbMHC, the Genetic Testing Registry, HIV-1/Human Protein Interaction Database and MedGen); genes and genomics (BioProject, BioSample, dbSNP, dbVar, Epigenomics, Gene, Gene Expression Omnibus (GEO), Genome, HomoloGene, the Map Viewer, Nucleotide, PopSet, Probe, RefSeq, Sequence Read Archive, the Taxonomy Browser, Trace Archive and UniGene); and proteins and chemicals (Biosystems, COBALT, the Conserved Domain Database (CDD), the Conserved Domain Architecture Retrieval Tool (CDART), the Molecular Modeling Database (MMDB), Protein Clusters, Protein and the PubChem suite of small molecule databases). The Entrez system provides search and retrieval operations for many of these databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of these resources can be accessed through the NCBI home page at http://www.ncbi.nlm.nih.gov. PMID:25398906
Database resources of the National Center for Biotechnology Information
2016-01-01
The National Center for Biotechnology Information (NCBI) provides a large suite of online resources for biological information and data, including the GenBank® nucleic acid sequence database and the PubMed database of citations and abstracts for published life science journals. Additional NCBI resources focus on literature (PubMed Central (PMC), Bookshelf and PubReader), health (ClinVar, dbGaP, dbMHC, the Genetic Testing Registry, HIV-1/Human Protein Interaction Database and MedGen), genomes (BioProject, Assembly, Genome, BioSample, dbSNP, dbVar, Epigenomics, the Map Viewer, Nucleotide, Probe, RefSeq, Sequence Read Archive, the Taxonomy Browser and the Trace Archive), genes (Gene, Gene Expression Omnibus (GEO), HomoloGene, PopSet and UniGene), proteins (Protein, the Conserved Domain Database (CDD), COBALT, Conserved Domain Architecture Retrieval Tool (CDART), the Molecular Modeling Database (MMDB) and Protein Clusters) and chemicals (Biosystems and the PubChem suite of small molecule databases). The Entrez system provides search and retrieval operations for most of these databases. Augmenting many of the web applications are custom implementations of the BLAST program optimized to search specialized datasets. All of these resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov. PMID:26615191
NGL Viewer: a web application for molecular visualization.
Rose, Alexander S; Hildebrand, Peter W
2015-07-01
The NGL Viewer (http://proteinformatics.charite.de/ngl) is a web application for the visualization of macromolecular structures. By fully adopting capabilities of modern web browsers, such as WebGL, for molecular graphics, the viewer can interactively display large molecular complexes and is also unaffected by the retirement of third-party plug-ins like Flash and Java Applets. Generally, the web application offers comprehensive molecular visualization through a graphical user interface so that life scientists can easily access and profit from available structural data. It supports common structural file-formats (e.g. PDB, mmCIF) and a variety of molecular representations (e.g. 'cartoon, spacefill, licorice'). Moreover, the viewer can be embedded in other web sites to provide specialized visualizations of entries in structural databases or results of structure-related calculations. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Database resources of the National Center for Biotechnology Information
Wheeler, David L.; Barrett, Tanya; Benson, Dennis A.; Bryant, Stephen H.; Canese, Kathi; Chetvernin, Vyacheslav; Church, Deanna M.; DiCuccio, Michael; Edgar, Ron; Federhen, Scott; Geer, Lewis Y.; Helmberg, Wolfgang; Kapustin, Yuri; Kenton, David L.; Khovayko, Oleg; Lipman, David J.; Madden, Thomas L.; Maglott, Donna R.; Ostell, James; Pruitt, Kim D.; Schuler, Gregory D.; Schriml, Lynn M.; Sequeira, Edwin; Sherry, Stephen T.; Sirotkin, Karl; Souvorov, Alexandre; Starchenko, Grigory; Suzek, Tugba O.; Tatusov, Roman; Tatusova, Tatiana A.; Wagner, Lukas; Yaschenko, Eugene
2006-01-01
In addition to maintaining the GenBank(R) nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data made available through NCBI's Web site. NCBI resources include Entrez, the Entrez Programming Utilities, MyNCBI, PubMed, PubMed Central, Entrez Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Electronic PCR, OrfFinder, Spidey, Splign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, Cancer Chromosomes, Entrez Genomes and related tools, the Map Viewer, Model Maker, Evidence Viewer, Clusters of Orthologous Groups, Retroviral Genotyping Tools, HIV-1, Human Protein Interaction Database, SAGEmap, Gene Expression Omnibus, Entrez Probe, GENSAT, Online Mendelian Inheritance in Man, Online Mendelian Inheritance in Animals, the Molecular Modeling Database, the Conserved Domain Database, the Conserved Domain Architecture Retrieval Tool and the PubChem suite of small molecule databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized datasets. All of the resources can be accessed through the NCBI home page at: . PMID:16381840
Database resources of the National Center for Biotechnology Information.
Sayers, Eric W; Barrett, Tanya; Benson, Dennis A; Bolton, Evan; Bryant, Stephen H; Canese, Kathi; Chetvernin, Vyacheslav; Church, Deanna M; Dicuccio, Michael; Federhen, Scott; Feolo, Michael; Fingerman, Ian M; Geer, Lewis Y; Helmberg, Wolfgang; Kapustin, Yuri; Krasnov, Sergey; Landsman, David; Lipman, David J; Lu, Zhiyong; Madden, Thomas L; Madej, Tom; Maglott, Donna R; Marchler-Bauer, Aron; Miller, Vadim; Karsch-Mizrachi, Ilene; Ostell, James; Panchenko, Anna; Phan, Lon; Pruitt, Kim D; Schuler, Gregory D; Sequeira, Edwin; Sherry, Stephen T; Shumway, Martin; Sirotkin, Karl; Slotta, Douglas; Souvorov, Alexandre; Starchenko, Grigory; Tatusova, Tatiana A; Wagner, Lukas; Wang, Yanli; Wilbur, W John; Yaschenko, Eugene; Ye, Jian
2012-01-01
In addition to maintaining the GenBank® nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data made available through the NCBI Website. NCBI resources include Entrez, the Entrez Programming Utilities, MyNCBI, PubMed, PubMed Central (PMC), Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Primer-BLAST, COBALT, Splign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, dbVar, Epigenomics, Genome and related tools, the Map Viewer, Model Maker, Evidence Viewer, Trace Archive, Sequence Read Archive, BioProject, BioSample, Retroviral Genotyping Tools, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus (GEO), Probe, Online Mendelian Inheritance in Animals (OMIA), the Molecular Modeling Database (MMDB), the Conserved Domain Database (CDD), the Conserved Domain Architecture Retrieval Tool (CDART), Biosystems, Protein Clusters and the PubChem suite of small molecule databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of these resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov.
Database resources of the National Center for Biotechnology Information
Acland, Abigail; Agarwala, Richa; Barrett, Tanya; Beck, Jeff; Benson, Dennis A.; Bollin, Colleen; Bolton, Evan; Bryant, Stephen H.; Canese, Kathi; Church, Deanna M.; Clark, Karen; DiCuccio, Michael; Dondoshansky, Ilya; Federhen, Scott; Feolo, Michael; Geer, Lewis Y.; Gorelenkov, Viatcheslav; Hoeppner, Marilu; Johnson, Mark; Kelly, Christopher; Khotomlianski, Viatcheslav; Kimchi, Avi; Kimelman, Michael; Kitts, Paul; Krasnov, Sergey; Kuznetsov, Anatoliy; Landsman, David; Lipman, David J.; Lu, Zhiyong; Madden, Thomas L.; Madej, Tom; Maglott, Donna R.; Marchler-Bauer, Aron; Karsch-Mizrachi, Ilene; Murphy, Terence; Ostell, James; O'Sullivan, Christopher; Panchenko, Anna; Phan, Lon; Pruitt, Don Preussm Kim D.; Rubinstein, Wendy; Sayers, Eric W.; Schneider, Valerie; Schuler, Gregory D.; Sequeira, Edwin; Sherry, Stephen T.; Shumway, Martin; Sirotkin, Karl; Siyan, Karanjit; Slotta, Douglas; Soboleva, Alexandra; Soussov, Vladimir; Starchenko, Grigory; Tatusova, Tatiana A.; Trawick, Bart W.; Vakatov, Denis; Wang, Yanli; Ward, Minghong; John Wilbur, W.; Yaschenko, Eugene; Zbicz, Kerry
2014-01-01
In addition to maintaining the GenBank® nucleic acid sequence database, the National Center for Biotechnology Information (NCBI, http://www.ncbi.nlm.nih.gov) provides analysis and retrieval resources for the data in GenBank and other biological data made available through the NCBI Web site. NCBI resources include Entrez, the Entrez Programming Utilities, MyNCBI, PubMed, PubMed Central, PubReader, Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link, Primer-BLAST, COBALT, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, dbVar, Epigenomics, the Genetic Testing Registry, Genome and related tools, the Map Viewer, Trace Archive, Sequence Read Archive, BioProject, BioSample, ClinVar, MedGen, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus, Probe, Online Mendelian Inheritance in Animals, the Molecular Modeling Database, the Conserved Domain Database, the Conserved Domain Architecture Retrieval Tool, Biosystems, Protein Clusters and the PubChem suite of small molecule databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All these resources can be accessed through the NCBI home page. PMID:24259429
Liu, Jianling; Liu, Mengmeng; Yao, Yao; Wang, Jinan; Li, Yan; Li, Guohui; Wang, Yonghua
2012-01-01
Chitinolytic β-N-acetyl-d-hexosaminidases, as a class of chitin hydrolysis enzyme in insects, are a potential species-specific target for developing environmentally-friendly pesticides. Until now, pesticides targeting chitinolytic β-N-acetyl-d-hexosaminidase have not been developed. This study demonstrates a combination of different theoretical methods for investigating the key structural features of this enzyme responsible for pesticide inhibition, thus allowing for the discovery of novel small molecule inhibitors. Firstly, based on the currently reported crystal structure of this protein (OfHex1.pdb), we conducted a pre-screening of a drug-like compound database with 8 × 106 compounds by using the expanded pesticide-likeness criteria, followed by docking-based screening, obtaining 5 top-ranked compounds with favorable docking conformation into OfHex1. Secondly, molecular docking and molecular dynamics simulations are performed for the five complexes and demonstrate that one main hydrophobic pocket formed by residues Trp424, Trp448 and Trp524, which is significant for stabilization of the ligand–receptor complex, and key residues Asp477 and Trp490, are respectively responsible for forming hydrogen-bonding and π–π stacking interactions with the ligands. Finally, the molecular mechanics Poisson–Boltzmann surface area (MM-PBSA) analysis indicates that van der Waals interactions are the main driving force for the inhibitor binding that agrees with the fact that the binding pocket of OfHex1 is mainly composed of hydrophobic residues. These results suggest that screening the ZINC database can maximize the identification of potential OfHex1 inhibitors and the computational protocol will be valuable for screening potential inhibitors of the binding mode, which is useful for the future rational design of novel, potent OfHex1-specific pesticides. PMID:22605995
McKay, Dennis B; Chang, Cheng; González-Cestari, Tatiana F; McKay, Susan B; El-Hajj, Raed A; Bryant, Darrell L; Zhu, Michael X; Swaan, Peter W; Arason, Kristjan M; Pulipaka, Aravinda B; Orac, Crina M; Bergmeier, Stephen C
2007-05-01
As a novel approach to drug discovery involving neuronal nicotinic acetylcholine receptors (nAChRs), our laboratory targeted nonagonist binding sites (i.e., noncompetitive binding sites, negative allosteric binding sites) located on nAChRs. Cultured bovine adrenal cells were used as neuronal models to investigate interactions of 67 analogs of methyllycaconitine (MLA) on native alpha3beta4* nAChRs. The availability of large numbers of structurally related molecules presents a unique opportunity for the development of pharmacophore models for noncompetitive binding sites. Our MLA analogs inhibited nicotine-mediated functional activation of both native and recombinant alpha3beta4* nAChRs with a wide range of IC(50) values (0.9-115 microM). These analogs had little or no inhibitory effects on agonist binding to native or recombinant nAChRs, supporting noncompetitive inhibitory activity. Based on these data, two highly predictive 3D quantitative structure-activity relationship (comparative molecular field analysis and comparative molecular similarity index analysis) models were generated. These computational models were successfully validated and provided insights into the molecular interactions of MLA analogs with nAChRs. In addition, a pharmacophore model was constructed to analyze and visualize the binding requirements to the analog binding site. The pharmacophore model was subsequently applied to search structurally diverse molecular databases to prospectively identify novel inhibitors. The rapid identification of eight molecules from database mining and our successful demonstration of in vitro inhibitory activity support the utility of these computational models as novel tools for the efficient retrieval of inhibitors. These results demonstrate the effectiveness of computational modeling and pharmacophore development, which may lead to the identification of new therapeutic drugs that target novel sites on nAChRs.
Krishnaraj, R Navanietha; Chandran, Saravanan; Pal, Parimal; Berchmans, Sheela
2014-01-01
Carbon nanotubes are the interesting class of materials with wide range of applications. They have excellent physical, chemical and electrical properties. Numerous reports were made on the antiviral activities of carbon nanotubes. However the mechanism of antiviral action is still in infancy. Herein we report, our recent novel findings on the molecular interactions of carbon nanotubes with the three key target proteins of HIV using computational chemistry approach. Armchair, chiral and zigzag CNTs were modeled and used as ligands for the interaction studies. The structure of the key proteins involved in HIV mediated infection namely HIV- Vpr, Nef and Gag proteins were collected from the PDB database. The docking studies were performed to quantify the interaction of the CNT with the three different disease targets. Results showed that the carbon nanotubes had high binding affinity to these proteins which confirms the antagonistic molecular interaction of carbon nanotubes to the disease targets. The modeled armchair carbon nanotubes had the binding affinities of -12.4 Kcal/mole, -20 Kcal/mole and -11.7 Kcal/mole with the Vpr, Nef and Gag proteins of HIV. Chiral CNTs also had the maximum affinity of -16.4 Kcal/mole to Nef. The binding affinity of chiral CNTs to Vpr and Gag was found to be -10.9 Kcal/mole and -10.3 Kcal/mole respectively. The zigzag CNTs had the binding affinity of -11.1 Kcal/mole with Vpr, -18.3 Kcal/mole with Nef and -10.9 with Gag respectively. The strong molecular interactions suggest the efficacy of CNTs for targeting the HIV mediated retroviral infections.
Adequacy of damped dynamics to represent the electron-phonon interaction in solids
Caro, A.; Correa, A. A.; Tamm, A.; ...
2015-10-16
Time-dependent density functional theory and Ehrenfest dynamics are used to calculate the electronic excitations produced by a moving Ni ion in a Ni crystal in the case of energetic MeV range (electronic stopping power regime), as well as thermal energy meV range (electron-phonon interaction regime). Results at high energy compare well to experimental databases of stopping power, and at low energy the electron-phonon interaction strength determined in this way is very similar to the linear response calculation and experimental measurements. This approach to electron-phonon interaction as an electronic stopping process provides the basis for a unified framework to perform classicalmore » molecular dynamics of ion-solid interaction with ab initio type nonadiabatic terms in a wide range of energies.« less
Hung, Tzu-Chieh; Lee, Wen-Yuan; Chen, Kuen-Bao; Chan, Yueh-Chiu; Lee, Cheng-Chun
2014-01-01
Human histone deacetylase 2 (HDAC2) has been identified as being associated with Alzheimer's disease (AD), a neuropathic degenerative disease. In this study, we screen the world's largest Traditional Chinese Medicine (TCM) database for natural compounds that may be useful as lead compounds in the search for inhibitors of HDAC2 function. The technique of molecular docking was employed to select the ten top TCM candidates. We used three prediction models, multiple linear regression (MLR), support vector machine (SVM), and the Bayes network toolbox (BNT), to predict the bioactivity of the TCM candidates. Molecular dynamics simulation provides the protein-ligand interactions of compounds. The bioactivity predictions of pIC50 values suggest that the TCM candidatesm, (−)-Bontl ferulate, monomethylcurcumin, and ningposides C, have a greater effect on HDAC2 inhibition. The structure variation caused by the hydrogen bonds and hydrophobic interactions between protein-ligand interactions indicates that these compounds have an inhibitory effect on the protein. PMID:25045700
van der Lee, A; Rolland, M; Marat, X; Virieux, D; Volle, J N; Pirat, J L
2008-04-01
The structures of six cyclic oxazaphospholidines and three cyclic oxazaphosphinanes have been determined and their supramolecular structures have been compared. The molecules differ with respect to the functional groups attached to the central five- or six-membered rings, but have one phosphoryl group in common. The predominant feature in the supramolecular structures is the existence of relatively weak intermolecular phosphoryl XH...O=P (X = C, N) hydrogen bonds, creating in nearly all cases linear zigzag or double molecular chains. The molecular chains are in general linked to each other via very weak CH...pi or usual hydrogen-bond interactions. A survey of the Cambridge Structural Database on similar XH...O=P interactions shows a very large flexibility of the XH...O angle, which is in agreement with the DFT calculation reported elsewhere. The strength of the XH...O=P interaction can therefore be considered as relatively weak to moderately strong, and is expected to play at least a role in the formation of secondary substructures.
PoMaMo--a comprehensive database for potato genome data.
Meyer, Svenja; Nagel, Axel; Gebhardt, Christiane
2005-01-01
A database for potato genome data (PoMaMo, Potato Maps and More) was established. The database contains molecular maps of all twelve potato chromosomes with about 1000 mapped elements, sequence data, putative gene functions, results from BLAST analysis, SNP and InDel information from different diploid and tetraploid potato genotypes, publication references, links to other public databases like GenBank (http://www.ncbi.nlm.nih.gov/) or SGN (Solanaceae Genomics Network, http://www.sgn.cornell.edu/), etc. Flexible search and data visualization interfaces enable easy access to the data via internet (https://gabi.rzpd.de/PoMaMo.html). The Java servlet tool YAMB (Yet Another Map Browser) was designed to interactively display chromosomal maps. Maps can be zoomed in and out, and detailed information about mapped elements can be obtained by clicking on an element of interest. The GreenCards interface allows a text-based data search by marker-, sequence- or genotype name, by sequence accession number, gene function, BLAST Hit or publication reference. The PoMaMo database is a comprehensive database for different potato genome data, and to date the only database containing SNP and InDel data from diploid and tetraploid potato genotypes.
PoMaMo—a comprehensive database for potato genome data
Meyer, Svenja; Nagel, Axel; Gebhardt, Christiane
2005-01-01
A database for potato genome data (PoMaMo, Potato Maps and More) was established. The database contains molecular maps of all twelve potato chromosomes with about 1000 mapped elements, sequence data, putative gene functions, results from BLAST analysis, SNP and InDel information from different diploid and tetraploid potato genotypes, publication references, links to other public databases like GenBank (http://www.ncbi.nlm.nih.gov/) or SGN (Solanaceae Genomics Network, http://www.sgn.cornell.edu/), etc. Flexible search and data visualization interfaces enable easy access to the data via internet (https://gabi.rzpd.de/PoMaMo.html). The Java servlet tool YAMB (Yet Another Map Browser) was designed to interactively display chromosomal maps. Maps can be zoomed in and out, and detailed information about mapped elements can be obtained by clicking on an element of interest. The GreenCards interface allows a text-based data search by marker-, sequence- or genotype name, by sequence accession number, gene function, BLAST Hit or publication reference. The PoMaMo database is a comprehensive database for different potato genome data, and to date the only database containing SNP and InDel data from diploid and tetraploid potato genotypes. PMID:15608284
Database resources of the National Center for Biotechnology Information
Wheeler, David L.; Barrett, Tanya; Benson, Dennis A.; Bryant, Stephen H.; Canese, Kathi; Chetvernin, Vyacheslav; Church, Deanna M.; DiCuccio, Michael; Edgar, Ron; Federhen, Scott; Feolo, Michael; Geer, Lewis Y.; Helmberg, Wolfgang; Kapustin, Yuri; Khovayko, Oleg; Landsman, David; Lipman, David J.; Madden, Thomas L.; Maglott, Donna R.; Miller, Vadim; Ostell, James; Pruitt, Kim D.; Schuler, Gregory D.; Shumway, Martin; Sequeira, Edwin; Sherry, Steven T.; Sirotkin, Karl; Souvorov, Alexandre; Starchenko, Grigory; Tatusov, Roman L.; Tatusova, Tatiana A.; Wagner, Lukas; Yaschenko, Eugene
2008-01-01
In addition to maintaining the GenBank(R) nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data available through NCBI's web site. NCBI resources include Entrez, the Entrez Programming Utilities, My NCBI, PubMed, PubMed Central, Entrez Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link, Electronic PCR, OrfFinder, Spidey, Splign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, Cancer Chromosomes, Entrez Genome, Genome Project and related tools, the Trace, Assembly, and Short Read Archives, the Map Viewer, Model Maker, Evidence Viewer, Clusters of Orthologous Groups, Influenza Viral Resources, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus, Entrez Probe, GENSAT, Database of Genotype and Phenotype, Online Mendelian Inheritance in Man, Online Mendelian Inheritance in Animals, the Molecular Modeling Database, the Conserved Domain Database, the Conserved Domain Architecture Retrieval Tool and the PubChem suite of small molecule databases. Augmenting the web applications are custom implementations of the BLAST program optimized to search specialized data sets. These resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov. PMID:18045790
5SRNAdb: an information resource for 5S ribosomal RNAs.
Szymanski, Maciej; Zielezinski, Andrzej; Barciszewski, Jan; Erdmann, Volker A; Karlowski, Wojciech M
2016-01-04
Ribosomal 5S RNA (5S rRNA) is the ubiquitous RNA component found in the large subunit of ribosomes in all known organisms. Due to its small size, abundance and evolutionary conservation 5S rRNA for many years now is used as a model molecule in studies on RNA structure, RNA-protein interactions and molecular phylogeny. 5SRNAdb (http://combio.pl/5srnadb/) is the first database that provides a high quality reference set of ribosomal 5S RNAs (5S rRNA) across three domains of life. Here, we give an overview of new developments in the database and associated web tools since 2002, including updates to database content, curation processes and user web interfaces. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Tang, Cheng; Lan, Daoliang; Zhang, Huanrong; Ma, Jing; Yue, Hua
2013-01-01
Duck is an economically important poultry and animal model for human viral hepatitis B. However, the molecular mechanisms underlying host-virus interaction remain unclear because of limited information on the duck genome. This study aims to characterize the duck normal liver transcriptome and to identify the differentially expressed transcripts at 24 h after duck hepatitis A virus genotype C (DHAV-C) infection using Illumina-Solexa sequencing. After removal of low-quality sequences and assembly, a total of 52,757 unigenes was obtained from the normal liver group. Further blast analysis showed that 18,918 unigenes successfully matched the known genes in the database. GO analysis revealed that 25,116 unigenes took part in 61 categories of biological processes, cellular components, and molecular functions. Among the 25 clusters of orthologous group categories (COG), the cluster for "General function prediction only" represented the largest group, followed by "Transcription" and "Replication, recombination, and repair." KEGG analysis showed that 17,628 unigenes were involved in 301 pathways. Through comparison of normal and infected transcriptome data, we identified 20 significantly differentially expressed unigenes, which were further confirmed by real-time polymerase chain reaction. Of the 20 unigenes, nine matched the known genes in the database, including three up-regulated genes (virus replicase polyprotein, LRRC3B, and PCK1) and six down-regulated genes (CRP, AICL-like 2, L1CAM, CYB26A1, CHAC1, and ADAM32). The remaining 11 novel unigenes that did not match any known genes in the database may provide a basis for the discovery of new transcripts associated with infection. This study provided a gene expression pattern for normal duck liver and for the previously unrecognized changes in gene transcription that are altered during DHAV-C infection. Our data revealed useful information for future studies on the duck genome and provided new insights into the molecular mechanism of host-DHAV-C interaction.
2012-01-01
Background In the scientific biodiversity community, it is increasingly perceived the need to build a bridge between molecular and traditional biodiversity studies. We believe that the information technology could have a preeminent role in integrating the information generated by these studies with the large amount of molecular data we can find in bioinformatics public databases. This work is primarily aimed at building a bioinformatic infrastructure for the integration of public and private biodiversity data through the development of GIDL, an Intelligent Data Loader coupled with the Molecular Biodiversity Database. The system presented here organizes in an ontological way and locally stores the sequence and annotation data contained in the GenBank primary database. Methods The GIDL architecture consists of a relational database and of an intelligent data loader software. The relational database schema is designed to manage biodiversity information (Molecular Biodiversity Database) and it is organized in four areas: MolecularData, Experiment, Collection and Taxonomy. The MolecularData area is inspired to an established standard in Generic Model Organism Databases, the Chado relational schema. The peculiarity of Chado, and also its strength, is the adoption of an ontological schema which makes use of the Sequence Ontology. The Intelligent Data Loader (IDL) component of GIDL is an Extract, Transform and Load software able to parse data, to discover hidden information in the GenBank entries and to populate the Molecular Biodiversity Database. The IDL is composed by three main modules: the Parser, able to parse GenBank flat files; the Reasoner, which automatically builds CLIPS facts mapping the biological knowledge expressed by the Sequence Ontology; the DBFiller, which translates the CLIPS facts into ordered SQL statements used to populate the database. In GIDL Semantic Web technologies have been adopted due to their advantages in data representation, integration and processing. Results and conclusions Entries coming from Virus (814,122), Plant (1,365,360) and Invertebrate (959,065) divisions of GenBank rel.180 have been loaded in the Molecular Biodiversity Database by GIDL. Our system, combining the Sequence Ontology and the Chado schema, allows a more powerful query expressiveness compared with the most commonly used sequence retrieval systems like Entrez or SRS. PMID:22536971
Relax with CouchDB - Into the non-relational DBMS era of Bioinformatics
Manyam, Ganiraju; Payton, Michelle A.; Roth, Jack A.; Abruzzo, Lynne V.; Coombes, Kevin R.
2012-01-01
With the proliferation of high-throughput technologies, genome-level data analysis has become common in molecular biology. Bioinformaticians are developing extensive resources to annotate and mine biological features from high-throughput data. The underlying database management systems for most bioinformatics software are based on a relational model. Modern non-relational databases offer an alternative that has flexibility, scalability, and a non-rigid design schema. Moreover, with an accelerated development pace, non-relational databases like CouchDB can be ideal tools to construct bioinformatics utilities. We describe CouchDB by presenting three new bioinformatics resources: (a) geneSmash, which collates data from bioinformatics resources and provides automated gene-centric annotations, (b) drugBase, a database of drug-target interactions with a web interface powered by geneSmash, and (c) HapMap-CN, which provides a web interface to query copy number variations from three SNP-chip HapMap datasets. In addition to the web sites, all three systems can be accessed programmatically via web services. PMID:22609849
Chemical Space: Big Data Challenge for Molecular Diversity.
Awale, Mahendra; Visini, Ricardo; Probst, Daniel; Arús-Pous, Josep; Reymond, Jean-Louis
2017-10-25
Chemical space describes all possible molecules as well as multi-dimensional conceptual spaces representing the structural diversity of these molecules. Part of this chemical space is available in public databases ranging from thousands to billions of compounds. Exploiting these databases for drug discovery represents a typical big data problem limited by computational power, data storage and data access capacity. Here we review recent developments of our laboratory, including progress in the chemical universe databases (GDB) and the fragment subset FDB-17, tools for ligand-based virtual screening by nearest neighbor searches, such as our multi-fingerprint browser for the ZINC database to select purchasable screening compounds, and their application to discover potent and selective inhibitors for calcium channel TRPV6 and Aurora A kinase, the polypharmacology browser (PPB) for predicting off-target effects, and finally interactive 3D-chemical space visualization using our online tools WebDrugCS and WebMolCS. All resources described in this paper are available for public use at www.gdb.unibe.ch.
Search extension transforms Wiki into a relational system: a case for flavonoid metabolite database.
Arita, Masanori; Suwa, Kazuhiro
2008-09-17
In computer science, database systems are based on the relational model founded by Edgar Codd in 1970. On the other hand, in the area of biology the word 'database' often refers to loosely formatted, very large text files. Although such bio-databases may describe conflicts or ambiguities (e.g. a protein pair do and do not interact, or unknown parameters) in a positive sense, the flexibility of the data format sacrifices a systematic query mechanism equivalent to the widely used SQL. To overcome this disadvantage, we propose embeddable string-search commands on a Wiki-based system and designed a half-formatted database. As proof of principle, a database of flavonoid with 6902 molecular structures from over 1687 plant species was implemented on MediaWiki, the background system of Wikipedia. Registered users can describe any information in an arbitrary format. Structured part is subject to text-string searches to realize relational operations. The system was written in PHP language as the extension of MediaWiki. All modifications are open-source and publicly available. This scheme benefits from both the free-formatted Wiki style and the concise and structured relational-database style. MediaWiki supports multi-user environments for document management, and the cost for database maintenance is alleviated.
Search extension transforms Wiki into a relational system: A case for flavonoid metabolite database
Arita, Masanori; Suwa, Kazuhiro
2008-01-01
Background In computer science, database systems are based on the relational model founded by Edgar Codd in 1970. On the other hand, in the area of biology the word 'database' often refers to loosely formatted, very large text files. Although such bio-databases may describe conflicts or ambiguities (e.g. a protein pair do and do not interact, or unknown parameters) in a positive sense, the flexibility of the data format sacrifices a systematic query mechanism equivalent to the widely used SQL. Results To overcome this disadvantage, we propose embeddable string-search commands on a Wiki-based system and designed a half-formatted database. As proof of principle, a database of flavonoid with 6902 molecular structures from over 1687 plant species was implemented on MediaWiki, the background system of Wikipedia. Registered users can describe any information in an arbitrary format. Structured part is subject to text-string searches to realize relational operations. The system was written in PHP language as the extension of MediaWiki. All modifications are open-source and publicly available. Conclusion This scheme benefits from both the free-formatted Wiki style and the concise and structured relational-database style. MediaWiki supports multi-user environments for document management, and the cost for database maintenance is alleviated. PMID:18822113
sc-PDB: a 3D-database of ligandable binding sites—10 years on
Desaphy, Jérémy; Bret, Guillaume; Rognan, Didier; Kellenberger, Esther
2015-01-01
The sc-PDB database (available at http://bioinfo-pharma.u-strasbg.fr/scPDB/) is a comprehensive and up-to-date selection of ligandable binding sites of the Protein Data Bank. Sites are defined from complexes between a protein and a pharmacological ligand. The database provides the all-atom description of the protein, its ligand, their binding site and their binding mode. Currently, the sc-PDB archive registers 9283 binding sites from 3678 unique proteins and 5608 unique ligands. The sc-PDB database was publicly launched in 2004 with the aim of providing structure files suitable for computational approaches to drug design, such as docking. During the last 10 years we have improved and standardized the processes for (i) identifying binding sites, (ii) correcting structures, (iii) annotating protein function and ligand properties and (iv) characterizing their binding mode. This paper presents the latest enhancements in the database, specifically pertaining to the representation of molecular interaction and to the similarity between ligand/protein binding patterns. The new website puts emphasis in pictorial analysis of data. PMID:25300483
iDrug: a web-accessible and interactive drug discovery and design platform
2014-01-01
Background The progress in computer-aided drug design (CADD) approaches over the past decades accelerated the early-stage pharmaceutical research. Many powerful standalone tools for CADD have been developed in academia. As programs are developed by various research groups, a consistent user-friendly online graphical working environment, combining computational techniques such as pharmacophore mapping, similarity calculation, scoring, and target identification is needed. Results We presented a versatile, user-friendly, and efficient online tool for computer-aided drug design based on pharmacophore and 3D molecular similarity searching. The web interface enables binding sites detection, virtual screening hits identification, and drug targets prediction in an interactive manner through a seamless interface to all adapted packages (e.g., Cavity, PocketV.2, PharmMapper, SHAFTS). Several commercially available compound databases for hit identification and a well-annotated pharmacophore database for drug targets prediction were integrated in iDrug as well. The web interface provides tools for real-time molecular building/editing, converting, displaying, and analyzing. All the customized configurations of the functional modules can be accessed through featured session files provided, which can be saved to the local disk and uploaded to resume or update the history work. Conclusions iDrug is easy to use, and provides a novel, fast and reliable tool for conducting drug design experiments. By using iDrug, various molecular design processing tasks can be submitted and visualized simply in one browser without installing locally any standalone modeling softwares. iDrug is accessible free of charge at http://lilab.ecust.edu.cn/idrug. PMID:24955134
2014-05-14
Transfluthrin I Permethrin DDT I Empirical C1sH12ClzF402C74J T, C21H20Clz03CBOJ ·1 C14H9ClsC72J DEET l Formula ! I 1 Molecular , 371.2(74) I 391.3(53...oil lamp to vaporize transfluthrin. Medical and veterinary entomology 16:277-84 65. Pesticide Target Interaction Database. 2012. Transfluthrin
Hu, Wei Qi; Wang, Wei; Fang, Di Long; Yin, Xue Feng
2018-05-24
BACKGROUND We screened the potential molecular targets and investigated the molecular mechanisms of hepatocellular carcinoma (HCC). MATERIAL AND METHODS Microarray data of GSE47786, including the 40 μM berberine-treated HepG2 human hepatoma cell line and 0.08% DMSO-treated as control cells samples, was downloaded from the GEO database. Gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes pathway (KEGG) enrichment analyses were performed; the protein-protein interaction (PPI) networks were constructed using STRING database and Cytoscape; the genetic alteration, neighboring genes networks, and survival analysis of hub genes were explored by cBio portal; and the expression of mRNA level of hub genes was obtained from the Oncomine databases. RESULTS A total of 56 upregulated and 8 downregulated DEGs were identified. The GO analysis results were significantly enriched in cell-cycle arrest, regulation of transcription, DNA-dependent, protein amino acid phosphorylation, cell cycle, and apoptosis. The KEGG pathway analysis showed that DEGs were enriched in MAPK signaling pathway, ErbB signaling pathway, and p53 signaling pathway. JUN, EGR1, MYC, and CDKN1A were identified as hub genes in PPI networks. The genetic alteration of hub genes was mainly concentrated in amplification. TP53, NDRG1, and MAPK15 were found in neighboring genes networks. Altered genes had worse overall survival and disease-free survival than unaltered genes. The expressions of EGR1, MYC, and CDKN1A were significantly increased, but expression of JUN was not, in the Roessler Liver datasets. CONCLUSIONS We found that JUN, EGR1, MYC, and CDKN1A might be used as diagnostic and therapeutic molecular biomarkers and broaden our understanding of the molecular mechanisms of HCC.
Chen, Lei; Zhang, Yu-Hang; Zheng, Mingyue; Huang, Tao; Cai, Yu-Dong
2016-12-01
Compound-protein interactions play important roles in every cell via the recognition and regulation of specific functional proteins. The correct identification of compound-protein interactions can lead to a good comprehension of this complicated system and provide useful input for the investigation of various attributes of compounds and proteins. In this study, we attempted to understand this system by extracting properties from both proteins and compounds, in which proteins were represented by gene ontology and KEGG pathway enrichment scores and compounds were represented by molecular fragments. Advanced feature selection methods, including minimum redundancy maximum relevance, incremental feature selection, and the basic machine learning algorithm random forest, were used to analyze these properties and extract core factors for the determination of actual compound-protein interactions. Compound-protein interactions reported in The Binding Databases were used as positive samples. To improve the reliability of the results, the analytic procedure was executed five times using different negative samples. Simultaneously, five optimal prediction methods based on a random forest and yielding maximum MCCs of approximately 77.55 % were constructed and may be useful tools for the prediction of compound-protein interactions. This work provides new clues to understanding the system of compound-protein interactions by analyzing extracted core features. Our results indicate that compound-protein interactions are related to biological processes involving immune, developmental and hormone-associated pathways.
Toseland, Christopher P; Clayton, Debra J; McSparron, Helen; Hemsley, Shelley L; Blythe, Martin J; Paine, Kelly; Doytchinova, Irini A; Guan, Pingping; Hattotuwagama, Channa K; Flower, Darren R
2005-01-01
AntiJen is a database system focused on the integration of kinetic, thermodynamic, functional, and cellular data within the context of immunology and vaccinology. Compared to its progenitor JenPep, the interface has been completely rewritten and redesigned and now offers a wider variety of search methods, including a nucleotide and a peptide BLAST search. In terms of data archived, AntiJen has a richer and more complete breadth, depth, and scope, and this has seen the database increase to over 31,000 entries. AntiJen provides the most complete and up-to-date dataset of its kind. While AntiJen v2.0 retains a focus on both T cell and B cell epitopes, its greatest novelty is the archiving of continuous quantitative data on a variety of immunological molecular interactions. This includes thermodynamic and kinetic measures of peptide binding to TAP and the Major Histocompatibility Complex (MHC), peptide-MHC complexes binding to T cell receptors, antibodies binding to protein antigens and general immunological protein-protein interactions. The database also contains quantitative specificity data from position-specific peptide libraries and biophysical data, in the form of diffusion co-efficients and cell surface copy numbers, on MHCs and other immunological molecules. The uses of AntiJen include the design of vaccines and diagnostics, such as tetramers, and other laboratory reagents, as well as helping parameterize the bioinformatic or mathematical in silico modeling of the immune system. The database is accessible from the URL: . PMID:16305757
NASA Astrophysics Data System (ADS)
Scharberg, Maureen A.; Cox, Oran E.; Barelli, Carl A.
1997-07-01
"The Molecule of the Day" consumer chemical database has been created to allow introductory chemistry students to explore molecular structures of chemicals in household products, and to provide opportunities in molecular modeling for undergraduate chemistry students. Before class begins, an overhead transparency is displayed which shows a three-dimensional molecular structure of a household chemical, and lists relevant features and uses of this chemical. Within answers to questionnaires, students have commented that this molecular graphics database has helped them to visually connect the microscopic structure of a molecule with its physical and chemical properties, as well as its uses in consumer products. It is anticipated that this database will be incorporated into a navigational software package such as Netscape.
Improved Infrastucture for Cdms and JPL Molecular Spectroscopy Catalogues
NASA Astrophysics Data System (ADS)
Endres, Christian; Schlemmer, Stephan; Drouin, Brian; Pearson, John; Müller, Holger S. P.; Schilke, P.; Stutzki, Jürgen
2014-06-01
Over the past years a new infrastructure for atomic and molecular databases has been developed within the framework of the Virtual Atomic and Molecular Data Centre (VAMDC). Standards for the representation of atomic and molecular data as well as a set of protocols have been established which allow now to retrieve data from various databases through one portal and to combine the data easily. Apart from spectroscopic databases such as the Cologne Database for Molecular Spectroscopy (CDMS), the Jet Propulsion Laboratory microwave, millimeter and submillimeter spectral line catalogue (JPL) and the HITRAN database, various databases on molecular collisions (BASECOL, KIDA) and reactions (UMIST) are connected. Together with other groups within the VAMDC consortium we are working on common user tools to simplify the access for new customers and to tailor data requests for users with specified needs. This comprises in particular tools to support the analysis of complex observational data obtained with the ALMA telescope. In this presentation requests to CDMS and JPL will be used to explain the basic concepts and the tools which are provided by VAMDC. In addition a new portal to CDMS will be presented which has a number of new features, in particular meaningful quantum numbers, references linked to data points, access to state energies and improved documentation. Fit files are accessible for download and queries to other databases are possible.
RNA Bricks—a database of RNA 3D motifs and their interactions
Chojnowski, Grzegorz; Waleń, Tomasz; Bujnicki, Janusz M.
2014-01-01
The RNA Bricks database (http://iimcb.genesilico.pl/rnabricks), stores information about recurrent RNA 3D motifs and their interactions, found in experimentally determined RNA structures and in RNA–protein complexes. In contrast to other similar tools (RNA 3D Motif Atlas, RNA Frabase, Rloom) RNA motifs, i.e. ‘RNA bricks’ are presented in the molecular environment, in which they were determined, including RNA, protein, metal ions, water molecules and ligands. All nucleotide residues in RNA bricks are annotated with structural quality scores that describe real-space correlation coefficients with the electron density data (if available), backbone geometry and possible steric conflicts, which can be used to identify poorly modeled residues. The database is also equipped with an algorithm for 3D motif search and comparison. The algorithm compares spatial positions of backbone atoms of the user-provided query structure and of stored RNA motifs, without relying on sequence or secondary structure information. This enables the identification of local structural similarities among evolutionarily related and unrelated RNA molecules. Besides, the search utility enables searching ‘RNA bricks’ according to sequence similarity, and makes it possible to identify motifs with modified ribonucleotide residues at specific positions. PMID:24220091
Franke, Lude; Bakel, Harm van; Fokkens, Like; de Jong, Edwin D.; Egmont-Petersen, Michael; Wijmenga, Cisca
2006-01-01
Most common genetic disorders have a complex inheritance and may result from variants in many genes, each contributing only weak effects to the disease. Pinpointing these disease genes within the myriad of susceptibility loci identified in linkage studies is difficult because these loci may contain hundreds of genes. However, in any disorder, most of the disease genes will be involved in only a few different molecular pathways. If we know something about the relationships between the genes, we can assess whether some genes (which may reside in different loci) functionally interact with each other, indicating a joint basis for the disease etiology. There are various repositories of information on pathway relationships. To consolidate this information, we developed a functional human gene network that integrates information on genes and the functional relationships between genes, based on data from the Kyoto Encyclopedia of Genes and Genomes, the Biomolecular Interaction Network Database, Reactome, the Human Protein Reference Database, the Gene Ontology database, predicted protein-protein interactions, human yeast two-hybrid interactions, and microarray coexpressions. We applied this network to interrelate positional candidate genes from different disease loci and then tested 96 heritable disorders for which the Online Mendelian Inheritance in Man database reported at least three disease genes. Artificial susceptibility loci, each containing 100 genes, were constructed around each disease gene, and we used the network to rank these genes on the basis of their functional interactions. By following up the top five genes per artificial locus, we were able to detect at least one known disease gene in 54% of the loci studied, representing a 2.8-fold increase over random selection. This suggests that our method can significantly reduce the cost and effort of pinpointing true disease genes in analyses of disorders for which numerous loci have been reported but for which most of the genes are unknown. PMID:16685651
Xia, Kai; Dong, Dong; Han, Jing-Dong J
2006-01-01
Background Although protein-protein interaction (PPI) networks have been explored by various experimental methods, the maps so built are still limited in coverage and accuracy. To further expand the PPI network and to extract more accurate information from existing maps, studies have been carried out to integrate various types of functional relationship data. A frequently updated database of computationally analyzed potential PPIs to provide biological researchers with rapid and easy access to analyze original data as a biological network is still lacking. Results By applying a probabilistic model, we integrated 27 heterogeneous genomic, proteomic and functional annotation datasets to predict PPI networks in human. In addition to previously studied data types, we show that phenotypic distances and genetic interactions can also be integrated to predict PPIs. We further built an easy-to-use, updatable integrated PPI database, the Integrated Network Database (IntNetDB) online, to provide automatic prediction and visualization of PPI network among genes of interest. The networks can be visualized in SVG (Scalable Vector Graphics) format for zooming in or out. IntNetDB also provides a tool to extract topologically highly connected network neighborhoods from a specific network for further exploration and research. Using the MCODE (Molecular Complex Detections) algorithm, 190 such neighborhoods were detected among all the predicted interactions. The predicted PPIs can also be mapped to worm, fly and mouse interologs. Conclusion IntNetDB includes 180,010 predicted protein-protein interactions among 9,901 human proteins and represents a useful resource for the research community. Our study has increased prediction coverage by five-fold. IntNetDB also provides easy-to-use network visualization and analysis tools that allow biological researchers unfamiliar with computational biology to access and analyze data over the internet. The web interface of IntNetDB is freely accessible at . Visualization requires Mozilla version 1.8 (or higher) or Internet Explorer with installation of SVGviewer. PMID:17112386
Database resources of the National Center for Biotechnology Information
Sayers, Eric W.; Barrett, Tanya; Benson, Dennis A.; Bolton, Evan; Bryant, Stephen H.; Canese, Kathi; Chetvernin, Vyacheslav; Church, Deanna M.; DiCuccio, Michael; Federhen, Scott; Feolo, Michael; Fingerman, Ian M.; Geer, Lewis Y.; Helmberg, Wolfgang; Kapustin, Yuri; Krasnov, Sergey; Landsman, David; Lipman, David J.; Lu, Zhiyong; Madden, Thomas L.; Madej, Tom; Maglott, Donna R.; Marchler-Bauer, Aron; Miller, Vadim; Karsch-Mizrachi, Ilene; Ostell, James; Panchenko, Anna; Phan, Lon; Pruitt, Kim D.; Schuler, Gregory D.; Sequeira, Edwin; Sherry, Stephen T.; Shumway, Martin; Sirotkin, Karl; Slotta, Douglas; Souvorov, Alexandre; Starchenko, Grigory; Tatusova, Tatiana A.; Wagner, Lukas; Wang, Yanli; Wilbur, W. John; Yaschenko, Eugene; Ye, Jian
2012-01-01
In addition to maintaining the GenBank® nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data made available through the NCBI Website. NCBI resources include Entrez, the Entrez Programming Utilities, MyNCBI, PubMed, PubMed Central (PMC), Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Primer-BLAST, COBALT, Splign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, dbVar, Epigenomics, Genome and related tools, the Map Viewer, Model Maker, Evidence Viewer, Trace Archive, Sequence Read Archive, BioProject, BioSample, Retroviral Genotyping Tools, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus (GEO), Probe, Online Mendelian Inheritance in Animals (OMIA), the Molecular Modeling Database (MMDB), the Conserved Domain Database (CDD), the Conserved Domain Architecture Retrieval Tool (CDART), Biosystems, Protein Clusters and the PubChem suite of small molecule databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of these resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov. PMID:22140104
Database resources of the National Center for Biotechnology Information
2013-01-01
In addition to maintaining the GenBank® nucleic acid sequence database, the National Center for Biotechnology Information (NCBI, http://www.ncbi.nlm.nih.gov) provides analysis and retrieval resources for the data in GenBank and other biological data made available through the NCBI web site. NCBI resources include Entrez, the Entrez Programming Utilities, MyNCBI, PubMed, PubMed Central, Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Primer-BLAST, COBALT, Splign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, dbVar, Epigenomics, the Genetic Testing Registry, Genome and related tools, the Map Viewer, Model Maker, Evidence Viewer, Trace Archive, Sequence Read Archive, BioProject, BioSample, Retroviral Genotyping Tools, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus, Probe, Online Mendelian Inheritance in Animals, the Molecular Modeling Database, the Conserved Domain Database, the Conserved Domain Architecture Retrieval Tool, Biosystems, Protein Clusters and the PubChem suite of small molecule databases. Augmenting many of the web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of these resources can be accessed through the NCBI home page. PMID:23193264
Database resources of the National Center for Biotechnology Information.
Wheeler, David L; Barrett, Tanya; Benson, Dennis A; Bryant, Stephen H; Canese, Kathi; Chetvernin, Vyacheslav; Church, Deanna M; DiCuccio, Michael; Edgar, Ron; Federhen, Scott; Geer, Lewis Y; Kapustin, Yuri; Khovayko, Oleg; Landsman, David; Lipman, David J; Madden, Thomas L; Maglott, Donna R; Ostell, James; Miller, Vadim; Pruitt, Kim D; Schuler, Gregory D; Sequeira, Edwin; Sherry, Steven T; Sirotkin, Karl; Souvorov, Alexandre; Starchenko, Grigory; Tatusov, Roman L; Tatusova, Tatiana A; Wagner, Lukas; Yaschenko, Eugene
2007-01-01
In addition to maintaining the GenBank nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data made available through NCBI's Web site. NCBI resources include Entrez, the Entrez Programming Utilities, My NCBI, PubMed, PubMed Central, Entrez Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link(BLink), Electronic PCR, OrfFinder, Spidey, Splign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, Cancer Chromosomes, Entrez Genome, Genome Project and related tools, the Trace and Assembly Archives, the Map Viewer, Model Maker, Evidence Viewer, Clusters of Orthologous Groups (COGs), Viral Genotyping Tools, Influenza Viral Resources, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus (GEO), Entrez Probe, GENSAT, Online Mendelian Inheritance in Man (OMIM), Online Mendelian Inheritance in Animals (OMIA), the Molecular Modeling Database (MMDB), the Conserved Domain Database (CDD), the Conserved Domain Architecture Retrieval Tool (CDART) and the PubChem suite of small molecule databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. These resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov.
Database resources of the National Center for Biotechnology Information.
Sayers, Eric W; Barrett, Tanya; Benson, Dennis A; Bryant, Stephen H; Canese, Kathi; Chetvernin, Vyacheslav; Church, Deanna M; DiCuccio, Michael; Edgar, Ron; Federhen, Scott; Feolo, Michael; Geer, Lewis Y; Helmberg, Wolfgang; Kapustin, Yuri; Landsman, David; Lipman, David J; Madden, Thomas L; Maglott, Donna R; Miller, Vadim; Mizrachi, Ilene; Ostell, James; Pruitt, Kim D; Schuler, Gregory D; Sequeira, Edwin; Sherry, Stephen T; Shumway, Martin; Sirotkin, Karl; Souvorov, Alexandre; Starchenko, Grigory; Tatusova, Tatiana A; Wagner, Lukas; Yaschenko, Eugene; Ye, Jian
2009-01-01
In addition to maintaining the GenBank nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data made available through the NCBI web site. NCBI resources include Entrez, the Entrez Programming Utilities, MyNCBI, PubMed, PubMed Central, Entrez Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Electronic PCR, OrfFinder, Spidey, Splign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, Cancer Chromosomes, Entrez Genomes and related tools, the Map Viewer, Model Maker, Evidence Viewer, Clusters of Orthologous Groups (COGs), Retroviral Genotyping Tools, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus (GEO), Entrez Probe, GENSAT, Online Mendelian Inheritance in Man (OMIM), Online Mendelian Inheritance in Animals (OMIA), the Molecular Modeling Database (MMDB), the Conserved Domain Database (CDD), the Conserved Domain Architecture Retrieval Tool (CDART) and the PubChem suite of small molecule databases. Augmenting many of the web applications is custom implementation of the BLAST program optimized to search specialized data sets. All of the resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov.
PlaMoM: a comprehensive database compiles plant mobile macromolecules
Guan, Daogang; Yan, Bin; Thieme, Christoph; Hua, Jingmin; Zhu, Hailong; Boheler, Kenneth R.; Zhao, Zhongying; Kragler, Friedrich; Xia, Yiji; Zhang, Shoudong
2017-01-01
In plants, various phloem-mobile macromolecules including noncoding RNAs, mRNAs and proteins are suggested to act as important long-distance signals in regulating crucial physiological and morphological transition processes such as flowering, plant growth and stress responses. Given recent advances in high-throughput sequencing technologies, numerous mobile macromolecules have been identified in diverse plant species from different plant families. However, most of the identified mobile macromolecules are not annotated in current versions of species-specific databases and are only available as non-searchable datasheets. To facilitate study of the mobile signaling macromolecules, we compiled the PlaMoM (Plant Mobile Macromolecules) database, a resource that provides convenient and interactive search tools allowing users to retrieve, to analyze and also to predict mobile RNAs/proteins. Each entry in the PlaMoM contains detailed information such as nucleotide/amino acid sequences, ortholog partners, related experiments, gene functions and literature. For the model plant Arabidopsis thaliana, protein–protein interactions of mobile transcripts are presented as interactive molecular networks. Furthermore, PlaMoM provides a built-in tool to identify potential RNA mobility signals such as tRNA-like structures. The current version of PlaMoM compiles a total of 17 991 mobile macromolecules from 14 plant species/ecotypes from published data and literature. PlaMoM is available at http://www.systembioinfo.org/plamom/. PMID:27924044
A Brief Review of RNA–Protein Interaction Database Resources
Yi, Ying; Zhao, Yue; Huang, Yan; Wang, Dong
2017-01-01
RNA–Protein interactions play critical roles in various biological processes. By collecting and analyzing the RNA–Protein interactions and binding sites from experiments and predictions, RNA–Protein interaction databases have become an essential resource for the exploration of the transcriptional and post-transcriptional regulatory network. Here, we briefly review several widely used RNA–Protein interaction database resources developed in recent years to provide a guide of these databases. The content and major functions in databases are presented. The brief description of database helps users to quickly choose the database containing information they interested. In short, these RNA–Protein interaction database resources are continually updated, but the current state shows the efforts to identify and analyze the large amount of RNA–Protein interactions. PMID:29657278
Chen, Long; Jiang, Yifeng; Du, Zhen
2018-04-01
Although previous studies have demonstrated that dental pulp stem cells (DPSCs) from mature and immature teeth exhibit potential for multi-directional differentiation, the molecular and biological difference between the DPSCs from mature and immature permanent teeth has not been fully investigated. In the present study, 500 differentially expressed genes from dental pulp cells (DPCs) in mature and immature permanent teeth were obtained from the Gene Expression Omnibus online database. Based on bioinformatics analysis using the Database for Annotation, Visualization and Integrated Discovery, these genes were divided into a number of subgroups associated with immunity, inflammation and cell signaling. The results of the present study suggest that immune features, response to infection and cell signaling may be different in DPCs from mature and immature permanent teeth; furthermore, DPCs from immature permanent teeth may be more suitable for use in tissue engineering or stem cell therapy. The Online Mendelian Inheritance in Man database stated that Sonic Hedgehog (SHH), a differentially expressed gene in DPCs from mature and immature permanent teeth, serves a crucial role in the development of craniofacial tissues, including teeth, which further confirmed that SHH may cause DPCs from mature and immature permanent teeth to exhibit different biological characteristics. The Search Tool for the Retrieval of Interacting Genes/Proteins database revealed that SHH has functional protein associations with a number of other proteins, including Glioma-associated oncogene (GLI)1, GLI2, growth arrest-specific protein 1, bone morphogenetic protein (BMP)2 and BMP4, in mice and humans. It was also demonstrated that SHH may interact with other genes to regulate the biological characteristics of DPCs. The results of the present study may provide a useful reference basis for selecting suitable DPSCs and molecules for the treatment of these cells to optimize features for tissue engineering or stem cell therapy. Quantitative polymerase chain reaction should be performed to confirm the differential expression of these genes prior to the beginning of a functional study.
An Integrated Molecular Database on Indian Insects.
Pratheepa, Maria; Venkatesan, Thiruvengadam; Gracy, Gandhi; Jalali, Sushil Kumar; Rangheswaran, Rajagopal; Antony, Jomin Cruz; Rai, Anil
2018-01-01
MOlecular Database on Indian Insects (MODII) is an online database linking several databases like Insect Pest Info, Insect Barcode Information System (IBIn), Insect Whole Genome sequence, Other Genomic Resources of National Bureau of Agricultural Insect Resources (NBAIR), Whole Genome sequencing of Honey bee viruses, Insecticide resistance gene database and Genomic tools. This database was developed with a holistic approach for collecting information about phenomic and genomic information of agriculturally important insects. This insect resource database is available online for free at http://cib.res.in. http://cib.res.in/.
Krassowski, Michal; Paczkowska, Marta; Cullion, Kim; Huang, Tina; Dzneladze, Irakli; Ouellette, B F Francis; Yamada, Joseph T; Fradet-Turcotte, Amelie
2018-01-01
Abstract Interpretation of genetic variation is needed for deciphering genotype-phenotype associations, mechanisms of inherited disease, and cancer driver mutations. Millions of single nucleotide variants (SNVs) in human genomes are known and thousands are associated with disease. An estimated 21% of disease-associated amino acid substitutions corresponding to missense SNVs are located in protein sites of post-translational modifications (PTMs), chemical modifications of amino acids that extend protein function. ActiveDriverDB is a comprehensive human proteo-genomics database that annotates disease mutations and population variants through the lens of PTMs. We integrated >385,000 published PTM sites with ∼3.6 million substitutions from The Cancer Genome Atlas (TCGA), the ClinVar database of disease genes, and human genome sequencing projects. The database includes site-specific interaction networks of proteins, upstream enzymes such as kinases, and drugs targeting these enzymes. We also predicted network-rewiring impact of mutations by analyzing gains and losses of kinase-bound sequence motifs. ActiveDriverDB provides detailed visualization, filtering, browsing and searching options for studying PTM-associated mutations. Users can upload mutation datasets interactively and use our application programming interface in pipelines. Integrative analysis of mutations and PTMs may help decipher molecular mechanisms of phenotypes and disease, as exemplified by case studies of TP53, BRCA2 and VHL. The open-source database is available at https://www.ActiveDriverDB.org. PMID:29126202
Tcof1-Related Molecular Networks in Treacher Collins Syndrome.
Dai, Jiewen; Si, Jiawen; Wang, Minjiao; Huang, Li; Fang, Bing; Shi, Jun; Wang, Xudong; Shen, Guofang
2016-09-01
Treacher Collins syndrome (TCS) is a rare, autosomal-dominant disorder characterized by craniofacial deformities, and is primarily caused by mutations in the Tcof1 gene. This article was aimed to perform a comprehensive literature review and systematic bioinformatic analysis of Tcof1-related molecular networks in TCS. First, the up- and down-regulated genes in Tcof1 heterozygous haploinsufficient mutant mice embryos and Tcof1 knockdown and Tcof1 over-expressed neuroblastoma N1E-115 cells were obtained from the Gene Expression Omnibus database. The GeneDecks database was used to calculate the 500 genes most closely related to Tcof1. Then, the relationships between 4 gene sets (a predicted set and sets comparing the wildtype with the 3 Gene Expression Omnibus datasets) were analyzed using the DAVID, GeneMANIA and STRING databases. The analysis results showed that the Tcof1-related genes were enriched in various biological processes, including cell proliferation, apoptosis, cell cycle, differentiation, and migration. They were also enriched in several signaling pathways, such as the ribosome, p53, cell cycle, and WNT signaling pathways. Additionally, these genes clearly had direct or indirect interactions with Tcof1 and between each other. Literature review and bioinformatic analysis finds imply that special attention should be given to these pathways, as they may offer target points for TCS therapies.
Shin, Jae-Min; Cho, Doo-Ho
2005-01-01
PDB-Ligand (http://www.idrtech.com/PDB-Ligand/) is a three-dimensional structure database of small molecular ligands that are bound to larger biomolecules deposited in the Protein Data Bank (PDB). It is also a database tool that allows one to browse, classify, superimpose and visualize these structures. As of May 2004, there are about 4870 types of small molecular ligands, experimentally determined as a complex with protein or DNA in the PDB. The proteins that a given ligand binds are often homologous and present the same binding structure to the ligand. However, there are also many instances wherein a given ligand binds to two or more unrelated proteins, or to the same or homologous protein in different binding environments. PDB-Ligand serves as an interactive structural analysis and clustering tool for all the ligand-binding structures in the PDB. PDB-Ligand also provides an easier way to obtain a number of different structure alignments of many related ligand-binding structures based on a simple and flexible ligand clustering method. PDB-Ligand will be a good resource for both a better interpretation of ligand-binding structures and the development of better scoring functions to be used in many drug discovery applications.
Atomic and Molecular Databases, VAMDC (Virtual Atomic and Molecular Data Centre)
NASA Astrophysics Data System (ADS)
Dubernet, Marie-Lise; Zwölf, Carlo Maria; Moreau, Nicolas; Awa Ba, Yaya; VAMDC Consortium
2015-08-01
The "Virtual Atomic and Molecular Data Centre Consortium",(VAMDC Consortium, http://www.vamdc.eu) is a Consortium bound by an Memorandum of Understanding aiming at ensuring the sustainability of the VAMDC e-infrastructure. The current VAMDC e-infrastructure inter-connects about 30 atomic and molecular databases with the number of connected databases increasing every year: some databases are well-known databases such as CDMS, JPL, HITRAN, VALD,.., other databases have been created since the start of VAMDC. About 90% of our databases are used for astrophysical applications. The data can be queried, retrieved, visualized in a single format from a general portal (http://portal.vamdc.eu) and VAMDC is also developing standalone tools in order to retrieve and handle the data. VAMDC provides software and support in order to include databases within the VAMDC e-infrastructure. One current feature of VAMDC is the constrained environnement of description of data that ensures a higher quality for distribution of data; a future feature is the link of VAMDC with evaluation/validation groups. The talk will present the VAMDC Consortium and the VAMDC e infrastructure with its underlying technology, its services, its science use cases and its etension towards other communities than the academic research community.
Planform: an application and database of graph-encoded planarian regenerative experiments.
Lobo, Daniel; Malone, Taylor J; Levin, Michael
2013-04-15
Understanding the mechanisms governing the regeneration capabilities of many organisms is a fundamental interest in biology and medicine. An ever-increasing number of manipulation and molecular experiments are attempting to discover a comprehensive model for regeneration, with the planarian flatworm being one of the most important model species. Despite much effort, no comprehensive, constructive, mechanistic models exist yet, and it is now clear that computational tools are needed to mine this huge dataset. However, until now, there is no database of regenerative experiments, and the current genotype-phenotype ontologies and databases are based on textual descriptions, which are not understandable by computers. To overcome these difficulties, we present here Planform (Planarian formalization), a manually curated database and software tool for planarian regenerative experiments, based on a mathematical graph formalism. The database contains more than a thousand experiments from the main publications in the planarian literature. The software tool provides the user with a graphical interface to easily interact with and mine the database. The presented system is a valuable resource for the regeneration community and, more importantly, will pave the way for the application of novel artificial intelligence tools to extract knowledge from this dataset. The database and software tool are freely available at http://planform.daniel-lobo.com.
VaProS: a database-integration approach for protein/genome information retrieval.
Gojobori, Takashi; Ikeo, Kazuho; Katayama, Yukie; Kawabata, Takeshi; Kinjo, Akira R; Kinoshita, Kengo; Kwon, Yeondae; Migita, Ohsuke; Mizutani, Hisashi; Muraoka, Masafumi; Nagata, Koji; Omori, Satoshi; Sugawara, Hideaki; Yamada, Daichi; Yura, Kei
2016-12-01
Life science research now heavily relies on all sorts of databases for genome sequences, transcription, protein three-dimensional (3D) structures, protein-protein interactions, phenotypes and so forth. The knowledge accumulated by all the omics research is so vast that a computer-aided search of data is now a prerequisite for starting a new study. In addition, a combinatory search throughout these databases has a chance to extract new ideas and new hypotheses that can be examined by wet-lab experiments. By virtually integrating the related databases on the Internet, we have built a new web application that facilitates life science researchers for retrieving experts' knowledge stored in the databases and for building a new hypothesis of the research target. This web application, named VaProS, puts stress on the interconnection between the functional information of genome sequences and protein 3D structures, such as structural effect of the gene mutation. In this manuscript, we present the notion of VaProS, the databases and tools that can be accessed without any knowledge of database locations and data formats, and the power of search exemplified in quest of the molecular mechanisms of lysosomal storage disease. VaProS can be freely accessed at http://p4d-info.nig.ac.jp/vapros/ .
Exploring of the molecular mechanism of rhinitis via bioinformatics methods
Song, Yufen; Yan, Zhaohui
2018-01-01
The aim of this study was to analyze gene expression profiles for exploring the function and regulatory network of differentially expressed genes (DEGs) in pathogenesis of rhinitis by a bioinformatics method. The gene expression profile of GSE43523 was downloaded from the Gene Expression Omnibus database. The dataset contained 7 seasonal allergic rhinitis samples and 5 non-allergic normal samples. DEGs between rhinitis samples and normal samples were identified via the limma package of R. The webGestal database was used to identify enriched Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways of the DEGs. The differentially co-expressed pairs of the DEGs were identified via the DCGL package in R, and the differential co-expression network was constructed based on these pairs. A protein-protein interaction (PPI) network of the DEGs was constructed based on the Search Tool for the Retrieval of Interacting Genes database. A total of 263 DEGs were identified in rhinitis samples compared with normal samples, including 125 downregulated ones and 138 upregulated ones. The DEGs were enriched in 7 KEGG pathways. 308 differential co-expression gene pairs were obtained. A differential co-expression network was constructed, containing 212 nodes. In total, 148 PPI pairs of the DEGs were identified, and a PPI network was constructed based on these pairs. Bioinformatics methods could help us identify significant genes and pathways related to the pathogenesis of rhinitis. Steroid biosynthesis pathway and metabolic pathways might play important roles in the development of allergic rhinitis (AR). Genes such as CDC42 effector protein 5, solute carrier family 39 member A11 and PR/SET domain 10 might be also associated with the pathogenesis of AR, which provided references for the molecular mechanisms of AR. PMID:29257233
Biopython: freely available Python tools for computational molecular biology and bioinformatics
Cock, Peter J. A.; Antao, Tiago; Chang, Jeffrey T.; Chapman, Brad A.; Cox, Cymon J.; Dalke, Andrew; Friedberg, Iddo; Hamelryck, Thomas; Kauff, Frank; Wilczynski, Bartek; de Hoon, Michiel J. L.
2009-01-01
Summary: The Biopython project is a mature open source international collaboration of volunteer developers, providing Python libraries for a wide range of bioinformatics problems. Biopython includes modules for reading and writing different sequence file formats and multiple sequence alignments, dealing with 3D macro molecular structures, interacting with common tools such as BLAST, ClustalW and EMBOSS, accessing key online databases, as well as providing numerical methods for statistical learning. Availability: Biopython is freely available, with documentation and source code at www.biopython.org under the Biopython license. Contact: All queries should be directed to the Biopython mailing lists, see www.biopython.org/wiki/_Mailing_listspeter.cock@scri.ac.uk. PMID:19304878
Kumar, Rakesh; Jade, Dhananjay; Gupta, Dinesh
2018-03-05
5-HydroxyTriptamine 2A antagonists are potential targets for treatment of various cerebrovascular and cardiovascular disorders. In this study, we have developed and performed a unique screening pipeline for filtering ZINC database compounds on the basis of similarities to known antagonists to determine novel small molecule antagonists of 5-HydroxyTriptamine 2A. The screening pipeline is based on 2D similarity, 3D dissimilarity and a combination of 2D/3D similarity. The shortlisted compounds were docked to a 5-HydroxyTriptamine 2A homology-based model, and complexes with low binding energies (287 complexes) were selected for molecular dynamics (MD) simulations in a lipid bilayer. The MD simulations of the shortlisted compounds in complex with 5-HydroxyTriptamine 2A confirmed the stability of the complexes and revealed novel interaction insights. The receptor residues S239, N343, S242, S159, Y370 and D155 predominantly participate in hydrogen bonding. π-π stacking is observed in F339, F340, F234, W151 and W336, whereas hydrophobic interactions are observed amongst V156, F339, F234, V362, V366, F340, V235, I152 and W151. The known and potential antagonists shortlisted by us have similar overlapping molecular interaction patterns. The 287 potential 5-HydroxyTriptamine 2A antagonists may be experimentally verified.
Hayashi, Takanori; Matsuzaki, Yuri; Yanagisawa, Keisuke; Ohue, Masahito; Akiyama, Yutaka
2018-05-08
Protein-protein interactions (PPIs) play several roles in living cells, and computational PPI prediction is a major focus of many researchers. The three-dimensional (3D) structure and binding surface are important for the design of PPI inhibitors. Therefore, rigid body protein-protein docking calculations for two protein structures are expected to allow elucidation of PPIs different from known complexes in terms of 3D structures because known PPI information is not explicitly required. We have developed rapid PPI prediction software based on protein-protein docking, called MEGADOCK. In order to fully utilize the benefits of computational PPI predictions, it is necessary to construct a comprehensive database to gather prediction results and their predicted 3D complex structures and to make them easily accessible. Although several databases exist that provide predicted PPIs, the previous databases do not contain a sufficient number of entries for the purpose of discovering novel PPIs. In this study, we constructed an integrated database of MEGADOCK PPI predictions, named MEGADOCK-Web. MEGADOCK-Web provides more than 10 times the number of PPI predictions than previous databases and enables users to conduct PPI predictions that cannot be found in conventional PPI prediction databases. In MEGADOCK-Web, there are 7528 protein chains and 28,331,628 predicted PPIs from all possible combinations of those proteins. Each protein structure is annotated with PDB ID, chain ID, UniProt AC, related KEGG pathway IDs, and known PPI pairs. Additionally, MEGADOCK-Web provides four powerful functions: 1) searching precalculated PPI predictions, 2) providing annotations for each predicted protein pair with an experimentally known PPI, 3) visualizing candidates that may interact with the query protein on biochemical pathways, and 4) visualizing predicted complex structures through a 3D molecular viewer. MEGADOCK-Web provides a huge amount of comprehensive PPI predictions based on docking calculations with biochemical pathways and enables users to easily and quickly assess PPI feasibilities by archiving PPI predictions. MEGADOCK-Web also promotes the discovery of new PPIs and protein functions and is freely available for use at http://www.bi.cs.titech.ac.jp/megadock-web/ .
Paula, Débora P.; Linard, Benjamin; Crampton-Platt, Alex; Srivathsan, Amrita; Timmermans, Martijn J. T. N.; Sujii, Edison R.; Pires, Carmen S. S.; Souza, Lucas M.; Andow, David A.; Vogler, Alfried P.
2016-01-01
Characterizing trophic networks is fundamental to many questions in ecology, but this typically requires painstaking efforts, especially to identify the diet of small generalist predators. Several attempts have been devoted to develop suitable molecular tools to determine predatory trophic interactions through gut content analysis, and the challenge has been to achieve simultaneously high taxonomic breadth and resolution. General and practical methods are still needed, preferably independent of PCR amplification of barcodes, to recover a broader range of interactions. Here we applied shotgun-sequencing of the DNA from arthropod predator gut contents, extracted from four common coccinellid and dermapteran predators co-occurring in an agroecosystem in Brazil. By matching unassembled reads against six DNA reference databases obtained from public databases and newly assembled mitogenomes, and filtering for high overlap length and identity, we identified prey and other foreign DNA in the predator guts. Good taxonomic breadth and resolution was achieved (93% of prey identified to species or genus), but with low recovery of matching reads. Two to nine trophic interactions were found for these predators, some of which were only inferred by the presence of parasitoids and components of the microbiome known to be associated with aphid prey. Intraguild predation was also found, including among closely related ladybird species. Uncertainty arises from the lack of comprehensive reference databases and reliance on low numbers of matching reads accentuating the risk of false positives. We discuss caveats and some future prospects that could improve the use of direct DNA shotgun-sequencing to characterize arthropod trophic networks. PMID:27622637
ERIC Educational Resources Information Center
Sillince, J. A. A.; Sillince, M.
1993-01-01
Discusses molecular databases and the role that government and private companies play in their administration and development. Highlights include copyright and patent issues relating to public databases and the information contained in them; data quality; data structures and technological questions; the international organization of molecular…
POLYVIEW-MM: web-based platform for animation and analysis of molecular simulations
Porollo, Aleksey; Meller, Jaroslaw
2010-01-01
Molecular simulations offer important mechanistic and functional clues in studies of proteins and other macromolecules. However, interpreting the results of such simulations increasingly requires tools that can combine information from multiple structural databases and other web resources, and provide highly integrated and versatile analysis tools. Here, we present a new web server that integrates high-quality animation of molecular motion (MM) with structural and functional analysis of macromolecules. The new tool, dubbed POLYVIEW-MM, enables animation of trajectories generated by molecular dynamics and related simulation techniques, as well as visualization of alternative conformers, e.g. obtained as a result of protein structure prediction methods or small molecule docking. To facilitate structural analysis, POLYVIEW-MM combines interactive view and analysis of conformational changes using Jmol and its tailored extensions, publication quality animation using PyMol, and customizable 2D summary plots that provide an overview of MM, e.g. in terms of changes in secondary structure states and relative solvent accessibility of individual residues in proteins. Furthermore, POLYVIEW-MM integrates visualization with various structural annotations, including automated mapping of known inter-action sites from structural homologs, mapping of cavities and ligand binding sites, transmembrane regions and protein domains. URL: http://polyview.cchmc.org/conform.html. PMID:20504857
NASA Astrophysics Data System (ADS)
Satpati, Suresh; Manohar, Kodavati; Acharya, Narottam; Dixit, Anshuman
2017-01-01
Genomic instability in Candida albicans is believed to play a crucial role in fungal pathogenesis. DNA polymerases contribute significantly to stability of any genome. Although Candida Genome database predicts presence of S. cerevisiae DNA polymerase orthologs; functional and structural characterizations of Candida DNA polymerases are still unexplored. DNA polymerase eta (Polη) is unique as it promotes efficient bypass of cyclobutane pyrimidine dimers. Interestingly, C. albicans is heterozygous in carrying two Polη genes and the nucleotide substitutions were found only in the ORFs. As allelic differences often result in functional differences of the encoded proteins, comparative analyses of structural models and molecular dynamic simulations were performed to characterize these orthologs of DNA Polη. Overall structures of both the ORFs remain conserved except subtle differences in the palm and PAD domains. The complementation analysis showed that both the ORFs equally suppressed UV sensitivity of yeast rad30 deletion strain. Our study has predicted two novel molecular interactions, a highly conserved molecular tetrad of salt bridges and a series of π-π interactions spanning from thumb to PAD. This study suggests these ORFs as the homologues of yeast Polη, and due to its heterogeneity in C. albicans they may play a significant role in pathogenicity.
Database resources of the National Center for Biotechnology Information.
2016-01-04
The National Center for Biotechnology Information (NCBI) provides a large suite of online resources for biological information and data, including the GenBank(®) nucleic acid sequence database and the PubMed database of citations and abstracts for published life science journals. Additional NCBI resources focus on literature (PubMed Central (PMC), Bookshelf and PubReader), health (ClinVar, dbGaP, dbMHC, the Genetic Testing Registry, HIV-1/Human Protein Interaction Database and MedGen), genomes (BioProject, Assembly, Genome, BioSample, dbSNP, dbVar, Epigenomics, the Map Viewer, Nucleotide, Probe, RefSeq, Sequence Read Archive, the Taxonomy Browser and the Trace Archive), genes (Gene, Gene Expression Omnibus (GEO), HomoloGene, PopSet and UniGene), proteins (Protein, the Conserved Domain Database (CDD), COBALT, Conserved Domain Architecture Retrieval Tool (CDART), the Molecular Modeling Database (MMDB) and Protein Clusters) and chemicals (Biosystems and the PubChem suite of small molecule databases). The Entrez system provides search and retrieval operations for most of these databases. Augmenting many of the web applications are custom implementations of the BLAST program optimized to search specialized datasets. All of these resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov. Published by Oxford University Press on behalf of Nucleic Acids Research 2015. This work is written by (a) US Government employee(s) and is in the public domain in the US.
Database resources of the National Center for Biotechnology Information.
2015-01-01
The National Center for Biotechnology Information (NCBI) provides a large suite of online resources for biological information and data, including the GenBank(®) nucleic acid sequence database and the PubMed database of citations and abstracts for published life science journals. Additional NCBI resources focus on literature (Bookshelf, PubMed Central (PMC) and PubReader); medical genetics (ClinVar, dbMHC, the Genetic Testing Registry, HIV-1/Human Protein Interaction Database and MedGen); genes and genomics (BioProject, BioSample, dbSNP, dbVar, Epigenomics, Gene, Gene Expression Omnibus (GEO), Genome, HomoloGene, the Map Viewer, Nucleotide, PopSet, Probe, RefSeq, Sequence Read Archive, the Taxonomy Browser, Trace Archive and UniGene); and proteins and chemicals (Biosystems, COBALT, the Conserved Domain Database (CDD), the Conserved Domain Architecture Retrieval Tool (CDART), the Molecular Modeling Database (MMDB), Protein Clusters, Protein and the PubChem suite of small molecule databases). The Entrez system provides search and retrieval operations for many of these databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of these resources can be accessed through the NCBI home page at http://www.ncbi.nlm.nih.gov. Published by Oxford University Press on behalf of Nucleic Acids Research 2014. This work is written by (a) US Government employee(s) and is in the public domain in the US.
Suganya, P Rathi; Kalva, Sukesh; Saleena, Lilly M
2016-01-01
ADAMTS4 (Aggrecanase-1) is an important enzyme, which belongs to ADAMTS family. Aggrecanase-1 is involved in aggrecan degradation of articular cartilage in osteoarthritis and rheumatoid arthritis. Overall variability of S1' domain of ADAMTS4 has been the main selectivity determinant to design the unique inhibitors. 34 inhibitors from Binding database and literature were used to develop the pharmacophore model. The five featured pharmacophore model AHHRR had the best survival score of 3.493 and post-hoc score of 2.545, indicating that the model is highly reliable. The 3D-QSAR acquired had excellent r(2) value of 0.99 and GH score of 0.839. The validated pharmacophore model was used for insilico screening of Asinex and ZINC database for finding the potential lead compounds. ZINC00987406 and ASN04459656 which pose high glide score i.e >7 Kcal/mol and H-bond and hydrophobic interactions in the S1'loop residues of ADAMTS4 were subjected to Molecular Dynamics Simulation studies. Molecular dynamic simulation result indicates that the RMSD and RMSF of backbone atoms for the above complexes were within the limit of 2.0 A˚. These compounds can be potential candidates for osteoarthritis by inhibiting ADAMTS4.
Molecular nutrition research: the modern way of performing nutritional science.
Norheim, Frode; Gjelstad, Ingrid Merethe Fange; Hjorth, Marit; Vinknes, Kathrine J; Langleite, Torgrim M; Holen, Torgeir; Jensen, Jørgen; Dalen, Knut Tomas; Karlsen, Anette S; Kielland, Anders; Rustan, Arild C; Drevon, Christian A
2012-12-03
In spite of amazing progress in food supply and nutritional science, and a striking increase in life expectancy of approximately 2.5 months per year in many countries during the previous 150 years, modern nutritional research has a great potential of still contributing to improved health for future generations, granted that the revolutions in molecular and systems technologies are applied to nutritional questions. Descriptive and mechanistic studies using state of the art epidemiology, food intake registration, genomics with single nucleotide polymorphisms (SNPs) and epigenomics, transcriptomics, proteomics, metabolomics, advanced biostatistics, imaging, calorimetry, cell biology, challenge tests (meals, exercise, etc.), and integration of all data by systems biology, will provide insight on a much higher level than today in a field we may name molecular nutrition research. To take advantage of all the new technologies scientists should develop international collaboration and gather data in large open access databases like the suggested Nutritional Phenotype database (dbNP). This collaboration will promote standardization of procedures (SOP), and provide a possibility to use collected data in future research projects. The ultimate goals of future nutritional research are to understand the detailed mechanisms of action for how nutrients/foods interact with the body and thereby enhance health and treat diet-related diseases.
Molecular Nutrition Research—The Modern Way Of Performing Nutritional Science
Norheim, Frode; Gjelstad, Ingrid M. F.; Hjorth, Marit; Vinknes, Kathrine J.; Langleite, Torgrim M.; Holen, Torgeir; Jensen, Jørgen; Dalen, Knut Tomas; Karlsen, Anette S.; Kielland, Anders; Rustan, Arild C.; Drevon, Christian A.
2012-01-01
In spite of amazing progress in food supply and nutritional science, and a striking increase in life expectancy of approximately 2.5 months per year in many countries during the previous 150 years, modern nutritional research has a great potential of still contributing to improved health for future generations, granted that the revolutions in molecular and systems technologies are applied to nutritional questions. Descriptive and mechanistic studies using state of the art epidemiology, food intake registration, genomics with single nucleotide polymorphisms (SNPs) and epigenomics, transcriptomics, proteomics, metabolomics, advanced biostatistics, imaging, calorimetry, cell biology, challenge tests (meals, exercise, etc.), and integration of all data by systems biology, will provide insight on a much higher level than today in a field we may name molecular nutrition research. To take advantage of all the new technologies scientists should develop international collaboration and gather data in large open access databases like the suggested Nutritional Phenotype database (dbNP). This collaboration will promote standardization of procedures (SOP), and provide a possibility to use collected data in future research projects. The ultimate goals of future nutritional research are to understand the detailed mechanisms of action for how nutrients/foods interact with the body and thereby enhance health and treat diet-related diseases. PMID:23208524
Urban, Martin; Cuzick, Alayne; Rutherford, Kim; Irvine, Alistair; Pedro, Helder; Pant, Rashmi; Sadanadan, Vidyendra; Khamari, Lokanath; Billal, Santoshkumar; Mohanty, Sagar; Hammond-Kosack, Kim E.
2017-01-01
The pathogen–host interactions database (PHI-base) is available at www.phi-base.org. PHI-base contains expertly curated molecular and biological information on genes proven to affect the outcome of pathogen–host interactions reported in peer reviewed research articles. In addition, literature that indicates specific gene alterations that did not affect the disease interaction phenotype are curated to provide complete datasets for comparative purposes. Viruses are not included. Here we describe a revised PHI-base Version 4 data platform with improved search, filtering and extended data display functions. A PHIB-BLAST search function is provided and a link to PHI-Canto, a tool for authors to directly curate their own published data into PHI-base. The new release of PHI-base Version 4.2 (October 2016) has an increased data content containing information from 2219 manually curated references. The data provide information on 4460 genes from 264 pathogens tested on 176 hosts in 8046 interactions. Prokaryotic and eukaryotic pathogens are represented in almost equal numbers. Host species belong ∼70% to plants and 30% to other species of medical and/or environmental importance. Additional data types included into PHI-base 4 are the direct targets of pathogen effector proteins in experimental and natural host organisms. The curation problems encountered and the future directions of the PHI-base project are briefly discussed. PMID:27915230
Bioinformatic prediction of leader genes in human periodontitis.
Covani, Ugo; Marconcini, Simone; Giacomelli, Luca; Sivozhelevov, Victor; Barone, Antonio; Nicolini, Claudio
2008-10-01
Genes involved in different biologic processes form complex interaction networks. However, only a few have a high number of interactions with the other genes in the network. In previous bioinformatics and experimental studies concerning the T lymphocyte cell cycle, these genes were identified and termed "leader genes." In this work, genes involved in human periodontitis were tentatively identified and ranked according to their number of interactions to obtain a preliminary, broader view of molecular mechanisms of periodontitis and plan targeted experimentation. Genes were identified with interrelated queries of several databases. The interactions among these genes were mapped and given a significance score. The weighted number of links (weighted sum of scores for every interaction in which the given gene is involved) was calculated for each gene. Genes were clustered according to this parameter. The genes in the highest cluster were termed leader genes. Sixty-one genes involved or potentially involved in periodontitis were identified. Only five were identified as leader genes, whereas 12 others were ranked in an immediately lower cluster. For 10 of 17 genes there is evidence of involvement in periodontitis; seven new genes that are potentially involved in this disease were identified. The involvement in periodontitis has been completely established for only two leader genes. We applied a validated bioinformatics algorithm to increase our knowledge of molecular mechanisms of periodontitis. Even with the limitations of this ab initio analysis, this theoretical study can suggest ad hoc experimentation targeted on significant genes and, therefore, simpler than mass-scale molecular genomics. Moreover, the identification of leader genes might suggest new potential risk factors and therapeutic targets.
Screening the molecular targets of ovarian cancer based on bioinformatics analysis.
Du, Lei; Qian, Xiaolei; Dai, Chenyang; Wang, Lihua; Huang, Ding; Wang, Shuying; Shen, Xiaowei
2015-01-01
Ovarian cancer (OC) is the most lethal gynecologic malignancy. This study aims to explore the molecular mechanisms of OC and identify potential molecular targets for OC treatment. Microarray gene expression data (GSE14407) including 12 normal ovarian surface epithelia samples and 12 OC epithelia samples were downloaded from Gene Expression Omnibus database. Differentially expressed genes (DEGs) between 2 kinds of ovarian tissue were identified by using limma package in R language (|log2 fold change| gt;1 and false discovery rate [FDR] lt;0.05). Protein-protein interactions (PPIs) and known OC-related genes were screened from COXPRESdb and GenBank database, respectively. Furthermore, PPI network of top 10 upregulated DEGs and top 10 downregulated DEGs was constructed and visualized through Cytoscape software. Finally, for the genes involved in PPI network, functional enrichment analysis was performed by using DAVID (FDR lt;0.05). In total, 1136 DEGs were identified, including 544 downregulated and 592 upregulated DEGs. Then, PPI network was constructed, and DEGs CDKN2A, MUC1, OGN, ZIC1, SOX17, and TFAP2A interacted with known OC-related genes CDK4, EGFR/JUN, SRC, CLI1, CTNNB1, and TP53, respectively. Moreover, functions about oxygen transport and embryonic development were enriched by the genes involved in the network of downregulated DEGs. We propose that 4 DEGs (OGN, ZIC1, SOX17, and TFAP2A) and 2 functions (oxygen transport and embryonic development) might play a role in the development of OC. These 4 DEGs and known OC-related genes might serve as therapeutic targets for OC. Further studies are required to validate these predictions.
Ruiz, Patricia; Perlina, Ally; Mumtaz, Moiz; Fowler, Bruce A
2016-07-01
A number of epidemiological studies have identified statistical associations between persistent organic pollutants (POPs) and metabolic diseases, but testable hypotheses regarding underlying molecular mechanisms to explain these linkages have not been published. We assessed the underlying mechanisms of POPs that have been associated with metabolic diseases; three well-known POPs [2,3,7,8-tetrachlorodibenzodioxin (TCDD), 2,2´,4,4´,5,5´-hexachlorobiphenyl (PCB 153), and 4,4´-dichlorodiphenyldichloroethylene (p,p´-DDE)] were studied. We used advanced database search tools to delineate testable hypotheses and to guide laboratory-based research studies into underlying mechanisms by which this POP mixture could produce or exacerbate metabolic diseases. For our searches, we used proprietary systems biology software (MetaCore™/MetaDrug™) to conduct advanced search queries for the underlying interactions database, followed by directional network construction to identify common mechanisms for these POPs within two or fewer interaction steps downstream of their primary targets. These common downstream pathways belong to various cytokine and chemokine families with experimentally well-documented causal associations with type 2 diabetes. Our systems biology approach allowed identification of converging pathways leading to activation of common downstream targets. To our knowledge, this is the first study to propose an integrated global set of step-by-step molecular mechanisms for a combination of three common POPs using a systems biology approach, which may link POP exposure to diseases. Experimental evaluation of the proposed pathways may lead to development of predictive biomarkers of the effects of POPs, which could translate into disease prevention and effective clinical treatment strategies. Ruiz P, Perlina A, Mumtaz M, Fowler BA. 2016. A systems biology approach reveals converging molecular mechanisms that link different POPs to common metabolic diseases. Environ Health Perspect 124:1034-1041; http://dx.doi.org/10.1289/ehp.1510308.
DrugBank 5.0: a major update to the DrugBank database for 2018.
Wishart, David S; Feunang, Yannick D; Guo, An C; Lo, Elvis J; Marcu, Ana; Grant, Jason R; Sajed, Tanvir; Johnson, Daniel; Li, Carin; Sayeeda, Zinat; Assempour, Nazanin; Iynkkaran, Ithayavani; Liu, Yifeng; Maciejewski, Adam; Gale, Nicola; Wilson, Alex; Chin, Lucy; Cummings, Ryan; Le, Diana; Pon, Allison; Knox, Craig; Wilson, Michael
2018-01-04
DrugBank (www.drugbank.ca) is a web-enabled database containing comprehensive molecular information about drugs, their mechanisms, their interactions and their targets. First described in 2006, DrugBank has continued to evolve over the past 12 years in response to marked improvements to web standards and changing needs for drug research and development. This year's update, DrugBank 5.0, represents the most significant upgrade to the database in more than 10 years. In many cases, existing data content has grown by 100% or more over the last update. For instance, the total number of investigational drugs in the database has grown by almost 300%, the number of drug-drug interactions has grown by nearly 600% and the number of SNP-associated drug effects has grown more than 3000%. Significant improvements have been made to the quantity, quality and consistency of drug indications, drug binding data as well as drug-drug and drug-food interactions. A great deal of brand new data have also been added to DrugBank 5.0. This includes information on the influence of hundreds of drugs on metabolite levels (pharmacometabolomics), gene expression levels (pharmacotranscriptomics) and protein expression levels (pharmacoprotoemics). New data have also been added on the status of hundreds of new drug clinical trials and existing drug repurposing trials. Many other important improvements in the content, interface and performance of the DrugBank website have been made and these should greatly enhance its ease of use, utility and potential applications in many areas of pharmacological research, pharmaceutical science and drug education. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Sig2BioPAX: Java tool for converting flat files to BioPAX Level 3 format.
Webb, Ryan L; Ma'ayan, Avi
2011-03-21
The World Wide Web plays a critical role in enabling molecular, cell, systems and computational biologists to exchange, search, visualize, integrate, and analyze experimental data. Such efforts can be further enhanced through the development of semantic web concepts. The semantic web idea is to enable machines to understand data through the development of protocol free data exchange formats such as Resource Description Framework (RDF) and the Web Ontology Language (OWL). These standards provide formal descriptors of objects, object properties and their relationships within a specific knowledge domain. However, the overhead of converting datasets typically stored in data tables such as Excel, text or PDF into RDF or OWL formats is not trivial for non-specialists and as such produces a barrier to seamless data exchange between researchers, databases and analysis tools. This problem is particularly of importance in the field of network systems biology where biochemical interactions between genes and their protein products are abstracted to networks. For the purpose of converting biochemical interactions into the BioPAX format, which is the leading standard developed by the computational systems biology community, we developed an open-source command line tool that takes as input tabular data describing different types of molecular biochemical interactions. The tool converts such interactions into the BioPAX level 3 OWL format. We used the tool to convert several existing and new mammalian networks of protein interactions, signalling pathways, and transcriptional regulatory networks into BioPAX. Some of these networks were deposited into PathwayCommons, a repository for consolidating and organizing biochemical networks. The software tool Sig2BioPAX is a resource that enables experimental and computational systems biologists to contribute their identified networks and pathways of molecular interactions for integration and reuse with the rest of the research community.
Reddy, Karnati Konda; Singh, Poonam; Singh, Sanjeev Kumar
2014-03-04
HIV-1 integrase (IN) mediates integration of viral cDNA into the host cell genome, an essential step in the retroviral life cycle. The human lens epithelium-derived growth factor (LEDGF/p75) is a co-factor of HIV-1 IN that plays a crucial role in viral integration. Because of its crucial role in early steps of HIV replication, the IN-LEDGF/p75 interaction represents an attractive target for anti-HIV drug discovery. In this study, the IN-LEDGF/p75 interaction was studied by in silico mutational studies and molecular dynamics simulations. The results showed that all of the key residues in the LEDGF/p75 binding pocket of IN protein are important for stabilization of the complex. Structure-based virtual screening against HIV-1 IN using the ChemBridge database was performed through three different protocols of docking simulations with varying precisions and computational intensities. Six compounds based on the docking score, binding affinity and pharmacokinetic parameters were selected and an analysis of the interactions with key amino acid residues of IN was carried out. Subsequently, molecular dynamics simulations of these compounds in the LEDGF/p75 binding site of IN were carried out in order to study the stability of complexes and their hydrogen bonding interactions. IN residues Glu170, His171, and Thr174 in chain A as well as Gln95 and Thr125 in chain B were discovered to play important roles in the binding of compounds. These findings could be helpful for blocking IN-LEDGF/p75 interaction, and provide a method for avoiding viral resistance and cross-resistance.
Li, Min; Dong, Xiang-yu; Liang, Hao; Leng, Li; Zhang, Hui; Wang, Shou-zhi; Li, Hui; Du, Zhi-Qiang
2017-05-20
Effective management and analysis of precisely recorded phenotypic traits are important components of the selection and breeding of superior livestocks. Over two decades, we divergently selected chicken lines for abdominal fat content at Northeast Agricultural University (Northeast Agricultural University High and Low Fat, NEAUHLF), and collected large volume of phenotypic data related to the investigation on molecular genetic basis of adipose tissue deposition in broilers. To effectively and systematically store, manage and analyze phenotypic data, we built the NEAUHLF Phenome Database (NEAUHLFPD). NEAUHLFPD included the following phenotypic records: pedigree (generations 1-19) and 29 phenotypes, such as body sizes and weights, carcass traits and their corresponding rates. The design and construction strategy of NEAUHLFPD were executed as follows: (1) Framework design. We used Apache as our web server, MySQL and Navicat as database management tools, and PHP as the HTML-embedded language to create dynamic interactive website. (2) Structural components. On the main interface, detailed introduction on the composition, function, and the index buttons of the basic structure of the database could be found. The functional modules of NEAUHLFPD had two main components: the first module referred to the physical storage space for phenotypic data, in which functional manipulation on data can be realized, such as data indexing, filtering, range-setting, searching, etc.; the second module related to the calculation of basic descriptive statistics, where data filtered from the database can be used for the computation of basic statistical parameters and the simultaneous conditional sorting. NEAUHLFPD could be used to effectively store and manage not only phenotypic, but also genotypic and genomics data, which can facilitate further investigation on the molecular genetic basis of chicken adipose tissue growth and development, and expedite the selection and breeding of broilers with low fat content.
sc-PDB: a 3D-database of ligandable binding sites--10 years on.
Desaphy, Jérémy; Bret, Guillaume; Rognan, Didier; Kellenberger, Esther
2015-01-01
The sc-PDB database (available at http://bioinfo-pharma.u-strasbg.fr/scPDB/) is a comprehensive and up-to-date selection of ligandable binding sites of the Protein Data Bank. Sites are defined from complexes between a protein and a pharmacological ligand. The database provides the all-atom description of the protein, its ligand, their binding site and their binding mode. Currently, the sc-PDB archive registers 9283 binding sites from 3678 unique proteins and 5608 unique ligands. The sc-PDB database was publicly launched in 2004 with the aim of providing structure files suitable for computational approaches to drug design, such as docking. During the last 10 years we have improved and standardized the processes for (i) identifying binding sites, (ii) correcting structures, (iii) annotating protein function and ligand properties and (iv) characterizing their binding mode. This paper presents the latest enhancements in the database, specifically pertaining to the representation of molecular interaction and to the similarity between ligand/protein binding patterns. The new website puts emphasis in pictorial analysis of data. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
MGDB: a comprehensive database of genes involved in melanoma.
Zhang, Di; Zhu, Rongrong; Zhang, Hanqian; Zheng, Chun-Hou; Xia, Junfeng
2015-01-01
The Melanoma Gene Database (MGDB) is a manually curated catalog of molecular genetic data relating to genes involved in melanoma. The main purpose of this database is to establish a network of melanoma related genes and to facilitate the mechanistic study of melanoma tumorigenesis. The entries describing the relationships between melanoma and genes in the current release were manually extracted from PubMed abstracts, which contains cumulative to date 527 human melanoma genes (422 protein-coding and 105 non-coding genes). Each melanoma gene was annotated in seven different aspects (General Information, Expression, Methylation, Mutation, Interaction, Pathway and Drug). In addition, manually curated literature references have also been provided to support the inclusion of the gene in MGDB and establish its association with melanoma. MGDB has a user-friendly web interface with multiple browse and search functions. We hoped MGDB will enrich our knowledge about melanoma genetics and serve as a useful complement to the existing public resources. Database URL: http://bioinfo.ahu.edu.cn:8080/Melanoma/index.jsp. © The Author(s) 2015. Published by Oxford University Press.
Relax with CouchDB--into the non-relational DBMS era of bioinformatics.
Manyam, Ganiraju; Payton, Michelle A; Roth, Jack A; Abruzzo, Lynne V; Coombes, Kevin R
2012-07-01
With the proliferation of high-throughput technologies, genome-level data analysis has become common in molecular biology. Bioinformaticians are developing extensive resources to annotate and mine biological features from high-throughput data. The underlying database management systems for most bioinformatics software are based on a relational model. Modern non-relational databases offer an alternative that has flexibility, scalability, and a non-rigid design schema. Moreover, with an accelerated development pace, non-relational databases like CouchDB can be ideal tools to construct bioinformatics utilities. We describe CouchDB by presenting three new bioinformatics resources: (a) geneSmash, which collates data from bioinformatics resources and provides automated gene-centric annotations, (b) drugBase, a database of drug-target interactions with a web interface powered by geneSmash, and (c) HapMap-CN, which provides a web interface to query copy number variations from three SNP-chip HapMap datasets. In addition to the web sites, all three systems can be accessed programmatically via web services. Copyright © 2012 Elsevier Inc. All rights reserved.
PharmDB-K: Integrated Bio-Pharmacological Network Database for Traditional Korean Medicine
Lee, Ji-Hyun; Park, Kyoung Mii; Han, Dong-Jin; Bang, Nam Young; Kim, Do-Hee; Na, Hyeongjin; Lim, Semi; Kim, Tae Bum; Kim, Dae Gyu; Kim, Hyun-Jung; Chung, Yeonseok; Sung, Sang Hyun; Surh, Young-Joon; Kim, Sunghoon; Han, Byung Woo
2015-01-01
Despite the growing attention given to Traditional Medicine (TM) worldwide, there is no well-known, publicly available, integrated bio-pharmacological Traditional Korean Medicine (TKM) database for researchers in drug discovery. In this study, we have constructed PharmDB-K, which offers comprehensive information relating to TKM-associated drugs (compound), disease indication, and protein relationships. To explore the underlying molecular interaction of TKM, we integrated fourteen different databases, six Pharmacopoeias, and literature, and established a massive bio-pharmacological network for TKM and experimentally validated some cases predicted from the PharmDB-K analyses. Currently, PharmDB-K contains information about 262 TKMs, 7,815 drugs, 3,721 diseases, 32,373 proteins, and 1,887 side effects. One of the unique sets of information in PharmDB-K includes 400 indicator compounds used for standardization of herbal medicine. Furthermore, we are operating PharmDB-K via phExplorer (a network visualization software) and BioMart (a data federation framework) for convenient search and analysis of the TKM network. Database URL: http://pharmdb-k.org, http://biomart.i-pharm.org. PMID:26555441
PlaMoM: a comprehensive database compiles plant mobile macromolecules.
Guan, Daogang; Yan, Bin; Thieme, Christoph; Hua, Jingmin; Zhu, Hailong; Boheler, Kenneth R; Zhao, Zhongying; Kragler, Friedrich; Xia, Yiji; Zhang, Shoudong
2017-01-04
In plants, various phloem-mobile macromolecules including noncoding RNAs, mRNAs and proteins are suggested to act as important long-distance signals in regulating crucial physiological and morphological transition processes such as flowering, plant growth and stress responses. Given recent advances in high-throughput sequencing technologies, numerous mobile macromolecules have been identified in diverse plant species from different plant families. However, most of the identified mobile macromolecules are not annotated in current versions of species-specific databases and are only available as non-searchable datasheets. To facilitate study of the mobile signaling macromolecules, we compiled the PlaMoM (Plant Mobile Macromolecules) database, a resource that provides convenient and interactive search tools allowing users to retrieve, to analyze and also to predict mobile RNAs/proteins. Each entry in the PlaMoM contains detailed information such as nucleotide/amino acid sequences, ortholog partners, related experiments, gene functions and literature. For the model plant Arabidopsis thaliana, protein-protein interactions of mobile transcripts are presented as interactive molecular networks. Furthermore, PlaMoM provides a built-in tool to identify potential RNA mobility signals such as tRNA-like structures. The current version of PlaMoM compiles a total of 17 991 mobile macromolecules from 14 plant species/ecotypes from published data and literature. PlaMoM is available at http://www.systembioinfo.org/plamom/. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Database resources of the National Center for Biotechnology Information.
Sayers, Eric W; Barrett, Tanya; Benson, Dennis A; Bolton, Evan; Bryant, Stephen H; Canese, Kathi; Chetvernin, Vyacheslav; Church, Deanna M; DiCuccio, Michael; Federhen, Scott; Feolo, Michael; Fingerman, Ian M; Geer, Lewis Y; Helmberg, Wolfgang; Kapustin, Yuri; Landsman, David; Lipman, David J; Lu, Zhiyong; Madden, Thomas L; Madej, Tom; Maglott, Donna R; Marchler-Bauer, Aron; Miller, Vadim; Mizrachi, Ilene; Ostell, James; Panchenko, Anna; Phan, Lon; Pruitt, Kim D; Schuler, Gregory D; Sequeira, Edwin; Sherry, Stephen T; Shumway, Martin; Sirotkin, Karl; Slotta, Douglas; Souvorov, Alexandre; Starchenko, Grigory; Tatusova, Tatiana A; Wagner, Lukas; Wang, Yanli; Wilbur, W John; Yaschenko, Eugene; Ye, Jian
2011-01-01
In addition to maintaining the GenBank® nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data made available through the NCBI Web site. NCBI resources include Entrez, the Entrez Programming Utilities, MyNCBI, PubMed, PubMed Central (PMC), Entrez Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Primer-BLAST, COBALT, Electronic PCR, OrfFinder, Splign, ProSplign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, dbVar, Epigenomics, Cancer Chromosomes, Entrez Genomes and related tools, the Map Viewer, Model Maker, Evidence Viewer, Trace Archive, Sequence Read Archive, Retroviral Genotyping Tools, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus (GEO), Entrez Probe, GENSAT, Online Mendelian Inheritance in Man (OMIM), Online Mendelian Inheritance in Animals (OMIA), the Molecular Modeling Database (MMDB), the Conserved Domain Database (CDD), the Conserved Domain Architecture Retrieval Tool (CDART), IBIS, Biosystems, Peptidome, OMSSA, Protein Clusters and the PubChem suite of small molecule databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of these resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov.
EDCs DataBank: 3D-Structure database of endocrine disrupting chemicals.
Montes-Grajales, Diana; Olivero-Verbel, Jesus
2015-01-02
Endocrine disrupting chemicals (EDCs) are a group of compounds that affect the endocrine system, frequently found in everyday products and epidemiologically associated with several diseases. The purpose of this work was to develop EDCs DataBank, the only database of EDCs with three-dimensional structures. This database was built on MySQL using the EU list of potential endocrine disruptors and TEDX list. It contains the three-dimensional structures available on PubChem, as well as a wide variety of information from different databases and text mining tools, useful for almost any kind of research regarding EDCs. The web platform was developed employing HTML, CSS and PHP languages, with dynamic contents in a graphic environment, facilitating information analysis. Currently EDCs DataBank has 615 molecules, including pesticides, natural and industrial products, cosmetics, drugs and food additives, among other low molecular weight xenobiotics. Therefore, this database can be used to study the toxicological effects of these molecules, or to develop pharmaceuticals targeting hormone receptors, through docking studies, high-throughput virtual screening and ligand-protein interaction analysis. EDCs DataBank is totally user-friendly and the 3D-structures of the molecules can be downloaded in several formats. This database is freely available at http://edcs.unicartagena.edu.co. Copyright © 2014. Published by Elsevier Ireland Ltd.
3D visualization of molecular structures in the MOGADOC database
NASA Astrophysics Data System (ADS)
Vogt, Natalja; Popov, Evgeny; Rudert, Rainer; Kramer, Rüdiger; Vogt, Jürgen
2010-08-01
The MOGADOC database (Molecular Gas-Phase Documentation) is a powerful tool to retrieve information about compounds which have been studied in the gas-phase by electron diffraction, microwave spectroscopy and molecular radio astronomy. Presently the database contains over 34,500 bibliographic references (from the beginning of each method) for about 10,000 inorganic, organic and organometallic compounds and structural data (bond lengths, bond angles, dihedral angles, etc.) for about 7800 compounds. Most of the implemented molecular structures are given in a three-dimensional (3D) presentation. To create or edit and visualize the 3D images of molecules, new tools (special editor and Java-based 3D applet) were developed. Molecular structures in internal coordinates were converted to those in Cartesian coordinates.
Brozovic, Matija; Dantec, Christelle; Dardaillon, Justine; Dauga, Delphine; Faure, Emmanuel; Gineste, Mathieu; Louis, Alexandra; Naville, Magali; Nitta, Kazuhiro R; Piette, Jacques; Reeves, Wendy; Scornavacca, Céline; Simion, Paul; Vincentelli, Renaud; Bellec, Maelle; Aicha, Sameh Ben; Fagotto, Marie; Guéroult-Bellone, Marion; Haeussler, Maximilian; Jacox, Edwin; Lowe, Elijah K; Mendez, Mickael; Roberge, Alexis; Stolfi, Alberto; Yokomori, Rui; Cambillau, Christian; Christiaen, Lionel; Delsuc, Frédéric; Douzery, Emmanuel; Dumollard, Rémi; Kusakabe, Takehiro; Nakai, Kenta; Nishida, Hiroki; Satou, Yutaka; Swalla, Billie; Veeman, Michael; Volff, Jean-Nicolas
2018-01-01
Abstract ANISEED (www.aniseed.cnrs.fr) is the main model organism database for tunicates, the sister-group of vertebrates. This release gives access to annotated genomes, gene expression patterns, and anatomical descriptions for nine ascidian species. It provides increased integration with external molecular and taxonomy databases, better support for epigenomics datasets, in particular RNA-seq, ChIP-seq and SELEX-seq, and features novel interactive interfaces for existing and novel datatypes. In particular, the cross-species navigation and comparison is enhanced through a novel taxonomy section describing each represented species and through the implementation of interactive phylogenetic gene trees for 60% of tunicate genes. The gene expression section displays the results of RNA-seq experiments for the three major model species of solitary ascidians. Gene expression is controlled by the binding of transcription factors to cis-regulatory sequences. A high-resolution description of the DNA-binding specificity for 131 Ciona robusta (formerly C. intestinalis type A) transcription factors by SELEX-seq is provided and used to map candidate binding sites across the Ciona robusta and Phallusia mammillata genomes. Finally, use of a WashU Epigenome browser enhances genome navigation, while a Genomicus server was set up to explore microsynteny relationships within tunicates and with vertebrates, Amphioxus, echinoderms and hemichordates. PMID:29149270
JAIL: a structure-based interface library for macromolecules.
Günther, Stefan; von Eichborn, Joachim; May, Patrick; Preissner, Robert
2009-01-01
The increasing number of solved macromolecules provides a solid number of 3D interfaces, if all types of molecular contacts are being considered. JAIL annotates three different kinds of macromolecular interfaces, those between interacting protein domains, interfaces of different protein chains and interfaces between proteins and nucleic acids. This results in a total number of about 184,000 database entries. All the interfaces can easily be identified by a detailed search form or by a hierarchical tree that describes the protein domain architectures classified by the SCOP database. Visual inspection of the interfaces is possible via an interactive protein viewer. Furthermore, large scale analyses are supported by an implemented sequential and by a structural clustering. Similar interfaces as well as non-redundant interfaces can be easily picked out. Additionally, the sequential conservation of binding sites was also included in the database and is retrievable via Jmol. A comprehensive download section allows the composition of representative data sets with user defined parameters. The huge data set in combination with various search options allow a comprehensive view on all interfaces between macromolecules included in the Protein Data Bank (PDB). The download of the data sets supports numerous further investigations in macromolecular recognition. JAIL is publicly available at http://bioinformatics.charite.de/jail.
The 2015 Nucleic Acids Research Database Issue and molecular biology database collection.
Galperin, Michael Y; Rigden, Daniel J; Fernández-Suárez, Xosé M
2015-01-01
The 2015 Nucleic Acids Research Database Issue contains 172 papers that include descriptions of 56 new molecular biology databases, and updates on 115 databases whose descriptions have been previously published in NAR or other journals. Following the classification that has been introduced last year in order to simplify navigation of the entire issue, these articles are divided into eight subject categories. This year's highlights include RNAcentral, an international community portal to various databases on noncoding RNA; ValidatorDB, a validation database for protein structures and their ligands; SASBDB, a primary repository for small-angle scattering data of various macromolecular complexes; MoonProt, a database of 'moonlighting' proteins, and two new databases of protein-protein and other macromolecular complexes, ComPPI and the Complex Portal. This issue also includes an unusually high number of cancer-related databases and other databases dedicated to genomic basics of disease and potential drugs and drug targets. The size of NAR online Molecular Biology Database Collection, http://www.oxfordjournals.org/nar/database/a/, remained approximately the same, following the addition of 74 new resources and removal of 77 obsolete web sites. The entire Database Issue is freely available online on the Nucleic Acids Research web site (http://nar.oxfordjournals.org/). Published by Oxford University Press on behalf of Nucleic Acids Research 2014. This work is written by (a) US Government employee(s) and is in the public domain in the US.
Discovery of novel human acrosin inhibitors by virtual screening
NASA Astrophysics Data System (ADS)
Liu, Xuefei; Dong, Guoqiang; Zhang, Jue; Qi, Jingjing; Zheng, Canhui; Zhou, Youjun; Zhu, Ju; Sheng, Chunquan; Lü, Jiaguo
2011-10-01
Human acrosin is an attractive target for the discovery of male contraceptive drugs. For the first time, structure-based drug design was applied to discover structurally diverse human acrosin inhibitors. A parallel virtual screening strategy in combination with pharmacophore-based and docking-based techniques was used to screen the SPECS database. From 16 compounds selected by virtual screening, a total of 10 compounds were found to be human acrosin inhibitors. Compound 2 was found to be the most potent hit (IC50 = 14 μM) and its binding mode was investigated by molecular dynamics simulations. The hit interacted with human acrosin mainly through hydrophobic and hydrogen-bonding interactions, which provided a good starting structure for further optimization studies.
NASA Technical Reports Server (NTRS)
Timofeeva, Tatiana V.; Nesterov, Vladimir N.; Antipin, Mikhail Yu.; Clark, Ronald D.; Sanghadasa, Mohan; Cardelino, Beatriz H.; Moore, Craig E.; Frazier, Donald O.
1999-01-01
A search for potential nonlinear optical compounds was performed using the Cambridge Structure Database and molecular modeling. We investigated a series of monosubstituted derivatives of dicyanovinylbenzene, since the nonlinear optical (NLO) properties of such derivatives (o-methoxy-dicyanovinylbenzene, DIVA) were studied earlier. The molecular geometry of these compounds was investigated with x-ray analysis and discussed along with the results of molecular mechanics and ab initio quantum chemical calculations. The influence of crystal packing on the planarity of the molecules of this series has been revealed. Two new compounds from the series studied, ortho-F and para-Cl-dicyanovinylbenzene, according to powder measurements, were found to be NLO compounds in the crystal state about 10 times more active than urea. The peculiarities of crystal structure formation in the framework of balance between van der Waals and electrostatic interactions have been discussed. The crystal shape of DIVA and two new NLO compounds have been calculated on the basis of the known crystal structure.
NASA Astrophysics Data System (ADS)
Kerdcharoen, Teerakiat; Morokuma, Keiji
2003-05-01
An extension of the ONIOM (Own N-layered Integrated molecular Orbital and molecular Mechanics) method [M. Svensson, S. Humbel, R. D. J. Froese, T. Mutsubara, S. Sieber, and K. Morokuma, J. Phys. Chem. 100, 19357 (1996)] for simulation in the condensed phase, called ONIOM-XS (XS=eXtension to Solvation) [T. Kerdcharoen and K. Morokuma, Chem. Phys. Lett. 355, 257 (2002)], was applied to investigate the coordination of Ca2+ in liquid ammonia. A coordination number of 6 is found. Previous simulations based on pair potential or pair potential plus three-body correction gave values of 9 and 8.2, respectively. The new value is the same as the coordination number most frequently listed in the Cambridge Structural Database (CSD) and Protein Data Bank (PDB). N-Ca-N angular distribution reveals a near-octahedral coordination structure. Inclusion of many-body interactions (which amounts to 25% of the pair interactions) into the potential energy surface is essential for obtaining reasonable coordination number. Analyses of the metal coordination in water, water-ammonia mixture, and in proteins reveals that cation/ammonia solution can be used to approximate the coordination environment in proteins.
Analysis of A Drug Target-based Classification System using Molecular Descriptors.
Lu, Jing; Zhang, Pin; Bi, Yi; Luo, Xiaomin
2016-01-01
Drug-target interaction is an important topic in drug discovery and drug repositioning. KEGG database offers a drug annotation and classification using a target-based classification system. In this study, we gave an investigation on five target-based classes: (I) G protein-coupled receptors; (II) Nuclear receptors; (III) Ion channels; (IV) Enzymes; (V) Pathogens, using molecular descriptors to represent each drug compound. Two popular feature selection methods, maximum relevance minimum redundancy and incremental feature selection, were adopted to extract the important descriptors. Meanwhile, an optimal prediction model based on nearest neighbor algorithm was constructed, which got the best result in identifying drug target-based classes. Finally, some key descriptors were discussed to uncover their important roles in the identification of drug-target classes.
MoCha: Molecular Characterization of Unknown Pathways.
Lobo, Daniel; Hammelman, Jennifer; Levin, Michael
2016-04-01
Automated methods for the reverse-engineering of complex regulatory networks are paving the way for the inference of mechanistic comprehensive models directly from experimental data. These novel methods can infer not only the relations and parameters of the known molecules defined in their input datasets, but also unknown components and pathways identified as necessary by the automated algorithms. Identifying the molecular nature of these unknown components is a crucial step for making testable predictions and experimentally validating the models, yet no specific and efficient tools exist to aid in this process. To this end, we present here MoCha (Molecular Characterization), a tool optimized for the search of unknown proteins and their pathways from a given set of known interacting proteins. MoCha uses the comprehensive dataset of protein-protein interactions provided by the STRING database, which currently includes more than a billion interactions from over 2,000 organisms. MoCha is highly optimized, performing typical searches within seconds. We demonstrate the use of MoCha with the characterization of unknown components from reverse-engineered models from the literature. MoCha is useful for working on network models by hand or as a downstream step of a model inference engine workflow and represents a valuable and efficient tool for the characterization of unknown pathways using known data from thousands of organisms. MoCha and its source code are freely available online under the GPLv3 license.
Hemalatha, R G; Pradeep, T
2013-08-07
The difference in size, shape, and chemical cues of leaves and flowers display the underlying genetic makeup and their interactions with the environment. The need to understand the molecular signatures of these fragile plant surfaces is illustrated with a model plant, Madagascar periwinkle (Catharanthus roseus (L.) G. Don). Flat, thin layer chromatographic imprints of leaves/petals were imaged using desorption electrospray ionization mass spectrometry (DESI MS), and the results were compared with electrospray ionization mass spectrometry (ESI MS) of their extracts. Tandem mass spectrometry with DESI and ESI, in conjunction with database records, confirmed the molecular species. This protocol has been extended to other plants. Implications of this study in identifying varietal differences, toxic metabolite production, changes in metabolites during growth, pest/pathogen attack, and natural stresses are shown with illustrations. The possibility to image subtle features like eye color of petals, leaf vacuole, leaf margin, and veins is demonstrated.
[Validation of interaction databases in psychopharmacotherapy].
Hahn, M; Roll, S C
2018-03-01
Drug-drug interaction databases are an important tool to increase drug safety in polypharmacy. There are several drug interaction databases available but it is unclear which one shows the best results and therefore increases safety for the user of the databases and the patients. So far, there has been no validation of German drug interaction databases. Validation of German drug interaction databases regarding the number of hits, mechanisms of drug interaction, references, clinical advice, and severity of the interaction. A total of 36 drug interactions which were published in the last 3-5 years were checked in 5 different databases. Besides the number of hits, it was also documented if the mechanism was correct, clinical advice was given, primary literature was cited, and the severity level of the drug-drug interaction was given. All databases showed weaknesses regarding the hit rate of the tested drug interactions, with a maximum of 67.7% hits. The highest score in this validation was achieved by MediQ with 104 out of 180 points. PsiacOnline achieved 83 points, arznei-telegramm® 58, ifap index® 54 and the ABDA-database 49 points. Based on this validation MediQ seems to be the most suitable databank for the field of psychopharmacotherapy. The best results in this comparison were achieved by MediQ but this database also needs improvement with respect to the hit rate so that the users can rely on the results and therefore increase drug therapy safety.
Zhu, Chen; Ai, Lin; Wang, Li; Yin, Pingping; Liu, Chenglan; Li, Shanshan; Zeng, Huiming
2016-01-01
Zoysia japonica brown spot was caused by necrotrophic fungus Rhizoctonia solani invasion, which led to severe financial loss in city lawn and golf ground maintenance. However, little was known about the molecular mechanism of R. solani pathogenicity in Z. japonica. In this study we examined early stage interaction between R. solani AG1 IA strain and Z. japonica cultivar "Zenith" root by cell ultra-structure analysis, pathogenesis-related proteins assay and transcriptome analysis to explore molecular clues for AG1 IA strain pathogenicity in Z. japonica. No obvious cell structure damage was found in infected roots and most pathogenesis-related protein activities showedg a downward trend especially in 36 h post inoculation, which exhibits AG1 IA strain stealthy invasion characteristic. According to Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) database classification, most DEGs in infected "Zenith" roots dynamically changed especially in three aspects, signal transduction, gene translation, and protein synthesis. Total 3422 unigenes of "Zenith" root were predicted into 14 kinds of resistance (R) gene class. Potential fungal resistance related unigenes of "Zenith" root were involved in ligin biosynthesis, phytoalexin synthesis, oxidative burst, wax biosynthesis, while two down-regulated unigenes encoding leucine-rich repeat receptor protein kinase and subtilisin-like protease might be important for host-derived signal perception to AG1 IA strain invasion. According to Pathogen Host Interaction (PHI) database annotation, 1508 unigenes of AG1 IA strain were predicted and classified into 37 known pathogen species, in addition, unigenes encoding virulence, signaling, host stress tolerance, and potential effector were also predicted. This research uncovered transcriptional profiling during the early phase interaction between R. solani AG1 IA strain and Z. japonica, and will greatly help identify key pathogenicity of AG1 IA strain.
Columba: an integrated database of proteins, structures, and annotations.
Trissl, Silke; Rother, Kristian; Müller, Heiko; Steinke, Thomas; Koch, Ina; Preissner, Robert; Frömmel, Cornelius; Leser, Ulf
2005-03-31
Structural and functional research often requires the computation of sets of protein structures based on certain properties of the proteins, such as sequence features, fold classification, or functional annotation. Compiling such sets using current web resources is tedious because the necessary data are spread over many different databases. To facilitate this task, we have created COLUMBA, an integrated database of annotations of protein structures. COLUMBA currently integrates twelve different databases, including PDB, KEGG, Swiss-Prot, CATH, SCOP, the Gene Ontology, and ENZYME. The database can be searched using either keyword search or data source-specific web forms. Users can thus quickly select and download PDB entries that, for instance, participate in a particular pathway, are classified as containing a certain CATH architecture, are annotated as having a certain molecular function in the Gene Ontology, and whose structures have a resolution under a defined threshold. The results of queries are provided in both machine-readable extensible markup language and human-readable format. The structures themselves can be viewed interactively on the web. The COLUMBA database facilitates the creation of protein structure data sets for many structure-based studies. It allows to combine queries on a number of structure-related databases not covered by other projects at present. Thus, information on both many and few protein structures can be used efficiently. The web interface for COLUMBA is available at http://www.columba-db.de.
Spectroscopic data for an astronomy database
NASA Technical Reports Server (NTRS)
Parkinson, W. H.; Smith, Peter L.
1995-01-01
Very few of the atomic and molecular data used in analyses of astronomical spectra are currently available in World Wide Web (WWW) databases that are searchable with hypertext browsers. We have begun to rectify this situation by making extensive atomic data files available with simple search procedures. We have also established links to other on-line atomic and molecular databases. All can be accessed from our database homepage with URL: http:// cfa-www.harvard.edu/ amp/ data/ amdata.html.
NASA Technical Reports Server (NTRS)
Timofeeva, Tatyana V.; Nesterov, Vladimir N.; Antipin, Mikhael Y.; Clark, R. D.; Sanghadasa, M.; Cardelino, B. H.; Moore, C. E.; Frazier, Donald O.
2000-01-01
A search for potential nonlinear optical (NLO) compounds has been performed using the Cambridge Structural Database and molecular modeling. We have studied a series of mono-substituted derivatives of dicyanovinylbenzene as the NLO properties of one of its derivatives (o-methoxy-dicyanovinylbenzene, DIVA) were described earlier. The molecular geometry in the series of the compounds studied was investigated with an X- ray analysis and discussed along with results of molecular mechanics and ab initio quantum chemical calculations. The influence of crystal packing on the molecular planarity has been revealed. Two new compounds from the series studied were found to be active for second harmonic generation (SHG) in the powder. The measurements of SHG efficiency have shown that the o-F- and p-Cl-derivatives of dicyanovinylbenzene are about 10 and 20- times more active than urea, respectively. The peculiarities of crystal structure formation in the framework of balance between the van der Waals and electrostatic interactions have been discussed. The crystal morphology of DIVA and two new SHG-active compounds have been calculated on the basis of their known crystal structures.
The immune response against Candida spp. and Sporothrix schenckii.
Martínez-Álvarez, José A; Pérez-García, Luis A; Flores-Carreón, Arturo; Mora-Montes, Héctor M
2014-01-01
Candida albicans is the main causative agent of systemic candidiasis, a condition with high mortality rates. The study of the interaction between C. albicans and immune system components has been thoroughly studied and nowadays there is a model for the anti-C. albicans immune response; however, little is known about the sensing of other pathogenic species of the Candida genus. Sporothrix schenckii is the causative agent of sporotrichosis, a subcutaneous mycosis, and thus far there is limited information about its interaction with the immune system. In this paper, we review the most recent information about the immune sensing of species from genus Candida and S. schenckii. Thoroughly searches in scientific journal databases were performed, looking for papers addressing either Candida- or Sporothrix-immune system interactions. There is a significant advance in the knowledge of non-C. albicans species of Candida and Sporothrix immune sensing; however, there are still relevant points to address, such as the specific contribution of pathogen-associated molecular patterns (PAMPs) for sensing by different immune cells and the immune receptors involved in such interactions. This manuscript is part of the series of works presented at the "V International Workshop: Molecular genetic approaches to the study of human pathogenic fungi" (Oaxaca, Mexico, 2012). Copyright © 2013 Revista Iberoamericana de Micología. Published by Elsevier Espana. All rights reserved.
Pan, Weiran; Li, Gang; Yang, Xiaoxiao; Miao, Jinming
2015-04-01
This study aims to explore the potential mechanism of glioma through bioinformatic approaches. The gene expression profile (GSE4290) of glioma tumor and non-tumor samples was downloaded from Gene Expression Omnibus database. A total of 180 samples were available, including 23 non-tumor and 157 tumor samples. Then the raw data were preprocessed using robust multiarray analysis, and 8,890 differentially expressed genes (DEGs) were identified by using t-test (false discovery rate < 0.0005). Furthermore, 16 known glioma related genes were abstracted from Genetic Association Database. After mapping 8,890 DEGs and 16 known glioma related genes to Human Protein Reference Database, a glioma associated protein-protein interaction network (GAPN) was constructed. In addition, 51 sub-networks in GAPN were screened out through Molecular Complex Detection (score ≥ 1), and sub-network 1 was found to have the closest interaction (score = 3). What' more, for the top 10 sub-networks, Gene Ontology (GO) enrichment analysis (p value < 0.05) was performed, and DEGs involved in sub-network 1 and 2, such as BRMS1L and CCNA1, were predicted to regulate cell growth, cell cycle, and DNA replication via interacting with known glioma related genes. Finally, the overlaps of DEGs and human essential, housekeeping, tissue-specific genes were calculated (p value = 1.0, 1.0, and 0.00014, respectively) and visualized by Venn Diagram package in R. About 61% of human tissue-specific genes were DEGs as well. This research shed new light on the pathogenesis of glioma based on DEGs and GAPN, and our findings might provide potential targets for clinical glioma treatment.
Chen, Xi; Lu, Fang; Jiang, Lu-di; Cai, Yi-Lian; Li, Gong-Yu; Zhang, Yan-Ling
2016-07-01
Inhibition of cytochrome P450 (CYP450) enzymes is the most common reasons for drug interactions, so the study on early prediction of CYPs inhibitors can help to decrease the incidence of adverse reactions caused by drug interactions.CYP450 2E1(CYP2E1), as a key role in drug metabolism process, has broad spectrum of drug metabolism substrate. In this study, 32 CYP2E1 inhibitors were collected for the construction of support vector regression (SVR) model. The test set data were used to verify CYP2E1 quantitative models and obtain the optimal prediction model of CYP2E1 inhibitor. Meanwhile, one molecular docking program, CDOCKER, was utilized to analyze the interaction pattern between positive compounds and active pocket to establish the optimal screening model of CYP2E1 inhibitors.SVR model and molecular docking prediction model were combined to screen traditional Chinese medicine database (TCMD), which could improve the calculation efficiency and prediction accuracy. 6 376 traditional Chinese medicine (TCM) compounds predicted by SVR model were obtained, and in further verification by using molecular docking model, 247 TCM compounds with potential inhibitory activities against CYP2E1 were finally retained. Some of them have been verified by experiments. The results demonstrated that this study could provide guidance for the virtual screening of CYP450 inhibitors and the prediction of CYPs-mediated DDIs, and also provide references for clinical rational drug use. Copyright© by the Chinese Pharmaceutical Association.
MSDB: A Comprehensive Database of Simple Sequence Repeats.
Avvaru, Akshay Kumar; Saxena, Saketh; Sowpati, Divya Tej; Mishra, Rakesh Kumar
2017-06-01
Microsatellites, also known as Simple Sequence Repeats (SSRs), are short tandem repeats of 1-6 nt motifs present in all genomes, particularly eukaryotes. Besides their usefulness as genome markers, SSRs have been shown to perform important regulatory functions, and variations in their length at coding regions are linked to several disorders in humans. Microsatellites show a taxon-specific enrichment in eukaryotic genomes, and some may be functional. MSDB (Microsatellite Database) is a collection of >650 million SSRs from 6,893 species including Bacteria, Archaea, Fungi, Plants, and Animals. This database is by far the most exhaustive resource to access and analyze SSR data of multiple species. In addition to exploring data in a customizable tabular format, users can view and compare the data of multiple species simultaneously using our interactive plotting system. MSDB is developed using the Django framework and MySQL. It is freely available at http://tdb.ccmb.res.in/msdb. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Dubovenko, Alexey; Nikolsky, Yuri; Rakhmatulin, Eugene; Nikolskaya, Tatiana
2017-01-01
Analysis of NGS and other sequencing data, gene variants, gene expression, proteomics, and other high-throughput (OMICs) data is challenging because of its biological complexity and high level of technical and biological noise. One way to deal with both problems is to perform analysis with a high fidelity annotated knowledgebase of protein interactions, pathways, and functional ontologies. This knowledgebase has to be structured in a computer-readable format and must include software tools for managing experimental data, analysis, and reporting. Here, we present MetaCore™ and Key Pathway Advisor (KPA), an integrated platform for functional data analysis. On the content side, MetaCore and KPA encompass a comprehensive database of molecular interactions of different types, pathways, network models, and ten functional ontologies covering human, mouse, and rat genes. The analytical toolkit includes tools for gene/protein list enrichment analysis, statistical "interactome" tool for the identification of over- and under-connected proteins in the dataset, and a biological network analysis module made up of network generation algorithms and filters. The suite also features Advanced Search, an application for combinatorial search of the database content, as well as a Java-based tool called Pathway Map Creator for drawing and editing custom pathway maps. Applications of MetaCore and KPA include molecular mode of action of disease research, identification of potential biomarkers and drug targets, pathway hypothesis generation, analysis of biological effects for novel small molecule compounds and clinical applications (analysis of large cohorts of patients, and translational and personalized medicine).
NASA Astrophysics Data System (ADS)
Endres, Christian P.; Schlemmer, Stephan; Schilke, Peter; Stutzki, Jürgen; Müller, Holger S. P.
2016-09-01
The Cologne Database for Molecular Spectroscopy, CDMS, was founded 1998 to provide in its catalog section line lists of mostly molecular species which are or may be observed in various astronomical sources (usually) by radio astronomical means. The line lists contain transition frequencies with qualified accuracies, intensities, quantum numbers, as well as further auxiliary information. They have been generated from critically evaluated experimental line lists, mostly from laboratory experiments, employing established Hamiltonian models. Separate entries exist for different isotopic species and usually also for different vibrational states. As of December 2015, the number of entries is 792. They are available online as ascii tables with additional files documenting information on the entries. The Virtual Atomic and Molecular Data Centre, VAMDC, was founded more than 5 years ago as a common platform for atomic and molecular data. This platform facilitates exchange not only between spectroscopic databases related to astrophysics or astrochemistry, but also with collisional and kinetic databases. A dedicated infrastructure was developed to provide a common data format in the various databases enabling queries to a large variety of databases on atomic and molecular data at once. For CDMS, the incorporation in VAMDC was combined with several modifications on the generation of CDMS catalog entries. Here we introduce related changes to the data structure and the data content in the CDMS. The new data scheme allows us to incorporate all previous data entries but in addition allows us also to include entries based on new theoretical descriptions. Moreover, the CDMS entries have been transferred into a mySQL database format. These developments within the VAMDC framework have in part been driven by the needs of the astronomical community to be able to deal efficiently with large data sets obtained with the Herschel Space Telescope or, more recently, with the Atacama Large Millimeter Array.
Burns, Gully A P C; Dasigi, Pradeep; de Waard, Anita; Hovy, Eduard H
2016-01-01
Automated machine-reading biocuration systems typically use sentence-by-sentence information extraction to construct meaning representations for use by curators. This does not directly reflect the typical discourse structure used by scientists to construct an argument from the experimental data available within a article, and is therefore less likely to correspond to representations typically used in biomedical informatics systems (let alone to the mental models that scientists have). In this study, we develop Natural Language Processing methods to locate, extract, and classify the individual passages of text from articles' Results sections that refer to experimental data. In our domain of interest (molecular biology studies of cancer signal transduction pathways), individual articles may contain as many as 30 small-scale individual experiments describing a variety of findings, upon which authors base their overall research conclusions. Our system automatically classifies discourse segments in these texts into seven categories (fact, hypothesis, problem, goal, method, result, implication) with an F-score of 0.68. These segments describe the essential building blocks of scientific discourse to (i) provide context for each experiment, (ii) report experimental details and (iii) explain the data's meaning in context. We evaluate our system on text passages from articles that were curated in molecular biology databases (the Pathway Logic Datum repository, the Molecular Interaction MINT and INTACT databases) linking individual experiments in articles to the type of assay used (coprecipitation, phosphorylation, translocation etc.). We use supervised machine learning techniques on text passages containing unambiguous references to experiments to obtain baseline F1 scores of 0.59 for MINT, 0.71 for INTACT and 0.63 for Pathway Logic. Although preliminary, these results support the notion that targeting information extraction methods to experimental results could provide accurate, automated methods for biocuration. We also suggest the need for finer-grained curation of experimental methods used when constructing molecular biology databases. © The Author(s) 2016. Published by Oxford University Press.
Virtual Atomic and Molecular Data Center (VAMDC) and Stark-B Database
NASA Astrophysics Data System (ADS)
Dimitrijevic, M. S.; Sahal-Brechot, S.; Kovacevic, A.; Jevremovic, D.; Popovic, L. C.; VAMDC Consortium; Dubernet, Marie-Lise
2012-01-01
Virtual Atomic and Molecular Data Center (VAMDC) is an European FP7 project with aims to build a flexible and interoperable e-science environment based interface to the existing Atomic and Molecular data. The VAMDC will be built upon the expertise of existing Atomic and Molecular databases, data producers and service providers with the specific aim of creating an infrastructure that is easily tuned to the requirements of a wide variety of users in academic, governmental, industrial or public communities. In VAMDC will enter also STARK-B database, containing Stark broadening parameters for a large number of lines, obtained by the semiclassical perturbation method during more than 30 years of collaboration of authors of this work (MSD and SSB) and their co-workers. In this contribution we will review the VAMDC project, STARK-B database and discuss the benefits of both for the corresponding data users.
Wichapong, K; Nueangaudom, A; Pianwanit, S; Sippl, W; Kokpol, S
2013-09-01
Dengue virus (DV) infections are a serious public health problem and there is currently no vaccine or drug treatment. NS2B/NS3 protease, an essential enzyme for viral replication, is one of the promising targets in the search for drugs against DV. In this research work, virtual screening (VS) was carried out on four multi-conformational databases using several criteria. Firstly, molecular dynamics simulations of the NS2B/NS3 protease and four known inhibitors, which reveal an importance of both electrostatic and van der Waals interactions in stabilizing the ligand-enzyme interaction, were used to generate three different pharmacophore models (a structure-based, a static and a dynamic). Subsequently, these three models were employed for pharmacophore search in the VS. Secondly, compounds passing the first criterion were further reduced using the Lipinski's rule of five to keep only compounds with drug-like properties. Thirdly, molecular docking calculations were performed to remove compounds with unsuitable ligand-enzyme interactions. Finally, binding free energy of each compound was calculated. Compounds having better energy than the known inhibitors were selected and thus 20 potential hits were obtained.
Lauria, Antonino; Ippolito, Mario; Almerico, Anna Maria
2009-10-01
Inhibiting a protein that regulates multiple signal transduction pathways in cancer cells is an attractive goal for cancer therapy. Heat shock protein 90 (Hsp90) is one of the most promising molecular targets for such an approach. In fact, Hsp90 is a ubiquitous molecular chaperone protein that is involved in folding, activating and assembling of many key mediators of signal transduction, cellular growth, differentiation, stress-response and apoptothic pathways. With the aim to analyze which molecular descriptors have the higher importance in the binding interactions of these classes, we first performed molecular docking experiments on the 187 Hsp90 inhibitors included in the BindingDB, a public database of measured binding affinities. Further, for each frozen conformation obtained from the docking, a set of 250 molecular descriptors was calculated, and the resulting Structure/Descriptors matrix was submitted to Principal Component Analysis. From the factor scores it emerged a good clusterization among similar compounds both in terms of structural class and activity spectrum, while examination of the loadings of the first two factors also allowed to study the classes of descriptors which mainly contribute to each one.
Saxena, Shalini; Abdullah, Maaged; Sriram, Dharmarajan; Guruprasad, Lalitha
2017-10-17
MurG (Rv2153c) is a key player in the biosynthesis of the peptidoglycan layer in Mycobacterium tuberculosis (Mtb). This work is an attempt to highlight the structural and functional relationship of Mtb MurG, the three-dimensional (3D) structure of protein was constructed by homology modelling using Discovery Studio 3.5 software. The quality and consistency of generated model was assessed by PROCHECK, ProSA and ERRAT. Later, the model was optimized by molecular dynamics (MD) simulations and the optimized model complex with substrate Uridine-diphosphate-N-acetylglucosamine (UD1) facilitated us to employ structure-based virtual screening approach to obtain new hits from Asinex database using energy-optimized pharmacophore modelling (e-pharmacophore). The pharmacophore model was validated using enrichment calculations, and finally, validated model was employed for high-throughput virtual screening and molecular docking to identify novel Mtb MurG inhibitors. This study led to the identification of 10 potential compounds with good fitness, docking score, which make important interactions with the protein active site. The 25 ns MD simulations of three potential lead compounds with protein confirmed that the structure was stable and make several non-bonding interactions with amino acids, such as Leu290, Met310 and Asn167. Hence, we concluded that the identified compounds may act as new leads for the design of Mtb MurG inhibitors.
Urban, Martin; Cuzick, Alayne; Rutherford, Kim; Irvine, Alistair; Pedro, Helder; Pant, Rashmi; Sadanadan, Vidyendra; Khamari, Lokanath; Billal, Santoshkumar; Mohanty, Sagar; Hammond-Kosack, Kim E
2017-01-04
The pathogen-host interactions database (PHI-base) is available at www.phi-base.org PHI-base contains expertly curated molecular and biological information on genes proven to affect the outcome of pathogen-host interactions reported in peer reviewed research articles. In addition, literature that indicates specific gene alterations that did not affect the disease interaction phenotype are curated to provide complete datasets for comparative purposes. Viruses are not included. Here we describe a revised PHI-base Version 4 data platform with improved search, filtering and extended data display functions. A PHIB-BLAST search function is provided and a link to PHI-Canto, a tool for authors to directly curate their own published data into PHI-base. The new release of PHI-base Version 4.2 (October 2016) has an increased data content containing information from 2219 manually curated references. The data provide information on 4460 genes from 264 pathogens tested on 176 hosts in 8046 interactions. Prokaryotic and eukaryotic pathogens are represented in almost equal numbers. Host species belong ∼70% to plants and 30% to other species of medical and/or environmental importance. Additional data types included into PHI-base 4 are the direct targets of pathogen effector proteins in experimental and natural host organisms. The curation problems encountered and the future directions of the PHI-base project are briefly discussed. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Mardi, Mohsen; Karimi Farsad, Laleh; Gharechahi, Javad; Salekdeh, Ghasem Hosseini
2015-01-01
Witches' broom disease of acid lime greatly affects the production of Mexican lime in Iran. It is caused by a phytoplasma (Candidatus Phytoplasma aurantifolia). However, the molecular mechanisms that underlie phytoplasma pathogenicity and the mode of interactions with host plants are largely unknown. Here, high-throughput transcriptome sequencing was conducted to explore gene expression signatures associated with phytoplasma infection in Mexican lime trees. We assembled 78,185 unique transcript sequences (unigenes) with an average length of 530 nt. Of these, 41,805 (53.4%) were annotated against the NCBI non-redundant (nr) protein database using a BLASTx search (e-value ≤ 1e-5). When the abundances of unigenes in healthy and infected plants were compared, 2,805 transcripts showed significant differences (false discovery rate ≤ 0.001 and log2 ratio ≥ 1.5). These differentially expressed genes (DEGs) were significantly enriched in 43 KEGG metabolic and regulatory pathways. The up-regulated DEGs were mainly categorized into pathways with possible implication in plant-pathogen interaction, including cell wall biogenesis and degradation, sucrose metabolism, secondary metabolism, hormone biosynthesis and signalling, amino acid and lipid metabolism, while down-regulated DEGs were predominantly enriched in ubiquitin proteolysis and oxidative phosphorylation pathways. Our analysis provides novel insight into the molecular pathways that are deregulated during the host-pathogen interaction in Mexican lime trees infected by phytoplasma. The findings can be valuable for unravelling the molecular mechanisms of plant-phytoplasma interactions and can pave the way for engineering lime trees with resistance to witches' broom disease.
Yang, Zhihao; Lin, Yuan; Wu, Jiajin; Tang, Nan; Lin, Hongfei; Li, Yanpeng
2011-10-01
Knowledge about protein-protein interactions (PPIs) unveils the molecular mechanisms of biological processes. However, the volume and content of published biomedical literature on protein interactions is expanding rapidly, making it increasingly difficult for interaction database curators to detect and curate protein interaction information manually. We present a multiple kernel learning-based approach for automatic PPI extraction from biomedical literature. The approach combines the following kernels: feature-based, tree, and graph and combines their output with Ranking support vector machine (SVM). Experimental evaluations show that the features in individual kernels are complementary and the kernel combined with Ranking SVM achieves better performance than those of the individual kernels, equal weight combination and optimal weight combination. Our approach can achieve state-of-the-art performance with respect to the comparable evaluations, with 64.88% F-score and 88.02% AUC on the AImed corpus. Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Evidence for a strong sulfur-aromatic interaction derived from crystallographic data.
Zauhar, R J; Colbert, C L; Morgan, R S; Welsh, W J
2000-03-01
We have uncovered new evidence for a significant interaction between divalent sulfur atoms and aromatic rings. Our study involves a statistical analysis of interatomic distances and other geometric descriptors derived from entries in the Cambridge Crystallographic Database (F. H. Allen and O. Kennard, Chem. Design Auto. News, 1993, Vol. 8, pp. 1 and 31-37). A set of descriptors was defined sufficient in number and type so as to elucidate completely the preferred geometry of interaction between six-membered aromatic carbon rings and divalent sulfurs for all crystal structures of nonmetal-bearing organic compounds present in the database. In order to test statistical significance, analogous probability distributions for the interaction of the moiety X-CH(2)-X with aromatic rings were computed, and taken a priori to correspond to the null hypothesis of no significant interaction. Tests of significance were carried our pairwise between probability distributions of sulfur-aromatic interaction descriptors and their CH(2)-aromatic analogues using the Smirnov-Kolmogorov nonparametric test (W. W. Daniel, Applied Nonparametric Statistics, Houghton-Mifflin: Boston, New York, 1978, pp. 276-286), and in all cases significance at the 99% confidence level or better was observed. Local maxima of the probability distributions were used to define a preferred geometry of interaction between the divalent sulfur moiety and the aromatic ring. Molecular mechanics studies were performed in an effort to better understand the physical basis of the interaction. This study confirms observations based on statistics of interaction of amino acids in protein crystal structures (R. S. Morgan, C. E. Tatsch, R. H. Gushard, J. M. McAdon, and P. K. Warme, International Journal of Peptide Protein Research, 1978, Vol. 11, pp. 209-217; R. S. Morgan and J. M. McAdon, International Journal of Peptide Protein Research, 1980, Vol. 15, pp. 177-180; K. S. C. Reid, P. F. Lindley, and J. M. Thornton, FEBS Letters, 1985, Vol. 190, pp. 209-213), as well as studies involving molecular mechanics (G. Nemethy and H. A. Scheraga, Biochemistry and Biophysics Research Communications, 1981, Vol. 98, pp. 482-487) and quantum chemical calculations (B. V. Cheney, M. W. Schulz, and J. Cheney, Biochimica Biophysica Acta, 1989, Vol. 996, pp.116-124; J. Pranata, Bioorganic Chemistry, 1997, Vol. 25, pp. 213-219)-all of which point to the possible importance of the sulfur-aromatic interaction. However, the preferred geometry of the interaction, as determined from our analysis of the small-molecule crystal data, differs significantly from that found by other approaches. Copyright 2000 John Wiley & Sons, Inc.
Fernández-Suárez, Xosé M; Rigden, Daniel J; Galperin, Michael Y
2014-01-01
The 2014 Nucleic Acids Research Database Issue includes descriptions of 58 new molecular biology databases and recent updates to 123 databases previously featured in NAR or other journals. For convenience, the issue is now divided into eight sections that reflect major subject categories. Among the highlights of this issue are six databases of the transcription factor binding sites in various organisms and updates on such popular databases as CAZy, Database of Genomic Variants (DGV), dbGaP, DrugBank, KEGG, miRBase, Pfam, Reactome, SEED, TCDB and UniProt. There is a strong block of structural databases, which includes, among others, the new RNA Bricks database, updates on PDBe, PDBsum, ArchDB, Gene3D, ModBase, Nucleic Acid Database and the recently revived iPfam database. An update on the NCBI's MMDB describes VAST+, an improved tool for protein structure comparison. Two articles highlight the development of the Structural Classification of Proteins (SCOP) database: one describes SCOPe, which automates assignment of new structures to the existing SCOP hierarchy; the other one describes the first version of SCOP2, with its more flexible approach to classifying protein structures. This issue also includes a collection of articles on bacterial taxonomy and metagenomics, which includes updates on the List of Prokaryotic Names with Standing in Nomenclature (LPSN), Ribosomal Database Project (RDP), the Silva/LTP project and several new metagenomics resources. The NAR online Molecular Biology Database Collection, http://www.oxfordjournals.org/nar/database/c/, has been expanded to 1552 databases. The entire Database Issue is freely available online on the Nucleic Acids Research website (http://nar.oxfordjournals.org/).
[Construction of chemical information database based on optical structure recognition technique].
Lv, C Y; Li, M N; Zhang, L R; Liu, Z M
2018-04-18
To create a protocol that could be used to construct chemical information database from scientific literature quickly and automatically. Scientific literature, patents and technical reports from different chemical disciplines were collected and stored in PDF format as fundamental datasets. Chemical structures were transformed from published documents and images to machine-readable data by using the name conversion technology and optical structure recognition tool CLiDE. In the process of molecular structure information extraction, Markush structures were enumerated into well-defined monomer molecules by means of QueryTools in molecule editor ChemDraw. Document management software EndNote X8 was applied to acquire bibliographical references involving title, author, journal and year of publication. Text mining toolkit ChemDataExtractor was adopted to retrieve information that could be used to populate structured chemical database from figures, tables, and textual paragraphs. After this step, detailed manual revision and annotation were conducted in order to ensure the accuracy and completeness of the data. In addition to the literature data, computing simulation platform Pipeline Pilot 7.5 was utilized to calculate the physical and chemical properties and predict molecular attributes. Furthermore, open database ChEMBL was linked to fetch known bioactivities, such as indications and targets. After information extraction and data expansion, five separate metadata files were generated, including molecular structure data file, molecular information, bibliographical references, predictable attributes and known bioactivities. Canonical simplified molecular input line entry specification as primary key, metadata files were associated through common key nodes including molecular number and PDF number to construct an integrated chemical information database. A reasonable construction protocol of chemical information database was created successfully. A total of 174 research articles and 25 reviews published in Marine Drugs from January 2015 to June 2016 collected as essential data source, and an elementary marine natural product database named PKU-MNPD was built in accordance with this protocol, which contained 3 262 molecules and 19 821 records. This data aggregation protocol is of great help for the chemical information database construction in accuracy, comprehensiveness and efficiency based on original documents. The structured chemical information database can facilitate the access to medical intelligence and accelerate the transformation of scientific research achievements.
The 2018 Nucleic Acids Research database issue and the online molecular biology database collection.
Rigden, Daniel J; Fernández, Xosé M
2018-01-04
The 2018 Nucleic Acids Research Database Issue contains 181 papers spanning molecular biology. Among them, 82 are new and 84 are updates describing resources that appeared in the Issue previously. The remaining 15 cover databases most recently published elsewhere. Databases in the area of nucleic acids include 3DIV for visualisation of data on genome 3D structure and RNArchitecture, a hierarchical classification of RNA families. Protein databases include the established SMART, ELM and MEROPS while GPCRdb and the newcomer STCRDab cover families of biomedical interest. In the area of metabolism, HMDB and Reactome both report new features while PULDB appears in NAR for the first time. This issue also contains reports on genomics resources including Ensembl, the UCSC Genome Browser and ENCODE. Update papers from the IUPHAR/BPS Guide to Pharmacology and DrugBank are highlights of the drug and drug target section while a number of proteomics databases including proteomicsDB are also covered. The entire Database Issue is freely available online on the Nucleic Acids Research website (https://academic.oup.com/nar). The NAR online Molecular Biology Database Collection has been updated, reviewing 138 entries, adding 88 new resources and eliminating 47 discontinued URLs, bringing the current total to 1737 databases. It is available at http://www.oxfordjournals.org/nar/database/c/. © The Author(s) 2018. Published by Oxford University Press on behalf of Nucleic Acids Research.
A comparative cellular and molecular biology of longevity database.
Stuart, Jeffrey A; Liang, Ping; Luo, Xuemei; Page, Melissa M; Gallagher, Emily J; Christoff, Casey A; Robb, Ellen L
2013-10-01
Discovering key cellular and molecular traits that promote longevity is a major goal of aging and longevity research. One experimental strategy is to determine which traits have been selected during the evolution of longevity in naturally long-lived animal species. This comparative approach has been applied to lifespan research for nearly four decades, yielding hundreds of datasets describing aspects of cell and molecular biology hypothesized to relate to animal longevity. Here, we introduce a Comparative Cellular and Molecular Biology of Longevity Database, available at ( http://genomics.brocku.ca/ccmbl/ ), as a compendium of comparative cell and molecular data presented in the context of longevity. This open access database will facilitate the meta-analysis of amalgamated datasets using standardized maximum lifespan (MLSP) data (from AnAge). The first edition contains over 800 data records describing experimental measurements of cellular stress resistance, reactive oxygen species metabolism, membrane composition, protein homeostasis, and genome homeostasis as they relate to vertebrate species MLSP. The purpose of this review is to introduce the database and briefly demonstrate its use in the meta-analysis of combined datasets.
Sun, Mao-Feng; Chen, Hsin-Yi; Tsai, Fuu-Jen; Lui, Shu-Hui; Chen, Chih-Yi; Chen, Calvin Yu-Chian
2011-10-01
Two nuclear plant disasters occurring within a span of 25 years threaten health and genome integrity both in Fukushima and Chernobyl. Search for remedies capable of enhancing DNA repair efficiency and radiation resistance in humans appears to be a urgent problem for now. XRCC4 is an important enhancer in promoting repair pathway triggered by DNA double-strand break (DSB). In the context of radiation therapy, active XRCC4 could reduce DSB-mediated apoptotic effect on cancer cells. Hence, developing XRCC4 inhibitors could possibly enhance radiotherapy outcomes. In this study, we screened traditional Chinese medicine (TCM) database, TCM Database@Taiwan, and have identified three potent inhibitor agents against XRCC4. Through molecular dynamics simulation, we have determined that the protein-ligand interactions were focused at Lys188 on chain A and Lys187 on chain B. Intriguingly, the hydrogen bonds for all three ligands fluctuated frequently but were held at close approximation. The pi-cation interactions and ionic interactions mediated by o-hydroxyphenyl and carboxyl functional groups respectively have been demonstrated to play critical roles in stabilizing binding conformations. Based on these results, we reported the identification of potential radiotherapy enhancers from TCM. We further characterized the key binding elements for inhibiting the XRCC4 activities.
Zeidán-Chuliá, Fares; Rybarczyk-Filho, José L; Gursoy, Mervi; Könönen, Eija; Uitto, Veli-Jukka; Gursoy, Orhan V; Cakmakci, Lutfu; Moreira, José C F; Gursoy, Ulvi K
2012-06-01
Essential oils carry diverse antimicrobial and anti-enzymatic properties. Matrix metalloproteinase (MMP) inhibition characteristics of Salvia fruticosa Miller (Labiatae), Myrtus communis Linnaeus (Myrtaceae), Juniperus communis Linnaeus (Cupressaceae), and Lavandula stoechas Linnaeus (Labiatae) essential oils were evaluated. Chemical compositions of the essential oils were analyzed by gas chromatography-mass spectrometry (GC-MS). Bioinformatical database analysis was performed by STRING 9.0 and STITCH 2.0 databases, and ViaComplex software. Antibacterial activity of essential oils against periodontopathogens was tested by the disc diffusion assay and the agar dilution method. Cellular proliferation and cytotoxicity were determined by commercial kits. MMP-2 and MMP-9 activities were measured by zymography. Bioinformatical database analyses, under a score of 0.4 (medium) and a prior correction of 0.0, gave rise to a model of protein (MMPs and tissue inhibitors of metalloproteinases) vs. chemical (essential oil components) interaction network; where MMPs and essential oil components interconnected through interaction with hydroxyl radicals, molecular oxygen, and hydrogen peroxide. Components from L. stoechas potentially displayed a higher grade of interaction with MMP-2 and -9. Although antibacterial and growth inhibitory effects of essential oils on the tested periodontopathogens were limited, all of them inhibited MMP-2 in vitro at concentrations of 1 and 5 µL/mL. Moreover, same concentrations of M. communis and L. stoechas also inhibited MMP-9. MMP-inhibiting concentrations of essential oils were not cytotoxic against keratinocytes. We propose essential oils of being useful therapeutic agents as MMP inhibitors through a mechanism possibly based on their antioxidant potential.
Searching molecular structure databases with tandem mass spectra using CSI:FingerID
Dührkop, Kai; Shen, Huibin; Meusel, Marvin; Rousu, Juho; Böcker, Sebastian
2015-01-01
Metabolites provide a direct functional signature of cellular state. Untargeted metabolomics experiments usually rely on tandem MS to identify the thousands of compounds in a biological sample. Today, the vast majority of metabolites remain unknown. We present a method for searching molecular structure databases using tandem MS data of small molecules. Our method computes a fragmentation tree that best explains the fragmentation spectrum of an unknown molecule. We use the fragmentation tree to predict the molecular structure fingerprint of the unknown compound using machine learning. This fingerprint is then used to search a molecular structure database such as PubChem. Our method is shown to improve on the competing methods for computational metabolite identification by a considerable margin. PMID:26392543
Non-transcriptional interactions of Hox proteins: inventory, facts, and future directions.
Rezsohazy, René
2014-01-01
Hox proteins are conserved homeodomain transcription factors involved in the control of embryo patterning, organ development, and cell differentiation during animal development and adult life. Although recognizably active in gene regulation, accumulating reports support that Hox proteins are also active in controlling other molecular processes like mRNA translation, DNA repair, initiation of DNA replication, and possibly modulation of signal transduction. Here we review experimental evidence as well as databases entries indicative of non-transcriptional activities of Hox proteins. Copyright © 2013 Wiley Periodicals, Inc.
Educational websites--Bioinformatics Tools II.
Lomberk, Gwen
2009-01-01
In this issue, the highlighted websites are a continuation of a series of educational websites; this one in particular from a couple of years ago, Bioinformatics Tools [Pancreatology 2005;5:314-315]. These include sites that are valuable resources for many research needs in genomics and proteomics. Bioinformatics has become a laboratory tool to map sequences to databases, develop models of molecular interactions, evaluate structural compatibilities, describe differences between normal and disease-associated DNA, identify conserved motifs within proteins, and chart extensive signaling networks, all in silico. Copyright 2008 S. Karger AG, Basel and IAP.
The 2015 edition of the GEISA spectroscopic database
NASA Astrophysics Data System (ADS)
Jacquinet-Husson, N.; Armante, R.; Scott, N. A.; Chédin, A.; Crépeau, L.; Boutammine, C.; Bouhdaoui, A.; Crevoisier, C.; Capelle, V.; Boonne, C.; Poulet-Crovisier, N.; Barbe, A.; Chris Benner, D.; Boudon, V.; Brown, L. R.; Buldyreva, J.; Campargue, A.; Coudert, L. H.; Devi, V. M.; Down, M. J.; Drouin, B. J.; Fayt, A.; Fittschen, C.; Flaud, J.-M.; Gamache, R. R.; Harrison, J. J.; Hill, C.; Hodnebrog, Ø.; Hu, S.-M.; Jacquemart, D.; Jolly, A.; Jiménez, E.; Lavrentieva, N. N.; Liu, A.-W.; Lodi, L.; Lyulin, O. M.; Massie, S. T.; Mikhailenko, S.; Müller, H. S. P.; Naumenko, O. V.; Nikitin, A.; Nielsen, C. J.; Orphal, J.; Perevalov, V. I.; Perrin, A.; Polovtseva, E.; Predoi-Cross, A.; Rotger, M.; Ruth, A. A.; Yu, S. S.; Sung, K.; Tashkun, S. A.; Tennyson, J.; Tyuterev, Vl. G.; Vander Auwera, J.; Voronin, B. A.; Makie, A.
2016-09-01
The GEISA database (Gestion et Etude des Informations Spectroscopiques Atmosphériques: Management and Study of Atmospheric Spectroscopic Information) has been developed and maintained by the http://ara.abct.lmd.polytechnique.fr. The "line parameters database" contains 52 molecular species (118 isotopologues) and transitions in the spectral range from 10-6 to 35,877.031 cm-1, representing 5,067,351 entries, against 3,794,297 in GEISA-2011. Among the previously existing molecules, 20 molecular species have been updated. A new molecule (SO3) has been added. HDO, isotopologue of H2O, is now identified as an independent molecular species. Seven new isotopologues have been added to the GEISA-2015 database. The "cross section sub-database" has been enriched by the addition of 43 new molecular species in its infrared part, 4 molecules (ethane, propane, acetone, acetonitrile) are also updated; they represent 3% of the update. A new section is added, in the near-infrared spectral region, involving 7 molecular species: CH3CN, CH3I, CH3O2, H2CO, HO2, HONO, NH3. The "microphysical and optical properties of atmospheric aerosols sub-database" has been updated for the first time since 2003. It contains more than 40 species originating from NCAR and 20 from the http://eodg.atm.ox.ac.uk/ARIA/introduction_nocol.html. As for the previous versions, this new release of GEISA and associated management software facilities are implemented and freely accessible on the http://cds-espri.ipsl.fr/etherTypo/?id=950.
Enhancing AFLOW Visualization using Jmol
NASA Astrophysics Data System (ADS)
Lanasa, Jacob; New, Elizabeth; Stefek, Patrik; Honaker, Brigette; Hanson, Robert; Aflow Collaboration
The AFLOW library is a database of theoretical solid-state structures and calculated properties created using high-throughput ab initio calculations. Jmol is a Java-based program capable of visualizing and analyzing complex molecular structures and energy landscapes. In collaboration with the AFLOW consortium, our goal is the enhancement of the AFLOWLIB database through the extension of Jmol's capabilities in the area of materials science. Modifications made to Jmol include the ability to read and visualize AFLOW binary alloy data files, the ability to extract from these files information using Jmol scripting macros that can be utilized in the creation of interactive web-based convex hull graphs, the capability to identify and classify local atomic environments by symmetry, and the ability to search one or more related crystal structures for atomic environments using a novel extension of inorganic polyhedron-based SMILES strings
USDA-ARS?s Scientific Manuscript database
This article documents the addition of 220 microsatellite marker loci to the Molecular Ecology Resources Database. Loci were developed for the following species: Allanblackia floribunda, Amblyraja radiata, Bactrocera cucurbitae, Brachycaudus helichrysi, Calopogonium mucunoides, Dissodactylus primiti...
MMDB: Entrez’s 3D-structure database
Wang, Yanli; Anderson, John B.; Chen, Jie; Geer, Lewis Y.; He, Siqian; Hurwitz, David I.; Liebert, Cynthia A.; Madej, Thomas; Marchler, Gabriele H.; Marchler-Bauer, Aron; Panchenko, Anna R.; Shoemaker, Benjamin A.; Song, James S.; Thiessen, Paul A.; Yamashita, Roxanne A.; Bryant, Stephen H.
2002-01-01
Three-dimensional structures are now known within many protein families and it is quite likely, in searching a sequence database, that one will encounter a homolog with known structure. The goal of Entrez’s 3D-structure database is to make this information, and the functional annotation it can provide, easily accessible to molecular biologists. To this end Entrez’s search engine provides three powerful features. (i) Sequence and structure neighbors; one may select all sequences similar to one of interest, for example, and link to any known 3D structures. (ii) Links between databases; one may search by term matching in MEDLINE, for example, and link to 3D structures reported in these articles. (iii) Sequence and structure visualization; identifying a homolog with known structure, one may view molecular-graphic and alignment displays, to infer approximate 3D structure. In this article we focus on two features of Entrez’s Molecular Modeling Database (MMDB) not described previously: links from individual biopolymer chains within 3D structures to a systematic taxonomy of organisms represented in molecular databases, and links from individual chains (and compact 3D domains within them) to structure neighbors, other chains (and 3D domains) with similar 3D structure. MMDB may be accessed at http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=Structure. PMID:11752307
NASA Astrophysics Data System (ADS)
Wu, Zhengyu
Part I of this dissertation studies the bonding in chemical reactions, while Part II studies the bonding related to inter- and intra-molecular interactions. Part III studies the application of IT technology in chemistry education. Part I of this dissertation (chapter 1 and chapter 2) focuses on the theoretical studies on the mechanism of the hydrolysis reactions of benzenediazonium ion and guaninediazonium ion. The major conclusion is that in hydrolysis reactions the "unimolecular mechanism" actually has to involve the reacting solvent molecule. Therefore, the unimolecular pathway can only serve as a conceptual model but will not happen in the reality. Chapter I concludes that the hydrolysis reaction of benzenediazonium ion takes the direct SN2Ar mechanism via a transition state but without going through a pre-coordination complex. Chapter 2 concludes that the formation of xanthine from the dediazoniation reaction of guaninediazonium ion in water takes the SN2Ar pathway without a transition state. And oxanine might come from an intermediate formed by the bimolecular deprotonation of the H atom on N3 of guaninediazonium ion synchronized with the pyrimidine ring opening reaction. Part II of this dissertation includes chapters 3, 4, and 5. Chapter 3 studies the quadrupole moment of benzene and quadrupole-quadrupole interactions. We concluded that the quadrupole-quadrupole interaction is important in the arene-arene interactions. Our study shows the most stable structure of benzene dimer is the point-to-face T-shaped structure. Chapter 4 studies the intermolecular interactions that result in the disorder of the crystal of 4-Chloroacetophenone-(4-methoxyphenylethylidene). We analyzed all the nearest neighbor interactions within that crystal and found that the crystal structure is determined by its thermo-dynamical properties. Our calculation perfectly reproduced the percentage of parallel-alignment of the crystal. Part III of this dissertation is focused on the application of database management system and computer technology on chemistry education. A database-supported webtool was developed to support the creation of news portfolio and peer reviews online. The responses to an in-class survey show that students embrace the use of this webtool for its conceptually clear design and its easiness of use.
Rice proteome analysis: a step toward functional analysis of the rice genome.
Komatsu, Setsuko; Tanaka, Naoki
2005-03-01
The technique of proteome analysis using 2-DE has the power to monitor global changes that occur in the protein complement of tissues and subcellular compartments. In this review, we describe construction of the rice proteome database, the cataloging of rice proteins, and the functional characterization of some of the proteins identified. Initially, proteins extracted from various tissues and organelles were separated by 2-DE and an image analyzer was used to construct a display or reference map of the proteins. The rice proteome database currently contains 23 reference maps based on 2-DE of proteins from different rice tissues and subcellular compartments. These reference maps comprise 13 129 rice proteins, and the amino acid sequences of 5092 of these proteins are entered in the database. Major proteins involved in growth or stress responses have been identified by using a proteomics approach and some of these proteins have unique functions. Furthermore, initial work has also begun on analyzing the phosphoproteome and protein-protein interactions in rice. The information obtained from the rice proteome database will aid in the molecular cloning of rice genes and in predicting the function of unknown proteins.
Radiation damage of biomolecules (RADAM) database development: current status
NASA Astrophysics Data System (ADS)
Denifl, S.; Garcia, G.; Huber, B. A.; Marinković, B. P.; Mason, N.; Postler, J.; Rabus, H.; Rixon, G.; Solov'yov, A. V.; Suraud, E.; Yakubovich, A. V.
2013-06-01
Ion beam therapy offers the possibility of excellent dose localization for treatment of malignant tumours, minimizing radiation damage in normal tissue, while maximizing cell killing within the tumour. However, as the underlying dependent physical, chemical and biological processes are too complex to treat them on a purely analytical level, most of our current and future understanding will rely on computer simulations, based on mathematical equations, algorithms and last, but not least, on the available atomic and molecular data. The viability of the simulated output and the success of any computer simulation will be determined by these data, which are treated as the input variables in each computer simulation performed. The radiation research community lacks a complete database for the cross sections of all the different processes involved in ion beam induced damage: ionization and excitation cross sections for ions with liquid water and biological molecules, all the possible electron - medium interactions, dielectric response data, electron attachment to biomolecules etc. In this paper we discuss current progress in the creation of such a database, outline the roadmap of the project and review plans for the exploitation of such a database in future simulations.
3DNALandscapes: a database for exploring the conformational features of DNA.
Zheng, Guohui; Colasanti, Andrew V; Lu, Xiang-Jun; Olson, Wilma K
2010-01-01
3DNALandscapes, located at: http://3DNAscapes.rutgers.edu, is a new database for exploring the conformational features of DNA. In contrast to most structural databases, which archive the Cartesian coordinates and/or derived parameters and images for individual structures, 3DNALandscapes enables searches of conformational information across multiple structures. The database contains a wide variety of structural parameters and molecular images, computed with the 3DNA software package and known to be useful for characterizing and understanding the sequence-dependent spatial arrangements of the DNA sugar-phosphate backbone, sugar-base side groups, base pairs, base-pair steps, groove structure, etc. The data comprise all DNA-containing structures--both free and bound to proteins, drugs and other ligands--currently available in the Protein Data Bank. The web interface allows the user to link, report, plot and analyze this information from numerous perspectives and thereby gain insight into DNA conformation, deformability and interactions in different sequence and structural contexts. The data accumulated from known, well-resolved DNA structures can serve as useful benchmarks for the analysis and simulation of new structures. The collective data can also help to understand how DNA deforms in response to proteins and other molecules and undergoes conformational rearrangements.
2013-01-01
Background Contemporary coral reef research has firmly established that a genomic approach is urgently needed to better understand the effects of anthropogenic environmental stress and global climate change on coral holobiont interactions. Here we present KEGG orthology-based annotation of the complete genome sequence of the scleractinian coral Acropora digitifera and provide the first comprehensive view of the genome of a reef-building coral by applying advanced bioinformatics. Description Sequences from the KEGG database of protein function were used to construct hidden Markov models. These models were used to search the predicted proteome of A. digitifera to establish complete genomic annotation. The annotated dataset is published in ZoophyteBase, an open access format with different options for searching the data. A particularly useful feature is the ability to use a Google-like search engine that links query words to protein attributes. We present features of the annotation that underpin the molecular structure of key processes of coral physiology that include (1) regulatory proteins of symbiosis, (2) planula and early developmental proteins, (3) neural messengers, receptors and sensory proteins, (4) calcification and Ca2+-signalling proteins, (5) plant-derived proteins, (6) proteins of nitrogen metabolism, (7) DNA repair proteins, (8) stress response proteins, (9) antioxidant and redox-protective proteins, (10) proteins of cellular apoptosis, (11) microbial symbioses and pathogenicity proteins, (12) proteins of viral pathogenicity, (13) toxins and venom, (14) proteins of the chemical defensome and (15) coral epigenetics. Conclusions We advocate that providing annotation in an open-access searchable database available to the public domain will give an unprecedented foundation to interrogate the fundamental molecular structure and interactions of coral symbiosis and allow critical questions to be addressed at the genomic level based on combined aspects of evolutionary, developmental, metabolic, and environmental perspectives. PMID:23889801
Xie, Zhihui; Li, Jing; Baker, Jonathan; Eagleson, Kathie L.; Coba, Marcelo P.; Levitt, Pat
2016-01-01
Background Atypical synapse development and plasticity are implicated in many neurodevelopmental disorders (NDDs). NDD-associated, high confidence risk genes have been identified, yet little is known about functional relationships at the level of protein-protein interactions, which are the dominant molecular bases responsible for mediating circuit development. Methods Proteomics in three independent developing neocortical synaptosomal preparations identified putative interacting proteins of the ligand-activated MET receptor tyrosine kinase, an autism risk gene that mediates synapse development. The candidates were translated into interactome networks and analyzed bioinformatically. Additionally, three independent quantitative proximity ligation assays (PLA) in cultured neurons and four independent immunoprecipitation analyses of synaptosomes validated protein interactions. Results Approximately 11% (8/72) of MET-interacting proteins, including SHANK3, SYNGAP1 and GRIN2B, are associated with NDDs. Proteins in the MET interactome were translated into a novel MET interactome network based on human protein-protein interaction databases. High confidence genes from different NDD datasets that encode synaptosomal proteins were analyzed for being enriched in MET interactome proteins. This was found for autism, but not schizophrenia, bipolar disorder, major depressive disorder or attentional deficit hyperactivity disorder. There is correlated gene expression between MET and its interactive partners in developing human temporal and visual neocortices, but not with highly expressed genes that are not in the interactome. PLA and biochemical analyses demonstrate that MET-protein partner interactions are dynamically regulated by receptor activation. Conclusions The results provide a novel molecular framework for deciphering the functional relations of key regulators of synaptogenesis that contribute to both typical cortical development and to NDDs. PMID:27086544
Brozovic, Matija; Dantec, Christelle; Dardaillon, Justine; Dauga, Delphine; Faure, Emmanuel; Gineste, Mathieu; Louis, Alexandra; Naville, Magali; Nitta, Kazuhiro R; Piette, Jacques; Reeves, Wendy; Scornavacca, Céline; Simion, Paul; Vincentelli, Renaud; Bellec, Maelle; Aicha, Sameh Ben; Fagotto, Marie; Guéroult-Bellone, Marion; Haeussler, Maximilian; Jacox, Edwin; Lowe, Elijah K; Mendez, Mickael; Roberge, Alexis; Stolfi, Alberto; Yokomori, Rui; Brown, C Titus; Cambillau, Christian; Christiaen, Lionel; Delsuc, Frédéric; Douzery, Emmanuel; Dumollard, Rémi; Kusakabe, Takehiro; Nakai, Kenta; Nishida, Hiroki; Satou, Yutaka; Swalla, Billie; Veeman, Michael; Volff, Jean-Nicolas; Lemaire, Patrick
2018-01-04
ANISEED (www.aniseed.cnrs.fr) is the main model organism database for tunicates, the sister-group of vertebrates. This release gives access to annotated genomes, gene expression patterns, and anatomical descriptions for nine ascidian species. It provides increased integration with external molecular and taxonomy databases, better support for epigenomics datasets, in particular RNA-seq, ChIP-seq and SELEX-seq, and features novel interactive interfaces for existing and novel datatypes. In particular, the cross-species navigation and comparison is enhanced through a novel taxonomy section describing each represented species and through the implementation of interactive phylogenetic gene trees for 60% of tunicate genes. The gene expression section displays the results of RNA-seq experiments for the three major model species of solitary ascidians. Gene expression is controlled by the binding of transcription factors to cis-regulatory sequences. A high-resolution description of the DNA-binding specificity for 131 Ciona robusta (formerly C. intestinalis type A) transcription factors by SELEX-seq is provided and used to map candidate binding sites across the Ciona robusta and Phallusia mammillata genomes. Finally, use of a WashU Epigenome browser enhances genome navigation, while a Genomicus server was set up to explore microsynteny relationships within tunicates and with vertebrates, Amphioxus, echinoderms and hemichordates. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Luo, Jie; Shi, Ke; Yin, Shu-Ya; Tang, Rui-Xue; Chen, Wen-Jie; Huang, Lin-Zhen; Gan, Ting-Qing; Cai, Zheng-Wen; Chen, Gang
2018-04-10
MiR-182-5p, as a member of miRNA family, can be detected in lung cancer and plays an important role in lung cancer. To explore the clinical value of miR-182-5p in lung squamous cell carcinoma (LUSC) and to unveil the molecular mechanism of LUSC. The clinical value of miR-182-5p in LUSC was investigated by collecting and calculating data from The Cancer Genome Atlas (TCGA) database, the Gene Expression Omnibus (GEO) database, and real-time quantitative polymerase chain reaction (RT-qPCR). Twelve prediction platforms were used to predict the target genes of miR-182-5p. Protein-protein interaction (PPI) networks and gene ontology (GO), and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses were used to explore the molecular mechanism of LUSC. The expression of miR-182-5p was significantly over-expressed in LUSC than in non-cancerous tissues, as evidenced by various approaches, including the TCGA database, GEO microarrays, RT-qPCR, and a comprehensive meta-analysis of 501 LUSC cases and 148 non-cancerous cases. Furthermore, a total of 81 potential target genes were chosen from the union of predicted genes and the TCGA database. GO and KEGG analyses demonstrated that the target genes are involved in pathways related to biological processes. PPIs revealed the relationships between these genes, with EPAS1, PRKCE, NR3C1, and RHOB being located in the center of the PPI network. MiR-182-5p upregulation greatly contributes to LUSC and may serve as a biomarker in LUSC.
Draper, John; Enot, David P; Parker, David; Beckmann, Manfred; Snowdon, Stuart; Lin, Wanchang; Zubair, Hassan
2009-01-01
Background Metabolomics experiments using Mass Spectrometry (MS) technology measure the mass to charge ratio (m/z) and intensity of ionised molecules in crude extracts of complex biological samples to generate high dimensional metabolite 'fingerprint' or metabolite 'profile' data. High resolution MS instruments perform routinely with a mass accuracy of < 5 ppm (parts per million) thus providing potentially a direct method for signal putative annotation using databases containing metabolite mass information. Most database interfaces support only simple queries with the default assumption that molecules either gain or lose a single proton when ionised. In reality the annotation process is confounded by the fact that many ionisation products will be not only molecular isotopes but also salt/solvent adducts and neutral loss fragments of original metabolites. This report describes an annotation strategy that will allow searching based on all potential ionisation products predicted to form during electrospray ionisation (ESI). Results Metabolite 'structures' harvested from publicly accessible databases were converted into a common format to generate a comprehensive archive in MZedDB. 'Rules' were derived from chemical information that allowed MZedDB to generate a list of adducts and neutral loss fragments putatively able to form for each structure and calculate, on the fly, the exact molecular weight of every potential ionisation product to provide targets for annotation searches based on accurate mass. We demonstrate that data matrices representing populations of ionisation products generated from different biological matrices contain a large proportion (sometimes > 50%) of molecular isotopes, salt adducts and neutral loss fragments. Correlation analysis of ESI-MS data features confirmed the predicted relationships of m/z signals. An integrated isotope enumerator in MZedDB allowed verification of exact isotopic pattern distributions to corroborate experimental data. Conclusion We conclude that although ultra-high accurate mass instruments provide major insight into the chemical diversity of biological extracts, the facile annotation of a large proportion of signals is not possible by simple, automated query of current databases using computed molecular formulae. Parameterising MZedDB to take into account predicted ionisation behaviour and the biological source of any sample improves greatly both the frequency and accuracy of potential annotation 'hits' in ESI-MS data. PMID:19622150
Mallik, Mrinmay Kumar
2018-02-07
Biological networks can be analyzed using "Centrality Analysis" to identify the more influential nodes and interactions in the network. This study was undertaken to create and visualize a biological network comprising of protein-protein interactions (PPIs) amongst proteins which are preferentially over-expressed in glioma cancer stem cell component (GCSC) of glioblastomas as compared to the glioma non-stem cancer cell (GNSC) component and then to analyze this network through centrality analyses (CA) in order to identify the essential proteins in this network and their interactions. In addition, this study proposes a new centrality analysis method pertaining exclusively to transcription factors (TFs) and interactions amongst them. Moreover the relevant molecular functions, biological processes and biochemical pathways amongst these proteins were sought through enrichment analysis. A protein interaction network was created using a list of proteins which have been shown to be preferentially expressed or over-expressed in GCSCs isolated from glioblastomas as compared to the GNSCs. This list comprising of 38 proteins, created using manual literature mining, was submitted to the Reactome FIViz tool, a web based application integrated into Cytoscape, an open source software platform for visualizing and analyzing molecular interaction networks and biological pathways to produce the network. This network was subjected to centrality analyses utilizing ranked lists of six centrality measures using the FIViz application and (for the first time) a dedicated centrality analysis plug-in ; CytoNCA. The interactions exclusively amongst the transcription factors were nalyzed through a newly proposed centrality analysis method called "Gene Expression Associated Degree Centrality Analysis (GEADCA)". Enrichment analysis was performed using the "network function analysis" tool on Reactome. The CA was able to identify a small set of proteins with consistently high centrality ranks that is indicative of their strong influence in the protein protein interaction network. Similarly the newly proposed GEADCA helped identify the transcription factors with high centrality values indicative of their key roles in transcriptional regulation. The enrichment studies provided a list of molecular functions, biological processes and biochemical pathways associated with the constructed network. The study shows how pathway based databases may be used to create and analyze a relevant protein interaction network in glioma cancer stem cells and identify the essential elements within it to gather insights into the molecular interactions that regulate the properties of glioma stem cells. How these insights may be utilized to help the development of future research towards formulation of new management strategies have been discussed from a theoretical standpoint. Copyright © 2017 Elsevier Ltd. All rights reserved.
Sialyldisaccharide conformations: a molecular dynamics perspective
NASA Astrophysics Data System (ADS)
Selvin, Jeyasigamani F. A.; Priyadarzini, Thanu R. K.; Veluraja, Kasinadar
2012-04-01
Sialyldisaccharides are significant terminal components of glycoconjugates and their negative charge and conformation are extensively utilized in molecular recognition processes. The conformation and flexibility of four biologically important sialyldisaccharides [Neu5Acα(2-3)Gal, Neu5Acα(2-6)Gal, Neu5Acα(2-8)Neu5Ac and Neu5Acα(2-9)Neu5Ac] are studied using Molecular Dynamics simulations of 20 ns duration to deduce the conformational preferences of the sialyldisaccharides and the interactions which stabilize the conformations. This study clearly describes the possible conformational models of sialyldisaccharides deduced from 20 ns Molecular Dynamics simulations and our results confirm the role of water in the structural stabilization of sialyldisaccharides. An extensive analysis on the sialyldisaccharide structures available in PDB also confirms the conformational regions found by experiments are detected in MD simulations of 20 ns duration. The three dimensional structural coordinates for all the MD derived sialyldisaccharide conformations are deposited in the 3DSDSCAR database and these conformational models will be useful for glycobiologists and biotechnologists to understand the biological functions of sialic acid containing glycoconjugates.
Metaproteomics as a Complementary Approach to Gut Microbiota in Health and Disease
NASA Astrophysics Data System (ADS)
Petriz, Bernardo A.; Franco, Octávio L.
2017-01-01
Classic studies on phylotype profiling are limited to the identification of microbial constituents, where information is lacking about the molecular interaction of these bacterial communities with the host genome and the possible outcomes in host biology. A range of OMICs approaches have provided great progress linking the microbiota to health and disease. However, the investigation of this context through proteomic mass spectrometry-based tools is still being improved. Therefore, metaproteomics or community proteogenomics has emerged as a complementary approach to metagenomic data, as a field in proteomics aiming to perform large-scale characterization of proteins from environmental microbiota such as the human gut. The advances in molecular separation methods coupled with mass spectrometry (e.g. LC-MS/MS) and proteome bioinformatics have been fundamental in these novel large-scale metaproteomic studies, which have further been performed in a wide range of samples including soil, plant and human environments. Metaproteomic studies will make major progress if a comprehensive database covering the genes and expresses proteins from all gut microbial species is developed. To this end, we here present some of the main limitations of metaproteomic studies in complex microbiota environments such as the gut, also addressing the up-to-date pipelines in sample preparation prior to fractionation/separation and mass spectrometry analysis. In addition, a novel approach to the limitations of metagenomic databases is also discussed. Finally, prospects are addressed regarding the application of metaproteomic analysis using a unified host-microbiome gene database and other meta-OMICs platforms.
A Molecular Framework for Understanding DCIS
2016-10-01
well. Pathologic and Clinical Annotation Database A clinical annotation database titled the Breast Oncology Database has been established to...complement the procured SPORE sample characteristics and annotated pathology data. This Breast Oncology Database is an offsite clinical annotation...database adheres to CSMC Enterprise Information Services (EIS) research database security standards. The Breast Oncology Database consists of: 9 Baseline
PLI: a web-based tool for the comparison of protein-ligand interactions observed on PDB structures.
Gallina, Anna Maria; Bisignano, Paola; Bergamino, Maurizio; Bordo, Domenico
2013-02-01
A large fraction of the entries contained in the Protein Data Bank describe proteins in complex with low molecular weight molecules such as physiological compounds or synthetic drugs. In many cases, the same molecule is found in distinct protein-ligand complexes. There is an increasing interest in Medicinal Chemistry in comparing protein binding sites to get insight on interactions that modulate the binding specificity, as this structural information can be correlated with other experimental data of biochemical or physiological nature and may help in rational drug design. The web service protein-ligand interaction presented here provides a tool to analyse and compare the binding pockets of homologous proteins in complex with a selected ligand. The information is deduced from protein-ligand complexes present in the Protein Data Bank and stored in the underlying database. Freely accessible at http://bioinformatics.istge.it/pli/.
Enhancing UCSF Chimera through web services
Huang, Conrad C.; Meng, Elaine C.; Morris, John H.; Pettersen, Eric F.; Ferrin, Thomas E.
2014-01-01
Integrating access to web services with desktop applications allows for an expanded set of application features, including performing computationally intensive tasks and convenient searches of databases. We describe how we have enhanced UCSF Chimera (http://www.rbvi.ucsf.edu/chimera/), a program for the interactive visualization and analysis of molecular structures and related data, through the addition of several web services (http://www.rbvi.ucsf.edu/chimera/docs/webservices.html). By streamlining access to web services, including the entire job submission, monitoring and retrieval process, Chimera makes it simpler for users to focus on their science projects rather than data manipulation. Chimera uses Opal, a toolkit for wrapping scientific applications as web services, to provide scalable and transparent access to several popular software packages. We illustrate Chimera's use of web services with an example workflow that interleaves use of these services with interactive manipulation of molecular sequences and structures, and we provide an example Python program to demonstrate how easily Opal-based web services can be accessed from within an application. Web server availability: http://webservices.rbvi.ucsf.edu/opal2/dashboard?command=serviceList. PMID:24861624
PyPathway: Python Package for Biological Network Analysis and Visualization.
Xu, Yang; Luo, Xiao-Chun
2018-05-01
Life science studies represent one of the biggest generators of large data sets, mainly because of rapid sequencing technological advances. Biological networks including interactive networks and human curated pathways are essential to understand these high-throughput data sets. Biological network analysis offers a method to explore systematically not only the molecular complexity of a particular disease but also the molecular relationships among apparently distinct phenotypes. Currently, several packages for Python community have been developed, such as BioPython and Goatools. However, tools to perform comprehensive network analysis and visualization are still needed. Here, we have developed PyPathway, an extensible free and open source Python package for functional enrichment analysis, network modeling, and network visualization. The network process module supports various interaction network and pathway databases such as Reactome, WikiPathway, STRING, and BioGRID. The network analysis module implements overrepresentation analysis, gene set enrichment analysis, network-based enrichment, and de novo network modeling. Finally, the visualization and data publishing modules enable users to share their analysis by using an easy web application. For package availability, see the first Reference.
Identification of core pathways based on attractor and crosstalk in ischemic stroke.
Diao, Xiufang; Liu, Aijuan
2018-02-01
Ischemic stroke is a leading cause of mortality and disability around the world. It is an important task to identify dysregulated pathways which infer molecular and functional insights existing in high-throughput experimental data. Gene expression profile of E-GEOD-16561 was collected. Pathways were obtained from the database of Kyoto Encyclopedia of Genes and Genomes and Retrieval of Interacting Genes was used to download protein-protein interaction sets. Attractor and crosstalk approaches were applied to screen dysregulated pathways. A total of 20 differentially expressed genes were identified in ischemic stroke. Thirty-nine significant differential pathways were identified according to P<0.01 and 28 pathways were identified with RP<0.01 and 17 pathways were identified with impact factor >250. On the basis of the three criteria, 11 significant dysfunctional pathways were identified. Among them, Epstein-Barr virus infection was the most significant differential pathway. In conclusion, with the method based on attractor and crosstalk, significantly dysfunctional pathways were identified. These pathways are expected to provide molecular mechanism of ischemic stroke and represents a novel potential therapeutic target for ischemic stroke treatment.
The structure of Ca2+-loaded S100A2 at 1.3-Å resolution.
Koch, Michael; Fritz, Günter
2012-05-01
S100A2 is an EF-hand calcium ion (Ca(2+))-binding protein that activates the tumour suppressor p53. In order to understand the molecular mechanisms underlying the Ca(2+) -induced activation of S100A2, the structure of Ca(2+)-bound S100A2 was determined at 1.3 Å resolution by X-ray crystallography. The structure was compared with Ca(2+) -free S100A2 and with other S100 proteins. Binding of Ca(2+) to S100A2 induces small structural changes in the N-terminal EF-hand, but a large conformational change in the C-terminal EF-hand, reorienting helix III by approximately 90°. This movement is accompanied by the exposure of a hydrophobic cavity between helix III and helix IV that represents the target protein interaction site. This molecular reorganization is associated with the breaking and new formation of intramolecular hydrophobic contacts. The target binding site exhibits unique features; in particular, the hydrophobic cavity is larger than in other Ca(2+)-loaded S100 proteins. The structural data underline that the shape and size of the hydrophobic cavity are major determinants for target specificity of S100 proteins and suggest that the binding mode for S100A2 is different from that of other p53-interacting S100 proteins. Database Structural data are available in the Protein Data Bank database under the accession number 4DUQ © 2012 The Authors Journal compilation © 2012 FEBS.
The ADAMS interactive interpreter
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rietscha, E.R.
1990-12-17
The ADAMS (Advanced DAta Management System) project is exploring next generation database technology. Database management does not follow the usual programming paradigm. Instead, the database dictionary provides an additional name space environment that should be interactively created and tested before writing application code. This document describes the implementation and operation of the ADAMS Interpreter, an interactive interface to the ADAMS data dictionary and runtime system. The Interpreter executes individual statements of the ADAMS Interface Language, providing a fast, interactive mechanism to define and access persistent databases. 5 refs.
NASA Astrophysics Data System (ADS)
Kamstra, Rhiannon L.; Dadgar, Saedeh; Wigg, John; Chowdhury, Morshed A.; Phenix, Christopher P.; Floriano, Wely B.
2014-11-01
Our group has recently demonstrated that virtual screening is a useful technique for the identification of target-specific molecular probes. In this paper, we discuss some of our proof-of-concept results involving two biologically relevant target proteins, and report the development of a computational script to generate large databases of fluorescence-labelled compounds for computer-assisted molecular design. The virtual screening of a small library of 1,153 fluorescently-labelled compounds against two targets, and the experimental testing of selected hits reveal that this approach is efficient at identifying molecular probes, and that the screening of a labelled library is preferred over the screening of base compounds followed by conjugation of confirmed hits. The automated script for library generation explores the known reactivity of commercially available dyes, such as NHS-esters, to create large virtual databases of fluorescence-tagged small molecules that can be easily synthesized in a laboratory. A database of 14,862 compounds, each tagged with the ATTO680 fluorophore was generated with the automated script reported here. This library is available for downloading and it is suitable for virtual ligand screening aiming at the identification of target-specific fluorescent molecular probes.
Davis, Allan Peter; Wiegers, Thomas C.; King, Benjamin L.; Wiegers, Jolene; Grondin, Cynthia J.; Sciaky, Daniela; Johnson, Robin J.; Mattingly, Carolyn J.
2016-01-01
Strategies for discovering common molecular events among disparate diseases hold promise for improving understanding of disease etiology and expanding treatment options. One technique is to leverage curated datasets found in the public domain. The Comparative Toxicogenomics Database (CTD; http://ctdbase.org/) manually curates chemical-gene, chemical-disease, and gene-disease interactions from the scientific literature. The use of official gene symbols in CTD interactions enables this information to be combined with the Gene Ontology (GO) file from NCBI Gene. By integrating these GO-gene annotations with CTD’s gene-disease dataset, we produce 753,000 inferences between 15,700 GO terms and 4,200 diseases, providing opportunities to explore presumptive molecular underpinnings of diseases and identify biological similarities. Through a variety of applications, we demonstrate the utility of this novel resource. As a proof-of-concept, we first analyze known repositioned drugs (e.g., raloxifene and sildenafil) and see that their target diseases have a greater degree of similarity when comparing GO terms vs. genes. Next, a computational analysis predicts seemingly non-intuitive diseases (e.g., stomach ulcers and atherosclerosis) as being similar to bipolar disorder, and these are validated in the literature as reported co-diseases. Additionally, we leverage other CTD content to develop testable hypotheses about thalidomide-gene networks to treat seemingly disparate diseases. Finally, we illustrate how CTD tools can rank a series of drugs as potential candidates for repositioning against B-cell chronic lymphocytic leukemia and predict cisplatin and the small molecule inhibitor JQ1 as lead compounds. The CTD dataset is freely available for users to navigate pathologies within the context of extensive biological processes, molecular functions, and cellular components conferred by GO. This inference set should aid researchers, bioinformaticists, and pharmaceutical drug makers in finding commonalities in disease mechanisms, which in turn could help identify new therapeutics, new indications for existing pharmaceuticals, potential disease comorbidities, and alerts for side effects. PMID:27171405
Davis, Allan Peter; Wiegers, Thomas C; King, Benjamin L; Wiegers, Jolene; Grondin, Cynthia J; Sciaky, Daniela; Johnson, Robin J; Mattingly, Carolyn J
2016-01-01
Strategies for discovering common molecular events among disparate diseases hold promise for improving understanding of disease etiology and expanding treatment options. One technique is to leverage curated datasets found in the public domain. The Comparative Toxicogenomics Database (CTD; http://ctdbase.org/) manually curates chemical-gene, chemical-disease, and gene-disease interactions from the scientific literature. The use of official gene symbols in CTD interactions enables this information to be combined with the Gene Ontology (GO) file from NCBI Gene. By integrating these GO-gene annotations with CTD's gene-disease dataset, we produce 753,000 inferences between 15,700 GO terms and 4,200 diseases, providing opportunities to explore presumptive molecular underpinnings of diseases and identify biological similarities. Through a variety of applications, we demonstrate the utility of this novel resource. As a proof-of-concept, we first analyze known repositioned drugs (e.g., raloxifene and sildenafil) and see that their target diseases have a greater degree of similarity when comparing GO terms vs. genes. Next, a computational analysis predicts seemingly non-intuitive diseases (e.g., stomach ulcers and atherosclerosis) as being similar to bipolar disorder, and these are validated in the literature as reported co-diseases. Additionally, we leverage other CTD content to develop testable hypotheses about thalidomide-gene networks to treat seemingly disparate diseases. Finally, we illustrate how CTD tools can rank a series of drugs as potential candidates for repositioning against B-cell chronic lymphocytic leukemia and predict cisplatin and the small molecule inhibitor JQ1 as lead compounds. The CTD dataset is freely available for users to navigate pathologies within the context of extensive biological processes, molecular functions, and cellular components conferred by GO. This inference set should aid researchers, bioinformaticists, and pharmaceutical drug makers in finding commonalities in disease mechanisms, which in turn could help identify new therapeutics, new indications for existing pharmaceuticals, potential disease comorbidities, and alerts for side effects.
KEGG Bioinformatics Resource for Plant Genomics and Metabolomics.
Kanehisa, Minoru
2016-01-01
In the era of high-throughput biology it is necessary to develop not only elaborate computational methods but also well-curated databases that can be used as reference for data interpretation. KEGG ( http://www.kegg.jp/ ) is such a reference knowledge base with two specific aims. One is to compile knowledge on high-level functions of the cell and the organism in terms of the molecular interaction and reaction networks, which is implemented in KEGG pathway maps, BRITE functional hierarchies, and KEGG modules. The other is to expand knowledge on genes and proteins involved in the molecular networks from experimentally observed organisms to other organisms using the concept of orthologs, which is implemented in the KEGG Orthology (KO) system. Thus, KEGG is a generic resource applicable to all organisms and enables interpretation of high-level functions from genomic and molecular data. Here we first present a brief overview of the entire KEGG resource, and then give an introduction of how to use KEGG in plant genomics and metabolomics research.
Using computer-aided drug design and medicinal chemistry strategies in the fight against diabetes.
Semighini, Evandro P; Resende, Jonathan A; de Andrade, Peterson; Morais, Pedro A B; Carvalho, Ivone; Taft, Carlton A; Silva, Carlos H T P
2011-04-01
The aim of this work is to present a simple, practical and efficient protocol for drug design, in particular Diabetes, which includes selection of the illness, good choice of a target as well as a bioactive ligand and then usage of various computer aided drug design and medicinal chemistry tools to design novel potential drug candidates in different diseases. We have selected the validated target dipeptidyl peptidase IV (DPP-IV), whose inhibition contributes to reduce glucose levels in type 2 diabetes patients. The most active inhibitor with complex X-ray structure reported was initially extracted from the BindingDB database. By using molecular modification strategies widely used in medicinal chemistry, besides current state-of-the-art tools in drug design (including flexible docking, virtual screening, molecular interaction fields, molecular dynamics, ADME and toxicity predictions), we have proposed 4 novel potential DPP-IV inhibitors with drug properties for Diabetes control, which have been supported and validated by all the computational tools used herewith.
Mortazavi, Majid; Brandenburg, Jan Gerit; Maurer, Reinhard J; Tkatchenko, Alexandre
2018-01-18
Accurate prediction of structure and stability of molecular crystals is crucial in materials science and requires reliable modeling of long-range dispersion interactions. Semiempirical electronic structure methods are computationally more efficient than their ab initio counterparts, allowing structure sampling with significant speedups. We combine the Tkatchenko-Scheffler van der Waals method (TS) and the many-body dispersion method (MBD) with third-order density functional tight-binding (DFTB3) via a charge population-based method. We find an overall good performance for the X23 benchmark database of molecular crystals, despite an underestimation of crystal volume that can be traced to the DFTB parametrization. We achieve accurate lattice energy predictions with DFT+MBD energetics on top of vdW-inclusive DFTB3 structures, resulting in a speedup of up to 3000 times compared with a full DFT treatment. This suggests that vdW-inclusive DFTB3 can serve as a viable structural prescreening tool in crystal structure prediction.
Virtual Interactomics of Proteins from Biochemical Standpoint
Kubrycht, Jaroslav; Sigler, Karel; Souček, Pavel
2012-01-01
Virtual interactomics represents a rapidly developing scientific area on the boundary line of bioinformatics and interactomics. Protein-related virtual interactomics then comprises instrumental tools for prediction, simulation, and networking of the majority of interactions important for structural and individual reproduction, differentiation, recognition, signaling, regulation, and metabolic pathways of cells and organisms. Here, we describe the main areas of virtual protein interactomics, that is, structurally based comparative analysis and prediction of functionally important interacting sites, mimotope-assisted and combined epitope prediction, molecular (protein) docking studies, and investigation of protein interaction networks. Detailed information about some interesting methodological approaches and online accessible programs or databases is displayed in our tables. Considerable part of the text deals with the searches for common conserved or functionally convergent protein regions and subgraphs of conserved interaction networks, new outstanding trends and clinically interesting results. In agreement with the presented data and relationships, virtual interactomic tools improve our scientific knowledge, help us to formulate working hypotheses, and they frequently also mediate variously important in silico simulations. PMID:22928109
Comprehensive, comprehensible, distributed and intelligent databases: current status.
Frishman, D; Heumann, K; Lesk, A; Mewes, H W
1998-01-01
It is only a matter of time until a user will see not many but one integrated database of information for molecular biology. Is this true? Is it a good thing? Why will it happen? Where are we now? What developments are fostering and what developments are impeding progress towards this end? A list of WWW resources devoted to database issues in molecular biology is available at http://www.mips.biochem.mpg.de frishman@mips.biochem.mpg.de
Alcohol-Induced Molecular Dysregulation in Human Embryonic Stem Cell-Derived Neural Precursor Cells
Kim, Yi Young; Roubal, Ivan; Lee, Youn Soo; Kim, Jin Seok; Hoang, Michael; Mathiyakom, Nathan; Kim, Yong
2016-01-01
Adverse effect of alcohol on neural function has been well documented. Especially, the teratogenic effect of alcohol on neurodevelopment during embryogenesis has been demonstrated in various models, which could be a pathologic basis for fetal alcohol spectrum disorders (FASDs). While the developmental defects from alcohol abuse during gestation have been described, the specific mechanisms by which alcohol mediates these injuries have yet to be determined. Recent studies have shown that alcohol has significant effect on molecular and cellular regulatory mechanisms in embryonic stem cell (ESC) differentiation including genes involved in neural development. To test our hypothesis that alcohol induces molecular alterations during neural differentiation we have derived neural precursor cells from pluripotent human ESCs in the presence or absence of ethanol treatment. Genome-wide transcriptomic profiling identified molecular alterations induced by ethanol exposure during neural differentiation of hESCs into neural rosettes and neural precursor cell populations. The Database for Annotation, Visualization and Integrated Discovery (DAVID) functional analysis on significantly altered genes showed potential ethanol’s effect on JAK-STAT signaling pathway, neuroactive ligand-receptor interaction, Toll-like receptor (TLR) signaling pathway, cytokine-cytokine receptor interaction and regulation of autophagy. We have further quantitatively verified ethanol-induced alterations of selected candidate genes. Among verified genes we further examined the expression of P2RX3, which is associated with nociception, a peripheral pain response. We found ethanol significantly reduced the level of P2RX3 in undifferentiated hESCs, but induced the level of P2RX3 mRNA and protein in hESC-derived NPCs. Our result suggests ethanol-induced dysregulation of P2RX3 along with alterations in molecules involved in neural activity such as neuroactive ligand-receptor interaction may be a molecular event associated with alcohol-related peripheral neuropathy of an enhanced nociceptive response. PMID:27682028
2011-01-01
Background To understand biological processes and diseases, it is crucial to unravel the concerted interplay of transcription factors (TFs), microRNAs (miRNAs) and their targets within regulatory networks and fundamental sub-networks. An integrative computational resource generating a comprehensive view of these regulatory molecular interactions at a genome-wide scale would be of great interest to biologists, but is not available to date. Results To identify and analyze molecular interaction networks, we developed MIR@NT@N, an integrative approach based on a meta-regulation network model and a large-scale database. MIR@NT@N uses a graph-based approach to predict novel molecular actors across multiple regulatory processes (i.e. TFs acting on protein-coding or miRNA genes, or miRNAs acting on messenger RNAs). Exploiting these predictions, the user can generate networks and further analyze them to identify sub-networks, including motifs such as feedback and feedforward loops (FBL and FFL). In addition, networks can be built from lists of molecular actors with an a priori role in a given biological process to predict novel and unanticipated interactions. Analyses can be contextualized and filtered by integrating additional information such as microarray expression data. All results, including generated graphs, can be visualized, saved and exported into various formats. MIR@NT@N performances have been evaluated using published data and then applied to the regulatory program underlying epithelium to mesenchyme transition (EMT), an evolutionary-conserved process which is implicated in embryonic development and disease. Conclusions MIR@NT@N is an effective computational approach to identify novel molecular regulations and to predict gene regulatory networks and sub-networks including conserved motifs within a given biological context. Taking advantage of the M@IA environment, MIR@NT@N is a user-friendly web resource freely available at http://mironton.uni.lu which will be updated on a regular basis. PMID:21375730
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yusim, Karina; Korber, Bette Tina; Brander, Christian
The scope and purpose of the HIV molecular immunology database: HIV Molecular Immunology is a companion volume to HIV Sequence Compendium. This publication, the 2015 edition, is the PDF version of the web-based HIV Immunology Database (http://www.hiv.lanl.gov/ content/immunology/). The web interface for this relational database has many search options, as well as interactive tools to help immunologists design reagents and interpret their results. In the HIV Immunology Database, HIV-specific B-cell and T-cell responses are summarized and annotated. Immunological responses are divided into three parts, CTL, T helper, and antibody. Within these parts, defined epitopes are organized by protein and bindingmore » sites within each protein, moving from left to right through the coding regions spanning the HIV genome. We include human responses to natural HIV infections, as well as vaccine studies in a range of animal models and human trials. Responses that are not specifically defined, such as responses to whole proteins or monoclonal antibody responses to discontinuous epitopes, are summarized at the end of each protein section. Studies describing general HIV responses to the virus, but not to any specific protein, are included at the end of each part. The annotation includes information such as cross-reactivity, escape mutations, antibody sequence, TCR usage, functional domains that overlap with an epitope, immune response associations with rates of progression and therapy, and how specific epitopes were experimentally defined. Basic information such as HLA specificities for T-cell epitopes, isotypes of monoclonal antibodies, and epitope sequences are included whenever possible. All studies that we can find that incorporate the use of a specific monoclonal antibody are included in the entry for that antibody. A single T-cell epitope can have multiple entries, generally one entry per study. Finally, maps of all defined linear epitopes relative to the HXB2 reference proteins are provided. Alignments of CTL, helper T-cell, and antibody epitopes are available through the search interface on our web site at http:// www.hiv.lanl.gov/content/immunology.« less
NASA Astrophysics Data System (ADS)
Liu, J.; Lu, W. Q.
2010-03-01
This paper presents the detailed MD simulation on the properties including the thermal conductivities and viscosities of the quantum fluid helium at different state points. The molecular interactions are represented by the Lennard-Jones pair potentials supplemented by quantum corrections following the Feynman-Hibbs approach and the properties are calculated using the Green-Kubo equations. A comparison is made among the numerical results using LJ and QFH potentials and the existing database and shows that the LJ model is not quantitatively correct for the supercritical liquid helium, thereby the quantum effect must be taken into account when the quantum fluid helium is studied. The comparison of the thermal conductivity is also made as a function of temperatures and pressure and the results show quantum effect correction is an efficient tool to get the thermal conductivities.
NO3- anions can act as Lewis acid in the solid state
NASA Astrophysics Data System (ADS)
Bauzá, Antonio; Frontera, Antonio; Mooibroek, Tiddo J.
2017-02-01
Identifying electron donating and accepting moieties is crucial to understanding molecular aggregation, which is of pivotal significance to biology. Anions such as NO3- are typical electron donors. However, computations predict that the charge distribution of NO3- is anisotropic and minimal on nitrogen. Here we show that when the nitrate's charge is sufficiently dampened by resonating over a larger area, a Lewis acidic site emerges on nitrogen that can interact favourably with electron rich partners. Surveys of the Cambridge Structural Database and Protein Data Bank reveal geometric preferences of some oxygen and sulfur containing entities around a nitrate anion that are consistent with this `π-hole bonding' geometry. Computations reveal donor-acceptor orbital interactions that confirm the counterintuitive Lewis π-acidity of nitrate.
Detection of functionally important regions in "hypothetical proteins" of known structure.
Nimrod, Guy; Schushan, Maya; Steinberg, David M; Ben-Tal, Nir
2008-12-10
Structural genomics initiatives provide ample structures of "hypothetical proteins" (i.e., proteins of unknown function) at an ever increasing rate. However, without function annotation, this structural goldmine is of little use to biologists who are interested in particular molecular systems. To this end, we used (an improved version of) the PatchFinder algorithm for the detection of functional regions on the protein surface, which could mediate its interactions with, e.g., substrates, ligands, and other proteins. Examination, using a data set of annotated proteins, showed that PatchFinder outperforms similar methods. We collected 757 structures of hypothetical proteins and their predicted functional regions in the N-Func database. Inspection of several of these regions demonstrated that they are useful for function prediction. For example, we suggested an interprotein interface and a putative nucleotide-binding site. A web-server implementation of PatchFinder and the N-Func database are available at http://patchfinder.tau.ac.il/.
A public HTLV-1 molecular epidemiology database for sequence management and data mining.
Araujo, Thessika Hialla Almeida; Souza-Brito, Leandro Inacio; Libin, Pieter; Deforche, Koen; Edwards, Dustin; de Albuquerque-Junior, Antonio Eduardo; Vandamme, Anne-Mieke; Galvao-Castro, Bernardo; Alcantara, Luiz Carlos Junior
2012-01-01
It is estimated that 15 to 20 million people are infected with the human T-cell lymphotropic virus type 1 (HTLV-1). At present, there are more than 2,000 unique HTLV-1 isolate sequences published. A central database to aggregate sequence information from a range of epidemiological aspects including HTLV-1 infections, pathogenesis, origins, and evolutionary dynamics would be useful to scientists and physicians worldwide. Described here, we have developed a database that collects and annotates sequence data and can be accessed through a user-friendly search interface. The HTLV-1 Molecular Epidemiology Database website is available at http://htlv1db.bahia.fiocruz.br/. All data was obtained from publications available at GenBank or through contact with the authors. The database was developed using Apache Webserver 2.1.6 and SGBD MySQL. The webpage interfaces were developed in HTML and sever-side scripting written in PHP. The HTLV-1 Molecular Epidemiology Database is hosted on the Gonçalo Moniz/FIOCRUZ Research Center server. There are currently 2,457 registered sequences with 2,024 (82.37%) of those sequences representing unique isolates. Of these sequences, 803 (39.67%) contain information about clinical status (TSP/HAM, 17.19%; ATL, 7.41%; asymptomatic, 12.89%; other diseases, 2.17%; and no information, 60.32%). Further, 7.26% of sequences contain information on patient gender while 5.23% of sequences provide the age of the patient. The HTLV-1 Molecular Epidemiology Database retrieves and stores annotated HTLV-1 proviral sequences from clinical, epidemiological, and geographical studies. The collected sequences and related information are now accessible on a publically available and user-friendly website. This open-access database will support clinical research and vaccine development related to viral genotype.
Jefferson, Emily R.; Walsh, Thomas P.; Roberts, Timothy J.; Barton, Geoffrey J.
2007-01-01
SNAPPI-DB, a high performance database of Structures, iNterfaces and Alignments of Protein–Protein Interactions, and its associated Java Application Programming Interface (API) is described. SNAPPI-DB contains structural data, down to the level of atom co-ordinates, for each structure in the Protein Data Bank (PDB) together with associated data including SCOP, CATH, Pfam, SWISSPROT, InterPro, GO terms, Protein Quaternary Structures (PQS) and secondary structure information. Domain–domain interactions are stored for multiple domain definitions and are classified by their Superfamily/Family pair and interaction interface. Each set of classified domain–domain interactions has an associated multiple structure alignment for each partner. The API facilitates data access via PDB entries, domains and domain–domain interactions. Rapid development, fast database access and the ability to perform advanced queries without the requirement for complex SQL statements are provided via an object oriented database and the Java Data Objects (JDO) API. SNAPPI-DB contains many features which are not available in other databases of structural protein–protein interactions. It has been applied in three studies on the properties of protein–protein interactions and is currently being employed to train a protein–protein interaction predictor and a functional residue predictor. The database, API and manual are available for download at: . PMID:17202171
MeDReaders: a database for transcription factors that bind to methylated DNA.
Wang, Guohua; Luo, Ximei; Wang, Jianan; Wan, Jun; Xia, Shuli; Zhu, Heng; Qian, Jiang; Wang, Yadong
2018-01-04
Understanding the molecular principles governing interactions between transcription factors (TFs) and DNA targets is one of the main subjects for transcriptional regulation. Recently, emerging evidence demonstrated that some TFs could bind to DNA motifs containing highly methylated CpGs both in vitro and in vivo. Identification of such TFs and elucidation of their physiological roles now become an important stepping-stone toward understanding the mechanisms underlying the methylation-mediated biological processes, which have crucial implications for human disease and disease development. Hence, we constructed a database, named as MeDReaders, to collect information about methylated DNA binding activities. A total of 731 TFs, which could bind to methylated DNA sequences, were manually curated in human and mouse studies reported in the literature. In silico approaches were applied to predict methylated and unmethylated motifs of 292 TFs by integrating whole genome bisulfite sequencing (WGBS) and ChIP-Seq datasets in six human cell lines and one mouse cell line extracted from ENCODE and GEO database. MeDReaders database will provide a comprehensive resource for further studies and aid related experiment designs. The database implemented unified access for users to most TFs involved in such methylation-associated binding actives. The website is available at http://medreader.org/. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
IsoPlot: a database for comparison of mRNA isoforms in fruit fly and mosquitoes
Ng, I-Man; Tsai, Shang-Chi
2017-01-01
Abstract Alternative splicing (AS), a mechanism by which different forms of mature messenger RNAs (mRNAs) are generated from the same gene, widely occurs in the metazoan genomes. Knowledge about isoform variants and abundance is crucial for understanding the functional context in the molecular diversity of the species. With increasing transcriptome data of model and non-model species, a database for visualization and comparison of AS events with up-to-date information is needed for further research. IsoPlot is a publicly available database with visualization tools for exploration of AS events, including three major species of mosquitoes, Aedes aegypti, Anopheles gambiae, and Culex quinquefasciatus, and fruit fly Drosophila melanogaster, the model insect species. IsoPlot includes not only 88,663 annotated transcripts but also 17,037 newly predicted transcripts from massive transcriptome data at different developmental stages of mosquitoes. The web interface enables users to explore the patterns and abundance of isoforms in different experimental conditions as well as cross-species sequence comparison of orthologous transcripts. IsoPlot provides a platform for researchers to access comprehensive information about AS events in mosquitoes and fruit fly. Our database is available on the web via an interactive user interface with an intuitive graphical design, which is applicable for the comparison of complex isoforms within or between species. Database URL: http://isoplot.iis.sinica.edu.tw/ PMID:29220459
WWW Entrez: A Hypertext Retrieval Tool for Molecular Biology.
ERIC Educational Resources Information Center
Epstein, Jonathan A.; Kans, Jonathan A.; Schuler, Gregory D.
This article describes the World Wide Web (WWW) Entrez server which is based upon the National Center for Biotechnology Information's (NCBI) Entrez retrieval database and software. Entrez is a molecular sequence retrieval system that contains an integrated view of portions of Medline and all publicly available nucleotide and protein databases,…
USDA-ARS?s Scientific Manuscript database
A database of Louisiana sugarcane molecular identity has been constructed and is being updated annually using FAM or HEX or NED fluorescence- and capillary electrophoresis (CE)-based microsatellite (SSR) fingerprinting information. The fingerprints are PCR-amplified from leaf DNA samples of current ...
Data warehousing in molecular biology.
Schönbach, C; Kowalski-Saunders, P; Brusic, V
2000-05-01
In the business and healthcare sectors data warehousing has provided effective solutions for information usage and knowledge discovery from databases. However, data warehousing applications in the biological research and development (R&D) sector are lagging far behind. The fuzziness and complexity of biological data represent a major challenge in data warehousing for molecular biology. By combining experiences in other domains with our findings from building a model database, we have defined the requirements for data warehousing in molecular biology.
Introducing meta-services for biomedical information extraction
Leitner, Florian; Krallinger, Martin; Rodriguez-Penagos, Carlos; Hakenberg, Jörg; Plake, Conrad; Kuo, Cheng-Ju; Hsu, Chun-Nan; Tsai, Richard Tzong-Han; Hung, Hsi-Chuan; Lau, William W; Johnson, Calvin A; Sætre, Rune; Yoshida, Kazuhiro; Chen, Yan Hua; Kim, Sun; Shin, Soo-Yong; Zhang, Byoung-Tak; Baumgartner, William A; Hunter, Lawrence; Haddow, Barry; Matthews, Michael; Wang, Xinglong; Ruch, Patrick; Ehrler, Frédéric; Özgür, Arzucan; Erkan, Güneş; Radev, Dragomir R; Krauthammer, Michael; Luong, ThaiBinh; Hoffmann, Robert; Sander, Chris; Valencia, Alfonso
2008-01-01
We introduce the first meta-service for information extraction in molecular biology, the BioCreative MetaServer (BCMS; ). This prototype platform is a joint effort of 13 research groups and provides automatically generated annotations for PubMed/Medline abstracts. Annotation types cover gene names, gene IDs, species, and protein-protein interactions. The annotations are distributed by the meta-server in both human and machine readable formats (HTML/XML). This service is intended to be used by biomedical researchers and database annotators, and in biomedical language processing. The platform allows direct comparison, unified access, and result aggregation of the annotations. PMID:18834497
An NMR database for simulations of membrane dynamics.
Leftin, Avigdor; Brown, Michael F
2011-03-01
Computational methods are powerful in capturing the results of experimental studies in terms of force fields that both explain and predict biological structures. Validation of molecular simulations requires comparison with experimental data to test and confirm computational predictions. Here we report a comprehensive database of NMR results for membrane phospholipids with interpretations intended to be accessible by non-NMR specialists. Experimental ¹³C-¹H and ²H NMR segmental order parameters (S(CH) or S(CD)) and spin-lattice (Zeeman) relaxation times (T(1Z)) are summarized in convenient tabular form for various saturated, unsaturated, and biological membrane phospholipids. Segmental order parameters give direct information about bilayer structural properties, including the area per lipid and volumetric hydrocarbon thickness. In addition, relaxation rates provide complementary information about molecular dynamics. Particular attention is paid to the magnetic field dependence (frequency dispersion) of the NMR relaxation rates in terms of various simplified power laws. Model-free reduction of the T(1Z) studies in terms of a power-law formalism shows that the relaxation rates for saturated phosphatidylcholines follow a single frequency-dispersive trend within the MHz regime. We show how analytical models can guide the continued development of atomistic and coarse-grained force fields. Our interpretation suggests that lipid diffusion and collective order fluctuations are implicitly governed by the viscoelastic nature of the liquid-crystalline ensemble. Collective bilayer excitations are emergent over mesoscopic length scales that fall between the molecular and bilayer dimensions, and are important for lipid organization and lipid-protein interactions. Future conceptual advances and theoretical reductions will foster understanding of biomembrane structural dynamics through a synergy of NMR measurements and molecular simulations. Copyright © 2010 Elsevier B.V. All rights reserved.
Bai, Qifeng; Shao, Yonghua; Pan, Dabo; Zhang, Yang; Liu, Huanxiang; Yao, Xiaojun
2014-01-01
We designed a program called MolGridCal that can be used to screen small molecule database in grid computing on basis of JPPF grid environment. Based on MolGridCal program, we proposed an integrated strategy for virtual screening and binding mode investigation by combining molecular docking, molecular dynamics (MD) simulations and free energy calculations. To test the effectiveness of MolGridCal, we screened potential ligands for β2 adrenergic receptor (β2AR) from a database containing 50,000 small molecules. MolGridCal can not only send tasks to the grid server automatically, but also can distribute tasks using the screensaver function. As for the results of virtual screening, the known agonist BI-167107 of β2AR is ranked among the top 2% of the screened candidates, indicating MolGridCal program can give reasonable results. To further study the binding mode and refine the results of MolGridCal, more accurate docking and scoring methods are used to estimate the binding affinity for the top three molecules (agonist BI-167107, neutral antagonist alprenolol and inverse agonist ICI 118,551). The results indicate agonist BI-167107 has the best binding affinity. MD simulation and free energy calculation are employed to investigate the dynamic interaction mechanism between the ligands and β2AR. The results show that the agonist BI-167107 also has the lowest binding free energy. This study can provide a new way to perform virtual screening effectively through integrating molecular docking based on grid computing, MD simulations and free energy calculations. The source codes of MolGridCal are freely available at http://molgridcal.codeplex.com. PMID:25229694
Xie, Zhihui; Li, Jing; Baker, Jonathan; Eagleson, Kathie L; Coba, Marcelo P; Levitt, Pat
2016-12-15
Atypical synapse development and plasticity are implicated in many neurodevelopmental disorders (NDDs). NDD-associated, high-confidence risk genes have been identified, yet little is known about functional relationships at the level of protein-protein interactions, which are the dominant molecular bases responsible for mediating circuit development. Proteomics in three independent developing neocortical synaptosomal preparations identified putative interacting proteins of the ligand-activated MET receptor tyrosine kinase, an autism risk gene that mediates synapse development. The candidates were translated into interactome networks and analyzed bioinformatically. Additionally, three independent quantitative proximity ligation assays in cultured neurons and four independent immunoprecipitation analyses of synaptosomes validated protein interactions. Approximately 11% (8/72) of MET-interacting proteins, including SHANK3, SYNGAP1, and GRIN2B, are associated with NDDs. Proteins in the MET interactome were translated into a novel MET interactome network based on human protein-protein interaction databases. High-confidence genes from different NDD datasets that encode synaptosomal proteins were analyzed for being enriched in MET interactome proteins. This was found for autism but not schizophrenia, bipolar disorder, major depressive disorder, or attention-deficit/hyperactivity disorder. There is correlated gene expression between MET and its interactive partners in developing human temporal and visual neocortices but not with highly expressed genes that are not in the interactome. Proximity ligation assays and biochemical analyses demonstrate that MET-protein partner interactions are dynamically regulated by receptor activation. The results provide a novel molecular framework for deciphering the functional relations of key regulators of synaptogenesis that contribute to both typical cortical development and to NDDs. Copyright © 2016 Society of Biological Psychiatry. Published by Elsevier Inc. All rights reserved.
Toofanny, Rudesh D; Simms, Andrew M; Beck, David A C; Daggett, Valerie
2011-08-10
Molecular dynamics (MD) simulations offer the ability to observe the dynamics and interactions of both whole macromolecules and individual atoms as a function of time. Taken in context with experimental data, atomic interactions from simulation provide insight into the mechanics of protein folding, dynamics, and function. The calculation of atomic interactions or contacts from an MD trajectory is computationally demanding and the work required grows exponentially with the size of the simulation system. We describe the implementation of a spatial indexing algorithm in our multi-terabyte MD simulation database that significantly reduces the run-time required for discovery of contacts. The approach is applied to the Dynameomics project data. Spatial indexing, also known as spatial hashing, is a method that divides the simulation space into regular sized bins and attributes an index to each bin. Since, the calculation of contacts is widely employed in the simulation field, we also use this as the basis for testing compression of data tables. We investigate the effects of compression of the trajectory coordinate tables with different options of data and index compression within MS SQL SERVER 2008. Our implementation of spatial indexing speeds up the calculation of contacts over a 1 nanosecond (ns) simulation window by between 14% and 90% (i.e., 1.2 and 10.3 times faster). For a 'full' simulation trajectory (51 ns) spatial indexing reduces the calculation run-time between 31 and 81% (between 1.4 and 5.3 times faster). Compression resulted in reduced table sizes but resulted in no significant difference in the total execution time for neighbour discovery. The greatest compression (~36%) was achieved using page level compression on both the data and indexes. The spatial indexing scheme significantly decreases the time taken to calculate atomic contacts and could be applied to other multidimensional neighbor discovery problems. The speed up enables on-the-fly calculation and visualization of contacts and rapid cross simulation analysis for knowledge discovery. Using page compression for the atomic coordinate tables and indexes saves ~36% of disk space without any significant decrease in calculation time and should be considered for other non-transactional databases in MS SQL SERVER 2008.
2011-01-01
Background Molecular dynamics (MD) simulations offer the ability to observe the dynamics and interactions of both whole macromolecules and individual atoms as a function of time. Taken in context with experimental data, atomic interactions from simulation provide insight into the mechanics of protein folding, dynamics, and function. The calculation of atomic interactions or contacts from an MD trajectory is computationally demanding and the work required grows exponentially with the size of the simulation system. We describe the implementation of a spatial indexing algorithm in our multi-terabyte MD simulation database that significantly reduces the run-time required for discovery of contacts. The approach is applied to the Dynameomics project data. Spatial indexing, also known as spatial hashing, is a method that divides the simulation space into regular sized bins and attributes an index to each bin. Since, the calculation of contacts is widely employed in the simulation field, we also use this as the basis for testing compression of data tables. We investigate the effects of compression of the trajectory coordinate tables with different options of data and index compression within MS SQL SERVER 2008. Results Our implementation of spatial indexing speeds up the calculation of contacts over a 1 nanosecond (ns) simulation window by between 14% and 90% (i.e., 1.2 and 10.3 times faster). For a 'full' simulation trajectory (51 ns) spatial indexing reduces the calculation run-time between 31 and 81% (between 1.4 and 5.3 times faster). Compression resulted in reduced table sizes but resulted in no significant difference in the total execution time for neighbour discovery. The greatest compression (~36%) was achieved using page level compression on both the data and indexes. Conclusions The spatial indexing scheme significantly decreases the time taken to calculate atomic contacts and could be applied to other multidimensional neighbor discovery problems. The speed up enables on-the-fly calculation and visualization of contacts and rapid cross simulation analysis for knowledge discovery. Using page compression for the atomic coordinate tables and indexes saves ~36% of disk space without any significant decrease in calculation time and should be considered for other non-transactional databases in MS SQL SERVER 2008. PMID:21831299
Abuqarn, Mehtap; Allmeling, Christina; Amshoff, Inga; Menger, Bjoern; Nasser, Inas; Vogt, Peter M; Reimers, Kerstin
2011-07-01
Urodele amphibians are exceptional in their ability to regenerate complex body structures such as limbs. Limb regeneration depends on a process called dedifferentiation. Under an inductive wound epidermis terminally differentiated cells transform to pluripotent progenitor cells that coordinately proliferate and eventually redifferentiate to form the new appendage. Recent studies have developed molecular models integrating a set of genes that might have important functions in the control of regenerative cellular plasticity. Among them is Msx1, which induced dedifferentiation in mammalian myotubes in vitro. Herein, we screened for interaction partners of axolotl Msx1 using a yeast two hybrid system. A two hybrid cDNA library of 5-day-old wound epidermis and underlying tissue containing more than 2×10⁶ cDNAs was constructed and used in the screen. 34 resulting cDNA clones were isolated and sequenced. We then compared sequences of the isolated clones to annotated EST contigs of the Salamander EST database (BLASTn) to identify presumptive orthologs. We subsequently searched all no-hit clone sequences against non redundant NCBI sequence databases using BLASTx. It is the first time, that the yeast two hybrid system was adapted to the axolotl animal model and successfully used in a screen for proteins interacting with Msx1 in the context of amphibian limb regeneration. 2011 Elsevier B.V. All rights reserved.
Chen, Chao-Jin; Liu, De-Zhao; Yao, Wei-Feng; Gu, Yu; Huang, Fei; Hei, Zi-Qing; Li, Xiang
2017-01-01
Neuropathic pain is a complex chronic condition occurring post-nervous system damage. The transcriptional reprogramming of injured dorsal root ganglia (DRGs) drives neuropathic pain. However, few comparative analyses using high-throughput platforms have investigated uninjured DRG in neuropathic pain, and potential interactions among differentially expressed genes (DEGs) and pathways were not taken into consideration. The aim of this study was to identify changes in genes and pathways associated with neuropathic pain in uninjured L4 DRG after L5 spinal nerve ligation (SNL) by using bioinformatic analysis. The microarray profile GSE24982 was downloaded from the Gene Expression Omnibus database to identify DEGs between DRGs in SNL and sham rats. The prioritization for these DEGs was performed using the Toppgene database followed by gene ontology and pathway enrichment analyses. The relationships among DEGs from the protein interactive perspective were analyzed using protein-protein interaction (PPI) network and module analysis. Real-time polymerase chain reaction (PCR) and Western blotting were used to confirm the expression of DEGs in the rodent neuropathic pain model. A total of 206 DEGs that might play a role in neuropathic pain were identified in L4 DRG, of which 75 were upregulated and 131 were downregulated. The upregulated DEGs were enriched in biological processes related to transcription regulation and molecular functions such as DNA binding, cell cycle, and the FoxO signaling pathway. Ctnnb1 protein had the highest connectivity degrees in the PPI network. The in vivo studies also validated that mRNA and protein levels of Ctnnb1 were upregulated in both L4 and L5 DRGs. This study provides insight into the functional gene sets and pathways associated with neuropathic pain in L4 uninjured DRG after L5 SNL, which might promote our understanding of the molecular mechanisms underlying the development of neuropathic pain.
PodNet, a protein-protein interaction network of the podocyte.
Warsow, Gregor; Endlich, Nicole; Schordan, Eric; Schordan, Sandra; Chilukoti, Ravi K; Homuth, Georg; Moeller, Marcus J; Fuellen, Georg; Endlich, Karlhans
2013-07-01
Interactions between proteins crucially determine cellular structure and function. Differential analysis of the interactome may help elucidate molecular mechanisms during disease development; however, this analysis necessitates mapping of expression data on protein-protein interaction networks. These networks do not exist for the podocyte; therefore, we built PodNet, a literature-based mouse podocyte network in Cytoscape format. Using database protein-protein interactions, we expanded PodNet to XPodNet with enhanced connectivity. In order to test the performance of XPodNet in differential interactome analysis, we examined podocyte developmental differentiation and the effect of cell culture. Transcriptomes of podocytes in 10 different states were mapped on XPodNet and analyzed with the Cytoscape plugin ExprEssence, based on the law of mass action. Interactions between slit diaphragm proteins are most significantly upregulated during podocyte development and most significantly downregulated in culture. On the other hand, our analysis revealed that interactions lost during podocyte differentiation are not regained in culture, suggesting a loss rather than a reversal of differentiation for podocytes in culture. Thus, we have developed PodNet as a valuable tool for differential interactome analysis in podocytes, and we have identified established and unexplored regulated interactions in developing and cultured podocytes.
Kazusa Marker DataBase: a database for genomics, genetics, and molecular breeding in plants.
Shirasawa, Kenta; Isobe, Sachiko; Tabata, Satoshi; Hirakawa, Hideki
2014-09-01
In order to provide useful genomic information for agronomical plants, we have established a database, the Kazusa Marker DataBase (http://marker.kazusa.or.jp). This database includes information on DNA markers, e.g., SSR and SNP markers, genetic linkage maps, and physical maps, that were developed at the Kazusa DNA Research Institute. Keyword searches for the markers, sequence data used for marker development, and experimental conditions are also available through this database. Currently, 10 plant species have been targeted: tomato (Solanum lycopersicum), pepper (Capsicum annuum), strawberry (Fragaria × ananassa), radish (Raphanus sativus), Lotus japonicus, soybean (Glycine max), peanut (Arachis hypogaea), red clover (Trifolium pratense), white clover (Trifolium repens), and eucalyptus (Eucalyptus camaldulensis). In addition, the number of plant species registered in this database will be increased as our research progresses. The Kazusa Marker DataBase will be a useful tool for both basic and applied sciences, such as genomics, genetics, and molecular breeding in crops.
Investigation of candidate genes for osteoarthritis based on gene expression profiles.
Dong, Shuanghai; Xia, Tian; Wang, Lei; Zhao, Qinghua; Tian, Jiwei
2016-12-01
To explore the mechanism of osteoarthritis (OA) and provide valid biological information for further investigation. Gene expression profile of GSE46750 was downloaded from Gene Expression Omnibus database. The Linear Models for Microarray Data (limma) package (Bioconductor project, http://www.bioconductor.org/packages/release/bioc/html/limma.html) was used to identify differentially expressed genes (DEGs) in inflamed OA samples. Gene Ontology function enrichment analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways enrichment analysis of DEGs were performed based on Database for Annotation, Visualization and Integrated Discovery data, and protein-protein interaction (PPI) network was constructed based on the Search Tool for the Retrieval of Interacting Genes/Proteins database. Regulatory network was screened based on Encyclopedia of DNA Elements. Molecular Complex Detection was used for sub-network screening. Two sub-networks with highest node degree were integrated with transcriptional regulatory network and KEGG functional enrichment analysis was processed for 2 modules. In total, 401 up- and 196 down-regulated DEGs were obtained. Up-regulated DEGs were involved in inflammatory response, while down-regulated DEGs were involved in cell cycle. PPI network with 2392 protein interactions was constructed. Moreover, 10 genes including Interleukin 6 (IL6) and Aurora B kinase (AURKB) were found to be outstanding in PPI network. There are 214 up- and 8 down-regulated transcription factor (TF)-target pairs in the TF regulatory network. Module 1 had TFs including SPI1, PRDM1, and FOS, while module 2 contained FOSL1. The nodes in module 1 were enriched in chemokine signaling pathway, while the nodes in module 2 were mainly enriched in cell cycle. The screened DEGs including IL6, AGT, and AURKB might be potential biomarkers for gene therapy for OA by being regulated by TFs such as FOS and SPI1, and participating in the cell cycle and cytokine-cytokine receptor interaction pathway. Copyright © 2016 Turkish Association of Orthopaedics and Traumatology. Production and hosting by Elsevier B.V. All rights reserved.
Kim, Sun; Chatr-aryamontri, Andrew; Chang, Christie S.; Oughtred, Rose; Rust, Jennifer; Wilbur, W. John; Comeau, Donald C.; Dolinski, Kara; Tyers, Mike
2017-01-01
A great deal of information on the molecular genetics and biochemistry of model organisms has been reported in the scientific literature. However, this data is typically described in free text form and is not readily amenable to computational analyses. To this end, the BioGRID database systematically curates the biomedical literature for genetic and protein interaction data. This data is provided in a standardized computationally tractable format and includes structured annotation of experimental evidence. BioGRID curation necessarily involves substantial human effort by expert curators who must read each publication to extract the relevant information. Computational text-mining methods offer the potential to augment and accelerate manual curation. To facilitate the development of practical text-mining strategies, a new challenge was organized in BioCreative V for the BioC task, the collaborative Biocurator Assistant Task. This was a non-competitive, cooperative task in which the participants worked together to build BioC-compatible modules into an integrated pipeline to assist BioGRID curators. As an integral part of this task, a test collection of full text articles was developed that contained both biological entity annotations (gene/protein and organism/species) and molecular interaction annotations (protein–protein and genetic interactions (PPIs and GIs)). This collection, which we call the BioC-BioGRID corpus, was annotated by four BioGRID curators over three rounds of annotation and contains 120 full text articles curated in a dataset representing two major model organisms, namely budding yeast and human. The BioC-BioGRID corpus contains annotations for 6409 mentions of genes and their Entrez Gene IDs, 186 mentions of organism names and their NCBI Taxonomy IDs, 1867 mentions of PPIs and 701 annotations of PPI experimental evidence statements, 856 mentions of GIs and 399 annotations of GI evidence statements. The purpose, characteristics and possible future uses of the BioC-BioGRID corpus are detailed in this report. Database URL: http://bioc.sourceforge.net/BioC-BioGRID.html PMID:28077563
Wang, Yin-Yin; Li, Jie; Wu, Zeng-Rui; Zhang, Bo; Yang, Hong-Bin; Wang, Qin; Cai, Ying-Chun; Liu, Gui-Xia; Li, Wei-Hua; Tang, Yun
2017-05-01
An increasing number of cases of herb-induced liver injury (HILI) have been reported, presenting new clinical challenges. In this study, taking Polygonum multiflorum Thunb (PmT) as an example, we proposed a computational systems toxicology approach to explore the molecular mechanisms of HILI. First, the chemical components of PmT were extracted from 3 main TCM databases as well as the literature related to natural products. Then, the known targets were collected through data integration, and the potential compound-target interactions (CTIs) were predicted using our substructure-drug-target network-based inference (SDTNBI) method. After screening for hepatotoxicity-related genes by assessing the symptoms of HILI, a compound-target interaction network was constructed. A scoring function, namely, Ascore, was developed to estimate the toxicity of chemicals in the liver. We conducted network analysis to determine the possible mechanisms of the biphasic effects using the analysis tools, including BiNGO, pathway enrichment, organ distribution analysis and predictions of interactions with CYP450 enzymes. Among the chemical components of PmT, 54 components with good intestinal absorption were used for analysis, and 2939 CTIs were obtained. After analyzing the mRNA expression data in the BioGPS database, 1599 CTIs and 125 targets related to liver diseases were identified. In the top 15 compounds, seven with Ascore values >3000 (emodin, quercetin, apigenin, resveratrol, gallic acid, kaempferol and luteolin) were obviously associated with hepatotoxicity. The results from the pathway enrichment analysis suggest that multiple interactions between apoptosis and metabolism may underlie PmT-induced liver injury. Many of the pathways have been verified in specific compounds, such as glutathione metabolism, cytochrome P450 metabolism, and the p53 pathway, among others. Hepatitis symptoms, the perturbation of nine bile acids and yellow or tawny urine also had corresponding pathways, justifying our method. In conclusion, this computational systems toxicology method reveals possible toxic components and could be very helpful for understanding the mechanisms of HILI. In this way, the method might also facilitate the identification of novel hepatotoxic herbs.
NASA Astrophysics Data System (ADS)
Kubas, Adam; Hoffmann, Felix; Heck, Alexander; Oberhofer, Harald; Elstner, Marcus; Blumberger, Jochen
2014-03-01
We introduce a database (HAB11) of electronic coupling matrix elements (Hab) for electron transfer in 11 π-conjugated organic homo-dimer cations. High-level ab inito calculations at the multireference configuration interaction MRCI+Q level of theory, n-electron valence state perturbation theory NEVPT2, and (spin-component scaled) approximate coupled cluster model (SCS)-CC2 are reported for this database to assess the performance of three DFT methods of decreasing computational cost, including constrained density functional theory (CDFT), fragment-orbital DFT (FODFT), and self-consistent charge density functional tight-binding (FODFTB). We find that the CDFT approach in combination with a modified PBE functional containing 50% Hartree-Fock exchange gives best results for absolute Hab values (mean relative unsigned error = 5.3%) and exponential distance decay constants β (4.3%). CDFT in combination with pure PBE overestimates couplings by 38.7% due to a too diffuse excess charge distribution, whereas the economic FODFT and highly cost-effective FODFTB methods underestimate couplings by 37.6% and 42.4%, respectively, due to neglect of interaction between donor and acceptor. The errors are systematic, however, and can be significantly reduced by applying a uniform scaling factor for each method. Applications to dimers outside the database, specifically rotated thiophene dimers and larger acenes up to pentacene, suggests that the same scaling procedure significantly improves the FODFT and FODFTB results for larger π-conjugated systems relevant to organic semiconductors and DNA.
A Toolkit for ARB to Integrate Custom Databases and Externally Built Phylogenies
Essinger, Steven D.; Reichenberger, Erin; Morrison, Calvin; ...
2015-01-21
Researchers are perpetually amassing biological sequence data. The computational approaches employed by ecologists for organizing this data (e.g. alignment, phylogeny, etc.) typically scale nonlinearly in execution time with the size of the dataset. This often serves as a bottleneck for processing experimental data since many molecular studies are characterized by massive datasets. To keep up with experimental data demands, ecologists are forced to choose between continually upgrading expensive in-house computer hardware or outsourcing the most demanding computations to the cloud. Outsourcing is attractive since it is the least expensive option, but does not necessarily allow direct user interaction with themore » data for exploratory analysis. Desktop analytical tools such as ARB are indispensable for this purpose, but they do not necessarily offer a convenient solution for the coordination and integration of datasets between local and outsourced destinations. Therefore, researchers are currently left with an undesirable tradeoff between computational throughput and analytical capability. To mitigate this tradeoff we introduce a software package to leverage the utility of the interactive exploratory tools offered by ARB with the computational throughput of cloud-based resources. Our pipeline serves as middleware between the desktop and the cloud allowing researchers to form local custom databases containing sequences and metadata from multiple resources and a method for linking data outsourced for computation back to the local database. Furthermore, a tutorial implementation of the toolkit is provided in the supporting information, S1 Tutorial.« less
A Toolkit for ARB to Integrate Custom Databases and Externally Built Phylogenies
Essinger, Steven D.; Reichenberger, Erin; Morrison, Calvin; Blackwood, Christopher B.; Rosen, Gail L.
2015-01-01
Researchers are perpetually amassing biological sequence data. The computational approaches employed by ecologists for organizing this data (e.g. alignment, phylogeny, etc.) typically scale nonlinearly in execution time with the size of the dataset. This often serves as a bottleneck for processing experimental data since many molecular studies are characterized by massive datasets. To keep up with experimental data demands, ecologists are forced to choose between continually upgrading expensive in-house computer hardware or outsourcing the most demanding computations to the cloud. Outsourcing is attractive since it is the least expensive option, but does not necessarily allow direct user interaction with the data for exploratory analysis. Desktop analytical tools such as ARB are indispensable for this purpose, but they do not necessarily offer a convenient solution for the coordination and integration of datasets between local and outsourced destinations. Therefore, researchers are currently left with an undesirable tradeoff between computational throughput and analytical capability. To mitigate this tradeoff we introduce a software package to leverage the utility of the interactive exploratory tools offered by ARB with the computational throughput of cloud-based resources. Our pipeline serves as middleware between the desktop and the cloud allowing researchers to form local custom databases containing sequences and metadata from multiple resources and a method for linking data outsourced for computation back to the local database. A tutorial implementation of the toolkit is provided in the supporting information, S1 Tutorial. Availability: http://www.ece.drexel.edu/gailr/EESI/tutorial.php. PMID:25607539
Forging the Basis for Developing Protein-Ligand Interaction Scoring Functions.
Liu, Zhihai; Su, Minyi; Han, Li; Liu, Jie; Yang, Qifan; Li, Yan; Wang, Renxiao
2017-02-21
In structure-based drug design, scoring functions are widely used for fast evaluation of protein-ligand interactions. They are often applied in combination with molecular docking and de novo design methods. Since the early 1990s, a whole spectrum of protein-ligand interaction scoring functions have been developed. Regardless of their technical difference, scoring functions all need data sets combining protein-ligand complex structures and binding affinity data for parametrization and validation. However, data sets of this kind used to be rather limited in terms of size and quality. On the other hand, standard metrics for evaluating scoring function used to be ambiguous. Scoring functions are often tested in molecular docking or even virtual screening trials, which do not directly reflect the genuine quality of scoring functions. Collectively, these underlying obstacles have impeded the invention of more advanced scoring functions. In this Account, we describe our long-lasting efforts to overcome these obstacles, which involve two related projects. On the first project, we have created the PDBbind database. It is the first database that systematically annotates the protein-ligand complexes in the Protein Data Bank (PDB) with experimental binding data. This database has been updated annually since its first public release in 2004. The latest release (version 2016) provides binding data for 16 179 biomolecular complexes in PDB. Data sets provided by PDBbind have been applied to many computational and statistical studies on protein-ligand interaction and various subjects. In particular, it has become a major data resource for scoring function development. On the second project, we have established the Comparative Assessment of Scoring Functions (CASF) benchmark for scoring function evaluation. Our key idea is to decouple the "scoring" process from the "sampling" process, so scoring functions can be tested in a relatively pure context to reflect their quality. In our latest work on this track, i.e. CASF-2013, the performance of a scoring function was quantified in four aspects, including "scoring power", "ranking power", "docking power", and "screening power". All four performance tests were conducted on a test set containing 195 high-quality protein-ligand complexes selected from PDBbind. A panel of 20 standard scoring functions were tested as demonstration. Importantly, CASF is designed to be an open-access benchmark, with which scoring functions developed by different researchers can be compared on the same grounds. Indeed, it has become a popular choice for scoring function validation in recent years. Despite the considerable progress that has been made so far, the performance of today's scoring functions still does not meet people's expectations in many aspects. There is a constant demand for more advanced scoring functions. Our efforts have helped to overcome some obstacles underlying scoring function development so that the researchers in this field can move forward faster. We will continue to improve the PDBbind database and the CASF benchmark in the future to keep them as useful community resources.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yu, Haoyu S.; Zhang, Wenjing; Verma, Pragya
2015-01-01
The goal of this work is to develop a gradient approximation to the exchange–correlation functional of Kohn–Sham density functional theory for treating molecular problems with a special emphasis on the prediction of quantities important for homogeneous catalysis and other molecular energetics. Our training and validation of exchange–correlation functionals is organized in terms of databases and subdatabases. The key properties required for homogeneous catalysis are main group bond energies (database MGBE137), transition metal bond energies (database TMBE32), reaction barrier heights (database BH76), and molecular structures (database MS10). We also consider 26 other databases, most of which are subdatabases of a newlymore » extended broad database called Database 2015, which is presented in the present article and in its ESI. Based on the mathematical form of a nonseparable gradient approximation (NGA), as first employed in the N12 functional, we design a new functional by using Database 2015 and by adding smoothness constraints to the optimization of the functional. The resulting functional is called the gradient approximation for molecules, or GAM. The GAM functional gives better results for MGBE137, TMBE32, and BH76 than any available generalized gradient approximation (GGA) or than N12. The GAM functional also gives reasonable results for MS10 with an MUE of 0.018 Å. The GAM functional provides good results both within the training sets and outside the training sets. The convergence tests and the smooth curves of exchange–correlation enhancement factor as a function of the reduced density gradient show that the GAM functional is a smooth functional that should not lead to extra expense or instability in optimizations. NGAs, like GGAs, have the advantage over meta-GGAs and hybrid GGAs of respectively smaller grid-size requirements for integrations and lower costs for extended systems. These computational advantages combined with the relatively high accuracy for all the key properties needed for molecular catalysis make the GAM functional very promising for future applications.« less
Perspective: Interactive material property databases through aggregation of literature data
NASA Astrophysics Data System (ADS)
Seshadri, Ram; Sparks, Taylor D.
2016-05-01
Searchable, interactive, databases of material properties, particularly those relating to functional materials (magnetics, thermoelectrics, photovoltaics, etc.) are curiously missing from discussions of machine-learning and other data-driven methods for advancing new materials discovery. Here we discuss the manual aggregation of experimental data from the published literature for the creation of interactive databases that allow the original experimental data as well additional metadata to be visualized in an interactive manner. The databases described involve materials for thermoelectric energy conversion, and for the electrodes of Li-ion batteries. The data can be subject to machine-learning, accelerating the discovery of new materials.
Huang, Lin; Lv, Qi; Liu, Fenfen; Shi, Tieliu; Wen, Chengping
2015-11-12
Sheng-ma-bie-jia-tang (SMBJT) is a Traditional Chinese Medicine (TCM) formula that is widely used for the treatment of Systemic Lupus Erythematosus (SLE) in China. However, molecular mechanism behind this formula remains unknown. Here, we systematically analyzed targets of the ingredients in SMBJT to evaluate its potential molecular mechanism. First, we collected 1,267 targets from our previously published database, the Traditional Chinese Medicine Integrated Database (TCMID). Next, we conducted gene ontology and pathway enrichment analyses for these targets and determined that they were enriched in metabolism (amino acids, fatty acids, etc.) and signaling pathways (chemokines, Toll-like receptors, adipocytokines, etc.). 96 targets, which are known SLE disease proteins, were identified as essential targets and the rest 1,171 targets were defined as common targets of this formula. The essential targets directly interacted with SLE disease proteins. Besides, some common targets also had essential connections to both key targets and SLE disease proteins in enriched signaling pathway, e.g. toll-like receptor signaling pathway. We also found distinct function of essential and common targets in immune system processes. This multi-level approach to deciphering the underlying mechanism of SMBJT treatment of SLE details a new perspective that will further our understanding of TCM formulas.
Small molecule inhibitors of mesotrypsin from a structure-based docking screen
Kayode, Olumide; Huang, Zunnan; Soares, Alexei S.; ...
2017-05-02
PRSS3/mesotrypsin is an atypical isoform of trypsin, the upregulation of which has been implicated in promoting tumor progression. To date there are no mesotrypsin-selective pharmacological inhibitors which could serve as tools for deciphering the pathological role of this enzyme, and could potentially form the basis for novel therapeutic strategies targeting mesotrypsin. A virtual screen of the Natural Product Database (NPD) and Food and Drug Administration (FDA) approved Drug Database was conducted by high-throughput molecular docking utilizing crystal structures of mesotrypsin. Twelve high-scoring compounds were selected for testing based on lowest free energy docking scores, interaction with key mesotrypsin active sitemore » residues, and commercial availability. Diminazene (C1D22956468), along with two similar compounds presenting the bis-benzamidine substructure, was validated as a competitive inhibitor of mesotrypsin and other human trypsin isoforms. Diminazene is the most potent small molecule inhibitor of mesotrypsin reported to date with an inhibitory constant (K i) of 3.6±0.3 pM. Diminazene was subsequently co-crystalized with mesotrypsin and the crystal structure was solved and refined to 1.25 Å resolution. This high resolution crystal structure can now offer a foundation for structure-guided efforts to develop novel and potentially more selective mesotrypsin inhibitors based on similar molecular substructures.« less
Small molecule inhibitors of mesotrypsin from a structure-based docking screen
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kayode, Olumide; Huang, Zunnan; Soares, Alexei S.
PRSS3/mesotrypsin is an atypical isoform of trypsin, the upregulation of which has been implicated in promoting tumor progression. To date there are no mesotrypsin-selective pharmacological inhibitors which could serve as tools for deciphering the pathological role of this enzyme, and could potentially form the basis for novel therapeutic strategies targeting mesotrypsin. A virtual screen of the Natural Product Database (NPD) and Food and Drug Administration (FDA) approved Drug Database was conducted by high-throughput molecular docking utilizing crystal structures of mesotrypsin. Twelve high-scoring compounds were selected for testing based on lowest free energy docking scores, interaction with key mesotrypsin active sitemore » residues, and commercial availability. Diminazene (C1D22956468), along with two similar compounds presenting the bis-benzamidine substructure, was validated as a competitive inhibitor of mesotrypsin and other human trypsin isoforms. Diminazene is the most potent small molecule inhibitor of mesotrypsin reported to date with an inhibitory constant (K i) of 3.6±0.3 pM. Diminazene was subsequently co-crystalized with mesotrypsin and the crystal structure was solved and refined to 1.25 Å resolution. This high resolution crystal structure can now offer a foundation for structure-guided efforts to develop novel and potentially more selective mesotrypsin inhibitors based on similar molecular substructures.« less
Using the Cambridge Structural Database to Teach Molecular Geometry Concepts in Organic Chemistry
ERIC Educational Resources Information Center
Wackerly, Jay Wm.; Janowicz, Philip A.; Ritchey, Joshua A.; Caruso, Mary M.; Elliott, Erin L.; Moore, Jeffrey S.
2009-01-01
This article reports a set of two homework assignments that can be used in a second-year undergraduate organic chemistry class. These assignments were designed to help reinforce concepts of molecular geometry and to give students the opportunity to use a technological database and data mining to analyze experimentally determined chemical…
CH/π interactions in metal-porphyrin complexes with pyrrole and chelate rings as hydrogen acceptors.
Medaković, Vesna B; Bogdanović, Goran A; Milčić, Miloš K; Janjić, Goran V; Zarić, Snežana D
2012-12-01
CH/π interactions in metal porphyrinato complexes were studied by analyzing data in crystal structures from the Cambridge Structural Database (CSD) and by quantum chemical calculations. The analysis of the data in the CSD shows that both five-membered pyrrole and six-membered chelate rings form CH/π interactions. The interactions occur more frequently with five-membered rings. The analysis of distances in crystal structures and calculated energies show stronger interactions with six-membered chelate rings, indicating that a larger number of interactions with five-membered rings are not the consequence of stronger interactions, but better accessibility of five-membered pyrrole rings. The calculated energies of the interactions with positions in six-membered rings are -2.09 to -2.83 kcal/mol, while the energies with five-membered rings are -2.05 to -2.26 kcal/mol. The results reveal that stronger interactions of six-membered rings are the consequence of stronger electrostatic interactions. Substituents on the porphyrin ring significantly strengthen the interactions. Substituents on the six-membered ring strengthen the interaction energy by about 20%. The results show that CH/π interactions play an important role in molecular recognition of metalloporphyrins. The significant influence of the substituents on interaction energies can be very important for the design of model systems in bioinorganic chemistry. Copyright © 2012 Elsevier Inc. All rights reserved.
Natural products used as a chemical library for protein-protein interaction targeted drug discovery.
Jin, Xuemei; Lee, Kyungro; Kim, Nam Hee; Kim, Hyun Sil; Yook, Jong In; Choi, Jiwon; No, Kyoung Tai
2018-01-01
Protein-protein interactions (PPIs), which are essential for cellular processes, have been recognized as attractive therapeutic targets. Therefore, the construction of a PPI-focused chemical library is an inevitable necessity for future drug discovery. Natural products have been used as traditional medicines to treat human diseases for millennia; in addition, their molecular scaffolds have been used in diverse approved drugs and drug candidates. The recent discovery of the ability of natural products to inhibit PPIs led us to use natural products as a chemical library for PPI-targeted drug discovery. In this study, we collected natural products (NPDB) from non-commercial and in-house databases to analyze their similarities to small-molecule PPI inhibitors (iPPIs) and FDA-approved drugs by using eight molecular descriptors. Then, we evaluated the distribution of NPDB and iPPIs in the chemical space, represented by the molecular fingerprint and molecular scaffolds, to identify the promising scaffolds, which could interfere with PPIs. To investigate the ability of natural products to inhibit PPI targets, molecular docking was used. Then, we predicted a set of high-potency natural products by using the iPPI-likeness score based on a docking score-weighted model. These selected natural products showed high binding affinities to the PPI target, namely XIAP, which were validated in an in vitro experiment. In addition, the natural products with novel scaffolds might provide a promising starting point for further medicinal chemistry developments. Overall, our study shows the potency of natural products in targeting PPIs, which might help in the design of a PPI-focused chemical library for future drug discovery. Copyright © 2017 Elsevier Inc. All rights reserved.
DIMA 3.0: Domain Interaction Map.
Luo, Qibin; Pagel, Philipp; Vilne, Baiba; Frishman, Dmitrij
2011-01-01
Domain Interaction MAp (DIMA, available at http://webclu.bio.wzw.tum.de/dima) is a database of predicted and known interactions between protein domains. It integrates 5807 structurally known interactions imported from the iPfam and 3did databases and 46,900 domain interactions predicted by four computational methods: domain phylogenetic profiling, domain pair exclusion algorithm correlated mutations and domain interaction prediction in a discriminative way. Additionally predictions are filtered to exclude those domain pairs that are reported as non-interacting by the Negatome database. The DIMA Web site allows to calculate domain interaction networks either for a domain of interest or for entire organisms, and to explore them interactively using the Flash-based Cytoscape Web software.
Molecular Imaging and Contrast Agent Database (MICAD): evolution and progress.
Chopra, Arvind; Shan, Liang; Eckelman, W C; Leung, Kam; Latterner, Martin; Bryant, Stephen H; Menkens, Anne
2012-02-01
The purpose of writing this review is to showcase the Molecular Imaging and Contrast Agent Database (MICAD; www.micad.nlm.nih.gov ) to students, researchers, and clinical investigators interested in the different aspects of molecular imaging. This database provides freely accessible, current, online scientific information regarding molecular imaging (MI) probes and contrast agents (CA) used for positron emission tomography, single-photon emission computed tomography, magnetic resonance imaging, X-ray/computed tomography, optical imaging and ultrasound imaging. Detailed information on >1,000 agents in MICAD is provided in a chapter format and can be accessed through PubMed. Lists containing >4,250 unique MI probes and CAs published in peer-reviewed journals and agents approved by the United States Food and Drug Administration as well as a comma separated values file summarizing all chapters in the database can be downloaded from the MICAD homepage. Users can search for agents in MICAD on the basis of imaging modality, source of signal/contrast, agent or target category, pre-clinical or clinical studies, and text words. Chapters in MICAD describe the chemical characteristics (structures linked to PubChem), the in vitro and in vivo activities, and other relevant information regarding an imaging agent. All references in the chapters have links to PubMed. A Supplemental Information Section in each chapter is available to share unpublished information regarding an agent. A Guest Author Program is available to facilitate rapid expansion of the database. Members of the imaging community registered with MICAD periodically receive an e-mail announcement (eAnnouncement) that lists new chapters uploaded to the database. Users of MICAD are encouraged to provide feedback, comments, or suggestions for further improvement of the database by writing to the editors at micad@nlm.nih.gov.
Molecular Imaging and Contrast Agent Database (MICAD): Evolution and Progress
Chopra, Arvind; Shan, Liang; Eckelman, W. C.; Leung, Kam; Latterner, Martin; Bryant, Stephen H.; Menkens, Anne
2011-01-01
The purpose of writing this review is to showcase the Molecular Imaging and Contrast Agent Database (MICAD; www.micad.nlm.nih.gov) to students, researchers and clinical investigators interested in the different aspects of molecular imaging. This database provides freely accessible, current, online scientific information regarding molecular imaging (MI) probes and contrast agents (CA) used for positron emission tomography, single-photon emission computed tomography, magnetic resonance imaging, x-ray/computed tomography, optical imaging and ultrasound imaging. Detailed information on >1000 agents in MICAD is provided in a chapter format and can be accessed through PubMed. Lists containing >4250 unique MI probes and CAs published in peer-reviewed journals and agents approved by the United States Food and Drug Administration (FDA) as well as a CSV file summarizing all chapters in the database can be downloaded from the MICAD homepage. Users can search for agents in MICAD on the basis of imaging modality, source of signal/contrast, agent or target category, preclinical or clinical studies, and text words. Chapters in MICAD describe the chemical characteristics (structures linked to PubChem), the in vitro and in vivo activities and other relevant information regarding an imaging agent. All references in the chapters have links to PubMed. A Supplemental Information Section in each chapter is available to share unpublished information regarding an agent. A Guest Author Program is available to facilitate rapid expansion of the database. Members of the imaging community registered with MICAD periodically receive an e-mail announcement (eAnnouncement) that lists new chapters uploaded to the database. Users of MICAD are encouraged to provide feedback, comments or suggestions for further improvement of the database by writing to the editors at: micad@nlm.nih.gov PMID:21989943
Mitchell, Joshua M.; Fan, Teresa W.-M.; Lane, Andrew N.; Moseley, Hunter N. B.
2014-01-01
Large-scale identification of metabolites is key to elucidating and modeling metabolism at the systems level. Advances in metabolomics technologies, particularly ultra-high resolution mass spectrometry (MS) enable comprehensive and rapid analysis of metabolites. However, a significant barrier to meaningful data interpretation is the identification of a wide range of metabolites including unknowns and the determination of their role(s) in various metabolic networks. Chemoselective (CS) probes to tag metabolite functional groups combined with high mass accuracy provide additional structural constraints for metabolite identification and quantification. We have developed a novel algorithm, Chemically Aware Substructure Search (CASS) that efficiently detects functional groups within existing metabolite databases, allowing for combined molecular formula and functional group (from CS tagging) queries to aid in metabolite identification without a priori knowledge. Analysis of the isomeric compounds in both Human Metabolome Database (HMDB) and KEGG Ligand demonstrated a high percentage of isomeric molecular formulae (43 and 28%, respectively), indicating the necessity for techniques such as CS-tagging. Furthermore, these two databases have only moderate overlap in molecular formulae. Thus, it is prudent to use multiple databases in metabolite assignment, since each major metabolite database represents different portions of metabolism within the biosphere. In silico analysis of various CS-tagging strategies under different conditions for adduct formation demonstrate that combined FT-MS derived molecular formulae and CS-tagging can uniquely identify up to 71% of KEGG and 37% of the combined KEGG/HMDB database vs. 41 and 17%, respectively without adduct formation. This difference between database isomer disambiguation highlights the strength of CS-tagging for non-lipid metabolite identification. However, unique identification of complex lipids still needs additional information. PMID:25120557
A Community Standard Format for the Representation of Protein Affinity Reagents*
Gloriam, David E.; Orchard, Sandra; Bertinetti, Daniela; Björling, Erik; Bongcam-Rudloff, Erik; Borrebaeck, Carl A. K.; Bourbeillon, Julie; Bradbury, Andrew R. M.; de Daruvar, Antoine; Dübel, Stefan; Frank, Ronald; Gibson, Toby J.; Gold, Larry; Haslam, Niall; Herberg, Friedrich W.; Hiltke, Tara; Hoheisel, Jörg D.; Kerrien, Samuel; Koegl, Manfred; Konthur, Zoltán; Korn, Bernhard; Landegren, Ulf; Montecchi-Palazzi, Luisa; Palcy, Sandrine; Rodriguez, Henry; Schweinsberg, Sonja; Sievert, Volker; Stoevesandt, Oda; Taussig, Michael J.; Ueffing, Marius; Uhlén, Mathias; van der Maarel, Silvère; Wingren, Christer; Woollard, Peter; Sherman, David J.; Hermjakob, Henning
2010-01-01
Protein affinity reagents (PARs), most commonly antibodies, are essential reagents for protein characterization in basic research, biotechnology, and diagnostics as well as the fastest growing class of therapeutics. Large numbers of PARs are available commercially; however, their quality is often uncertain. In addition, currently available PARs cover only a fraction of the human proteome, and their cost is prohibitive for proteome scale applications. This situation has triggered several initiatives involving large scale generation and validation of antibodies, for example the Swedish Human Protein Atlas and the German Antibody Factory. Antibodies targeting specific subproteomes are being pursued by members of Human Proteome Organisation (plasma and liver proteome projects) and the United States National Cancer Institute (cancer-associated antigens). ProteomeBinders, a European consortium, aims to set up a resource of consistently quality-controlled protein-binding reagents for the whole human proteome. An ultimate PAR database resource would allow consumers to visit one on-line warehouse and find all available affinity reagents from different providers together with documentation that facilitates easy comparison of their cost and quality. However, in contrast to, for example, nucleotide databases among which data are synchronized between the major data providers, current PAR producers, quality control centers, and commercial companies all use incompatible formats, hindering data exchange. Here we propose Proteomics Standards Initiative (PSI)-PAR as a global community standard format for the representation and exchange of protein affinity reagent data. The PSI-PAR format is maintained by the Human Proteome Organisation PSI and was developed within the context of ProteomeBinders by building on a mature proteomics standard format, PSI-molecular interaction, which is a widely accepted and established community standard for molecular interaction data. Further information and documentation are available on the PSI-PAR web site. PMID:19674966
iRefWeb: interactive analysis of consolidated protein interaction data and their supporting evidence
Turner, Brian; Razick, Sabry; Turinsky, Andrei L.; Vlasblom, James; Crowdy, Edgard K.; Cho, Emerson; Morrison, Kyle; Wodak, Shoshana J.
2010-01-01
We present iRefWeb, a web interface to protein interaction data consolidated from 10 public databases: BIND, BioGRID, CORUM, DIP, IntAct, HPRD, MINT, MPact, MPPI and OPHID. iRefWeb enables users to examine aggregated interactions for a protein of interest, and presents various statistical summaries of the data across databases, such as the number of organism-specific interactions, proteins and cited publications. Through links to source databases and supporting evidence, researchers may gauge the reliability of an interaction using simple criteria, such as the detection methods, the scale of the study (high- or low-throughput) or the number of cited publications. Furthermore, iRefWeb compares the information extracted from the same publication by different databases, and offers means to follow-up possible inconsistencies. We provide an overview of the consolidated protein–protein interaction landscape and show how it can be automatically cropped to aid the generation of meaningful organism-specific interactomes. iRefWeb can be accessed at: http://wodaklab.org/iRefWeb. Database URL: http://wodaklab.org/iRefWeb/ PMID:20940177
Kim, Woo-Yeon; Kang, Sungsoo; Kim, Byoung-Chul; Oh, Jeehyun; Cho, Seongwoong; Bhak, Jong; Choi, Jong-Soon
2008-01-01
Cyanobacteria are model organisms for studying photosynthesis, carbon and nitrogen assimilation, evolution of plant plastids, and adaptability to environmental stresses. Despite many studies on cyanobacteria, there is no web-based database of their regulatory and signaling protein-protein interaction networks to date. We report a database and website SynechoNET that provides predicted protein-protein interactions. SynechoNET shows cyanobacterial domain-domain interactions as well as their protein-level interactions using the model cyanobacterium, Synechocystis sp. PCC 6803. It predicts the protein-protein interactions using public interaction databases that contain mutually complementary and redundant data. Furthermore, SynechoNET provides information on transmembrane topology, signal peptide, and domain structure in order to support the analysis of regulatory membrane proteins. Such biological information can be queried and visualized in user-friendly web interfaces that include the interactive network viewer and search pages by keyword and functional category. SynechoNET is an integrated protein-protein interaction database designed to analyze regulatory membrane proteins in cyanobacteria. It provides a platform for biologists to extend the genomic data of cyanobacteria by predicting interaction partners, membrane association, and membrane topology of Synechocystis proteins. SynechoNET is freely available at http://synechocystis.org/ or directly at http://bioportal.kobic.kr/SynechoNET/.
Blázovics, Anna
2018-05-01
The terminology of traditional Chinese medicine (TCM) is hardly interpretable in the context of human genome, therefore the human genome program attracted attention towards the Western practice of medicine in China. In the last two decades, several important steps could be observed in China in relation to the approach of traditional Chinese and Western medicine. The Chinese government supports the realization of information databases for research in order to clarify the molecular biology level to detect associations between gene expression signal transduction pathways and protein-protein interactions, and the effects of bioactive components of Chinese drugs and their effectiveness. The values of TCM are becoming more and more important for Western medicine as well, because molecular biological therapies did not redeem themselves, e.g., in tumor therapy. Orv Hetil. 2018; 159(18): 696-702.
Kihara, Daisuke; Sael, Lee; Chikhi, Rayan; Esquivel-Rodriguez, Juan
2011-09-01
The tertiary structures of proteins have been solved in an increasing pace in recent years. To capitalize the enormous efforts paid for accumulating the structure data, efficient and effective computational methods need to be developed for comparing, searching, and investigating interactions of protein structures. We introduce the 3D Zernike descriptor (3DZD), an emerging technique to describe molecular surfaces. The 3DZD is a series expansion of mathematical three-dimensional function, and thus a tertiary structure is represented compactly by a vector of coefficients of terms in the series. A strong advantage of the 3DZD is that it is invariant to rotation of target object to be represented. These two characteristics of the 3DZD allow rapid comparison of surface shapes, which is sufficient for real-time structure database screening. In this article, we review various applications of the 3DZD, which have been recently proposed.
A theoretical-electron-density databank using a model of real and virtual spherical atoms.
Nassour, Ayoub; Domagala, Slawomir; Guillot, Benoit; Leduc, Theo; Lecomte, Claude; Jelsch, Christian
2017-08-01
A database describing the electron density of common chemical groups using combinations of real and virtual spherical atoms is proposed, as an alternative to the multipolar atom modelling of the molecular charge density. Theoretical structure factors were computed from periodic density functional theory calculations on 38 crystal structures of small molecules and the charge density was subsequently refined using a density model based on real spherical atoms and additional dummy charges on the covalent bonds and on electron lone-pair sites. The electron-density parameters of real and dummy atoms present in a similar chemical environment were averaged on all the molecules studied to build a database of transferable spherical atoms. Compared with the now-popular databases of transferable multipolar parameters, the spherical charge modelling needs fewer parameters to describe the molecular electron density and can be more easily incorporated in molecular modelling software for the computation of electrostatic properties. The construction method of the database is described. In order to analyse to what extent this modelling method can be used to derive meaningful molecular properties, it has been applied to the urea molecule and to biotin/streptavidin, a protein/ligand complex.
NALDB: nucleic acid ligand database for small molecules targeting nucleic acid
Kumar Mishra, Subodh; Kumar, Amit
2016-01-01
Nucleic acid ligand database (NALDB) is a unique database that provides detailed information about the experimental data of small molecules that were reported to target several types of nucleic acid structures. NALDB is the first ligand database that contains ligand information for all type of nucleic acid. NALDB contains more than 3500 ligand entries with detailed pharmacokinetic and pharmacodynamic information such as target name, target sequence, ligand 2D/3D structure, SMILES, molecular formula, molecular weight, net-formal charge, AlogP, number of rings, number of hydrogen bond donor and acceptor, potential energy along with their Ki, Kd, IC50 values. All these details at single platform would be helpful for the development and betterment of novel ligands targeting nucleic acids that could serve as a potential target in different diseases including cancers and neurological disorders. With maximum 255 conformers for each ligand entry, our database is a multi-conformer database and can facilitate the virtual screening process. NALDB provides powerful web-based search tools that make database searching efficient and simplified using option for text as well as for structure query. NALDB also provides multi-dimensional advanced search tool which can screen the database molecules on the basis of molecular properties of ligand provided by database users. A 3D structure visualization tool has also been included for 3D structure representation of ligands. NALDB offers an inclusive pharmacological information and the structurally flexible set of small molecules with their three-dimensional conformers that can accelerate the virtual screening and other modeling processes and eventually complement the nucleic acid-based drug discovery research. NALDB can be routinely updated and freely available on bsbe.iiti.ac.in/bsbe/naldb/HOME.php. Database URL: http://bsbe.iiti.ac.in/bsbe/naldb/HOME.php PMID:26896846
The BioGRID interaction database: 2017 update
Chatr-aryamontri, Andrew; Oughtred, Rose; Boucher, Lorrie; Rust, Jennifer; Chang, Christie; Kolas, Nadine K.; O'Donnell, Lara; Oster, Sara; Theesfeld, Chandra; Sellam, Adnane; Stark, Chris; Breitkreutz, Bobby-Joe; Dolinski, Kara; Tyers, Mike
2017-01-01
The Biological General Repository for Interaction Datasets (BioGRID: https://thebiogrid.org) is an open access database dedicated to the annotation and archival of protein, genetic and chemical interactions for all major model organism species and humans. As of September 2016 (build 3.4.140), the BioGRID contains 1 072 173 genetic and protein interactions, and 38 559 post-translational modifications, as manually annotated from 48 114 publications. This dataset represents interaction records for 66 model organisms and represents a 30% increase compared to the previous 2015 BioGRID update. BioGRID curates the biomedical literature for major model organism species, including humans, with a recent emphasis on central biological processes and specific human diseases. To facilitate network-based approaches to drug discovery, BioGRID now incorporates 27 501 chemical–protein interactions for human drug targets, as drawn from the DrugBank database. A new dynamic interaction network viewer allows the easy navigation and filtering of all genetic and protein interaction data, as well as for bioactive compounds and their established targets. BioGRID data are directly downloadable without restriction in a variety of standardized formats and are freely distributed through partner model organism databases and meta-databases. PMID:27980099
Enhancing UCSF Chimera through web services.
Huang, Conrad C; Meng, Elaine C; Morris, John H; Pettersen, Eric F; Ferrin, Thomas E
2014-07-01
Integrating access to web services with desktop applications allows for an expanded set of application features, including performing computationally intensive tasks and convenient searches of databases. We describe how we have enhanced UCSF Chimera (http://www.rbvi.ucsf.edu/chimera/), a program for the interactive visualization and analysis of molecular structures and related data, through the addition of several web services (http://www.rbvi.ucsf.edu/chimera/docs/webservices.html). By streamlining access to web services, including the entire job submission, monitoring and retrieval process, Chimera makes it simpler for users to focus on their science projects rather than data manipulation. Chimera uses Opal, a toolkit for wrapping scientific applications as web services, to provide scalable and transparent access to several popular software packages. We illustrate Chimera's use of web services with an example workflow that interleaves use of these services with interactive manipulation of molecular sequences and structures, and we provide an example Python program to demonstrate how easily Opal-based web services can be accessed from within an application. Web server availability: http://webservices.rbvi.ucsf.edu/opal2/dashboard?command=serviceList. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Identification and the molecular mechanism of a novel myosin-derived ACE inhibitory peptide.
Yu, Zhipeng; Wu, Sijia; Zhao, Wenzhu; Ding, Long; Shiuan, David; Chen, Feng; Li, Jianrong; Liu, Jingbo
2018-01-24
The objective of this work was to identify a novel ACE inhibitory peptide from myosin using a number of in silico methods. Myosin was evaluated as a substrate for use in the generation of ACE inhibitory peptides using BIOPEP and ExPASy PeptideCutter. Then the ACE inhibitory activity prediction of peptides in silico was evaluated using the program peptide ranker, following the database search of known and unknown peptides using the program BIOPEP. In addition, the interaction mechanisms of the peptide and ACE were evaluated by DS. All of the tripeptides were predicted to be nontoxic. Results suggested that the tripeptide NCW exerted potent ACE inhibitory activity with an IC 50 value of 35.5 μM. Furthermore, the results suggested that the peptide NCW comes into contact with Zn 701, Tyr 523, His 383, Glu 384, Glu 411, and His 387. The potential molecular mechanism of the NCW/ACE interaction was investigated. Results confirmed that the higher inhibitory potency of NCW might be attributed to the formation of more hydrogen bonds with the ACE's active site. Therefore, the in silico method is effective to predict and identify novel ACE inhibitory peptides from protein hydrolysates.
Lee, Patricia N; McFall-Ngai, Margaret J; Callaerts, Patrick; de Couet, H Gert
2009-11-01
The Hawaiian bobtail squid, Euprymna scolopes, is a cephalopod whose small size, short lifespan, rapid growth, and year-round availability make it suitable as a model organism. E. scolopes is studied in three principal contexts: (1) as a model of cephalopod development; (2) as a model of animal-bacterial symbioses; and (3) as a system for studying adaptations of tissues that interact with light. E. scolopes embryos can be obtained continually and can be reared in the laboratory over an entire generation. The embryos and protective chorions are optically clear, facilitating in situ developmental observations, and can be manipulated experimentally. Many molecular protocols have been developed for studying E. scolopes development. This species is best known, however, for its symbiosis with the luminous marine bacterium Vibrio fischeri and has been used to study determinants of symbiont specificity, the influence of symbiosis on development of the squid light organ, and the mechanisms by which a stable association is achieved. Both partners can be grown independently under laboratory conditions, a feature that offers the unusual opportunity to manipulate the symbiosis experimentally. Molecular and genetic tools have been developed for V. fischeri, and a large expressed sequence tag (EST) database is available for the host symbiotic tissues. Additionally, comparisons between light organ form and function to those of the eye can be made. Both types of tissue interact with light, but have divergent embryonic development. As such, they offer an opportunity to study the molecular basis for the evolution of morphological novelties.
NASA Astrophysics Data System (ADS)
Rajamanikandan, Sundaraj; Srinivasan, Pappu
2017-03-01
Bacteria communicate with one another using extracellular signaling molecules called auto-inducers (AHLs), a process termed as quorum sensing. The quorum sensing process allows bacteria to regulate various physiological activities. In this regard, quorum sensing master regulator LuxR from Vibrio harveyi represents an attractive therapeutic target for the development of novel anti-quorum sensing agents. Eventhough the binding of AHL complex with LuxR is evidenced in earlier reports, but their mode of binding is not clearly determined. Therefore, in the present work, molecular docking, in silico mutational studies, molecular dynamics simulations and free energy calculations were performed to understand the selectivity of AHL into the binding site of LuxR. The results revealed that Asn133 and Gln137 residues play a crucial role in recognizing AHL more effectively into the binding site of LuxR with good binding free energy. In addition to that, the carbonyl group presents in the lactone ring and amide group of AHL plays a vital role in the formation of hydrogen bond interactions with the protein. Further, structure based virtual screening was performed using ChemBridge database to screen potent lead molecules against LuxR. 4-benzyl-2-pyrrolidinone and N-[2(1-cyclohexen-1-yl) enthyl]-N'(2-ethoxyphenyl) were selected based on dock score, binding affinity and mode of interactions with the receptor. Furthermore, binding free energy, density functional theory and ADME prediction were performed to rank the lead molecules. Thus, the identified lead molecules can be used for the development of anti-quorum sensing drugs.
NASA Astrophysics Data System (ADS)
Lalit, Manisha; Gangwal, Rahul P.; Dhoke, Gaurao V.; Damre, Mangesh V.; Khandelwal, Kanchan; Sangamwar, Abhay T.
2013-10-01
A combined pharmacophore modelling, 3D-QSAR and molecular docking approach was employed to reveal structural and chemical features essential for the development of small molecules as LRH-1 agonists. The best HypoGen pharmacophore hypothesis (Hypo1) consists of one hydrogen-bond donor (HBD), two general hydrophobic (H), one hydrophobic aromatic (HYAr) and one hydrophobic aliphatic (HYA) feature. It has exhibited high correlation coefficient of 0.927, cost difference of 85.178 bit and low RMS value of 1.411. This pharmacophore hypothesis was cross-validated using test set, decoy set and Cat-Scramble methodology. Subsequently, validated pharmacophore hypothesis was used in the screening of small chemical databases. Further, 3D-QSAR models were developed based on the alignment obtained using substructure alignment. The best CoMFA and CoMSIA model has exhibited excellent rncv2 values of 0.991 and 0.987, and rcv2 values of 0.767 and 0.703, respectively. CoMFA predicted rpred2 of 0.87 and CoMSIA predicted rpred2 of 0.78 showed that the predicted values were in good agreement with the experimental values. Molecular docking analysis reveals that π-π interaction with His390 and hydrogen bond interaction with His390/Arg393 is essential for LRH-1 agonistic activity. The results from pharmacophore modelling, 3D-QSAR and molecular docking are complementary to each other and could serve as a powerful tool for the discovery of potent small molecules as LRH-1 agonists.
Molecular scaffold analysis of natural products databases in the public domain.
Yongye, Austin B; Waddell, Jacob; Medina-Franco, José L
2012-11-01
Natural products represent important sources of bioactive compounds in drug discovery efforts. In this work, we compiled five natural products databases available in the public domain and performed a comprehensive chemoinformatic analysis focused on the content and diversity of the scaffolds with an overview of the diversity based on molecular fingerprints. The natural products databases were compared with each other and with a set of molecules obtained from in-house combinatorial libraries, and with a general screening commercial library. It was found that publicly available natural products databases have different scaffold diversity. In contrast to the common concept that larger libraries have the largest scaffold diversity, the largest natural products collection analyzed in this work was not the most diverse. The general screening library showed, overall, the highest scaffold diversity. However, considering the most frequent scaffolds, the general reference library was the least diverse. In general, natural products databases in the public domain showed low molecule overlap. In addition to benzene and acyclic compounds, flavones, coumarins, and flavanones were identified as the most frequent molecular scaffolds across the different natural products collections. The results of this work have direct implications in the computational and experimental screening of natural product databases for drug discovery. © 2012 John Wiley & Sons A/S.
SCRIPDB: a portal for easy access to syntheses, chemicals and reactions in patents
Heifets, Abraham; Jurisica, Igor
2012-01-01
The patent literature is a rich catalog of biologically relevant chemicals; many public and commercial molecular databases contain the structures disclosed in patent claims. However, patents are an equally rich source of metadata about bioactive molecules, including mechanism of action, disease class, homologous experimental series, structural alternatives, or the synthetic pathways used to produce molecules of interest. Unfortunately, this metadata is discarded when chemical structures are deposited separately in databases. SCRIPDB is a chemical structure database designed to make this metadata accessible. SCRIPDB provides the full original patent text, reactions and relationships described within any individual patent, in addition to the molecular files common to structural databases. We discuss how such information is valuable in medical text mining, chemical image analysis, reaction extraction and in silico pharmaceutical lead optimization. SCRIPDB may be searched by exact chemical structure, substructure or molecular similarity and the results may be restricted to patents describing synthetic routes. SCRIPDB is available at http://dcv.uhnres.utoronto.ca/SCRIPDB. PMID:22067445
The HITRAN molecular data base - Editions of 1991 and 1992
NASA Technical Reports Server (NTRS)
Rothman, Laurence S.; Gamache, R. R.; Tipping, R. H.; Rinsland, C. P.; Smith, M. A. H.; Benner, D. C.; Devi, V. M.; Flaud, J.-M.; Camy-Peyret, C.; Perrin, A.
1992-01-01
We describe in this paper the modifications, improvements, and enhancements to the HITRAN molecular absorption database that have occurred in the two editions of 1991 and 1992. The current database includes line parameters for 31 species and their isotopomers that are significant for terrestrial atmospheric studies. This line-by-line portion of HITRAN presently contains about 709,000 transitions between 0 and 23,000/cm and contains three molecules not present in earlier versions: COF2, SF6, and H2S. The HITRAN compilation has substantially more information on chlorofluorocarbons and other molecular species that exhibit dense spectra which are not amenable to line-by-line representation. The user access of the database has been advanced, and new media forms are now available for use on personal computers.
dbMDEGA: a database for meta-analysis of differentially expressed genes in autism spectrum disorder.
Zhang, Shuyun; Deng, Libin; Jia, Qiyue; Huang, Shaoting; Gu, Junwang; Zhou, Fankun; Gao, Meng; Sun, Xinyi; Feng, Chang; Fan, Guangqin
2017-11-16
Autism spectrum disorders (ASD) are hereditary, heterogeneous and biologically complex neurodevelopmental disorders. Individual studies on gene expression in ASD cannot provide clear consensus conclusions. Therefore, a systematic review to synthesize the current findings from brain tissues and a search tool to share the meta-analysis results are urgently needed. Here, we conducted a meta-analysis of brain gene expression profiles in the current reported human ASD expression datasets (with 84 frozen male cortex samples, 17 female cortex samples, 32 cerebellum samples and 4 formalin fixed samples) and knock-out mouse ASD model expression datasets (with 80 collective brain samples). Then, we applied R language software and developed an interactive shared and updated database (dbMDEGA) displaying the results of meta-analysis of data from ASD studies regarding differentially expressed genes (DEGs) in the brain. This database, dbMDEGA ( https://dbmdega.shinyapps.io/dbMDEGA/ ), is a publicly available web-portal for manual annotation and visualization of DEGs in the brain from data from ASD studies. This database uniquely presents meta-analysis values and homologous forest plots of DEGs in brain tissues. Gene entries are annotated with meta-values, statistical values and forest plots of DEGs in brain samples. This database aims to provide searchable meta-analysis results based on the current reported brain gene expression datasets of ASD to help detect candidate genes underlying this disorder. This new analytical tool may provide valuable assistance in the discovery of DEGs and the elucidation of the molecular pathogenicity of ASD. This database model may be replicated to study other disorders.
Coarse-Grained Models for Automated Fragmentation and Parametrization of Molecular Databases.
Fraaije, Johannes G E M; van Male, Jan; Becherer, Paul; Serral Gracià, Rubèn
2016-12-27
We calibrate coarse-grained interaction potentials suitable for screening large data sets in top-down fashion. Three new algorithms are introduced: (i) automated decomposition of molecules into coarse-grained units (fragmentation); (ii) Coarse-Grained Reference Interaction Site Model-Hypernetted Chain (CG RISM-HNC) as an intermediate proxy for dissipative particle dynamics (DPD); and (iii) a simple top-down coarse-grained interaction potential/model based on activity coefficient theories from engineering (using COSMO-RS). We find that the fragment distribution follows Zipf and Heaps scaling laws. The accuracy in Gibbs energy of mixing calculations is a few tenths of a kilocalorie per mole. As a final proof of principle, we use full coarse-grained sampling through DPD thermodynamics integration to calculate log P OW for 4627 compounds with an average error of 0.84 log unit. The computational speeds per calculation are a few seconds for CG RISM-HNC and a few minutes for DPD thermodynamic integration.
Pafilis, Evangelos; Buttigieg, Pier Luigi; Ferrell, Barbra; Pereira, Emiliano; Schnetzer, Julia; Arvanitidis, Christos; Jensen, Lars Juhl
2016-01-01
The microbial and molecular ecology research communities have made substantial progress on developing standards for annotating samples with environment metadata. However, sample manual annotation is a highly labor intensive process and requires familiarity with the terminologies used. We have therefore developed an interactive annotation tool, EXTRACT, which helps curators identify and extract standard-compliant terms for annotation of metagenomic records and other samples. Behind its web-based user interface, the system combines published methods for named entity recognition of environment, organism, tissue and disease terms. The evaluators in the BioCreative V Interactive Annotation Task found the system to be intuitive, useful, well documented and sufficiently accurate to be helpful in spotting relevant text passages and extracting organism and environment terms. Comparison of fully manual and text-mining-assisted curation revealed that EXTRACT speeds up annotation by 15-25% and helps curators to detect terms that would otherwise have been missed. Database URL: https://extract.hcmr.gr/. © The Author(s) 2016. Published by Oxford University Press.
Faulon, Jean-Loup; Misra, Milind; Martin, Shawn; ...
2007-11-23
Motivation: Identifying protein enzymatic or pharmacological activities are important areas of research in biology and chemistry. Biological and chemical databases are increasingly being populated with linkages between protein sequences and chemical structures. Additionally, there is now sufficient information to apply machine-learning techniques to predict interactions between chemicals and proteins at a genome scale. Current machine-learning techniques use as input either protein sequences and structures or chemical information. We propose here a method to infer protein–chemical interactions using heterogeneous input consisting of both protein sequence and chemical information. Results: Our method relies on expressing proteins and chemicals with a common cheminformaticsmore » representation. We demonstrate our approach by predicting whether proteins can catalyze reactions not present in training sets. We also predict whether a given drug can bind a target, in the absence of prior binding information for that drug and target. Lastly, such predictions cannot be made with current machine-learning techniques requiring binding information for individual reactions or individual targets.« less
The HITRAN 2008 Molecular Spectroscopic Database
NASA Technical Reports Server (NTRS)
Rothman, Laurence S.; Gordon, Iouli E.; Barbe, Alain; Benner, D. Chris; Bernath, Peter F.; Birk, Manfred; Boudon, V.; Brown, Linda R.; Campargue, Alain; Champion, J.-P.;
2009-01-01
This paper describes the status of the 2008 edition of the HITRAN molecular spectroscopic database. The new edition is the first official public release since the 2004 edition, although a number of crucial updates had been made available online since 2004. The HITRAN compilation consists of several components that serve as input for radiative-transfer calculation codes: individual line parameters for the microwave through visible spectra of molecules in the gas phase; absorption cross-sections for molecules having dense spectral features, i.e., spectra in which the individual lines are not resolved; individual line parameters and absorption cross sections for bands in the ultra-violet; refractive indices of aerosols, tables and files of general properties associated with the database; and database management software. The line-by-line portion of the database contains spectroscopic parameters for forty-two molecules including many of their isotopologues.
SBION: A Program for Analyses of Salt-Bridges from Multiple Structure Files.
Gupta, Parth Sarthi Sen; Mondal, Sudipta; Mondal, Buddhadev; Islam, Rifat Nawaz Ul; Banerjee, Shyamashree; Bandyopadhyay, Amal K
2014-01-01
Salt-bridge and network salt-bridge are specific electrostatic interactions that contribute to the overall stability of proteins. In hierarchical protein folding model, these interactions play crucial role in nucleation process. The advent and growth of protein structure database and its availability in public domain made an urgent need for context dependent rapid analysis of salt-bridges. While these analyses on single protein is cumbersome and time-consuming, batch analyses need efficient software for rapid topological scan of a large number of protein for extracting details on (i) fraction of salt-bridge residues (acidic and basic). (ii) Chain specific intra-molecular salt-bridges, (iii) inter-molecular salt-bridges (protein-protein interactions) in all possible binary combinations (iv) network salt-bridges and (v) secondary structure distribution of salt-bridge residues. To the best of our knowledge, such efficient software is not available in public domain. At this juncture, we have developed a program i.e. SBION which can perform all the above mentioned computations for any number of protein with any number of chain at any given distance of ion-pair. It is highly efficient, fast, error-free and user friendly. Finally we would say that our SBION indeed possesses potential for applications in the field of structural and comparative bioinformatics studies. SBION is freely available for non-commercial/academic institutions on formal request to the corresponding author (akbanerjee@biotech.buruniv.ac.in).
Fish Karyome version 2.1: a chromosome database of fishes and other aquatic organisms
Nagpure, Naresh Sahebrao; Pathak, Ajey Kumar; Pati, Rameshwar; Rashid, Iliyas; Sharma, Jyoti; Singh, Shri Prakash; Singh, Mahender; Sarkar, Uttam Kumar; Kushwaha, Basdeo; Kumar, Ravindra; Murali, S.
2016-01-01
A voluminous information is available on karyological studies of fishes; however, limited efforts were made for compilation and curation of the available karyological data in a digital form. ‘Fish Karyome’ database was the preliminary attempt to compile and digitize the available karyological information on finfishes belonging to the Indian subcontinent. But the database had limitations since it covered data only on Indian finfishes with limited search options. Perceiving the feedbacks from the users and its utility in fish cytogenetic studies, the Fish Karyome database was upgraded by applying Linux, Apache, MySQL and PHP (pre hypertext processor) (LAMP) technologies. In the present version, the scope of the system was increased by compiling and curating the available chromosomal information over the globe on fishes and other aquatic organisms, such as echinoderms, molluscs and arthropods, especially of aquaculture importance. Thus, Fish Karyome version 2.1 presently covers 866 chromosomal records for 726 species supported with 253 published articles and the information is being updated regularly. The database provides information on chromosome number and morphology, sex chromosomes, chromosome banding, molecular cytogenetic markers, etc. supported by fish and karyotype images through interactive tools. It also enables the users to browse and view chromosomal information based on habitat, family, conservation status and chromosome number. The system also displays chromosome number in model organisms, protocol for chromosome preparation and allied techniques and glossary of cytogenetic terms. A data submission facility has also been provided through data submission panel. The database can serve as a unique and useful resource for cytogenetic characterization, sex determination, chromosomal mapping, cytotaxonomy, karyo-evolution and systematics of fishes. Database URL: http://mail.nbfgr.res.in/Fish_Karyome PMID:26980518
Fish Karyome version 2.1: a chromosome database of fishes and other aquatic organisms.
Nagpure, Naresh Sahebrao; Pathak, Ajey Kumar; Pati, Rameshwar; Rashid, Iliyas; Sharma, Jyoti; Singh, Shri Prakash; Singh, Mahender; Sarkar, Uttam Kumar; Kushwaha, Basdeo; Kumar, Ravindra; Murali, S
2016-01-01
A voluminous information is available on karyological studies of fishes; however, limited efforts were made for compilation and curation of the available karyological data in a digital form. 'Fish Karyome' database was the preliminary attempt to compile and digitize the available karyological information on finfishes belonging to the Indian subcontinent. But the database had limitations since it covered data only on Indian finfishes with limited search options. Perceiving the feedbacks from the users and its utility in fish cytogenetic studies, the Fish Karyome database was upgraded by applying Linux, Apache, MySQL and PHP (pre hypertext processor) (LAMP) technologies. In the present version, the scope of the system was increased by compiling and curating the available chromosomal information over the globe on fishes and other aquatic organisms, such as echinoderms, molluscs and arthropods, especially of aquaculture importance. Thus, Fish Karyome version 2.1 presently covers 866 chromosomal records for 726 species supported with 253 published articles and the information is being updated regularly. The database provides information on chromosome number and morphology, sex chromosomes, chromosome banding, molecular cytogenetic markers, etc. supported by fish and karyotype images through interactive tools. It also enables the users to browse and view chromosomal information based on habitat, family, conservation status and chromosome number. The system also displays chromosome number in model organisms, protocol for chromosome preparation and allied techniques and glossary of cytogenetic terms. A data submission facility has also been provided through data submission panel. The database can serve as a unique and useful resource for cytogenetic characterization, sex determination, chromosomal mapping, cytotaxonomy, karyo-evolution and systematics of fishes. Database URL: http://mail.nbfgr.res.in/Fish_Karyome. © The Author(s) 2016. Published by Oxford University Press.
Role for protein–protein interaction databases in human genetics
Pattin, Kristine A; Moore, Jason H
2010-01-01
Proteomics and the study of protein–protein interactions are becoming increasingly important in our effort to understand human diseases on a system-wide level. Thanks to the development and curation of protein-interaction databases, up-to-date information on these interaction networks is accessible and publicly available to the scientific community. As our knowledge of protein–protein interactions increases, it is important to give thought to the different ways that these resources can impact biomedical research. In this article, we highlight the importance of protein–protein interactions in human genetics and genetic epidemiology. Since protein–protein interactions demonstrate one of the strongest functional relationships between genes, combining genomic data with available proteomic data may provide us with a more in-depth understanding of common human diseases. In this review, we will discuss some of the fundamentals of protein interactions, the databases that are publicly available and how information from these databases can be used to facilitate genome-wide genetic studies. PMID:19929610
The Knowledge-Integrated Network Biomarkers Discovery for Major Adverse Cardiac Events
Jin, Guangxu; Zhou, Xiaobo; Wang, Honghui; Zhao, Hong; Cui, Kemi; Zhang, Xiang-Sun; Chen, Luonan; Hazen, Stanley L.; Li, King; Wong, Stephen T. C.
2010-01-01
The mass spectrometry (MS) technology in clinical proteomics is very promising for discovery of new biomarkers for diseases management. To overcome the obstacles of data noises in MS analysis, we proposed a new approach of knowledge-integrated biomarker discovery using data from Major Adverse Cardiac Events (MACE) patients. We first built up a cardiovascular-related network based on protein information coming from protein annotations in Uniprot, protein–protein interaction (PPI), and signal transduction database. Distinct from the previous machine learning methods in MS data processing, we then used statistical methods to discover biomarkers in cardiovascular-related network. Through the tradeoff between known protein information and data noises in mass spectrometry data, we finally could firmly identify those high-confident biomarkers. Most importantly, aided by protein–protein interaction network, that is, cardiovascular-related network, we proposed a new type of biomarkers, that is, network biomarkers, composed of a set of proteins and the interactions among them. The candidate network biomarkers can classify the two groups of patients more accurately than current single ones without consideration of biological molecular interaction. PMID:18665624
Rezza, Amélie; Wang, Zichen; Sennett, Rachel; Qiao, Wenlian; Wang, Dongmei; Heitman, Nicholas; Mok, Ka Wai; Clavel, Carlos; Yi, Rui; Zandstra, Peter; Ma'ayan, Avi; Rendl, Michael
2016-03-29
The hair follicle (HF) is a complex miniorgan that serves as an ideal model system to study stem cell (SC) interactions with the niche during growth and regeneration. Dermal papilla (DP) cells are required for SC activation during the adult hair cycle, but signal exchange between niche and SC precursors/transit-amplifying cell (TAC) progenitors that regulates HF morphogenetic growth is largely unknown. Here we use six transgenic reporters to isolate 14 major skin and HF cell populations. With next-generation RNA sequencing, we characterize their transcriptomes and define unique molecular signatures. SC precursors, TACs, and the DP niche express a plethora of ligands and receptors. Signaling interaction network analysis reveals a bird's-eye view of pathways implicated in epithelial-mesenchymal interactions. Using a systematic tissue-wide approach, this work provides a comprehensive platform, linked to an interactive online database, to identify and further explore the SC/TAC/niche crosstalk regulating HF growth. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.
Predicting drug-target interactions using restricted Boltzmann machines.
Wang, Yuhao; Zeng, Jianyang
2013-07-01
In silico prediction of drug-target interactions plays an important role toward identifying and developing new uses of existing or abandoned drugs. Network-based approaches have recently become a popular tool for discovering new drug-target interactions (DTIs). Unfortunately, most of these network-based approaches can only predict binary interactions between drugs and targets, and information about different types of interactions has not been well exploited for DTI prediction in previous studies. On the other hand, incorporating additional information about drug-target relationships or drug modes of action can improve prediction of DTIs. Furthermore, the predicted types of DTIs can broaden our understanding about the molecular basis of drug action. We propose a first machine learning approach to integrate multiple types of DTIs and predict unknown drug-target relationships or drug modes of action. We cast the new DTI prediction problem into a two-layer graphical model, called restricted Boltzmann machine, and apply a practical learning algorithm to train our model and make predictions. Tests on two public databases show that our restricted Boltzmann machine model can effectively capture the latent features of a DTI network and achieve excellent performance on predicting different types of DTIs, with the area under precision-recall curve up to 89.6. In addition, we demonstrate that integrating multiple types of DTIs can significantly outperform other predictions either by simply mixing multiple types of interactions without distinction or using only a single interaction type. Further tests show that our approach can infer a high fraction of novel DTIs that has been validated by known experiments in the literature or other databases. These results indicate that our approach can have highly practical relevance to DTI prediction and drug repositioning, and hence advance the drug discovery process. Software and datasets are available on request. Supplementary data are available at Bioinformatics online.
Simple system--substantial share: the use of Dictyostelium in cell biology and molecular medicine.
Müller-Taubenberger, Annette; Kortholt, Arjan; Eichinger, Ludwig
2013-02-01
Dictyostelium discoideum offers unique advantages for studying fundamental cellular processes, host-pathogen interactions as well as the molecular causes of human diseases. The organism can be easily grown in large amounts and is amenable to diverse biochemical, cell biological and genetic approaches. Throughout their life cycle Dictyostelium cells are motile, and thus are perfectly suited to study random and directed cell motility with the underlying changes in signal transduction and the actin cytoskeleton. Dictyostelium is also increasingly used for the investigation of human disease genes and the crosstalk between host and pathogen. As a professional phagocyte it can be infected with several human bacterial pathogens and used to study the infection process. The availability of a large number of knock-out mutants renders Dictyostelium particularly useful for the elucidation and investigation of host cell factors. A powerful armory of molecular genetic techniques that have been continuously expanded over the years and a well curated genome sequence, which is accessible via the online database dictyBase, considerably strengthened Dictyostelium's experimental attractiveness and its value as model organism. Copyright © 2012 Elsevier GmbH. All rights reserved.
Xie, Duoli; Shi, Tieliu; Wen, Chengping
2017-01-01
Traditional Chinese Medicine (TCM) has been widely used as a complementary medicine in Acute Myeloid Leukemia (AML) treatment. In this study, we proposed a new classification of Chinese Medicines (CMs) by integrating the latest discoveries in disease molecular mechanisms and traditional medicine theory. We screened out a set of chemical compounds on basis of AML differential expression genes and chemical-protein interactions and then mapped them to Traditional Chinese Medicine Integrated Database. 415 CMs contain those compounds and they were categorized into 8 groups according to the Traditional Chinese Pharmacology. Pathway analysis and synthetic lethality gene pairs were applied to analyze the dissimilarity, generality and intergroup relations of different groups. We defined hub CM pairs and alternative CM groups based on the analysis result and finally proposed a formula to form an effective anti-AML prescription which combined the hub CM pairs with alternative CMs according to patients’ molecular features. Our method of formulating CMs based on patients’ stratification provides novel insights into the new usage of conventional CMs and will promote TCM modernization. PMID:28454110
Huang, Lin; Li, Haichang; Xie, Duoli; Shi, Tieliu; Wen, Chengping
2017-06-27
Traditional Chinese Medicine (TCM) has been widely used as a complementary medicine in Acute Myeloid Leukemia (AML) treatment. In this study, we proposed a new classification of Chinese Medicines (CMs) by integrating the latest discoveries in disease molecular mechanisms and traditional medicine theory. We screened out a set of chemical compounds on basis of AML differential expression genes and chemical-protein interactions and then mapped them to Traditional Chinese Medicine Integrated Database. 415 CMs contain those compounds and they were categorized into 8 groups according to the Traditional Chinese Pharmacology. Pathway analysis and synthetic lethality gene pairs were applied to analyze the dissimilarity, generality and intergroup relations of different groups. We defined hub CM pairs and alternative CM groups based on the analysis result and finally proposed a formula to form an effective anti-AML prescription which combined the hub CM pairs with alternative CMs according to patients' molecular features. Our method of formulating CMs based on patients' stratification provides novel insights into the new usage of conventional CMs and will promote TCM modernization.
Fu, Ying; Sun, Yi-Na; Yi, Ke-Han; Li, Ming-Qiang; Cao, Hai-Feng; Li, Jia-Zhong; Ye, Fei
2017-06-09
p -Hydroxyphenylpyruvate dioxygenase (HPPD) is not only the useful molecular target in treating life-threatening tyrosinemia type I, but also an important target for chemical herbicides. A combined in silico structure-based pharmacophore and molecular docking-based virtual screening were performed to identify novel potential HPPD inhibitors. The complex-based pharmacophore model (CBP) with 0.721 of ROC used for screening compounds showed remarkable ability to retrieve known active ligands from among decoy molecules. The ChemDiv database was screened using CBP-Hypo2 as a 3D query, and the best-fit hits subjected to molecular docking with two methods of LibDock and CDOCKER in Accelrys Discovery Studio 2.5 (DS 2.5) to discern interactions with key residues at the active site of HPPD. Four compounds with top rankings in the HipHop model and well-known binding model were finally chosen as lead compounds with potential inhibitory effects on the active site of target. The results provided powerful insight into the development of novel HPPD inhibitors herbicides using computational techniques.
PhyloExplorer: a web server to validate, explore and query phylogenetic trees
Ranwez, Vincent; Clairon, Nicolas; Delsuc, Frédéric; Pourali, Saeed; Auberval, Nicolas; Diser, Sorel; Berry, Vincent
2009-01-01
Background Many important problems in evolutionary biology require molecular phylogenies to be reconstructed. Phylogenetic trees must then be manipulated for subsequent inclusion in publications or analyses such as supertree inference and tree comparisons. However, no tool is currently available to facilitate the management of tree collections providing, for instance: standardisation of taxon names among trees with respect to a reference taxonomy; selection of relevant subsets of trees or sub-trees according to a taxonomic query; or simply computation of descriptive statistics on the collection. Moreover, although several databases of phylogenetic trees exist, there is currently no easy way to find trees that are both relevant and complementary to a given collection of trees. Results We propose a tool to facilitate assessment and management of phylogenetic tree collections. Given an input collection of rooted trees, PhyloExplorer provides facilities for obtaining statistics describing the collection, correcting invalid taxon names, extracting taxonomically relevant parts of the collection using a dedicated query language, and identifying related trees in the TreeBASE database. Conclusion PhyloExplorer is a simple and interactive website implemented through underlying Python libraries and MySQL databases. It is available at: and the source code can be downloaded from: . PMID:19450253
Bošnjaković-Pavlović, Nada; Bajuk-Bogdanović, Danica; Zakrzewska, Joanna; Yan, Zeyin; Holclajtner-Antunović, Ivanka; Gillet, Jean-Michel; Spasojević-de Biré, Anne
2017-11-01
Influence of 12-tungstophosphoric acid (WPA) on conversion of adenosine triphosphate (ATP) to adenosine diphosphate (ADP) in the presence of Na + /K + -ATPase was monitored by 31 P NMR spectroscopy. It was shown that WPA exhibits inhibitory effect on Na + /K + -ATPase activity. In order to study WPA reactivity and intermolecular interactions between WPA oxygen atoms and different proton donor types (D=O, N, C), we have considered data for WPA based compounds from the Cambridge Structural Database (CSD), the Crystallographic Open Database (COD) and the Inorganic Crystal Structure Database (ICSD). Binding properties of Keggin's anion in biological systems are illustrated using Protein Data Bank (PDB). This work constitutes the first determination of theoretical Bader charges on polyoxotungstate compound via the Atom In Molecule theory. An analysis of electrostatic potential maps at the molecular surface and charge of WPA, resulting from DFT calculations, suggests that the preferred protonation site corresponds to WPA bridging oxygen. These results enlightened WPA chemical reactivity and its potential biological applications such as the inhibition of the ATPase activity. Copyright © 2017 Elsevier Inc. All rights reserved.
SMMRNA: a database of small molecule modulators of RNA
Mehta, Ankita; Sonam, Surabhi; Gouri, Isha; Loharch, Saurabh; Sharma, Deepak K.; Parkesh, Raman
2014-01-01
We have developed SMMRNA, an interactive database, available at http://www.smmrna.org, with special focus on small molecule ligands targeting RNA. Currently, SMMRNA consists of ∼770 unique ligands along with structural images of RNA molecules. Each ligand in the SMMRNA contains information such as Kd, Ki, IC50, ΔTm, molecular weight (MW), hydrogen donor and acceptor count, XlogP, number of rotatable bonds, number of aromatic rings and 2D and 3D structures. These parameters can be explored using text search, advanced search, substructure and similarity-based analysis tools that are embedded in SMMRNA. A structure editor is provided for 3D visualization of ligands. Advance analysis can be performed using substructure and OpenBabel-based chemical similarity fingerprints. Upload facility for both RNA and ligands is also provided. The physicochemical properties of the ligands were further examined using OpenBabel descriptors, hierarchical clustering, binning partition and multidimensional scaling. We have also generated a 3D conformation database of ligands to support the structure and ligand-based screening. SMMRNA provides comprehensive resource for further design, development and refinement of small molecule modulators for selective targeting of RNA molecules. PMID:24163098
Amadoz, Alicia; González-Candelas, Fernando
2007-04-20
Most research scientists working in the fields of molecular epidemiology, population and evolutionary genetics are confronted with the management of large volumes of data. Moreover, the data used in studies of infectious diseases are complex and usually derive from different institutions such as hospitals or laboratories. Since no public database scheme incorporating clinical and epidemiological information about patients and molecular information about pathogens is currently available, we have developed an information system, composed by a main database and a web-based interface, which integrates both types of data and satisfies requirements of good organization, simple accessibility, data security and multi-user support. From the moment a patient arrives to a hospital or health centre until the processing and analysis of molecular sequences obtained from infectious pathogens in the laboratory, lots of information is collected from different sources. We have divided the most relevant data into 12 conceptual modules around which we have organized the database schema. Our schema is very complete and it covers many aspects of sample sources, samples, laboratory processes, molecular sequences, phylogenetics results, clinical tests and results, clinical information, treatments, pathogens, transmissions, outbreaks and bibliographic information. Communication between end-users and the selected Relational Database Management System (RDMS) is carried out by default through a command-line window or through a user-friendly, web-based interface which provides access and management tools for the data. epiPATH is an information system for managing clinical and molecular information from infectious diseases. It facilitates daily work related to infectious pathogens and sequences obtained from them. This software is intended for local installation in order to safeguard private data and provides advanced SQL-users the flexibility to adapt it to their needs. The database schema, tool scripts and web-based interface are free software but data stored in our database server are not publicly available. epiPATH is distributed under the terms of GNU General Public License. More details about epiPATH can be found at http://genevo.uv.es/epipath.
Analysing and Rationalising Molecular and Materials Databases Using Machine-Learning
NASA Astrophysics Data System (ADS)
de, Sandip; Ceriotti, Michele
Computational materials design promises to greatly accelerate the process of discovering new or more performant materials. Several collaborative efforts are contributing to this goal by building databases of structures, containing between thousands and millions of distinct hypothetical compounds, whose properties are computed by high-throughput electronic-structure calculations. The complexity and sheer amount of information has made manual exploration, interpretation and maintenance of these databases a formidable challenge, making it necessary to resort to automatic analysis tools. Here we will demonstrate how, starting from a measure of (dis)similarity between database items built from a combination of local environment descriptors, it is possible to apply hierarchical clustering algorithms, as well as dimensionality reduction methods such as sketchmap, to analyse, classify and interpret trends in molecular and materials databases, as well as to detect inconsistencies and errors. Thanks to the agnostic and flexible nature of the underlying metric, we will show how our framework can be applied transparently to different kinds of systems ranging from organic molecules and oligopeptides to inorganic crystal structures as well as molecular crystals. Funded by National Center for Computational Design and Discovery of Novel Materials (MARVEL) and Swiss National Science Foundation.
Li, Minghui; Goncearenco, Alexander; Panchenko, Anna R
2017-01-01
In this review we describe a protocol to annotate the effects of missense mutations on proteins, their functions, stability, and binding. For this purpose we present a collection of the most comprehensive databases which store different types of sequencing data on missense mutations, we discuss their relationships, possible intersections, and unique features. Next, we suggest an annotation workflow using the state-of-the art methods and highlight their usability, advantages, and limitations for different cases. Finally, we address a particularly difficult problem of deciphering the molecular mechanisms of mutations on proteins and protein complexes to understand the origins and mechanisms of diseases.
News from Online: What's New with Chime?
NASA Astrophysics Data System (ADS)
Dorland, Liz
2002-07-01
The Chime plugin (pronounced like the bells) provides a simple route to presenting interactive molecular structures to students via the Internet or in classroom presentations. Small inorganic molecules, ionic structures, organic molecules and giant macromolecules can all be viewed in several formats including ball and stick and spacefilling. Extensive Chime resources on the Internet allow chemistry and biochemistry instructors to create their own Web pages or to use some of the many tutorials for students already online. This article describes about twenty Chime-based Web sites in three categories: Chime Resources, Materials for Student and Classroom Use, and Structure Databases. A list of links is provided.
Inside a VAMDC data node—putting standards into practical software
NASA Astrophysics Data System (ADS)
Regandell, Samuel; Marquart, Thomas; Piskunov, Nikolai
2018-03-01
Access to molecular and atomic data is critical for many forms of remote sensing analysis across different fields. Many atomic and molecular databases are however highly specialised for their intended application, complicating querying and combination data between sources. The Virtual Atomic and Molecular Data Centre, VAMDC, is an electronic infrastructure that allows each database to register as a ‘node’. Through services such as VAMDC’s portal website, users can then access and query all nodes in a homogenised way. Today all major Atomic and Molecular databases are attached to VAMDC This article describes the software tools we developed to help data providers create and manage a VAMDC node. It gives an overview of the VAMDC infrastructure and of the various standards it uses. The article then discusses the development choices made and how the standards are implemented in practice. It concludes with a full example of implementing a VAMDC node using a real-life case as well as future plans for the node software.
NPIDB: Nucleic acid-Protein Interaction DataBase.
Kirsanov, Dmitry D; Zanegina, Olga N; Aksianov, Evgeniy A; Spirin, Sergei A; Karyagina, Anna S; Alexeevski, Andrei V
2013-01-01
The Nucleic acid-Protein Interaction DataBase (http://npidb.belozersky.msu.ru/) contains information derived from structures of DNA-protein and RNA-protein complexes extracted from the Protein Data Bank (3846 complexes in October 2012). It provides a web interface and a set of tools for extracting biologically meaningful characteristics of nucleoprotein complexes. The content of the database is updated weekly. The current version of the Nucleic acid-Protein Interaction DataBase is an upgrade of the version published in 2007. The improvements include a new web interface, new tools for calculation of intermolecular interactions, a classification of SCOP families that contains DNA-binding protein domains and data on conserved water molecules on the DNA-protein interface.
Molecular signatures database (MSigDB) 3.0.
Liberzon, Arthur; Subramanian, Aravind; Pinchback, Reid; Thorvaldsdóttir, Helga; Tamayo, Pablo; Mesirov, Jill P
2011-06-15
Well-annotated gene sets representing the universe of the biological processes are critical for meaningful and insightful interpretation of large-scale genomic data. The Molecular Signatures Database (MSigDB) is one of the most widely used repositories of such sets. We report the availability of a new version of the database, MSigDB 3.0, with over 6700 gene sets, a complete revision of the collection of canonical pathways and experimental signatures from publications, enhanced annotations and upgrades to the web site. MSigDB is freely available for non-commercial use at http://www.broadinstitute.org/msigdb.
neXtA5: accelerating annotation of articles via automated approaches in neXtProt.
Mottin, Luc; Gobeill, Julien; Pasche, Emilie; Michel, Pierre-André; Cusin, Isabelle; Gaudet, Pascale; Ruch, Patrick
2016-01-01
The rapid increase in the number of published articles poses a challenge for curated databases to remain up-to-date. To help the scientific community and database curators deal with this issue, we have developed an application, neXtA5, which prioritizes the literature for specific curation requirements. Our system, neXtA5, is a curation service composed of three main elements. The first component is a named-entity recognition module, which annotates MEDLINE over some predefined axes. This report focuses on three axes: Diseases, the Molecular Function and Biological Process sub-ontologies of the Gene Ontology (GO). The automatic annotations are then stored in a local database, BioMed, for each annotation axis. Additional entities such as species and chemical compounds are also identified. The second component is an existing search engine, which retrieves the most relevant MEDLINE records for any given query. The third component uses the content of BioMed to generate an axis-specific ranking, which takes into account the density of named-entities as stored in the Biomed database. The two ranked lists are ultimately merged using a linear combination, which has been specifically tuned to support the annotation of each axis. The fine-tuning of the coefficients is formally reported for each axis-driven search. Compared with PubMed, which is the system used by most curators, the improvement is the following: +231% for Diseases, +236% for Molecular Functions and +3153% for Biological Process when measuring the precision of the top-returned PMID (P0 or mean reciprocal rank). The current search methods significantly improve the search effectiveness of curators for three important curation axes. Further experiments are being performed to extend the curation types, in particular protein-protein interactions, which require specific relationship extraction capabilities. In parallel, user-friendly interfaces powered with a set of JSON web services are currently being implemented into the neXtProt annotation pipeline.Available on: http://babar.unige.ch:8082/neXtA5Database URL: http://babar.unige.ch:8082/neXtA5/fetcher.jsp. © The Author(s) 2016. Published by Oxford University Press.
Meta-All: a system for managing metabolic pathway information.
Weise, Stephan; Grosse, Ivo; Klukas, Christian; Koschützki, Dirk; Scholz, Uwe; Schreiber, Falk; Junker, Björn H
2006-10-23
Many attempts are being made to understand biological subjects at a systems level. A major resource for these approaches are biological databases, storing manifold information about DNA, RNA and protein sequences including their functional and structural motifs, molecular markers, mRNA expression levels, metabolite concentrations, protein-protein interactions, phenotypic traits or taxonomic relationships. The use of these databases is often hampered by the fact that they are designed for special application areas and thus lack universality. Databases on metabolic pathways, which provide an increasingly important foundation for many analyses of biochemical processes at a systems level, are no exception from the rule. Data stored in central databases such as KEGG, BRENDA or SABIO-RK is often limited to read-only access. If experimentalists want to store their own data, possibly still under investigation, there are two possibilities. They can either develop their own information system for managing that own data, which is very time-consuming and costly, or they can try to store their data in existing systems, which is often restricted. Hence, an out-of-the-box information system for managing metabolic pathway data is needed. We have designed META-ALL, an information system that allows the management of metabolic pathways, including reaction kinetics, detailed locations, environmental factors and taxonomic information. Data can be stored together with quality tags and in different parallel versions. META-ALL uses Oracle DBMS and Oracle Application Express. We provide the META-ALL information system for download and use. In this paper, we describe the database structure and give information about the tools for submitting and accessing the data. As a first application of META-ALL, we show how the information contained in a detailed kinetic model can be stored and accessed. META-ALL is a system for managing information about metabolic pathways. It facilitates the handling of pathway-related data and is designed to help biochemists and molecular biologists in their daily research. It is available on the Web at http://bic-gh.de/meta-all and can be downloaded free of charge and installed locally.
neXtA5: accelerating annotation of articles via automated approaches in neXtProt
Mottin, Luc; Gobeill, Julien; Pasche, Emilie; Michel, Pierre-André; Cusin, Isabelle; Gaudet, Pascale; Ruch, Patrick
2016-01-01
The rapid increase in the number of published articles poses a challenge for curated databases to remain up-to-date. To help the scientific community and database curators deal with this issue, we have developed an application, neXtA5, which prioritizes the literature for specific curation requirements. Our system, neXtA5, is a curation service composed of three main elements. The first component is a named-entity recognition module, which annotates MEDLINE over some predefined axes. This report focuses on three axes: Diseases, the Molecular Function and Biological Process sub-ontologies of the Gene Ontology (GO). The automatic annotations are then stored in a local database, BioMed, for each annotation axis. Additional entities such as species and chemical compounds are also identified. The second component is an existing search engine, which retrieves the most relevant MEDLINE records for any given query. The third component uses the content of BioMed to generate an axis-specific ranking, which takes into account the density of named-entities as stored in the Biomed database. The two ranked lists are ultimately merged using a linear combination, which has been specifically tuned to support the annotation of each axis. The fine-tuning of the coefficients is formally reported for each axis-driven search. Compared with PubMed, which is the system used by most curators, the improvement is the following: +231% for Diseases, +236% for Molecular Functions and +3153% for Biological Process when measuring the precision of the top-returned PMID (P0 or mean reciprocal rank). The current search methods significantly improve the search effectiveness of curators for three important curation axes. Further experiments are being performed to extend the curation types, in particular protein–protein interactions, which require specific relationship extraction capabilities. In parallel, user-friendly interfaces powered with a set of JSON web services are currently being implemented into the neXtProt annotation pipeline. Available on: http://babar.unige.ch:8082/neXtA5 Database URL: http://babar.unige.ch:8082/neXtA5/fetcher.jsp PMID:27374119
Meta-All: a system for managing metabolic pathway information
Weise, Stephan; Grosse, Ivo; Klukas, Christian; Koschützki, Dirk; Scholz, Uwe; Schreiber, Falk; Junker, Björn H
2006-01-01
Background Many attempts are being made to understand biological subjects at a systems level. A major resource for these approaches are biological databases, storing manifold information about DNA, RNA and protein sequences including their functional and structural motifs, molecular markers, mRNA expression levels, metabolite concentrations, protein-protein interactions, phenotypic traits or taxonomic relationships. The use of these databases is often hampered by the fact that they are designed for special application areas and thus lack universality. Databases on metabolic pathways, which provide an increasingly important foundation for many analyses of biochemical processes at a systems level, are no exception from the rule. Data stored in central databases such as KEGG, BRENDA or SABIO-RK is often limited to read-only access. If experimentalists want to store their own data, possibly still under investigation, there are two possibilities. They can either develop their own information system for managing that own data, which is very time-consuming and costly, or they can try to store their data in existing systems, which is often restricted. Hence, an out-of-the-box information system for managing metabolic pathway data is needed. Results We have designed META-ALL, an information system that allows the management of metabolic pathways, including reaction kinetics, detailed locations, environmental factors and taxonomic information. Data can be stored together with quality tags and in different parallel versions. META-ALL uses Oracle DBMS and Oracle Application Express. We provide the META-ALL information system for download and use. In this paper, we describe the database structure and give information about the tools for submitting and accessing the data. As a first application of META-ALL, we show how the information contained in a detailed kinetic model can be stored and accessed. Conclusion META-ALL is a system for managing information about metabolic pathways. It facilitates the handling of pathway-related data and is designed to help biochemists and molecular biologists in their daily research. It is available on the Web at and can be downloaded free of charge and installed locally. PMID:17059592
Evaluating Land-Atmosphere Interactions with the North American Soil Moisture Database
NASA Astrophysics Data System (ADS)
Giles, S. M.; Quiring, S. M.; Ford, T.; Chavez, N.; Galvan, J.
2015-12-01
The North American Soil Moisture Database (NASMD) is a high-quality observational soil moisture database that was developed to study land-atmosphere interactions. It includes over 1,800 monitoring stations the United States, Canada and Mexico. Soil moisture data are collected from multiple sources, quality controlled and integrated into an online database (soilmoisture.tamu.edu). The period of record varies substantially and only a few of these stations have an observation record extending back into the 1990s. Daily soil moisture observations have been quality controlled using the North American Soil Moisture Database QAQC algorithm. The database is designed to facilitate observationally-driven investigations of land-atmosphere interactions, validation of the accuracy of soil moisture simulations in global land surface models, satellite calibration/validation for SMOS and SMAP, and an improved understanding of how soil moisture influences climate on seasonal to interannual timescales. This paper provides some examples of how the NASMD has been utilized to enhance understanding of land-atmosphere interactions in the U.S. Great Plains.
Molecular association of pathogenetic contributors to pre-eclampsia (pre-eclampsia associome)
2015-01-01
Background Pre-eclampsia is the most common complication occurring during pregnancy. In the majority of cases, it is concurrent with other pathologies in a comorbid manner (frequent co-occurrences in patients), such as diabetes mellitus, gestational diabetes and obesity. Providing bronchial asthma, pulmonary tuberculosis, certain neurodegenerative diseases and cancers as examples, we have shown previously that pairs of inversely comorbid pathologies (rare co-occurrences in patients) are more closely related to each other at the molecular genetic level compared with randomly generated pairs of diseases. Data in the literature concerning the causes of pre-eclampsia are abundant. However, the key mechanisms triggering this disease that are initiated by other pathological processes are thus far unknown. The aim of this work was to analyse the characteristic features of genetic networks that describe interactions between comorbid diseases, using pre-eclampsia as a case in point. Results The use of ANDSystem, Pathway Studio and STRING computer tools based on text-mining and database-mining approaches allowed us to reconstruct associative networks, representing molecular genetic interactions between genes, associated concurrently with comorbid disease pairs, including pre-eclampsia, diabetes mellitus, gestational diabetes and obesity. It was found that these associative networks statistically differed in the number of genes and interactions between them from those built for randomly chosen pairs of diseases. The associative network connecting all four diseases was composed of 16 genes (PLAT, ADIPOQ, ADRB3, LEPR, HP, TGFB1, TNFA, INS, CRP, CSRP1, IGFBP1, MBL2, ACE, ESR1, SHBG, ADA). Such an analysis allowed us to reveal differential gene risk factors for these diseases, and to propose certain, most probable, theoretical mechanisms of pre-eclampsia development in pregnant women. The mechanisms may include the following pathways: [TGFB1 or TNFA]-[IL1B]-[pre-eclampsia]; [TNFA or INS]-[NOS3]-[pre-eclampsia]; [INS]-[HSPA4 or CLU]-[pre-eclampsia]; [ACE]-[MTHFR]-[pre-eclampsia]. Conclusions For pre-eclampsia, diabetes mellitus, gestational diabetes and obesity, we showed that the size and connectivity of the associative molecular genetic networks, which describe interactions between comorbid diseases, statistically exceeded the size and connectivity of those built for randomly chosen pairs of diseases. Recently, we have shown a similar result for inversely comorbid diseases. This suggests that comorbid and inversely comorbid diseases have common features concerning structural organization of associative molecular genetic networks. PMID:25879409
Ynalvez, Marcus Antonius; Ynalvez, Ruby A; Ramírez, Enrique
2017-03-04
We explored the social shaping of science at the micro-level reality of face-to-face interaction in one of the traditional places for scientific activities-the scientific lab. We specifically examined how doctoral students' perception of their: (i) interaction with doctoral mentors (MMI) and (ii) lab social environment (LSE) influenced productivity. Construed as the production of peer-reviewed articles, we measured productivity using total number of articles (TOTAL), number of articles with impact factor greater than or equal to 4.00 (IFGE4), and number of first-authored articles (NFA). Via face-to-face interviews, we obtained data from n = 210 molecular biology Ph.D. students in selected universities in Japan, Singapore, and Taiwan. Additional productivity data (NFA) were obtained from online bibliometric databases. To summarize the original 13 MMI and 13 LSE semantic-differential items which we used to measure students' perceptions, principal component (PC) analyses were performed. The results were smaller sets of 4 MMI PCs and 4 LSE PCs. To identify which PCs influenced publication counts, we performed Poisson regression analyses. Although perceived MMI was not linked to productivity, perceived LSE was linked: Students who perceived their LSE as intellectually stimulating reported high levels of productivity in both TOTAL and IFGE4, but not in NFA. Our findings not only highlight how students' perception of their training environment factors in the production of scientific output, our findings also carry important implications for improving mentoring programs in science. © 2016 by The International Union of Biochemistry and Molecular Biology, 45(2):130-144, 2017. © 2016 The International Union of Biochemistry and Molecular Biology.
Huang, Jingwei; Liu, Tingqi; Li, Ke; Song, Xiaokai; Yan, Ruofeng; Xu, Lixin; Li, Xiangrui
2018-04-04
Eimeria maxima initiates infection by invading the jejunal epithelial cells of chicken. However, the proteins involved in invasion remain unknown. The research of the molecules that participate in the interactions between E. maxima sporozoites and host target cells will fill a gap in our understanding of the invasion system of this parasitic pathogen. In the present study, chicken jejunal epithelial cells were isolated and cultured in vitro. Western blot was employed to analyze the soluble proteins of E. maxima sporozoites that bound to chicken jejunal epithelial cells. Co-immunoprecipitation (co-IP) assay was used to separate the E. maxima proteins that bound to chicken jejunal epithelial cells. Shotgun LC-MS/MS technique was used for proteomics identification and Gene Ontology was employed for the bioinformatics analysis. The results of Western blot analysis showed that four proteins bands from jejunal epithelial cells co-cultured with soluble proteins of E. maxima sporozoites were recognized by the positive sera, with molecular weights of 70, 90, 95 and 130 kDa. The co-IP dilutions were analyzed by shotgun LC-MS/MS. A total of 204 proteins were identified in the E. maxima protein database using the MASCOT search engine. Thirty-five proteins including microneme protein 3 and 7 had more than two unique peptide counts and were annotated using Gene Ontology for molecular function, biological process and cellular localization. The results revealed that of the 35 annotated peptides, 22 (62.86%) were associated with binding activity and 15 (42.86%) were involved in catalytic activity. Our findings provide an insight into the interaction between E. maxima and the corresponding host cells and it is important for the understanding of molecular mechanisms underlying E. maxima invasion.
Trevarton, Alexander J.; Mann, Michael B.; Knapp, Christoph; Araki, Hiromitsu; Wren, Jonathan D.; Stones-Havas, Steven; Black, Michael A.; Print, Cristin G.
2013-01-01
Despite on-going research, metastatic melanoma survival rates remain low and treatment options are limited. Researchers can now access a rapidly growing amount of molecular and clinical information about melanoma. This information is becoming difficult to assemble and interpret due to its dispersed nature, yet as it grows it becomes increasingly valuable for understanding melanoma. Integration of this information into a comprehensive resource to aid rational experimental design and patient stratification is needed. As an initial step in this direction, we have assembled a web-accessible melanoma database, MelanomaDB, which incorporates clinical and molecular data from publically available sources, which will be regularly updated as new information becomes available. This database allows complex links to be drawn between many different aspects of melanoma biology: genetic changes (e.g., mutations) in individual melanomas revealed by DNA sequencing, associations between gene expression and patient survival, data concerning drug targets, biomarkers, druggability, and clinical trials, as well as our own statistical analysis of relationships between molecular pathways and clinical parameters that have been produced using these data sets. The database is freely available at http://genesetdb.auckland.ac.nz/melanomadb/about.html. A subset of the information in the database can also be accessed through a freely available web application in the Illumina genomic cloud computing platform BaseSpace at http://www.biomatters.com/apps/melanoma-profiler-for-research. The MelanomaDB database illustrates dysregulation of specific signaling pathways across 310 exome-sequenced melanomas and in individual tumors and identifies the distribution of somatic variants in melanoma. We suggest that MelanomaDB can provide a context in which to interpret the tumor molecular profiles of individual melanoma patients relative to biological information and available drug therapies. PMID:23875173
Computerized techniques pave the way for drug-drug interaction prediction and interpretation
Safdari, Reza; Ferdousi, Reza; Aziziheris, Kamal; Niakan-Kalhori, Sharareh R.; Omidi, Yadollah
2016-01-01
Introduction: Health care industry also patients penalized by medical errors that are inevitable but highly preventable. Vast majority of medical errors are related to adverse drug reactions, while drug-drug interactions (DDIs) are the main cause of adverse drug reactions (ADRs). DDIs and ADRs have mainly been reported by haphazard case studies. Experimental in vivo and in vitro researches also reveals DDI pairs. Laboratory and experimental researches are valuable but also expensive and in some cases researchers may suffer from limitations. Methods: In the current investigation, the latest published works were studied to analyze the trend and pattern of the DDI modelling and the impacts of machine learning methods. Applications of computerized techniques were also investigated for the prediction and interpretation of DDIs. Results: Computerized data-mining in pharmaceutical sciences and related databases provide new key transformative paradigms that can revolutionize the treatment of diseases and hence medical care. Given that various aspects of drug discovery and pharmacotherapy are closely related to the clinical and molecular/biological information, the scientifically sound databases (e.g., DDIs, ADRs) can be of importance for the success of pharmacotherapy modalities. Conclusion: A better understanding of DDIs not only provides a robust means for designing more effective medicines but also grantees patient safety. PMID:27525223
sc-PDB-Frag: a database of protein-ligand interaction patterns for Bioisosteric replacements.
Desaphy, Jérémy; Rognan, Didier
2014-07-28
Bioisosteric replacement plays an important role in medicinal chemistry by keeping the biological activity of a molecule while changing either its core scaffold or substituents, thereby facilitating lead optimization and patenting. Bioisosteres are classically chosen in order to keep the main pharmacophoric moieties of the substructure to replace. However, notably when changing a scaffold, no attention is usually paid as whether all atoms of the reference scaffold are equally important for binding to the desired target. We herewith propose a novel database for bioisosteric replacement (scPDBFrag), capitalizing on our recently published structure-based approach to scaffold hopping, focusing on interaction pattern graphs. Protein-bound ligands are first fragmented and the interaction of the corresponding fragments with their protein environment computed-on-the-fly. Using an in-house developed graph alignment tool, interaction patterns graphs can be compared, aligned, and sorted by decreasing similarity to any reference. In the herein presented sc-PDB-Frag database ( http://bioinfo-pharma.u-strasbg.fr/scPDBFrag ), fragments, interaction patterns, alignments, and pairwise similarity scores have been extracted from the sc-PDB database of 8077 druggable protein-ligand complexes and further stored in a relational database. We herewith present the database, its Web implementation, and procedures for identifying true bioisosteric replacements based on conserved interaction patterns.
Alonso-López, Diego; Gutiérrez, Miguel A.; Lopes, Katia P.; Prieto, Carlos; Santamaría, Rodrigo; De Las Rivas, Javier
2016-01-01
APID (Agile Protein Interactomes DataServer) is an interactive web server that provides unified generation and delivery of protein interactomes mapped to their respective proteomes. This resource is a new, fully redesigned server that includes a comprehensive collection of protein interactomes for more than 400 organisms (25 of which include more than 500 interactions) produced by the integration of only experimentally validated protein–protein physical interactions. For each protein–protein interaction (PPI) the server includes currently reported information about its experimental validation to allow selection and filtering at different quality levels. As a whole, it provides easy access to the interactomes from specific species and includes a global uniform compendium of 90,379 distinct proteins and 678,441 singular interactions. APID integrates and unifies PPIs from major primary databases of molecular interactions, from other specific repositories and also from experimentally resolved 3D structures of protein complexes where more than two proteins were identified. For this purpose, a collection of 8,388 structures were analyzed to identify specific PPIs. APID also includes a new graph tool (based on Cytoscape.js) for visualization and interactive analyses of PPI networks. The server does not require registration and it is freely available for use at http://apid.dep.usal.es. PMID:27131791
Kedrov, Alexej; Janovjak, Harald; Sapra, K Tanuj; Müller, Daniel J
2007-01-01
Molecular interactions are the basic language of biological processes. They establish the forces interacting between the building blocks of proteins and other macromolecules, thus determining their functional roles. Because molecular interactions trigger virtually every biological process, approaches to decipher their language are needed. Single-molecule force spectroscopy (SMFS) has been used to detect and characterize different types of molecular interactions that occur between and within native membrane proteins. The first experiments detected and localized molecular interactions that stabilized membrane proteins, including how these interactions were established during folding of alpha-helical secondary structure elements into the native protein and how they changed with oligomerization, temperature, and mutations. SMFS also enables investigators to detect and locate molecular interactions established during ligand and inhibitor binding. These exciting applications provide opportunities for studying the molecular forces of life. Further developments will elucidate the origins of molecular interactions encoded in their lifetimes, interaction ranges, interplay, and dynamics characteristic of biological systems.
Li, Jiazhong; Gramatica, Paola
2010-11-01
Quantitative structure-activity relationship (QSAR) methodology aims to explore the relationship between molecular structures and experimental endpoints, producing a model for the prediction of new data; the predictive performance of the model must be checked by external validation. Clearly, the qualities of chemical structure information and experimental endpoints, as well as the statistical parameters used to verify the external predictivity have a strong influence on QSAR model reliability. Here, we emphasize the importance of these three aspects by analyzing our models on estrogen receptor binders (Endocrine disruptor knowledge base (EDKB) database). Endocrine disrupting chemicals, which mimic or antagonize the endogenous hormones such as estrogens, are a hot topic in environmental and toxicological sciences. QSAR shows great values in predicting the estrogenic activity and exploring the interactions between the estrogen receptor and ligands. We have verified our previously published model for additional external validation on new EDKB chemicals. Having found some errors in the used 3D molecular conformations, we redevelop a new model using the same data set with corrected structures, the same method (ordinary least-square regression, OLS) and DRAGON descriptors. The new model, based on some different descriptors, is more predictive on external prediction sets. Three different formulas to calculate correlation coefficient for the external prediction set (Q2 EXT) were compared, and the results indicated that the new proposal of Consonni et al. had more reasonable results, consistent with the conclusions from regression line, Williams plot and root mean square error (RMSE) values. Finally, the importance of reliable endpoints values has been highlighted by comparing the classification assignments of EDKB with those of another estrogen receptor binders database (METI): we found that 16.1% assignments of the common compounds were opposite (20 among 124 common compounds). In order to verify the real assignments for these inconsistent compounds, we predicted these samples, as a blind external set, by our regression models and compared the results with the two databases. The results indicated that most of the predictions were consistent with METI. Furthermore, we built a kNN classification model using the 104 consistent compounds to predict those inconsistent ones, and most of the predictions were also in agreement with METI database.
Virtual Exploration of the Ring Systems Chemical Universe.
Visini, Ricardo; Arús-Pous, Josep; Awale, Mahendra; Reymond, Jean-Louis
2017-11-27
Here, we explore the chemical space of all virtually possible organic molecules focusing on ring systems, which represent the cyclic cores of organic molecules obtained by removing all acyclic bonds and converting all remaining atoms to carbon. This approach circumvents the combinatorial explosion encountered when enumerating the molecules themselves. We report the chemical universe database GDB4c containing 916 130 ring systems up to four saturated or aromatic rings and maximum ring size of 14 atoms and GDB4c3D containing the corresponding 6 555 929 stereoisomers. Almost all (98.6%) of these ring systems are unknown and represent chiral 3D-shaped macrocycles containing small rings and quaternary centers reminiscent of polycyclic natural products. We envision that GDB4c can serve to select new ring systems from which to design analogs of such natural products. The database is available for download at www.gdb.unibe.ch together with interactive visualization and search tools as a resource for molecular design.
Ghorab, Hamida; Lammi, Carmen; Arnoldi, Anna; Kabouche, Zahia; Aiello, Gilda
2018-01-15
An investigation on the proteome of the sweet kernel of apricot, based on equalisation with combinatorial peptide ligand libraries (CPLLs), SDS-PAGE, nLC-ESI-MS/MS, and database search, permitted identifying 175 proteins. Gene ontology analysis indicated that their main molecular functions are in nucleotide binding (20.9%), hydrolase activities (10.6%), kinase activities (7%), and catalytic activity (5.6%). A protein-protein association network analysis using STRING software permitted to build an interactomic map of all detected proteins, characterised by 34 interactions. In order to forecast the potential health benefits deriving from the consumption of these proteins, the two most abundant, i.e. Prunin 1 and 2, were enzymatically digested in silico predicting 10 and 14 peptides, respectively. Searching their sequences in the database BIOPEP, it was possible to suggest a variety of bioactivities, including dipeptidyl peptidase-IV (DPP-IV) and angiotensin converting enzyme I (ACE) inhibition, glucose uptake stimulation and antioxidant properties. Copyright © 2017 Elsevier Ltd. All rights reserved.
Lwin, Wint Wah; Park, Ken; Wauson, Matthew; Gao, Qin; Finn, Patricia W; Perkins, David; Khanna, Ajai
2012-07-01
Systems biology is gaining importance in studying complex systems such as the functional interconnections of human genes [1]. To investigate the molecular interactions involved in T cell immune responses, we used databases of physical gene-gene interactions to constructed molecular interaction networks (interconnections) with R language algorithms. This helped to identify highly interconnected "hub" genes AT(1)P5C1, IL6ST, PRKCZ, MYC, FOS, JUN, and MAPK1. We hypothesized that suppression of these hub genes in the gene network would result in significant phenotypic effects on T cells and examined this in vitro. The molecular interaction networks were then analyzed and visualized with Cytoscape. Jurkat and HeLa cells were transfected with siRNA for the selected hub genes. Cell proliferation was measured using ATP luminescence and BrdU labeling, which were measured 36, 72, and 96 h after activation. Following T cell stimulation, we found a significant decrease in ATP production (P < 0.05) when the hub genes ATP5C1 and PRKCZ were knocked down using siRNA transfection, whereas no difference in ATP production was observed in siRNA transfected HeLa cells. However, HeLa cells showed a significant (P < 0.05) decrease in cell proliferation when the genes MAPK1, IL6ST, ATP5C1, JUN, and FOS were knocked down. In both Jurkat and HeLa cells, targeted gene knockdown using siRNA showed decreased cell proliferation and ATP production in both Jurkat and HeLa cells. However, Jurkat T cells and HELA cells use different hub genes to regulate activation responses. This experiment provides proof of principle of applying siRNA knockdown of T cell hub genes to evaluate their proliferative capacity and ATP production. This novel concept outlines a systems biology approach to identify hub genes for targeted therapeutics. Published by Elsevier Inc.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yusim, Karina; Korber, Bette Tina Marie; Barouch, Dan
HIV Molecular Immunology is a companion volume to HIV Sequence Compendium. This publication, the 2014 edition, is the PDF version of the web-based HIV Immunology Database (http://www.hiv.lanl.gov/content/immunology/). The web interface for this relational database has many search options, as well as interactive tools to help immunologists design reagents and interpret their results. In the HIV Immunology Database, HIV-specific B-cell and T-cell responses are summarized and annotated. Immunological responses are divided into three parts, CTL, T helper, and antibody. Within these parts, defined epitopes are organized by protein and binding sites within each protein, moving from left to right through themore » coding regions spanning the HIV genome. We include human responses to natural HIV infections, as well as vaccine studies in a range of animal models and human trials. Responses that are not specifically defined, such as responses to whole proteins or monoclonal antibody responses to discontinuous epitopes, are summarized at the end of each protein section. Studies describing general HIV responses to the virus, but not to any specific protein, are included at the end of each part. The annotation includes information such as crossreactivity, escape mutations, antibody sequence, TCR usage, functional domains that overlap with an epitope, immune response associations with rates of progression and therapy, and how specific epitopes were experimentally defined. Basic information such as HLA specificities for T-cell epitopes, isotypes of monoclonal antibodies, and epitope sequences are included whenever possible. All studies that we can find that incorporate the use of a specific monoclonal antibody are included in the entry for that antibody. A single T-cell epitope can have multiple entries, generally one entry per study. Finally, maps of all defined linear epitopes relative to the HXB2 reference proteins are provided.« less
Publications - DDS 8 | Alaska Division of Geological & Geophysical Surveys
DGGS DDS 8 Publication Details Title: Alaska Volcano Observatory geochemical database Authors: Cameron ., Snedigar, S.F., and Nye, C.J., 2014, Alaska Volcano Observatory geochemical database: Alaska Division of ://doi.org/10.14509/29120 Publication Products Interactive Interactive Database Alaska Volcano Observatory
GIMDA: Graphlet interaction-based MiRNA-disease association prediction.
Chen, Xing; Guan, Na-Na; Li, Jian-Qiang; Yan, Gui-Ying
2018-03-01
MicroRNAs (miRNAs) have been confirmed to be closely related to various human complex diseases by many experimental studies. It is necessary and valuable to develop powerful and effective computational models to predict potential associations between miRNAs and diseases. In this work, we presented a prediction model of Graphlet Interaction for MiRNA-Disease Association prediction (GIMDA) by integrating the disease semantic similarity, miRNA functional similarity, Gaussian interaction profile kernel similarity and the experimentally confirmed miRNA-disease associations. The related score of a miRNA to a disease was calculated by measuring the graphlet interactions between two miRNAs or two diseases. The novelty of GIMDA lies in that we used graphlet interaction to analyse the complex relationships between two nodes in a graph. The AUCs of GIMDA in global and local leave-one-out cross-validation (LOOCV) turned out to be 0.9006 and 0.8455, respectively. The average result of five-fold cross-validation reached to 0.8927 ± 0.0012. In case study for colon neoplasms, kidney neoplasms and prostate neoplasms based on the database of HMDD V2.0, 45, 45, 41 of the top 50 potential miRNAs predicted by GIMDA were validated by dbDEMC and miR2Disease. Additionally, in the case study of new diseases without any known associated miRNAs and the case study of predicting potential miRNA-disease associations using HMDD V1.0, there were also high percentages of top 50 miRNAs verified by the experimental literatures. © 2017 The Authors. Journal of Cellular and Molecular Medicine published by John Wiley & Sons Ltd and Foundation for Cellular and Molecular Medicine.
Besalú, Emili
2016-01-01
The Superposing Significant Interaction Rules (SSIR) method is described. It is a general combinatorial and symbolic procedure able to rank compounds belonging to combinatorial analogue series. The procedure generates structure-activity relationship (SAR) models and also serves as an inverse SAR tool. The method is fast and can deal with large databases. SSIR operates from statistical significances calculated from the available library of compounds and according to the previously attached molecular labels of interest or non-interest. The required symbolic codification allows dealing with almost any combinatorial data set, even in a confidential manner, if desired. The application example categorizes molecules as binding or non-binding, and consensus ranking SAR models are generated from training and two distinct cross-validation methods: leave-one-out and balanced leave-two-out (BL2O), the latter being suited for the treatment of binary properties. PMID:27240346
A Computational Approach to Finding Novel Targets for Existing Drugs
Li, Yvonne Y.; An, Jianghong; Jones, Steven J. M.
2011-01-01
Repositioning existing drugs for new therapeutic uses is an efficient approach to drug discovery. We have developed a computational drug repositioning pipeline to perform large-scale molecular docking of small molecule drugs against protein drug targets, in order to map the drug-target interaction space and find novel interactions. Our method emphasizes removing false positive interaction predictions using criteria from known interaction docking, consensus scoring, and specificity. In all, our database contains 252 human protein drug targets that we classify as reliable-for-docking as well as 4621 approved and experimental small molecule drugs from DrugBank. These were cross-docked, then filtered through stringent scoring criteria to select top drug-target interactions. In particular, we used MAPK14 and the kinase inhibitor BIM-8 as examples where our stringent thresholds enriched the predicted drug-target interactions with known interactions up to 20 times compared to standard score thresholds. We validated nilotinib as a potent MAPK14 inhibitor in vitro (IC50 40 nM), suggesting a potential use for this drug in treating inflammatory diseases. The published literature indicated experimental evidence for 31 of the top predicted interactions, highlighting the promising nature of our approach. Novel interactions discovered may lead to the drug being repositioned as a therapeutic treatment for its off-target's associated disease, added insight into the drug's mechanism of action, and added insight into the drug's side effects. PMID:21909252
PathNER: a tool for systematic identification of biological pathway mentions in the literature
2013-01-01
Background Biological pathways are central to many biomedical studies and are frequently discussed in the literature. Several curated databases have been established to collate the knowledge of molecular processes constituting pathways. Yet, there has been little focus on enabling systematic detection of pathway mentions in the literature. Results We developed a tool, named PathNER (Pathway Named Entity Recognition), for the systematic identification of pathway mentions in the literature. PathNER is based on soft dictionary matching and rules, with the dictionary generated from public pathway databases. The rules utilise general pathway-specific keywords, syntactic information and gene/protein mentions. Detection results from both components are merged. On a gold-standard corpus, PathNER achieved an F1-score of 84%. To illustrate its potential, we applied PathNER on a collection of articles related to Alzheimer's disease to identify associated pathways, highlighting cases that can complement an existing manually curated knowledgebase. Conclusions In contrast to existing text-mining efforts that target the automatic reconstruction of pathway details from molecular interactions mentioned in the literature, PathNER focuses on identifying specific named pathway mentions. These mentions can be used to support large-scale curation and pathway-related systems biology applications, as demonstrated in the example of Alzheimer's disease. PathNER is implemented in Java and made freely available online at http://sourceforge.net/projects/pathner/. PMID:24555844
Chen, Yuting; Cassone, Bryan J; Bai, Xiaodong; Redinbaugh, Margaret G; Michel, Andrew P
2012-01-01
Leafhoppers (HEmiptera: Cicadellidae) are plant-phloem feeders that are known for their ability to vector plant pathogens. The black-faced leafhopper (Graminella nigrifrons) has been identified as the only known vector for the Maize fine streak virus (MFSV), an emerging plant pathogen in the Rhabdoviridae. Within G. nigrifrons populations, individuals can be experimentally separated into three classes based on their capacity for viral transmission: transmitters, acquirers and non-acquirers. Understanding the molecular interactions between vector and virus can reveal important insights in virus immune defense and vector transmission. RNA sequencing (RNA-Seq) was performed to characterize the transcriptome of G. nigrifrons. A total of 38,240 ESTs of a minimum 100 bp were generated from two separate cDNA libraries consisting of virus transmitters and acquirers. More than 60% of known D. melanogaster, A. gambiae, T. castaneum immune response genes mapped to our G. nigrifrons EST database. Real time quantitative PCR (RT-qPCR) showed significant down-regulation of three genes for peptidoglycan recognition proteins (PGRP - SB1, SD, and LC) in G. nigrifrons transmitters versus control leafhoppers. Our study is the first to characterize the transcriptome of a leafhopper vector species. Significant sequence similarity in immune defense genes existed between G. nigrifrons and other well characterized insects. The down-regulation of PGRPs in MFSV transmitters suggested a possible role in rhabdovirus transmission. The results provide a framework for future studies aimed at elucidating the molecular mechanisms of plant virus vector competence.
HoPaCI-DB: host-Pseudomonas and Coxiella interaction database
Bleves, Sophie; Dunger, Irmtraud; Walter, Mathias C.; Frangoulidis, Dimitrios; Kastenmüller, Gabi; Voulhoux, Romé; Ruepp, Andreas
2014-01-01
Bacterial infectious diseases are the result of multifactorial processes affected by the interplay between virulence factors and host targets. The host-Pseudomonas and Coxiella interaction database (HoPaCI-DB) is a publicly available manually curated integrative database (http://mips.helmholtz-muenchen.de/HoPaCI/) of host–pathogen interaction data from Pseudomonas aeruginosa and Coxiella burnetii. The resource provides structured information on 3585 experimentally validated interactions between molecules, bioprocesses and cellular structures extracted from the scientific literature. Systematic annotation and interactive graphical representation of disease networks make HoPaCI-DB a versatile knowledge base for biologists and network biology approaches. PMID:24137008
... Splign Vector Alignment Search Tool (VAST) All Data & Software Resources... Domains & Structures BioSystems Cn3D Conserved Domain Database (CDD) Conserved Domain Search Service (CD Search) Structure (Molecular Modeling Database) Vector Alignment ...
Information resources at the National Center for Biotechnology Information.
Woodsmall, R M; Benson, D A
1993-01-01
The National Center for Biotechnology Information (NCBI), part of the National Library of Medicine, was established in 1988 to perform basic research in the field of computational molecular biology as well as build and distribute molecular biology databases. The basic research has led to new algorithms and analysis tools for interpreting genomic data and has been instrumental in the discovery of human disease genes for neurofibromatosis and Kallmann syndrome. The principal database responsibility is the National Institutes of Health (NIH) genetic sequence database, GenBank. NCBI, in collaboration with international partners, builds, distributes, and provides online and CD-ROM access to over 112,000 DNA sequences. Another major program is the integration of multiple sequences databases and related bibliographic information and the development of network-based retrieval systems for Internet access. PMID:8374583
FreeSolv: A database of experimental and calculated hydration free energies, with input files
Mobley, David L.; Guthrie, J. Peter
2014-01-01
This work provides a curated database of experimental and calculated hydration free energies for small neutral molecules in water, along with molecular structures, input files, references, and annotations. We call this the Free Solvation Database, or FreeSolv. Experimental values were taken from prior literature and will continue to be curated, with updated experimental references and data added as they become available. Calculated values are based on alchemical free energy calculations using molecular dynamics simulations. These used the GAFF small molecule force field in TIP3P water with AM1-BCC charges. Values were calculated with the GROMACS simulation package, with full details given in references cited within the database itself. This database builds in part on a previous, 504-molecule database containing similar information. However, additional curation of both experimental data and calculated values has been done here, and the total number of molecules is now up to 643. Additional information is now included in the database, such as SMILES strings, PubChem compound IDs, accurate reference DOIs, and others. One version of the database is provided in the Supporting Information of this article, but as ongoing updates are envisioned, the database is now versioned and hosted online. In addition to providing the database, this work describes its construction process. The database is available free-of-charge via http://www.escholarship.org/uc/item/6sd403pz. PMID:24928188
Crosara, Karla Tonelli Bicalho; Moffa, Eduardo Buozi; Xiao, Yizhi; Siqueira, Walter Luiz
2018-01-16
Protein-protein interaction is a common physiological mechanism for protection and actions of proteins in an organism. The identification and characterization of protein-protein interactions in different organisms is necessary to better understand their physiology and to determine their efficacy. In a previous in vitro study using mass spectrometry, we identified 43 proteins that interact with histatin 1. Six previously documented interactors were confirmed and 37 novel partners were identified. In this tutorial, we aimed to demonstrate the usefulness of the STRING database for studying protein-protein interactions. We used an in-silico approach along with the STRING database (http://string-db.org/) and successfully performed a fast simulation of a novel constructed histatin 1 protein-protein network, including both the previously known and the predicted interactors, along with our newly identified interactors. Our study highlights the advantages and importance of applying bioinformatics tools to merge in-silico tactics with experimental in vitro findings for rapid advancement of our knowledge about protein-protein interactions. Our findings also indicate that bioinformatics tools such as the STRING protein network database can help predict potential interactions between proteins and thus serve as a guide for future steps in our exploration of the Human Interactome. Our study highlights the usefulness of the STRING protein database for studying protein-protein interactions. The STRING database can collect and integrate data about known and predicted protein-protein associations from many organisms, including both direct (physical) and indirect (functional) interactions, in an easy-to-use interface. Copyright © 2017 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Azizah, A.; Suselo, Y. H.; Muthmainah, M.; Indarto, D.
2018-05-01
Gestational Hypertension is one of the three main causes of maternal mortality in Indonesia. Nifedipine which blockes the Cav1.2 calcium channel has frequently been used to treat gestational hypertension. However the efficacy of nifedipine has not been established yet and the prevalence of gestational hypertension is still high (27.1 %). Indonesian herbal plants have potential to be developed as natural drugs. Molecular docking, a computational method, is very often used to depict interaction between molecules and target receptor This study was therefore to identify Indonesian herbal plants that could inhibit the calcium channel in silico. This was a bioinformatics study with molecular docking approach. Three-dimensional structure of human calcium channel Cav1.2 was determined by modelling with rabbit calcium channel (ID:5GJW) as template and using the SWISS MODEL software. Nifedipine was used as a standard ligand and obtained from ZINC database with the access code ZINC19594578. Active compounds of Indonesian herbal plants were registered in HerbalDB database and their molecular structure was obtained from PubChem. Binding affinity of human Cav1.2 model-ligand complexes were assesed using AutoDock Vina 1.1.2 software and visualization of molecular conformation used Chimera 1.10 and PyMol 1.3 softwares. The Lipinsky’s rules of five were used to determine active compounds which fullfilled drug criteria. The human Cav1-2 model had 72.35% sequence identity with rabbit Cav1.1. Nifedipine bound to the human Cav1.2 model with -2.1 kcal/mol binding affinity and had binding sites at Gln1060, Phe1129, Ser1132, and Ile1173 residues. A lower binding affinity was observed in 8 phytochemicals but only obtusifolin 2-glucoside (-2.2 kcal/mol) had similar binding sites as nifedipin did. In addition, obtusifolin 2-glucoside met the Lipinsky criteria and the molecule conformation was similar with nifedipine. From the HerbalDB database, obtusifolin 2-glucoside is found in Tectona grandis. Obtusifolin 2-glucoside computationally becomes a potensial candidate of calcium channel blocker. In vitro assays should be performed to evaluate the antagonist effect of obtusifolin 2-glucoside on calcium channel Cav1.2.
Pragmatic precision oncology: the secondary uses of clinical tumor molecular profiling
Thota, Ramya; Staggs, David B; Johnson, Douglas B; Warner, Jeremy L
2016-01-01
Background Precision oncology increasingly utilizes molecular profiling of tumors to determine treatment decisions with targeted therapeutics. The molecular profiling data is valuable in the treatment of individual patients as well as for multiple secondary uses. Objective To automatically parse, categorize, and aggregate clinical molecular profile data generated during cancer care as well as use this data to address multiple secondary use cases. Methods A system to parse, categorize and aggregate molecular profile data was created. A naÿve Bayesian classifier categorized results according to clinical groups. The accuracy of these systems were validated against a published expertly-curated subset of molecular profiling data. Results Following one year of operation, 819 samples have been accurately parsed and categorized to generate a data repository of 10,620 genetic variants. The database has been used for operational, clinical trial, and discovery science research. Conclusions A real-time database of molecular profiling data is a pragmatic solution to several knowledge management problems in the practice and science of precision oncology. PMID:27026612
Bioinformatics analysis on molecular mechanism of rheum officinale in treatment of jaundice
NASA Astrophysics Data System (ADS)
Shan, Si; Tu, Jun; Nie, Peng; Yan, Xiaojun
2017-01-01
Objective: To study the molecular mechanism of Rheum officinale in the treatment of Jaundice by building molecular networks and comparing canonical pathways. Methods: Target proteins of Rheum officinale and related genes of Jaundice were searched from Pubchem and Gene databases online respectively. Molecular networks and canonical pathways comparison analyses were performed by Ingenuity Pathway Analysis (IPA). Results: The molecular networks of Rheum officinale and Jaundice were complex and multifunctional. The 40 target proteins of Rheum officinale and 33 Homo sapiens genes of Jaundice were found in databases. There were 19 common pathways both related networks. Rheum officinale could regulate endothelial differentiation, Interleukin-1B (IL-1B) and Tumor Necrosis Factor (TNF) in these pathways. Conclusions: Rheum officinale treat Jaundice by regulating many effective nodes of Apoptotic pathway and cellular immunity related pathways.
NALDB: nucleic acid ligand database for small molecules targeting nucleic acid.
Kumar Mishra, Subodh; Kumar, Amit
2016-01-01
Nucleic acid ligand database (NALDB) is a unique database that provides detailed information about the experimental data of small molecules that were reported to target several types of nucleic acid structures. NALDB is the first ligand database that contains ligand information for all type of nucleic acid. NALDB contains more than 3500 ligand entries with detailed pharmacokinetic and pharmacodynamic information such as target name, target sequence, ligand 2D/3D structure, SMILES, molecular formula, molecular weight, net-formal charge, AlogP, number of rings, number of hydrogen bond donor and acceptor, potential energy along with their Ki, Kd, IC50 values. All these details at single platform would be helpful for the development and betterment of novel ligands targeting nucleic acids that could serve as a potential target in different diseases including cancers and neurological disorders. With maximum 255 conformers for each ligand entry, our database is a multi-conformer database and can facilitate the virtual screening process. NALDB provides powerful web-based search tools that make database searching efficient and simplified using option for text as well as for structure query. NALDB also provides multi-dimensional advanced search tool which can screen the database molecules on the basis of molecular properties of ligand provided by database users. A 3D structure visualization tool has also been included for 3D structure representation of ligands. NALDB offers an inclusive pharmacological information and the structurally flexible set of small molecules with their three-dimensional conformers that can accelerate the virtual screening and other modeling processes and eventually complement the nucleic acid-based drug discovery research. NALDB can be routinely updated and freely available on bsbe.iiti.ac.in/bsbe/naldb/HOME.php. Database URL: http://bsbe.iiti.ac.in/bsbe/naldb/HOME.php. © The Author(s) 2016. Published by Oxford University Press.
The EDGE-CALIFA Survey: Interferometric Observations of 126 Galaxies with CARMA
NASA Astrophysics Data System (ADS)
Bolatto, Alberto D.; Wong, Tony; Utomo, Dyas; Blitz, Leo; Vogel, Stuart N.; Sánchez, Sebastián F.; Barrera-Ballesteros, Jorge; Cao, Yixian; Colombo, Dario; Dannerbauer, Helmut; García-Benito, Rubén; Herrera-Camus, Rodrigo; Husemann, Bernd; Kalinova, Veselina; Leroy, Adam K.; Leung, Gigi; Levy, Rebecca C.; Mast, Damián; Ostriker, Eve; Rosolowsky, Erik; Sandstrom, Karin M.; Teuben, Peter; van de Ven, Glenn; Walter, Fabian
2017-09-01
We present interferometric CO observations, made with the Combined Array for Millimeter-wave Astronomy (CARMA) interferometer, of galaxies from the Extragalactic Database for Galaxy Evolution survey (EDGE). These galaxies are selected from the Calar Alto Legacy Integral Field Area (CALIFA) sample, mapped with optical integral field spectroscopy. EDGE provides good-quality CO data (3σ sensitivity {{{Σ }}}{mol}˜ 11 {M}⊙ {{pc}}-2 before inclination correction, resolution ˜1.4 kpc) for 126 galaxies, constituting the largest interferometric CO survey of galaxies in the nearby universe. We describe the survey and data characteristics and products, then present initial science results. We find that the exponential scale lengths of the molecular, stellar, and star-forming disks are approximately equal, and galaxies that are more compact in molecular gas than in stars tend to show signs of interaction. We characterize the molecular-to-stellar ratio as a function of Hubble type and stellar mass and present preliminary results on the resolved relations between the molecular gas, stars, and star-formation rate. We then discuss the dependence of the resolved molecular depletion time on stellar surface density, nebular extinction, and gas metallicity. EDGE provides a key data set to address outstanding topics regarding gas and its role in star formation and galaxy evolution, which will be publicly available on completion of the quality assessment.
Kumar, Anurag; Saha, Bhaskar; Singh, Shailza
2017-12-01
Leishmaniasis is the second largest parasitic killer disease caused by the protozoan parasite Leishmania , transmitted by the bite of sand flies. It's endemic in the eastern India with 165.4 million populations at risk with the current drug regimen. Three forms of leishmaniasis exist in which cutaneous is the most common form caused by Leishmania major . Trypanothione Reductase (TryR), a flavoprotein oxidoreductase, unique to thiol redox system, is considered as a potential target for chemotherapy for trypanosomatids infection. It is involved in the NADPH dependent reduction of Trypanothione disulphide to Trypanothione. Similarly, is Tryparedoxin Peroxidase (Txnpx), for detoxification of peroxides, an event pivotal for survival of Leishmania in two disparate biological environment. Fe-S plays a major role in regulating redox balance. To check for the closeness between human homologs of these proteins, we have carried the molecular clock analysis followed by molecular modeling of 3D structure of this protein, enabling us to design and test the novel drug like molecules. Molecular clock analysis suggests that human homologs of TryR i.e. Glutathione Reductase and Txnpx respectively are highly diverged in phylogenetic tree, thus, they serve as good candidates for chemotherapy of leishmaniasis. Furthermore, we have done the homology modeling of TryR using template of same protein from Leishmania infantum (PDB ID: 2JK6). This was done using Modeller 9.18 and the resultant models were validated. To inhibit this target, molecular docking was done with various screened inhibitors in which we found Taxifolin acts as common inhibitors for both TryR and Txnpx. We constructed the protein-protein interaction network for the proteins that are involved in the redox metabolism from various Interaction databases and the network was statistically analysed.
DGIdb 3.0: a redesign and expansion of the drug-gene interaction database.
Cotto, Kelsy C; Wagner, Alex H; Feng, Yang-Yang; Kiwala, Susanna; Coffman, Adam C; Spies, Gregory; Wollam, Alex; Spies, Nicholas C; Griffith, Obi L; Griffith, Malachi
2018-01-04
The drug-gene interaction database (DGIdb, www.dgidb.org) consolidates, organizes and presents drug-gene interactions and gene druggability information from papers, databases and web resources. DGIdb normalizes content from 30 disparate sources and allows for user-friendly advanced browsing, searching and filtering for ease of access through an intuitive web user interface, application programming interface (API) and public cloud-based server image. DGIdb v3.0 represents a major update of the database. Nine of the previously included 24 sources were updated. Six new resources were added, bringing the total number of sources to 30. These updates and additions of sources have cumulatively resulted in 56 309 interaction claims. This has also substantially expanded the comprehensive catalogue of druggable genes and anti-neoplastic drug-gene interactions included in the DGIdb. Along with these content updates, v3.0 has received a major overhaul of its codebase, including an updated user interface, preset interaction search filters, consolidation of interaction information into interaction groups, greatly improved search response times and upgrading the underlying web application framework. In addition, the expanded API features new endpoints which allow users to extract more detailed information about queried drugs, genes and drug-gene interactions, including listings of PubMed IDs, interaction type and other interaction metadata.
Updates to the Virtual Atomic and Molecular Data Centre
NASA Astrophysics Data System (ADS)
Hill, Christian; Tennyson, Jonathan; Gordon, Iouli E.; Rothman, Laurence S.; Dubernet, Marie-Lise
2014-06-01
The Virtual Atomic and Molecular Data Centre (VAMDC) has established a set of standards for the storage and transmission of atomic and molecular data and an SQL-based query language (VSS2) for searching online databases, known as nodes. The project has also created an online service, the VAMDC Portal, through which all of these databases may be searched and their results compared and aggregated. Since its inception four years ago, the VAMDC e-infrastructure has grown to encompass over 40 databases, including HITRAN, in more than 20 countries and engages actively with scientists in six continents. Associated with the portal are a growing suite of software tools for the transformation of data from its native, XML-based, XSAMS format, to a range of more convenient human-readable (such as HTML) and machinereadable (such as CSV) formats. The relational database for HITRAN1, created as part of the VAMDC project is a flexible and extensible data model which is able to represent a wider range of parameters than the current fixed-format text-based one. Over the next year, a new online interface to this database will be tested, released and fully documented - this web application, HITRANonline2, will fully replace the ageing and incomplete JavaHAWKS software suite.
Cayenne, Andrea P.; Gabert, Beverly; Stillman, Jonathon H.
2011-01-01
Biochemical adaptation of enzymes involves conservation of activity, stability and affinity across a wide range of intracellular and environmental conditions. Enzyme adaptation by alteration of primary structure is well known, but the roles of protein-protein interactions in enzyme adaptation are less well understood. Interspecific differences in thermal stability of lactate dehydrogenase (LDH) in porcelain crabs (genus Petrolisthes) are related to intrinsic differences among LDH molecules and by interactions with other stabilizing proteins. Here, we identified proteins that interact with LDH in porcelain crab claw muscle tissue using co-immunoprecipitation, and showed LDH exists in high molecular weight complexes using size exclusion chromatography and Western blot analyses. Co-immunoprecipitated proteins were separated using 2D SDS PAGE and analyzed by LC/ESI using peptide MS/MS. Peptide MS/MS ions were compared to an EST database for Petrolisthes cinctipes to identify proteins. Identified proteins included cytoskeletal elements, glycolytic enzymes, a phosphagen kinase, and the respiratory protein hemocyanin. Our results support the hypothesis that LDH interacts with glycolytic enzymes in a metabolon structured by cytoskeletal elements that may also include the enzyme for transfer of the adenylate charge in glycolytically produced ATP. Those interactions may play specific roles in biochemical adaptation of glycolytic enzymes. PMID:21968246
Yugandhar, K; Gromiha, M Michael
2014-09-01
Protein-protein interactions are intrinsic to virtually every cellular process. Predicting the binding affinity of protein-protein complexes is one of the challenging problems in computational and molecular biology. In this work, we related sequence features of protein-protein complexes with their binding affinities using machine learning approaches. We set up a database of 185 protein-protein complexes for which the interacting pairs are heterodimers and their experimental binding affinities are available. On the other hand, we have developed a set of 610 features from the sequences of protein complexes and utilized Ranker search method, which is the combination of Attribute evaluator and Ranker method for selecting specific features. We have analyzed several machine learning algorithms to discriminate protein-protein complexes into high and low affinity groups based on their Kd values. Our results showed a 10-fold cross-validation accuracy of 76.1% with the combination of nine features using support vector machines. Further, we observed accuracy of 83.3% on an independent test set of 30 complexes. We suggest that our method would serve as an effective tool for identifying the interacting partners in protein-protein interaction networks and human-pathogen interactions based on the strength of interactions. © 2014 Wiley Periodicals, Inc.
ISMB Conference Funding to Support Attendance of Early Researchers and Students
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gaasterland, Terry
ISMB Conference Funding for Students and Young Scientists Historical Description The Intelligent Systems for Molecular Biology (ISMB) conference has provided a general forum for disseminating the latest developments in bioinformatics on an annual basis for the past 22 years. ISMB is a multidisciplinary conference that brings together scientists from computer science, molecular biology, mathematics and statistics. The goal of the ISMB meeting is to bring together biologists and computational scientists in a focus on actual biological problems, i.e., not simply theoretical calculations. The combined focus on “intelligent systems” and actual biological data makes ISMB a unique and highly important meeting.more » 21 years of experience in holding the conference has resulted in a consistently well-organized, well attended, and highly respected annual conference. "Intelligent systems" include any software which goes beyond straightforward, closed-form algorithms or standard database technologies, and encompasses those that view data in a symbolic fashion, learn from examples, consolidate multiple levels of abstraction, or synthesize results to be cognitively tractable to a human, including the development and application of advanced computational methods for biological problems. Relevant computational techniques include, but are not limited to: machine learning, pattern recognition, knowledge representation, databases, combinatorics, stochastic modeling, string and graph algorithms, linguistic methods, robotics, constraint satisfaction, and parallel computation. Biological areas of interest include molecular structure, genomics, molecular sequence analysis, evolution and phylogenetics, molecular interactions, metabolic pathways, regulatory networks, developmental control, and molecular biology generally. Emphasis is placed on the validation of methods using real data sets, on practical applications in the biological sciences, and on development of novel computational techniques. The ISMB conferences are distinguished from many other conferences in computational biology or artificial intelligence by an insistence that the researchers work with real molecular biology data, not theoretical or toy examples; and from many other biological conferences by providing a forum for technical advances as they occur, which otherwise may be shunned until a firm experimental result is published. The resulting intellectual richness and cross-disciplinary diversity provides an important opportunity for both students and senior researchers. ISMB has become the premier conference series in this field with refereed, published proceedings, establishing an infrastructure to promote the growing body of research.« less
Xu, Wei-Ming; Yang, Kuo; Jiang, Li-Jie; Hu, Jing-Qing; Zhou, Xue-Zhong
2018-01-01
Background: Ischemic heart disease (IHD) has been the leading cause of death for several decades globally, IHD patients usually hold the symptoms of phlegm-stasis cementation syndrome (PSCS) as significant complications. However, the underlying molecular mechanisms of PSCS complicated with IHD have not yet been fully elucidated. Materials and Methods: Network medicine methods were utilized to elucidate the underlying molecular mechanisms of IHD phenotypes. Firstly, high-quality IHD-associated genes from both human curated disease-gene association database and biomedical literatures were integrated. Secondly, the IHD disease modules were obtained by dissecting the protein-protein interaction (PPI) topological modules in the String V9.1 database and the mapping of IHD-associated genes to the PPI topological modules. After that, molecular functional analyses (e.g., Gene Ontology and pathway enrichment analyses) for these IHD disease modules were conducted. Finally, the PSCS syndrome modules were identified by mapping the PSCS related symptom-genes to the IHD disease modules, which were further validated by both pharmacological and physiological evidences derived from published literatures. Results: The total of 1,056 high-quality IHD-associated genes were integrated and evaluated. In addition, eight IHD disease modules (the PPI sub-networks significantly relevant to IHD) were identified, in which two disease modules were relevant to PSCS syndrome (i.e., two PSCS syndrome modules). These two modules had enriched pathways on Toll-like receptor signaling pathway (hsa04620) and Renin-angiotensin system (hsa04614), with the molecular functions of angiotensin maturation (GO:0002003) and response to bacterium (GO:0009617), which had been validated by classical Chinese herbal formulas-related targets, IHD-related drug targets, and the phenotype features derived from human phenotype ontology (HPO) and published biomedical literatures. Conclusion: A network medicine-based approach was proposed to identify the underlying molecular modules of PSCS complicated with IHD, which could be used for interpreting the pharmacological mechanisms of well-established Chinese herbal formulas ( e.g., Tao Hong Si Wu Tang, Dan Shen Yin, Hunag Lian Wen Dan Tang and Gua Lou Xie Bai Ban Xia Tang ). In addition, these results delivered novel understandings of the molecular network mechanisms of IHD phenotype subtypes with PSCS complications, which would be both insightful for IHD precision medicine and the integration of disease and TCM syndrome diagnoses.
Xu, Wei-Ming; Yang, Kuo; Jiang, Li-Jie; Hu, Jing-Qing; Zhou, Xue-Zhong
2018-01-01
Background: Ischemic heart disease (IHD) has been the leading cause of death for several decades globally, IHD patients usually hold the symptoms of phlegm-stasis cementation syndrome (PSCS) as significant complications. However, the underlying molecular mechanisms of PSCS complicated with IHD have not yet been fully elucidated. Materials and Methods: Network medicine methods were utilized to elucidate the underlying molecular mechanisms of IHD phenotypes. Firstly, high-quality IHD-associated genes from both human curated disease-gene association database and biomedical literatures were integrated. Secondly, the IHD disease modules were obtained by dissecting the protein-protein interaction (PPI) topological modules in the String V9.1 database and the mapping of IHD-associated genes to the PPI topological modules. After that, molecular functional analyses (e.g., Gene Ontology and pathway enrichment analyses) for these IHD disease modules were conducted. Finally, the PSCS syndrome modules were identified by mapping the PSCS related symptom-genes to the IHD disease modules, which were further validated by both pharmacological and physiological evidences derived from published literatures. Results: The total of 1,056 high-quality IHD-associated genes were integrated and evaluated. In addition, eight IHD disease modules (the PPI sub-networks significantly relevant to IHD) were identified, in which two disease modules were relevant to PSCS syndrome (i.e., two PSCS syndrome modules). These two modules had enriched pathways on Toll-like receptor signaling pathway (hsa04620) and Renin-angiotensin system (hsa04614), with the molecular functions of angiotensin maturation (GO:0002003) and response to bacterium (GO:0009617), which had been validated by classical Chinese herbal formulas-related targets, IHD-related drug targets, and the phenotype features derived from human phenotype ontology (HPO) and published biomedical literatures. Conclusion: A network medicine-based approach was proposed to identify the underlying molecular modules of PSCS complicated with IHD, which could be used for interpreting the pharmacological mechanisms of well-established Chinese herbal formulas (e.g., Tao Hong Si Wu Tang, Dan Shen Yin, Hunag Lian Wen Dan Tang and Gua Lou Xie Bai Ban Xia Tang). In addition, these results delivered novel understandings of the molecular network mechanisms of IHD phenotype subtypes with PSCS complications, which would be both insightful for IHD precision medicine and the integration of disease and TCM syndrome diagnoses. PMID:29403392
National Center for Biotechnology Information
... Splign Vector Alignment Search Tool (VAST) All Data & Software Resources... Domains & Structures BioSystems Cn3D Conserved Domain Database (CDD) Conserved Domain Search Service (CD Search) Structure (Molecular Modeling Database) Vector Alignment ...
NASA Astrophysics Data System (ADS)
Rutkowski, Lucile; Masłowski, Piotr; Johansson, Alexandra C.; Khodabakhsh, Amir; Foltynowicz, Aleksandra
2018-01-01
Broadband precision spectroscopy is indispensable for providing high fidelity molecular parameters for spectroscopic databases. We have recently shown that mechanical Fourier transform spectrometers based on optical frequency combs can measure broadband high-resolution molecular spectra undistorted by the instrumental line shape (ILS) and with a highly precise frequency scale provided by the comb. The accurate measurement of the power of the comb modes interacting with the molecular sample was achieved by acquiring single-burst interferograms with nominal resolution matched to the comb mode spacing. Here we describe in detail the experimental and numerical steps needed to achieve sub-nominal resolution and retrieve ILS-free molecular spectra, i.e. with ILS-induced distortion below the noise level. We investigate the accuracy of the transition line centers retrieved by fitting to the absorption lines measured using this method. We verify the performance by measuring an ILS-free cavity-enhanced low-pressure spectrum of the 3ν1 + ν3 band of CO2 around 1575 nm with line widths narrower than the nominal resolution. We observe and quantify collisional narrowing of absorption line shape, for the first time with a comb-based spectroscopic technique. Thus retrieval of line shape parameters with accuracy not limited by the Voigt profile is now possible for entire absorption bands acquired simultaneously.
NONATObase: a database for Polychaeta (Annelida) from the Southwestern Atlantic Ocean.
Pagliosa, Paulo R; Doria, João G; Misturini, Dairana; Otegui, Mariana B P; Oortman, Mariana S; Weis, Wilson A; Faroni-Perez, Larisse; Alves, Alexandre P; Camargo, Maurício G; Amaral, A Cecília Z; Marques, Antonio C; Lana, Paulo C
2014-01-01
Networks can greatly advance data sharing attitudes by providing organized and useful data sets on marine biodiversity in a friendly and shared scientific environment. NONATObase, the interactive database on polychaetes presented herein, will provide new macroecological and taxonomic insights of the Southwestern Atlantic region. The database was developed by the NONATO network, a team of South American researchers, who integrated available information on polychaetes from between 5°N and 80°S in the Atlantic Ocean and near the Antarctic. The guiding principle of the database is to keep free and open access to data based on partnerships. Its architecture consists of a relational database integrated in the MySQL and PHP framework. Its web application allows access to the data from three different directions: species (qualitative data), abundance (quantitative data) and data set (reference data). The database has built-in functionality, such as the filter of data on user-defined taxonomic levels, characteristics of site, sample, sampler, and mesh size used. Considering that there are still many taxonomic issues related to poorly known regional fauna, a scientific committee was created to work out consistent solutions to current misidentifications and equivocal taxonomy status of some species. Expertise from this committee will be incorporated by NONATObase continually. The use of quantitative data was possible by standardization of a sample unit. All data, maps of distribution and references from a data set or a specified query can be visualized and exported to a commonly used data format in statistical analysis or reference manager software. The NONATO network has initialized with NONATObase, a valuable resource for marine ecologists and taxonomists. The database is expected to grow in functionality as it comes in useful, particularly regarding the challenges of dealing with molecular genetic data and tools to assess the effects of global environment change. Database URL: http://nonatobase.ufsc.br/.
NONATObase: a database for Polychaeta (Annelida) from the Southwestern Atlantic Ocean
Pagliosa, Paulo R.; Doria, João G.; Misturini, Dairana; Otegui, Mariana B. P.; Oortman, Mariana S.; Weis, Wilson A.; Faroni-Perez, Larisse; Alves, Alexandre P.; Camargo, Maurício G.; Amaral, A. Cecília Z.; Marques, Antonio C.; Lana, Paulo C.
2014-01-01
Networks can greatly advance data sharing attitudes by providing organized and useful data sets on marine biodiversity in a friendly and shared scientific environment. NONATObase, the interactive database on polychaetes presented herein, will provide new macroecological and taxonomic insights of the Southwestern Atlantic region. The database was developed by the NONATO network, a team of South American researchers, who integrated available information on polychaetes from between 5°N and 80°S in the Atlantic Ocean and near the Antarctic. The guiding principle of the database is to keep free and open access to data based on partnerships. Its architecture consists of a relational database integrated in the MySQL and PHP framework. Its web application allows access to the data from three different directions: species (qualitative data), abundance (quantitative data) and data set (reference data). The database has built-in functionality, such as the filter of data on user-defined taxonomic levels, characteristics of site, sample, sampler, and mesh size used. Considering that there are still many taxonomic issues related to poorly known regional fauna, a scientific committee was created to work out consistent solutions to current misidentifications and equivocal taxonomy status of some species. Expertise from this committee will be incorporated by NONATObase continually. The use of quantitative data was possible by standardization of a sample unit. All data, maps of distribution and references from a data set or a specified query can be visualized and exported to a commonly used data format in statistical analysis or reference manager software. The NONATO network has initialized with NONATObase, a valuable resource for marine ecologists and taxonomists. The database is expected to grow in functionality as it comes in useful, particularly regarding the challenges of dealing with molecular genetic data and tools to assess the effects of global environment change. Database URL: http://nonatobase.ufsc.br/ PMID:24573879
Interleukins and their signaling pathways in the Reactome biological pathway database.
Jupe, Steve; Ray, Keith; Roca, Corina Duenas; Varusai, Thawfeek; Shamovsky, Veronica; Stein, Lincoln; D'Eustachio, Peter; Hermjakob, Henning
2018-04-01
There is a wealth of biological pathway information available in the scientific literature, but it is spread across many thousands of publications. Alongside publications that contain definitive experimental discoveries are many others that have been dismissed as spurious, found to be irreproducible, or are contradicted by later results and consequently now considered controversial. Many descriptions and images of pathways are incomplete stylized representations that assume the reader is an expert and familiar with the established details of the process, which are consequently not fully explained. Pathway representations in publications frequently do not represent a complete, detailed, and unambiguous description of the molecules involved; their precise posttranslational state; or a full account of the molecular events they undergo while participating in a process. Although this might be sufficient to be interpreted by an expert reader, the lack of detail makes such pathways less useful and difficult to understand for anyone unfamiliar with the area and of limited use as the basis for computational models. Reactome was established as a freely accessible knowledge base of human biological pathways. It is manually populated with interconnected molecular events that fully detail the molecular participants linked to published experimental data and background material by using a formal and open data structure that facilitates computational reuse. These data are accessible on a Web site in the form of pathway diagrams that have descriptive summaries and annotations and as downloadable data sets in several formats that can be reused with other computational tools. The entire database and all supporting software can be downloaded and reused under a Creative Commons license. Pathways are authored by expert biologists who work with Reactome curators and editorial staff to represent the consensus in the field. Pathways are represented as interactive diagrams that include as much molecular detail as possible and are linked to literature citations that contain supporting experimental details. All newly created events undergo a peer-review process before they are added to the database and made available on the associated Web site. New content is added quarterly. The 63rd release of Reactome in December 2017 contains 10,996 human proteins participating in 11,426 events in 2,179 pathways. In addition, analytic tools allow data set submission for the identification and visualization of pathway enrichment and representation of expression profiles as an overlay on Reactome pathways. Protein-protein and compound-protein interactions from several sources, including custom user data sets, can be added to extend pathways. Pathway diagrams and analytic result displays can be downloaded as editable images, human-readable reports, and files in several standard formats that are suitable for computational reuse. Reactome content is available programmatically through a REpresentational State Transfer (REST)-based content service and as a Neo4J graph database. Signaling pathways for IL-1 to IL-38 are hierarchically classified within the pathway "signaling by interleukins." The classification used is largely derived from Akdis et al. The addition to Reactome of a complete set of the known human interleukins, their receptors, and established signaling pathways linked to annotations of relevant aspects of immune function provides a significant computationally accessible resource of information about this important family. This information can be extended easily as new discoveries become accepted as the consensus in the field. A key aim for the future is to increase coverage of gene expression changes induced by interleukin signaling. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.
TMDB: a literature-curated database for small molecular compounds found from tea.
Yue, Yi; Chu, Gang-Xiu; Liu, Xue-Shi; Tang, Xing; Wang, Wei; Liu, Guang-Jin; Yang, Tao; Ling, Tie-Jun; Wang, Xiao-Gang; Zhang, Zheng-Zhu; Xia, Tao; Wan, Xiao-Chun; Bao, Guan-Hu
2014-09-16
Tea is one of the most consumed beverages worldwide. The healthy effects of tea are attributed to a wealthy of different chemical components from tea. Thousands of studies on the chemical constituents of tea had been reported. However, data from these individual reports have not been collected into a single database. The lack of a curated database of related information limits research in this field, and thus a cohesive database system should necessarily be constructed for data deposit and further application. The Tea Metabolome database (TMDB), a manually curated and web-accessible database, was developed to provide detailed, searchable descriptions of small molecular compounds found in Camellia spp. esp. in the plant Camellia sinensis and compounds in its manufactured products (different kinds of tea infusion). TMDB is currently the most complete and comprehensive curated collection of tea compounds data in the world. It contains records for more than 1393 constituents found in tea with information gathered from 364 published books, journal articles, and electronic databases. It also contains experimental 1H NMR and 13C NMR data collected from the purified reference compounds or collected from other database resources such as HMDB. TMDB interface allows users to retrieve tea compounds entries by keyword search using compound name, formula, occurrence, and CAS register number. Each entry in the TMDB contains an average of 24 separate data fields including its original plant species, compound structure, formula, molecular weight, name, CAS registry number, compound types, compound uses including healthy benefits, reference literatures, NMR, MS data, and the corresponding ID from databases such as HMDB and Pubmed. Users can also contribute novel regulatory entries by using a web-based submission page. The TMDB database is freely accessible from the URL of http://pcsb.ahau.edu.cn:8080/TCDB/index.jsp. The TMDB is designed to address the broad needs of tea biochemists, natural products chemists, nutritionists, and members of tea related research community. The TMDB database provides a solid platform for collection, standardization, and searching of compounds information found in tea. As such this database will be a comprehensive repository for tea biochemistry and tea health research community.
Caetano, Fabiana A; Dirk, Brennan S; Tam, Joshua H K; Cavanagh, P Craig; Goiko, Maria; Ferguson, Stephen S G; Pasternak, Stephen H; Dikeakos, Jimmy D; de Bruyn, John R; Heit, Bryan
2015-12-01
Our current understanding of the molecular mechanisms which regulate cellular processes such as vesicular trafficking has been enabled by conventional biochemical and microscopy techniques. However, these methods often obscure the heterogeneity of the cellular environment, thus precluding a quantitative assessment of the molecular interactions regulating these processes. Herein, we present Molecular Interactions in Super Resolution (MIiSR) software which provides quantitative analysis tools for use with super-resolution images. MIiSR combines multiple tools for analyzing intermolecular interactions, molecular clustering and image segmentation. These tools enable quantification, in the native environment of the cell, of molecular interactions and the formation of higher-order molecular complexes. The capabilities and limitations of these analytical tools are demonstrated using both modeled data and examples derived from the vesicular trafficking system, thereby providing an established and validated experimental workflow capable of quantitatively assessing molecular interactions and molecular complex formation within the heterogeneous environment of the cell.
Recent progress and future directions in protein-protein docking.
Ritchie, David W
2008-02-01
This article gives an overview of recent progress in protein-protein docking and it identifies several directions for future research. Recent results from the CAPRI blind docking experiments show that docking algorithms are steadily improving in both reliability and accuracy. Current docking algorithms employ a range of efficient search and scoring strategies, including e.g. fast Fourier transform correlations, geometric hashing, and Monte Carlo techniques. These approaches can often produce a relatively small list of up to a few thousand orientations, amongst which a near-native binding mode is often observed. However, despite the use of improved scoring functions which typically include models of desolvation, hydrophobicity, and electrostatics, current algorithms still have difficulty in identifying the correct solution from the list of false positives, or decoys. Nonetheless, significant progress is being made through better use of bioinformatics, biochemical, and biophysical information such as e.g. sequence conservation analysis, protein interaction databases, alanine scanning, and NMR residual dipolar coupling restraints to help identify key binding residues. Promising new approaches to incorporate models of protein flexibility during docking are being developed, including the use of molecular dynamics snapshots, rotameric and off-rotamer searches, internal coordinate mechanics, and principal component analysis based techniques. Some investigators now use explicit solvent models in their docking protocols. Many of these approaches can be computationally intensive, although new silicon chip technologies such as programmable graphics processor units are beginning to offer competitive alternatives to conventional high performance computer systems. As cryo-EM techniques improve apace, docking NMR and X-ray protein structures into low resolution EM density maps is helping to bridge the resolution gap between these complementary techniques. The use of symmetry and fragment assembly constraints are also helping to make possible docking-based predictions of large multimeric protein complexes. In the near future, the closer integration of docking algorithms with protein interface prediction software, structural databases, and sequence analysis techniques should help produce better predictions of protein interaction networks and more accurate structural models of the fundamental molecular interactions within the cell.
NASA Astrophysics Data System (ADS)
Patil, Sachin P.; Pacitti, Michael F.; Gilroy, Kevin S.; Ruggiero, John C.; Griffin, Jonathan D.; Butera, Joseph J.; Notarfrancesco, Joseph M.; Tran, Shawn; Stoddart, John W.
2015-02-01
The inhibition of tumor suppressor p53 protein due to its direct interaction with oncogenic murine double minute 2 (MDM2) protein, plays a central role in almost 50 % of all human tumor cells. Therefore, pharmacological inhibition of the p53-binding pocket on MDM2, leading to p53 activation, presents an important therapeutic target against these cancers expressing wild-type p53. In this context, the present study utilized an integrated virtual and experimental screening approach to screen a database of approved drugs for potential p53-MDM2 interaction inhibitors. Specifically, using an ensemble rigid-receptor docking approach with four MDM2 protein crystal structures, six drug molecules were identified as possible p53-MDM2 inhibitors. These drug molecules were then subjected to further molecular modeling investigation through flexible-receptor docking followed by Prime/MM-GBSA binding energy analysis. These studies identified fluspirilene, an approved antipsychotic drug, as a top hit with MDM2 binding mode and energy similar to that of a native MDM2 crystal ligand. The molecular dynamics simulations suggested stable binding of fluspirilene to the p53-binding pocket on MDM2 protein. The experimental testing of fluspirilene showed significant growth inhibition of human colon tumor cells in a p53-dependent manner. Fluspirilene also inhibited growth of several other human tumor cell lines in the NCI60 cell line panel. Taken together, these computational and experimental data suggest a potentially novel role of fluspirilene in inhibiting the p53-MDM2 interaction. It is noteworthy here that fluspirilene has a long history of safe human use, thus presenting immediate clinical potential as a cancer therapeutic. Furthermore, fluspirilene could also serve as a structurally-novel lead molecule for the development of more potent, small-molecule p53-MDM2 inhibitors against several types of cancer. Importantly, the combined computational and experimental screening protocol presented in this study may also prove useful for screening other commercially-available compound databases for identification of novel, small molecule p53-MDM2 inhibitors.
Wang, Xiaojie; Tang, Chunlei; Zhang, Gang; Li, Yingchun; Wang, Chenfang; Liu, Bo; Qu, Zhipeng; Zhao, Jie; Han, Qingmei; Huang, Lili; Chen, Xianming; Kang, Zhensheng
2009-01-01
Background Puccinia striiformis f. sp. tritici is a fungal pathogen causing stripe rust, one of the most important wheat diseases worldwide. The fungus is strictly biotrophic and thus, completely dependent on living host cells for its reproduction, which makes it difficult to study genes of the pathogen. In spite of its economic importance, little is known about the molecular basis of compatible interaction between the pathogen and wheat host. In this study, we identified wheat and P. striiformis genes associated with the infection process by conducting a large-scale transcriptomic analysis using cDNA-AFLP. Results Of the total 54,912 transcript derived fragments (TDFs) obtained using cDNA-AFLP with 64 primer pairs, 2,306 (4.2%) displayed altered expression patterns after inoculation, of which 966 showed up-regulated and 1,340 down-regulated. 186 TDFs produced reliable sequences after sequencing of 208 TDFs selected, of which 74 (40%) had known functions through BLAST searching the GenBank database. Majority of the latter group had predicted gene products involved in energy (13%), signal transduction (5.4%), disease/defence (5.9%) and metabolism (5% of the sequenced TDFs). BLAST searching of the wheat stem rust fungus genome database identified 18 TDFs possibly from the stripe rust pathogen, of which 9 were validated of the pathogen origin using PCR-based assays followed by sequencing confirmation. Of the 186 reliable TDFs, 29 homologous to genes known to play a role in disease/defense, signal transduction or uncharacterized genes were further selected for validation of cDNA-AFLP expression patterns using qRT-PCR analyses. Results confirmed the altered expression patterns of 28 (96.5%) genes revealed by the cDNA-AFLP technique. Conclusion The results show that cDNA-AFLP is a reliable technique for studying expression patterns of genes involved in the wheat-stripe rust interactions. Genes involved in compatible interactions between wheat and the stripe rust pathogen were identified and their expression patterns were determined. The present study should be helpful in elucidating the molecular basis of the infection process, and identifying genes that can be targeted for inhibiting the growth and reproduction of the pathogen. Moreover, this study can also be used to elucidate the defence responses of the genes that were of plant origin. PMID:19566949
DCMS: A data analytics and management system for molecular simulation.
Kumar, Anand; Grupcev, Vladimir; Berrada, Meryem; Fogarty, Joseph C; Tu, Yi-Cheng; Zhu, Xingquan; Pandit, Sagar A; Xia, Yuni
Molecular Simulation (MS) is a powerful tool for studying physical/chemical features of large systems and has seen applications in many scientific and engineering domains. During the simulation process, the experiments generate a very large number of atoms and intend to observe their spatial and temporal relationships for scientific analysis. The sheer data volumes and their intensive interactions impose significant challenges for data accessing, managing, and analysis. To date, existing MS software systems fall short on storage and handling of MS data, mainly because of the missing of a platform to support applications that involve intensive data access and analytical process. In this paper, we present the database-centric molecular simulation (DCMS) system our team developed in the past few years. The main idea behind DCMS is to store MS data in a relational database management system (DBMS) to take advantage of the declarative query interface ( i.e. , SQL), data access methods, query processing, and optimization mechanisms of modern DBMSs. A unique challenge is to handle the analytical queries that are often compute-intensive. For that, we developed novel indexing and query processing strategies (including algorithms running on modern co-processors) as integrated components of the DBMS. As a result, researchers can upload and analyze their data using efficient functions implemented inside the DBMS. Index structures are generated to store analysis results that may be interesting to other users, so that the results are readily available without duplicating the analysis. We have developed a prototype of DCMS based on the PostgreSQL system and experiments using real MS data and workload show that DCMS significantly outperforms existing MS software systems. We also used it as a platform to test other data management issues such as security and compression.
López, Yosvany; Nakai, Kenta; Patil, Ashwini
2015-01-01
HitPredict is a consolidated resource of experimentally identified, physical protein-protein interactions with confidence scores to indicate their reliability. The study of genes and their inter-relationships using methods such as network and pathway analysis requires high quality protein-protein interaction information. Extracting reliable interactions from most of the existing databases is challenging because they either contain only a subset of the available interactions, or a mixture of physical, genetic and predicted interactions. Automated integration of interactions is further complicated by varying levels of accuracy of database content and lack of adherence to standard formats. To address these issues, the latest version of HitPredict provides a manually curated dataset of 398 696 physical associations between 70 808 proteins from 105 species. Manual confirmation was used to resolve all issues encountered during data integration. For improved reliability assessment, this version combines a new score derived from the experimental information of the interactions with the original score based on the features of the interacting proteins. The combined interaction score performs better than either of the individual scores in HitPredict as well as the reliability score of another similar database. HitPredict provides a web interface to search proteins and visualize their interactions, and the data can be downloaded for offline analysis. Data usability has been enhanced by mapping protein identifiers across multiple reference databases. Thus, the latest version of HitPredict provides a significantly larger, more reliable and usable dataset of protein-protein interactions from several species for the study of gene groups. Database URL: http://hintdb.hgc.jp/htp. © The Author(s) 2015. Published by Oxford University Press.
ORENZA: a web resource for studying ORphan ENZyme activities
Lespinet, Olivier; Labedan, Bernard
2006-01-01
Background Despite the current availability of several hundreds of thousands of amino acid sequences, more than 36% of the enzyme activities (EC numbers) defined by the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology (NC-IUBMB) are not associated with any amino acid sequence in major public databases. This wide gap separating knowledge of biochemical function and sequence information is found for nearly all classes of enzymes. Thus, there is an urgent need to explore these sequence-less EC numbers, in order to progressively close this gap. Description We designed ORENZA, a PostgreSQL database of ORphan ENZyme Activities, to collate information about the EC numbers defined by the NC-IUBMB with specific emphasis on orphan enzyme activities. Complete lists of all EC numbers and of orphan EC numbers are available and will be periodically updated. ORENZA allows one to browse the complete list of EC numbers or the subset associated with orphan enzymes or to query a specific EC number, an enzyme name or a species name for those interested in particular organisms. It is possible to search ORENZA for the different biochemical properties of the defined enzymes, the metabolic pathways in which they participate, the taxonomic data of the organisms whose genomes encode them, and many other features. The association of an enzyme activity with an amino acid sequence is clearly underlined, making it easy to identify at once the orphan enzyme activities. Interactive publishing of suggestions by the community would provide expert evidence for re-annotation of orphan EC numbers in public databases. Conclusion ORENZA is a Web resource designed to progressively bridge the unwanted gap between function (enzyme activities) and sequence (dataset present in public databases). ORENZA should increase interactions between communities of biochemists and of genomicists. This is expected to reduce the number of orphan enzyme activities by allocating gene sequences to the relevant enzymes. PMID:17026747
Kind, Tobias; Fiehn, Oliver
2007-01-01
Background Structure elucidation of unknown small molecules by mass spectrometry is a challenge despite advances in instrumentation. The first crucial step is to obtain correct elemental compositions. In order to automatically constrain the thousands of possible candidate structures, rules need to be developed to select the most likely and chemically correct molecular formulas. Results An algorithm for filtering molecular formulas is derived from seven heuristic rules: (1) restrictions for the number of elements, (2) LEWIS and SENIOR chemical rules, (3) isotopic patterns, (4) hydrogen/carbon ratios, (5) element ratio of nitrogen, oxygen, phosphor, and sulphur versus carbon, (6) element ratio probabilities and (7) presence of trimethylsilylated compounds. Formulas are ranked according to their isotopic patterns and subsequently constrained by presence in public chemical databases. The seven rules were developed on 68,237 existing molecular formulas and were validated in four experiments. First, 432,968 formulas covering five million PubChem database entries were checked for consistency. Only 0.6% of these compounds did not pass all rules. Next, the rules were shown to effectively reducing the complement all eight billion theoretically possible C, H, N, S, O, P-formulas up to 2000 Da to only 623 million most probable elemental compositions. Thirdly 6,000 pharmaceutical, toxic and natural compounds were selected from DrugBank, TSCA and DNP databases. The correct formulas were retrieved as top hit at 80–99% probability when assuming data acquisition with complete resolution of unique compounds and 5% absolute isotope ratio deviation and 3 ppm mass accuracy. Last, some exemplary compounds were analyzed by Fourier transform ion cyclotron resonance mass spectrometry and by gas chromatography-time of flight mass spectrometry. In each case, the correct formula was ranked as top hit when combining the seven rules with database queries. Conclusion The seven rules enable an automatic exclusion of molecular formulas which are either wrong or which contain unlikely high or low number of elements. The correct molecular formula is assigned with a probability of 98% if the formula exists in a compound database. For truly novel compounds that are not present in databases, the correct formula is found in the first three hits with a probability of 65–81%. Corresponding software and supplemental data are available for downloads from the authors' website. PMID:17389044
CADDIS Volume 5. Causal Databases: Interactive Conceptual Diagrams (ICDs)
In Interactive Conceptual Diagram (ICD) section of CADDIS allows users to create conceptual model diagrams, search a literature-based evidence database, and then attach that evidence to their diagrams.
The InterAction Database includes demographic and prescription information for more than 500,000 patients in the northern and middle Netherlands and has been integrated with other systems to enhance data collection and analysis.
Biological network extraction from scientific literature: state of the art and challenges.
Li, Chen; Liakata, Maria; Rebholz-Schuhmann, Dietrich
2014-09-01
Networks of molecular interactions explain complex biological processes, and all known information on molecular events is contained in a number of public repositories including the scientific literature. Metabolic and signalling pathways are often viewed separately, even though both types are composed of interactions involving proteins and other chemical entities. It is necessary to be able to combine data from all available resources to judge the functionality, complexity and completeness of any given network overall, but especially the full integration of relevant information from the scientific literature is still an ongoing and complex task. Currently, the text-mining research community is steadily moving towards processing the full body of the scientific literature by making use of rich linguistic features such as full text parsing, to extract biological interactions. The next step will be to combine these with information from scientific databases to support hypothesis generation for the discovery of new knowledge and the extension of biological networks. The generation of comprehensive networks requires technologies such as entity grounding, coordination resolution and co-reference resolution, which are not fully solved and are required to further improve the quality of results. Here, we analyse the state of the art for the extraction of network information from the scientific literature and the evaluation of extraction methods against reference corpora, discuss challenges involved and identify directions for future research. © The Author 2013. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.
Bi, Dongbin; Ning, Hao; Liu, Shuai; Que, Xinxiang; Ding, Kejia
2015-06-01
To explore molecular mechanisms of bladder cancer (BC), network strategy was used to find biomarkers for early detection and diagnosis. The differentially expressed genes (DEGs) between bladder carcinoma patients and normal subjects were screened using empirical Bayes method of the linear models for microarray data package. Co-expression networks were constructed by differentially co-expressed genes and links. Regulatory impact factors (RIF) metric was used to identify critical transcription factors (TFs). The protein-protein interaction (PPI) networks were constructed by the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) and clusters were obtained through molecular complex detection (MCODE) algorithm. Centralities analyses for complex networks were performed based on degree, stress and betweenness. Enrichment analyses were performed based on Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) databases. Co-expression networks and TFs (based on expression data of global DEGs and DEGs in different stages and grades) were identified. Hub genes of complex networks, such as UBE2C, ACTA2, FABP4, CKS2, FN1 and TOP2A, were also obtained according to analysis of degree. In gene enrichment analyses of global DEGs, cell adhesion, proteinaceous extracellular matrix and extracellular matrix structural constituent were top three GO terms. ECM-receptor interaction, focal adhesion, and cell cycle were significant pathways. Our results provide some potential underlying biomarkers of BC. However, further validation is required and deep studies are needed to elucidate the pathogenesis of BC. Copyright © 2015 Elsevier Ltd. All rights reserved.
Dynamic, mechanistic, molecular-level modelling of cyanobacteria: Anabaena and nitrogen interaction.
Hellweger, Ferdi L; Fredrick, Neil D; McCarthy, Mark J; Gardner, Wayne S; Wilhelm, Steven W; Paerl, Hans W
2016-09-01
Phytoplankton (eutrophication, biogeochemical) models are important tools for ecosystem research and management, but they generally have not been updated to include modern biology. Here, we present a dynamic, mechanistic, molecular-level (i.e. gene, transcript, protein, metabolite) model of Anabaena - nitrogen interaction. The model was developed using the pattern-oriented approach to model definition and parameterization of complex agent-based models. It simulates individual filaments, each with individual cells, each with genes that are expressed to yield transcripts and proteins. Cells metabolize various forms of N, grow and divide, and differentiate heterocysts when fixed N is depleted. The model is informed by observations from 269 laboratory experiments from 55 papers published from 1942 to 2014. Within this database, we identified 331 emerging patterns, and, excluding inconsistencies in observations, the model reproduces 94% of them. To explore a practical application, we used the model to simulate nutrient reduction scenarios for a hypothetical lake. For a 50% N only loading reduction, the model predicts that N fixation increases, but this fixed N does not compensate for the loading reduction, and the chlorophyll a concentration decreases substantially (by 33%). When N is reduced along with P, the model predicts an additional 8% reduction (compared to P only). © 2016 Society for Applied Microbiology and John Wiley & Sons Ltd.
Long non-coding RNA expression profile in Cdk5-knockdown mouse skin.
Ji, Kaiyuan; Fan, Ruiwen; Zhang, Junzhen; Yang, Shanshan; Dong, Changsheng
2018-06-08
To elucidate the Cdk5 regulatory molecular mechanism in skin, we generated Cdk5-knockdown mice and subjected their skins to lncRNA sequencing. The results showed that there were 4533 novel lncRNAs from 142 lncRNA families. In total, 693 lncRNAs were significantly differentially expressed. Alignment analysis of the lncRNAs in miRBase identified 45 pre-mRNAs. By KEGG PATHWAY Database analysis, we found that lncRNAs (lnc-NONMMUT064276.2, lnc-NONMMUT075728.1, and lnc-NONMMUT039653.2) may regulate pigmentation by regulating target genes. To reveal potential antisense lncRNA-mRNA interactions, we searched all lncRNA-mRNA duplexes using RNAplex, and found 97 lncRNAs interacted with mRNAs. The luciferase assay confirmed that TCONS_00049140 binded to Krt80 by the co-transfection of pVAX1-TCONS_00049140 and pGL0-Krt80 expression plasmids in 293T cell, based on the bioinformatics analysis. Overexpression of TCONS_00049140 in mouse melanocytes down-regulated Krt80 and resulted in the phenotype of increased cell proliferation and increased melanin production. The results suggested that TCONS_00049140 contributed to skin thickening through Krt80. Our findings provide a direction for research of the molecular mechanism of Cdk5 function. Copyright © 2017. Published by Elsevier B.V.
Collisional excitation of molecules in dense interstellar clouds
NASA Technical Reports Server (NTRS)
Green, S.
1985-01-01
State transitions which permit the identification of the molecular species in dense interstellar clouds are reviewed, along with the techniques used to calculate the transition energies, the database on known molecular transitions and the accuracy of the values. The transition energies cannot be measured directly and therefore must be modeled analytically. Scattering theory is used to determine the intermolecular forces on the basis of quantum mechanics. The nuclear motions can also be modeled with classical mechanics. Sample rate constants are provided for molecular systems known to inhabit dense interstellar clouds. The values serve as a database for interpreting microwave and RF astrophysical data on the transitions undergone by interstellar molecules.
Kinase Pathway Database: An Integrated Protein-Kinase and NLP-Based Protein-Interaction Resource
Koike, Asako; Kobayashi, Yoshiyuki; Takagi, Toshihisa
2003-01-01
Protein kinases play a crucial role in the regulation of cellular functions. Various kinds of information about these molecules are important for understanding signaling pathways and organism characteristics. We have developed the Kinase Pathway Database, an integrated database involving major completely sequenced eukaryotes. It contains the classification of protein kinases and their functional conservation, ortholog tables among species, protein–protein, protein–gene, and protein–compound interaction data, domain information, and structural information. It also provides an automatic pathway graphic image interface. The protein, gene, and compound interactions are automatically extracted from abstracts for all genes and proteins by natural-language processing (NLP).The method of automatic extraction uses phrase patterns and the GENA protein, gene, and compound name dictionary, which was developed by our group. With this database, pathways are easily compared among species using data with more than 47,000 protein interactions and protein kinase ortholog tables. The database is available for querying and browsing at http://kinasedb.ontology.ims.u-tokyo.ac.jp/. PMID:12799355
Stock, Philipp; Utzig, Thomas; Valtiner, Markus
2017-02-08
In all realms of soft matter research a fundamental understanding of the structure/property relationships based on molecular interactions is crucial for developing a framework for the targeted design of soft materials. However, a molecular picture is often difficult to ascertain and yet essential for understanding the many different competing interactions at play, including entropies and cooperativities, hydration effects, and the enormous design space of soft matter. Here, we characterized for the first time the interaction between single hydrophobic molecules quantitatively using atomic force microscopy, and demonstrated that single molecular hydrophobic interaction free energies are dominated by the area of the smallest interacting hydrophobe. The interaction free energy amounts to 3-4 kT per hydrophobic unit. Also, we find that the transition state of the hydrophobic interactions is located at 3 Å with respect to the ground state, based on Bell-Evans theory. Our results provide a new path for understanding the nature of hydrophobic interactions at the single molecular scale. Our approach enables us to systematically vary hydrophobic and any other interaction type by utilizing peptide chemistry providing a strategic advancement to unravel molecular surface and soft matter interactions at the single molecular scale.
The Reactome Pathway Knowledgebase
Jupe, Steven; Matthews, Lisa; Sidiropoulos, Konstantinos; Gillespie, Marc; Garapati, Phani; Haw, Robin; Jassal, Bijay; Korninger, Florian; May, Bruce; Milacic, Marija; Roca, Corina Duenas; Rothfels, Karen; Sevilla, Cristoffer; Shamovsky, Veronica; Shorser, Solomon; Varusai, Thawfeek; Viteri, Guilherme; Weiser, Joel
2018-01-01
Abstract The Reactome Knowledgebase (https://reactome.org) provides molecular details of signal transduction, transport, DNA replication, metabolism, and other cellular processes as an ordered network of molecular transformations—an extended version of a classic metabolic map, in a single consistent data model. Reactome functions both as an archive of biological processes and as a tool for discovering unexpected functional relationships in data such as gene expression profiles or somatic mutation catalogues from tumor cells. To support the continued brisk growth in the size and complexity of Reactome, we have implemented a graph database, improved performance of data analysis tools, and designed new data structures and strategies to boost diagram viewer performance. To make our website more accessible to human users, we have improved pathway display and navigation by implementing interactive Enhanced High Level Diagrams (EHLDs) with an associated icon library, and subpathway highlighting and zooming, in a simplified and reorganized web site with adaptive design. To encourage re-use of our content, we have enabled export of pathway diagrams as ‘PowerPoint’ files. PMID:29145629
Bioactive focus in conformational ensembles: a pluralistic approach
NASA Astrophysics Data System (ADS)
Habgood, Matthew
2017-12-01
Computational generation of conformational ensembles is key to contemporary drug design. Selecting the members of the ensemble that will approximate the conformation most likely to bind to a desired target (the bioactive conformation) is difficult, given that the potential energy usually used to generate and rank the ensemble is a notoriously poor discriminator between bioactive and non-bioactive conformations. In this study an approach to generating a focused ensemble is proposed in which each conformation is assigned multiple rankings based not just on potential energy but also on solvation energy, hydrophobic or hydrophilic interaction energy, radius of gyration, and on a statistical potential derived from Cambridge Structural Database data. The best ranked structures derived from each system are then assembled into a new ensemble that is shown to be better focused on bioactive conformations. This pluralistic approach is tested on ensembles generated by the Molecular Operating Environment's Low Mode Molecular Dynamics module, and by the Cambridge Crystallographic Data Centre's conformation generator software.
Li, Rui-Juan; Wang, Ya-Li; Wang, Qing-He; Wang, Jian; Cheng, Mao-Sheng
2015-01-01
Inosine 5′-monophosphate dehydrogenase (IMPDH) is one of the crucial enzymes in the de novo biosynthesis of guanosine nucleotides. It has served as an attractive target in immunosuppressive, anticancer, antiviral, and antiparasitic therapeutic strategies. In this study, pharmacophore mapping and molecular docking approaches were employed to discover novel Homo sapiens IMPDH (hIMPDH) inhibitors. The Güner-Henry (GH) scoring method was used to evaluate the quality of generated pharmacophore hypotheses. One of the generated pharmacophore hypotheses was found to possess a GH score of 0.67. Ten potential compounds were selected from the ZINC database using a pharmacophore mapping approach and docked into the IMPDH active site. We find two hits (i.e., ZINC02090792 and ZINC00048033) that match well the optimal pharmacophore features used in this investigation, and it is found that they form interactions with key residues of IMPDH. We propose that these two hits are lead compounds for the development of novel hIMPDH inhibitors. PMID:25784957
Computational Study on New Natural Compound Inhibitors of Pyruvate Dehydrogenase Kinases
Zhou, Xiaoli; Yu, Shanshan; Su, Jing; Sun, Liankun
2016-01-01
Pyruvate dehydrogenase kinases (PDKs) are key enzymes in glucose metabolism, negatively regulating pyruvate dehyrogenase complex (PDC) activity through phosphorylation. Inhibiting PDKs could upregulate PDC activity and drive cells into more aerobic metabolism. Therefore, PDKs are potential targets for metabolism related diseases, such as cancers and diabetes. In this study, a series of computer-aided virtual screening techniques were utilized to discover potential inhibitors of PDKs. Structure-based screening using Libdock was carried out following by ADME (adsorption, distribution, metabolism, excretion) and toxicity prediction. Molecular docking was used to analyze the binding mechanism between these compounds and PDKs. Molecular dynamic simulation was utilized to confirm the stability of potential compound binding. From the computational results, two novel natural coumarins compounds (ZINC12296427 and ZINC12389251) from the ZINC database were found binding to PDKs with favorable interaction energy and predicted to be non-toxic. Our study provide valuable information of PDK-coumarins binding mechanisms in PDK inhibitor-based drug discovery. PMID:26959013
Computational Study on New Natural Compound Inhibitors of Pyruvate Dehydrogenase Kinases.
Zhou, Xiaoli; Yu, Shanshan; Su, Jing; Sun, Liankun
2016-03-04
Pyruvate dehydrogenase kinases (PDKs) are key enzymes in glucose metabolism, negatively regulating pyruvate dehyrogenase complex (PDC) activity through phosphorylation. Inhibiting PDKs could upregulate PDC activity and drive cells into more aerobic metabolism. Therefore, PDKs are potential targets for metabolism related diseases, such as cancers and diabetes. In this study, a series of computer-aided virtual screening techniques were utilized to discover potential inhibitors of PDKs. Structure-based screening using Libdock was carried out following by ADME (adsorption, distribution, metabolism, excretion) and toxicity prediction. Molecular docking was used to analyze the binding mechanism between these compounds and PDKs. Molecular dynamic simulation was utilized to confirm the stability of potential compound binding. From the computational results, two novel natural coumarins compounds (ZINC12296427 and ZINC12389251) from the ZINC database were found binding to PDKs with favorable interaction energy and predicted to be non-toxic. Our study provide valuable information of PDK-coumarins binding mechanisms in PDK inhibitor-based drug discovery.
Emmenegger, E.J.; Kentop, E.; Thompson, T.M.; Pittam, S.; Ryan, A.; Keon, D.; Carlino, J.A.; Ranson, J.; Life, R.B.; Troyer, R.M.; Garver, K.A.; Kurath, G.
2011-01-01
The AquaPathogen X database is a template for recording information on individual isolates of aquatic pathogens and is freely available for download (http://wfrc.usgs.gov). This database can accommodate the nucleotide sequence data generated in molecular epidemiological studies along with the myriad of abiotic and biotic traits associated with isolates of various pathogens (e.g. viruses, parasites and bacteria) from multiple aquatic animal host species (e.g. fish, shellfish and shrimp). The cataloguing of isolates from different aquatic pathogens simultaneously is a unique feature to the AquaPathogen X database, which can be used in surveillance of emerging aquatic animal diseases and elucidation of key risk factors associated with pathogen incursions into new water systems. An application of the template database that stores the epidemiological profiles of fish virus isolates, called Fish ViroTrak, was also developed. Exported records for two aquatic rhabdovirus species emerging in North America were used in the implementation of two separate web-accessible databases: the Molecular Epidemiology of Aquatic Pathogens infectious haematopoietic necrosis virus (MEAP-IHNV) database (http://gis.nacse.org/ihnv/) released in 2006 and the MEAP- viral haemorrhagic septicaemia virus (http://gis.nacse.org/vhsv/) database released in 2010.
Mukhopadhyay, Anirban; Maulik, Ujjwal; Bandyopadhyay, Sanghamitra
2012-01-01
Identification of potential viral-host protein interactions is a vital and useful approach towards development of new drugs targeting those interactions. In recent days, computational tools are being utilized for predicting viral-host interactions. Recently a database containing records of experimentally validated interactions between a set of HIV-1 proteins and a set of human proteins has been published. The problem of predicting new interactions based on this database is usually posed as a classification problem. However, posing the problem as a classification one suffers from the lack of biologically validated negative interactions. Therefore it will be beneficial to use the existing database for predicting new viral-host interactions without the need of negative samples. Motivated by this, in this article, the HIV-1–human protein interaction database has been analyzed using association rule mining. The main objective is to identify a set of association rules both among the HIV-1 proteins and among the human proteins, and use these rules for predicting new interactions. In this regard, a novel association rule mining technique based on biclustering has been proposed for discovering frequent closed itemsets followed by the association rules from the adjacency matrix of the HIV-1–human interaction network. Novel HIV-1–human interactions have been predicted based on the discovered association rules and tested for biological significance. For validation of the predicted new interactions, gene ontology-based and pathway-based studies have been performed. These studies show that the human proteins which are predicted to interact with a particular viral protein share many common biological activities. Moreover, literature survey has been used for validation purpose to identify some predicted interactions that are already validated experimentally but not present in the database. Comparison with other prediction methods is also discussed. PMID:22539940
Protein-protein interaction network of gene expression in the hydrocortisone-treated keloid.
Chen, Rui; Zhang, Zhiliang; Xue, Zhujia; Wang, Lin; Fu, Mingang; Lu, Yi; Bai, Ling; Zhang, Ping; Fan, Zhihong
2015-01-01
In order to explore the molecular mechanism of hydrocortisone in keloid tissue, the gene expression profiles of keloid samples treated with hydrocortisone were subjected to bioinformatics analysis. Firstly, the gene expression profiles (GSE7890) of five samples of keloid treated with hydrocortisone and five untreated keloid samples were downloaded from the Gene Expression Omnibus (GEO) database. Secondly, data were preprocessed using packages in R language and differentially expressed genes (DEGs) were screened using a significance analysis of microarrays (SAM) protocol. Thirdly, the DEGs were subjected to gene ontology (GO) function and KEGG pathway enrichment analysis. Finally, the interactions of DEGs in samples of keloid treated with hydrocortisone were explored in a human protein-protein interaction (PPI) network, and sub-modules of the DEGs interaction network were analyzed using Cytoscape software. Based on the analysis, 572 DEGs in the hydrocortisone-treated samples were screened; most of these were involved in the signal transduction and cell cycle. Furthermore, three critical genes in the module, including COL1A1, NID1, and PRELP, were screened in the PPI network analysis. These findings enhance understanding of the pathogenesis of the keloid and provide references for keloid therapy. © 2015 The International Society of Dermatology.
A review on computational systems biology of pathogen–host interactions
Durmuş, Saliha; Çakır, Tunahan; Özgür, Arzucan; Guthke, Reinhard
2015-01-01
Pathogens manipulate the cellular mechanisms of host organisms via pathogen–host interactions (PHIs) in order to take advantage of the capabilities of host cells, leading to infections. The crucial role of these interspecies molecular interactions in initiating and sustaining infections necessitates a thorough understanding of the corresponding mechanisms. Unlike the traditional approach of considering the host or pathogen separately, a systems-level approach, considering the PHI system as a whole is indispensable to elucidate the mechanisms of infection. Following the technological advances in the post-genomic era, PHI data have been produced in large-scale within the last decade. Systems biology-based methods for the inference and analysis of PHI regulatory, metabolic, and protein–protein networks to shed light on infection mechanisms are gaining increasing demand thanks to the availability of omics data. The knowledge derived from the PHIs may largely contribute to the identification of new and more efficient therapeutics to prevent or cure infections. There are recent efforts for the detailed documentation of these experimentally verified PHI data through Web-based databases. Despite these advances in data archiving, there are still large amounts of PHI data in the biomedical literature yet to be discovered, and novel text mining methods are in development to unearth such hidden data. Here, we review a collection of recent studies on computational systems biology of PHIs with a special focus on the methods for the inference and analysis of PHI networks, covering also the Web-based databases and text-mining efforts to unravel the data hidden in the literature. PMID:25914674
Virtual screening using the ligand ZINC database for novel lipoxygenase-3 inhibitors.
Monika; Kour, Janmeet; Singh, Kulwinder
2013-01-01
The leukotrienes constitute a group of arachidonic acid-derived compounds with biologic activities suggesting important roles in inflammation and immediate hypersensitivity. Epidermis-type lipoxygenase-3 (ALOXE3), a distinct subclass within the multigene family of mammalian lipoxygenases, is a novel isoenzyme involved in the metabolism of leukotrienes and plays a very important role in skin barrier functions. Lipoxygenase selective inhibitors such as azelastine and zileuton are currently used to reduce inflammatory response. Nausea, pharyngolaryngeal pain, headache, nasal burning and somnolence are the most frequently reported adverse effects of these drugs. Therefore, there is still a need to develop more potent lipoxygenase inhibitors. In this paper, we report the screening of various compounds from the ZINC database (contains over 21 million compounds) using the Molegro Virtual Docker software against the ALOXE3 protein. Screening was performed using molecular constraints tool to filter compounds with physico-chemical properties similar to the 1N8Q bound ligand protocatechuic acid. The analysis resulted in 4319 Lipinski compliant hits which are docked and scored to identify structurally novel ligands that make similar interactions to those of known ligands or may have different interactions with other parts of the binding site. Our screening approach identified four molecules ZINC84299674; ZINC76643455; ZINC84299122 & ZINC75626957 with MolDock score of -128.901, -120.22, -116.873 & - 102.116 kcal/mol, respectively. Their energy scores were better than the 1N8Q bound co-crystallized ligand protocatechuic acid (with MolDock score of -77.225 kcal/mol). All the ligands were docked within the binding pocket forming interactions with amino acid residues.
Role of Chemical Reactivity and Transition State Modeling for Virtual Screening.
Karthikeyan, Muthukumarasamy; Vyas, Renu; Tambe, Sanjeev S; Radhamohan, Deepthi; Kulkarni, Bhaskar D
2015-01-01
Every drug discovery research program involves synthesis of a novel and potential drug molecule utilizing atom efficient, economical and environment friendly synthetic strategies. The current work focuses on the role of the reactivity based fingerprints of compounds as filters for virtual screening using a tool ChemScore. A reactant-like (RLS) and a product- like (PLS) score can be predicted for a given compound using the binary fingerprints derived from the numerous known organic reactions which capture the molecule-molecule interactions in the form of addition, substitution, rearrangement, elimination and isomerization reactions. The reaction fingerprints were applied to large databases in biology and chemistry, namely ChEMBL, KEGG, HMDB, DSSTox, and the Drug Bank database. A large network of 1113 synthetic reactions was constructed to visualize and ascertain the reactant product mappings in the chemical reaction space. The cumulative reaction fingerprints were computed for 4000 molecules belonging to 29 therapeutic classes of compounds, and these were found capable of discriminating between the cognition disorder related and anti-allergy compounds with reasonable accuracy of 75% and AUC 0.8. In this study, the transition state based fingerprints were also developed and used effectively for virtual screening in drug related databases. The methodology presented here provides an efficient handle for the rapid scoring of molecular libraries for virtual screening.
Yuan T. Lee and Molecular Beam Studies
&D Nuggets Database dropdown arrow Search Tag Cloud Browse Reports Database Help Finding Aids : The Influence of Yuan T. Lee, Journal of Chemical Physics, Volume 125, Issue 13, pp. 132302-132302-19
Anusuya, Shanmugam; Gromiha, M Michael
2017-10-01
Dengue is an important public health problem in tropical and subtropical regions of the world. Neither vaccine nor an antiviral medication is available to treat dengue. This insists the need of drug discovery for dengue. In order to find a potent lead molecule, RNA-dependent RNA polymerase which is essential for dengue viral replication is chosen as a drug target. As Quercetin showed antiviral activity against several viruses, quercetin derivatives developed by combinatorial library synthesis and mined from PubChem databases were screened for a potent anti-dengue viral agent. Our study predicted Quercetin 3-(6″-(E)-p-coumaroylsophoroside)-7-rhamnoside as a dengue polymerase inhibitor. The results were validated by molecular dynamics simulation studies which reveal water bridges and hydrogen bonds as major contributors for the stability of the polymerase-lead complex. Interactions formed by this compound with residues Trp795, Arg792 and Glu351 are found to be essential for the stability of the polymerase-lead complex. Our study demonstrates Quercetin 3-(6″-(E)-p-coumaroylsophoroside)-7-rhamnoside as a potent non-nucleoside inhibitor for dengue polymerase.
Reinharz, Vladimir; Soulé, Antoine; Westhof, Eric; Waldispühl, Jérôme; Denise, Alain
2018-05-04
The wealth of the combinatorics of nucleotide base pairs enables RNA molecules to assemble into sophisticated interaction networks, which are used to create complex 3D substructures. These interaction networks are essential to shape the 3D architecture of the molecule, and also to provide the key elements to carry molecular functions such as protein or ligand binding. They are made of organised sets of long-range tertiary interactions which connect distinct secondary structure elements in 3D structures. Here, we present a de novo data-driven approach to extract automatically from large data sets of full RNA 3D structures the recurrent interaction networks (RINs). Our methodology enables us for the first time to detect the interaction networks connecting distinct components of the RNA structure, highlighting their diversity and conservation through non-related functional RNAs. We use a graphical model to perform pairwise comparisons of all RNA structures available and to extract RINs and modules. Our analysis yields a complete catalog of RNA 3D structures available in the Protein Data Bank and reveals the intricate hierarchical organization of the RNA interaction networks and modules. We assembled our results in an online database (http://carnaval.lri.fr) which will be regularly updated. Within the site, a tool allows users with a novel RNA structure to detect automatically whether the novel structure contains previously observed RINs.
Zhao, Y; Gran, B; Pinilla, C; Markovic-Plese, S; Hemmer, B; Tzou, A; Whitney, L W; Biddison, W E; Martin, R; Simon, R
2001-08-15
The interaction of TCRs with MHC peptide ligands can be highly flexible, so that many different peptides are recognized by the same TCR in the context of a single restriction element. We provide a quantitative description of such interactions, which allows the identification of T cell epitopes and molecular mimics. The response of T cell clones to positional scanning synthetic combinatorial libraries is analyzed with a mathematical approach that is based on a model of independent contribution of individual amino acids to peptide Ag recognition. This biometric analysis compares the information derived from these libraries composed of trillions of decapeptides with all the millions of decapeptides contained in a protein database to rank and predict the most stimulatory peptides for a given T cell clone. We demonstrate the predictive power of the novel strategy and show that, together with gene expression profiling by cDNA microarrays, it leads to the identification of novel candidate autoantigens in the inflammatory autoimmune disease, multiple sclerosis.
Krishna Raja, M; Ghosh, Asit Ranjan; Vino, S; Sajitha Lulu, S
2015-01-01
Features of heat-labile enterotoxins of Escherichia coli which make them fit to use as novel receptors for antidiarrheals are not completely explored. Data-set of 14 different serovars of enterotoxigenic Escherichia coli producing heat-labile toxins were taken from NCBI Genbank database and used in the study. Sequence analysis showed mutations in different subunits and also at their interface residues. As these toxins lack crystallography structures, homology modeling using Modeller 9.11 led to the structural approximation for the E. coli producing heat-labile toxins. Interaction of modeled toxin subunits with proanthocyanidin, an antidiarrheal showed several strong hydrogen bonding interactions at the cost of minimized energy. The hits were subsequently characterized by molecular dynamics simulation studies to monitor their binding stabilities. This study looks into novel space where the ligand can choose the receptor preference not as a whole but as an individual subunit. Mutation at interface residues and interaction among subunits along with the binding of ligand to individual subunits would help to design a non-toxic labile toxin and also to improve the therapeutics.
Lee, A Yeong; Park, Won; Kang, Tae-Wook; Cha, Min Ho; Chun, Jin Mi
2018-07-15
Yijin-Tang (YJT) is a traditional prescription for the treatment of hyperlipidaemia, atherosclerosis and other ailments related to dampness phlegm, a typical pathological symptom of abnormal body fluid metabolism in Traditional Korean Medicine. However, a holistic network pharmacology approach to understanding the therapeutic mechanisms underlying hyperlipidaemia and atherosclerosis has not been pursued. To examine the network pharmacological potential effects of YJT on hyperlipidaemia and atherosclerosis, we analysed components, performed target prediction and network analysis, and investigated interacting pathways using a network pharmacology approach. Information on compounds in herbal medicines was obtained from public databases, and oral bioavailability and drug-likeness was screened using absorption, distribution, metabolism, and excretion (ADME) criteria. Correlations between compounds and genes were linked using the STITCH database, and genes related to hyperlipidaemia and atherosclerosis were gathered using the GeneCards database. Human genes were identified and subjected to Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis. Network analysis identified 447 compounds in five herbal medicines that were subjected to ADME screening, and 21 compounds and 57 genes formed the main pathways linked to hyperlipidaemia and atherosclerosis. Among them, 10 compounds (naringenin, nobiletin, hesperidin, galangin, glycyrrhizin, homogentisic acid, stigmasterol, 6-gingerol, quercetin and glabridin) were linked to more than four genes, and are bioactive compounds and key chemicals. Core genes in this network were CASP3, CYP1A1, CYP1A2, MMP2 and MMP9. The compound-target gene network revealed close interactions between multiple components and multiple targets, and facilitates a better understanding of the potential therapeutic effects of YJT. Pharmacological network analysis can help to explain the potential effects of YJT for treating dampness phlegm-related diseases such as hyperlipidaemia and atherosclerosis. Copyright © 2018 Elsevier B.V. All rights reserved.
2011-01-01
Background Renewed interest in plant × environment interactions has risen in the post-genomic era. In this context, high-throughput phenotyping platforms have been developed to create reproducible environmental scenarios in which the phenotypic responses of multiple genotypes can be analysed in a reproducible way. These platforms benefit hugely from the development of suitable databases for storage, sharing and analysis of the large amount of data collected. In the model plant Arabidopsis thaliana, most databases available to the scientific community contain data related to genetic and molecular biology and are characterised by an inadequacy in the description of plant developmental stages and experimental metadata such as environmental conditions. Our goal was to develop a comprehensive information system for sharing of the data collected in PHENOPSIS, an automated platform for Arabidopsis thaliana phenotyping, with the scientific community. Description PHENOPSIS DB is a publicly available (URL: http://bioweb.supagro.inra.fr/phenopsis/) information system developed for storage, browsing and sharing of online data generated by the PHENOPSIS platform and offline data collected by experimenters and experimental metadata. It provides modules coupled to a Web interface for (i) the visualisation of environmental data of an experiment, (ii) the visualisation and statistical analysis of phenotypic data, and (iii) the analysis of Arabidopsis thaliana plant images. Conclusions Firstly, data stored in the PHENOPSIS DB are of interest to the Arabidopsis thaliana community, particularly in allowing phenotypic meta-analyses directly linked to environmental conditions on which publications are still scarce. Secondly, data or image analysis modules can be downloaded from the Web interface for direct usage or as the basis for modifications according to new requirements. Finally, the structure of PHENOPSIS DB provides a useful template for the development of other similar databases related to genotype × environment interactions. PMID:21554668
Newt-omics: a comprehensive repository for omics data from the newt Notophthalmus viridescens
Bruckskotten, Marc; Looso, Mario; Reinhardt, Richard; Braun, Thomas; Borchardt, Thilo
2012-01-01
Notophthalmus viridescens, a member of the salamander family is an excellent model organism to study regenerative processes due to its unique ability to replace lost appendages and to repair internal organs. Molecular insights into regenerative events have been severely hampered by the lack of genomic, transcriptomic and proteomic data, as well as an appropriate database to store such novel information. Here, we describe ‘Newt-omics’ (http://newt-omics.mpi-bn.mpg.de), a database, which enables researchers to locate, retrieve and store data sets dedicated to the molecular characterization of newts. Newt-omics is a transcript-centred database, based on an Expressed Sequence Tag (EST) data set from the newt, covering ∼50 000 Sanger sequenced transcripts and a set of high-density microarray data, generated from regenerating hearts. Newt-omics also contains a large set of peptides identified by mass spectrometry, which was used to validate 13 810 ESTs as true protein coding. Newt-omics is open to implement additional high-throughput data sets without changing the database structure. Via a user-friendly interface Newt-omics allows access to a huge set of molecular data without the need for prior bioinformatical expertise. PMID:22039101
Sana, Theodore R; Roark, Joseph C; Li, Xiangdong; Waddell, Keith; Fischer, Steven M
2008-09-01
In an effort to simplify and streamline compound identification from metabolomics data generated by liquid chromatography time-of-flight mass spectrometry, we have created software for constructing Personalized Metabolite Databases with content from over 15,000 compounds pulled from the public METLIN database (http://metlin.scripps.edu/). Moreover, we have added extra functionalities to the database that (a) permit the addition of user-defined retention times as an orthogonal searchable parameter to complement accurate mass data; and (b) allow interfacing to separate software, a Molecular Formula Generator (MFG), that facilitates reliable interpretation of any database matches from the accurate mass spectral data. To test the utility of this identification strategy, we added retention times to a subset of masses in this database, representing a mixture of 78 synthetic urine standards. The synthetic mixture was analyzed and screened against this METLIN urine database, resulting in 46 accurate mass and retention time matches. Human urine samples were subsequently analyzed under the same analytical conditions and screened against this database. A total of 1387 ions were detected in human urine; 16 of these ions matched both accurate mass and retention time parameters for the 78 urine standards in the database. Another 374 had only an accurate mass match to the database, with 163 of those masses also having the highest MFG score. Furthermore, MFG calculated a formula for a further 849 ions that had no match to the database. Taken together, these results suggest that the METLIN Personal Metabolite database and MFG software offer a robust strategy for confirming the formula of database matches. In the event of no database match, it also suggests possible formulas that may be helpful in interpreting the experimental results.
Nascimento, Leandro Costa; Salazar, Marcela Mendes; Lepikson-Neto, Jorge; Camargo, Eduardo Leal Oliveira; Parreiras, Lucas Salera; Carazzolle, Marcelo Falsarella
2017-01-01
Abstract Tree species of the genus Eucalyptus are the most valuable and widely planted hardwoods in the world. Given the economic importance of Eucalyptus trees, much effort has been made towards the generation of specimens with superior forestry properties that can deliver high-quality feedstocks, customized to the industrýs needs for both cellulosic (paper) and lignocellulosic biomass production. In line with these efforts, large sets of molecular data have been generated by several scientific groups, providing invaluable information that can be applied in the development of improved specimens. In order to fully explore the potential of available datasets, the development of a public database that provides integrated access to genomic and transcriptomic data from Eucalyptus is needed. EUCANEXT is a database that analyses and integrates publicly available Eucalyptus molecular data, such as the E. grandis genome assembly and predicted genes, ESTs from several species and digital gene expression from 26 RNA-Seq libraries. The database has been implemented in a Fedora Linux machine running MySQL and Apache, while Perl CGI was used for the web interfaces. EUCANEXT provides a user-friendly web interface for easy access and analysis of publicly available molecular data from Eucalyptus species. This integrated database allows for complex searches by gene name, keyword or sequence similarity and is publicly accessible at http://www.lge.ibi.unicamp.br/eucalyptusdb. Through EUCANEXT, users can perform complex analysis to identify genes related traits of interest using RNA-Seq libraries and tools for differential expression analysis. Moreover, all the bioinformatics pipeline here described, including the database schema and PERL scripts, are readily available and can be applied to any genomic and transcriptomic project, regardless of the organism. Database URL: http://www.lge.ibi.unicamp.br/eucalyptusdb PMID:29220468
Validated MicroRNA Target Databases: An Evaluation.
Lee, Yun Ji Diana; Kim, Veronica; Muth, Dillon C; Witwer, Kenneth W
2015-11-01
Preclinical Research Positive findings from preclinical and clinical studies involving depletion or supplementation of microRNA (miRNA) engender optimism about miRNA-based therapeutics. However, off-target effects must be considered. Predicting these effects is complicated. Each miRNA may target many gene transcripts, and the rules governing imperfectly complementary miRNA: target interactions are incompletely understood. Several databases provide lists of the relatively small number of experimentally confirmed miRNA: target pairs. Although incomplete, this information might allow assessment of at least some of the off-target effects. We evaluated the performance of four databases of experimentally validated miRNA: target interactions (miRWalk 2.0, miRTarBase, miRecords, and TarBase 7.0) using a list of 50 alphabetically consecutive genes. We examined the provided citations to determine the degree to which each interaction was experimentally supported. To assess stability, we tested at the beginning and end of a five-month period. Results varied widely by database. Two of the databases changed significantly over the course of 5 months. Most reported evidence for miRNA: target interactions were indirect or otherwise weak, and relatively few interactions were supported by more than one publication. Some returned results appear to arise from simplistic text searches that offer no insight into the relationship of the search terms, may not even include the reported gene or miRNA, and may thus, be invalid. We conclude that validation databases provide important information, but not all information in all extant databases is up-to-date or accurate. Nevertheless, the more comprehensive validation databases may provide useful starting points for investigation of off-target effects of proposed small RNA therapies. © 2015 Wiley Periodicals, Inc.
RAIN: RNA–protein Association and Interaction Networks
Junge, Alexander; Refsgaard, Jan C.; Garde, Christian; Pan, Xiaoyong; Santos, Alberto; Alkan, Ferhat; Anthon, Christian; von Mering, Christian; Workman, Christopher T.; Jensen, Lars Juhl; Gorodkin, Jan
2017-01-01
Protein association networks can be inferred from a range of resources including experimental data, literature mining and computational predictions. These types of evidence are emerging for non-coding RNAs (ncRNAs) as well. However, integration of ncRNAs into protein association networks is challenging due to data heterogeneity. Here, we present a database of ncRNA–RNA and ncRNA–protein interactions and its integration with the STRING database of protein–protein interactions. These ncRNA associations cover four organisms and have been established from curated examples, experimental data, interaction predictions and automatic literature mining. RAIN uses an integrative scoring scheme to assign a confidence score to each interaction. We demonstrate that RAIN outperforms the underlying microRNA-target predictions in inferring ncRNA interactions. RAIN can be operated through an easily accessible web interface and all interaction data can be downloaded. Database URL: http://rth.dk/resources/rain PMID:28077569
Potential Energy Surface Database of Group II Dimer
National Institute of Standards and Technology Data Gateway
SRD 143 NIST Potential Energy Surface Database of Group II Dimer (Web, free access) This database provides critical atomic and molecular data needed in order to evaluate the feasibility of using laser cooled and trapped Group II atomic species (Mg, Ca, Sr, and Ba) for ultra-precise optical clocks or quantum information processing devices.
Sharma, Amit K; Gohel, Sangeeta; Singh, Satya P
2012-01-01
Actinobase is a relational database of molecular diversity, phylogeny and biocatalytic potential of haloalkaliphilic actinomycetes. The main objective of this data base is to provide easy access to range of information, data storage, comparison and analysis apart from reduced data redundancy, data entry, storage, retrieval costs and improve data security. Information related to habitat, cell morphology, Gram reaction, biochemical characterization and molecular features would allow researchers in understanding identification and stress adaptation of the existing and new candidates belonging to salt tolerant alkaliphilic actinomycetes. The PHP front end helps to add nucleotides and protein sequence of reported entries which directly help researchers to obtain the required details. Analysis of the genus wise status of the salt tolerant alkaliphilic actinomycetes indicated 6 different genera among the 40 classified entries of the salt tolerant alkaliphilic actinomycetes. The results represented wide spread occurrence of salt tolerant alkaliphilic actinomycetes belonging to diverse taxonomic positions. Entries and information related to actinomycetes in the database are publicly accessible at http://www.actinobase.in. On clustalW/X multiple sequence alignment of the alkaline protease gene sequences, different clusters emerged among the groups. The narrow search and limit options of the constructed database provided comparable information. The user friendly access to PHP front end facilitates would facilitate addition of sequences of reported entries. The database is available for free at http://www.actinobase.in.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fox, P.B.; Yatabe, M.
1987-01-01
In this report the Nuclear Criticality Safety Analytical Methods Resource Center describes a new interactive version of CESAR, a critical experiments storage and retrieval program available on the Nuclear Criticality Information System (NCIS) database at Lawrence Livermore National Laboratory. The original version of CESAR did not include interactive search capabilities. The CESAR database was developed to provide a convenient, readily accessible means of storing and retrieving code input data for the SCALE Criticality Safety Analytical Sequences and the codes comprising those sequences. The database includes data for both cross section preparation and criticality safety calculations. 3 refs., 1 tab.
Morphinome Database - The database of proteins altered by morphine administration - An update.
Bodzon-Kulakowska, Anna; Padrtova, Tereza; Drabik, Anna; Ner-Kluza, Joanna; Antolak, Anna; Kulakowski, Konrad; Suder, Piotr
2018-04-13
Morphine is considered a gold standard in pain treatment. Nevertheless, its use could be associated with severe side effects, including drug addiction. Thus, it is very important to understand the molecular mechanism of morphine action in order to develop new methods of pain therapy, or at least to attenuate the side effects of opioids usage. Proteomics allows for the indication of proteins involved in certain biological processes, but the number of items identified in a single study is usually overwhelming. Thus, researchers face the difficult problem of choosing the proteins which are really important for the investigated processes and worth further studies. Therefore, based on the 29 published articles, we created a database of proteins regulated by morphine administration - The Morphinome Database (addiction-proteomics.org). This web tool allows for indicating proteins that were identified during different proteomics studies. Moreover, the collection and organization of such a vast amount of data allows us to find the same proteins that were identified in various studies and to create their ranking, based on the frequency of their identification. STRING and KEGG databases indicated metabolic pathways which those molecules are involved in. This means that those molecular pathways seem to be strongly affected by morphine administration and could be important targets for further investigations. The data about proteins identified by different proteomics studies of molecular changes caused by morphine administration (29 published articles) were gathered in the Morphinome Database. Unification of those data allowed for the identification of proteins that were indicated several times by distinct proteomics studies, which means that they seem to be very well verified and important for the entire process. Those proteins might be now considered promising aims for more detailed studies of their role in the molecular mechanism of morphine action. Copyright © 2018. Published by Elsevier B.V.
A Viral-Human Interactome Based on Structural Motif-Domain Interactions Captures the Human Infectome
Guo, Xianwu; Rodríguez-Pérez, Mario A.
2013-01-01
Protein interactions between a pathogen and its host are fundamental in the establishment of the pathogen and underline the infection mechanism. In the present work, we developed a single predictive model for building a host-viral interactome based on the identification of structural descriptors from motif-domain interactions of protein complexes deposited in the Protein Data Bank (PDB). The structural descriptors were used for searching, in a database of protein sequences of human and five clinically important viruses; therefore, viral and human proteins sharing a descriptor were predicted as interacting proteins. The analysis of the host-viral interactome allowed to identify a set of new interactions that further explain molecular mechanism associated with viral infections and showed that it was able to capture human proteins already associated to viral infections (human infectome) and non-infectious diseases (human diseasome). The analysis of human proteins targeted by viral proteins in the context of a human interactome showed that their neighbors are enriched in proteins reported with differential expression under infection and disease conditions. It is expected that the findings of this work will contribute to the development of systems biology for infectious diseases, and help guide the rational identification and prioritization of novel drug targets. PMID:23951184
Oral cancer databases: A comprehensive review.
Sarode, Gargi S; Sarode, Sachin C; Maniyar, Nikunj; Anand, Rahul; Patil, Shankargouda
2017-11-29
Cancer database is a systematic collection and analysis of information on various human cancers at genomic and molecular level that can be utilized to understand various steps in carcinogenesis and for therapeutic advancement in cancer field. Oral cancer is one of the leading causes of morbidity and mortality all over the world. The current research efforts in this field are aimed at cancer etiology and therapy. Advanced genomic technologies including microarrays, proteomics, transcrpitomics, and gene sequencing development have culminated in generation of extensive data and subjection of several genes and microRNAs that are distinctively expressed and this information is stored in the form of various databases. Extensive data from various resources have brought the need for collaboration and data sharing to make effective use of this new knowledge. The current review provides comprehensive information of various publicly accessible databases that contain information pertinent to oral squamous cell carcinoma (OSCC) and databases designed exclusively for OSCC. The databases discussed in this paper are Protein-Coding Gene Databases and microRNA Databases. This paper also describes gene overlap in various databases, which will help researchers to reduce redundancy and focus on only those genes, which are common to more than one databases. We hope such introduction will promote awareness and facilitate the usage of these resources in the cancer research community, and researchers can explore the molecular mechanisms involved in the development of cancer, which can help in subsequent crafting of therapeutic strategies. © 2017 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
MetaBar - a tool for consistent contextual data acquisition and standards compliant submission.
Hankeln, Wolfgang; Buttigieg, Pier Luigi; Fink, Dennis; Kottmann, Renzo; Yilmaz, Pelin; Glöckner, Frank Oliver
2010-06-30
Environmental sequence datasets are increasing at an exponential rate; however, the vast majority of them lack appropriate descriptors like sampling location, time and depth/altitude: generally referred to as metadata or contextual data. The consistent capture and structured submission of these data is crucial for integrated data analysis and ecosystems modeling. The application MetaBar has been developed, to support consistent contextual data acquisition. MetaBar is a spreadsheet and web-based software tool designed to assist users in the consistent acquisition, electronic storage, and submission of contextual data associated to their samples. A preconfigured Microsoft Excel spreadsheet is used to initiate structured contextual data storage in the field or laboratory. Each sample is given a unique identifier and at any stage the sheets can be uploaded to the MetaBar database server. To label samples, identifiers can be printed as barcodes. An intuitive web interface provides quick access to the contextual data in the MetaBar database as well as user and project management capabilities. Export functions facilitate contextual and sequence data submission to the International Nucleotide Sequence Database Collaboration (INSDC), comprising of the DNA DataBase of Japan (DDBJ), the European Molecular Biology Laboratory database (EMBL) and GenBank. MetaBar requests and stores contextual data in compliance to the Genomic Standards Consortium specifications. The MetaBar open source code base for local installation is available under the GNU General Public License version 3 (GNU GPL3). The MetaBar software supports the typical workflow from data acquisition and field-sampling to contextual data enriched sequence submission to an INSDC database. The integration with the megx.net marine Ecological Genomics database and portal facilitates georeferenced data integration and metadata-based comparisons of sampling sites as well as interactive data visualization. The ample export functionalities and the INSDC submission support enable exchange of data across disciplines and safeguarding contextual data.
MetaBar - a tool for consistent contextual data acquisition and standards compliant submission
2010-01-01
Background Environmental sequence datasets are increasing at an exponential rate; however, the vast majority of them lack appropriate descriptors like sampling location, time and depth/altitude: generally referred to as metadata or contextual data. The consistent capture and structured submission of these data is crucial for integrated data analysis and ecosystems modeling. The application MetaBar has been developed, to support consistent contextual data acquisition. Results MetaBar is a spreadsheet and web-based software tool designed to assist users in the consistent acquisition, electronic storage, and submission of contextual data associated to their samples. A preconfigured Microsoft® Excel® spreadsheet is used to initiate structured contextual data storage in the field or laboratory. Each sample is given a unique identifier and at any stage the sheets can be uploaded to the MetaBar database server. To label samples, identifiers can be printed as barcodes. An intuitive web interface provides quick access to the contextual data in the MetaBar database as well as user and project management capabilities. Export functions facilitate contextual and sequence data submission to the International Nucleotide Sequence Database Collaboration (INSDC), comprising of the DNA DataBase of Japan (DDBJ), the European Molecular Biology Laboratory database (EMBL) and GenBank. MetaBar requests and stores contextual data in compliance to the Genomic Standards Consortium specifications. The MetaBar open source code base for local installation is available under the GNU General Public License version 3 (GNU GPL3). Conclusion The MetaBar software supports the typical workflow from data acquisition and field-sampling to contextual data enriched sequence submission to an INSDC database. The integration with the megx.net marine Ecological Genomics database and portal facilitates georeferenced data integration and metadata-based comparisons of sampling sites as well as interactive data visualization. The ample export functionalities and the INSDC submission support enable exchange of data across disciplines and safeguarding contextual data. PMID:20591175
Pathak, Rajesh K.; Baunthiyal, Mamta; Shukla, Rohit; Pandey, Dinesh; Taj, Gohar; Kumar, Anil
2017-01-01
Alternaria brassicae and Alternaria brassicicola are two major phytopathogenic fungi which cause Alternaria blight, a recalcitrant disease on Brassica crops throughout the world, which is highly destructive and responsible for significant yield losses. Since no resistant source is available against Alternaria blight, therefore, efforts have been made in the present study to identify defense inducer molecules which can induce jasmonic acid (JA) mediated defense against the disease. It is believed that JA triggered defense response will prevent necrotrophic mode of colonization of Alternaria brassicae fungus. The JA receptor, COI1 is one of the potential targets for triggering JA mediated immunity through interaction with JA signal. In the present study, few mimicking compounds more efficient than naturally occurring JA in terms of interaction with COI1 were identified through virtual screening and molecular dynamics simulation studies. A high quality structural model of COI1 was developed using the protein sequence of Brassica rapa. This was followed by virtual screening of 767 analogs of JA from ZINC database for interaction with COI1. Two analogs viz. ZINC27640214 and ZINC43772052 showed more binding affinity with COI1 as compared to naturally occurring JA. Molecular dynamics simulation of COI1 and COI1-JA complex, as well as best screened interacting structural analogs of JA with COI1 was done for 50 ns to validate the stability of system. It was found that ZINC27640214 possesses efficient, stable, and good cell permeability properties. Based on the obtained results and its physicochemical properties, it is capable of mimicking JA signaling and may be used as defense inducers for triggering JA mediated resistance against Alternaria blight, only after further validation through field trials. PMID:28487711
Chen, Kuan Chen; Lu, Richard; Iqbal, Usman; Hsu, Ko-Ching; Chen, Bi-Li; Nguyen, Phung-Anh; Yang, Hsuan-Chia; Huang, Chih-Wei; Li, Yu-Chuan Jack; Jian, Wen-Shan; Tsai, Shin-Han
2015-12-01
Drug-drug interactions have long been an active research area in clinical medicine. In Taiwan, however, the widespread use of traditional Chinese medicines (TCM) presents additional complexity to the topic. Therefore, it is important to see the interaction between traditional Chinese and western medicine. (1) To create a comprehensive database of multi-herb/western drug interactions indexed according to the ways in which physicians actually practice and (2) to measure this database's impact on the detection of adverse effects between traditional Chinese medicine compounds and western medicines. First, a multi-herb/western medicine drug interactions database was created by separating each TCM compound into its constituent herbs. Each individual herb was then checked against an existing single-herb/western drug interactions database. The data source comes from the National Health Insurance research database, which spans the years 1998-2011. This study estimated the interaction prevalence rate and further separated the rates according to patient characteristics, distribution by county, and hospital accreditation levels. Finally, this new database was integrated into a computer order entry module of the electronic medical records system of a regional teaching hospital. The effects it had were measured for two months. The most commonly interacting Chinese herbs were Ephedrae Herba and Angelicae Sinensis Radix/Angelicae Dahuricae Radix. Ephedrae Herba contains active ingredients similar to in ephedrine. 15 kinds of traditional Chinese medicine compounds contain Ephedrae Herba. Angelicae Sinensis Radix and Angelicae Dahuricae Radix contain ingredients similar to coumarin, a blood thinner. 9 kinds of traditional Chinese medicine compounds contained Angelicae Sinensis Radix/Angelicae Dahuricae Radix. In the period from 1998 to 2011, the prevalence of herb-drug interactions related to Ephedrae Herba was 0.18%. The most commonly prescribed traditional Chinese compounds were MA SHING GAN SHYR TANG (23.1%), followed by SHEAU CHING LONG TANG (15.5%) and DINQ CHUAN TANG (13.2%). The prevalence of herb-drug interactions related to Angelicae Sinensis Radix, Angelicae Dahuricae Radix was 4.59%. The most common traditional Chinese compound formula were TSANG EEL SAAN (32%), followed by HUOH SHIANG JENQ CHIH SAAN (31.4%) and SHY WUH TANG (10.7%). Once the multi-herb drug interaction database was deployed in a hospital system, there were 480 prescriptions that indicated a TCM-western drug interaction. Physicians were alerted 24 times during two months. These alerts resulted in a prescription change four times (16.7%). Due to the unique cultural factors that have resulted in widespread acceptance of both western and traditional Chinese medicine, Taiwan stands well positioned to report on the prevalence of interactions between western drugs and traditional Chinese medicine and devise ways to reduce their incidence. This study built a multi-herb/western drug interactions database, embedded inside a hospital clinical information system, and then examined the effects that drug interaction alerts had on clinician prescribing behaviour. The results demonstrated that western drug/traditional Chinese medicine interactions are prevalent and that western-trained physicians tend to change their prescribing behaviour more than traditional Chinese medicine physicians in their response to medication interaction alerts. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Gao, Li; Zhang, Li-Jie; Li, Sheng-Hua; Wei, Li-Li; Luo, Bin; He, Rong-Quan; Xia, Shuang
2018-03-06
MiR-452-5p has been reported to be down-regulated in prostate cancer, affecting the development of this type of cancer. However, the molecular mechanism of miR-452-5p in prostate cancer remains unclear. Therefore, we investigated the network of target genes of miR-452-5p in prostate cancer using bioinformatics analyses. We first analyzed the expression profiles and prognostic value of miR-452-5p in prostate cancer tissues from a public database. Gene Ontology (GO), the Kyoto Encyclopedia of Genes and Genomes (KEGG), PANTHER pathway analyses, and a disease ontology (DG) analysis were performed to find the molecular functions of the target genes from GSE datasets and miRWalk. Finally, we validated hub genes from the protein-protein interaction (PPI) networks of the target genes in the Human Protein Atlas (HPA) database and Gene Expression Profiling Interactive Analysis (GEPIA). Narrowing down the optimal target genes was conducted by seeking the common parts of up-regulated genes from GEPIA, down-regulated genes from GSE datasets, and predicted genes in miRWalk. Based on mining of GEO and ArrayExpress microarray chips and miRNA-Seq data in the TCGA database, which includes 1007 prostate cancer samples and 387 non-cancer samples, miR-452-5p is shown to be down-regulated in prostate cancer. GO, KEGG, and PANTHER pathway analyses suggested that the target genes might participate in important biological processes, such as transforming growth factor beta signaling and the positive regulation of brown fat cell differentiation and mesenchymal cell differentiation, as well as the Ras signaling pathway and pathways regulating the pluripotency of stem cells and arrhythmogenic right ventricular cardiomyopathy (ARVC). Nine genes-GABBR, PNISR, NTSR1, DOCK1, EREG, SFRP1, PTGS2, LEF1, and BMP2-were defined as hub genes in the PPI network. Three genes-FAM174B, SLC30A4, and SLIT1-were jointly shared by GEPIA, the GSE datasets, and miRWalk. Down-regulated miR-452-5p might play an essential role in the tumorigenesis of prostate cancer. Copyright © 2018. Published by Elsevier GmbH.
Chen, Yuting; Cassone, Bryan J.; Bai, Xiaodong; Redinbaugh, Margaret G.; Michel, Andrew P.
2012-01-01
Background Leafhoppers (Hemiptera: Cicadellidae) are plant-phloem feeders that are known for their ability to vector plant pathogens. The black-faced leafhopper (Graminella nigrifrons) has been identified as the only known vector for the Maize fine streak virus (MFSV), an emerging plant pathogen in the Rhabdoviridae. Within G. nigrifrons populations, individuals can be experimentally separated into three classes based on their capacity for viral transmission: transmitters, acquirers and non-acquirers. Understanding the molecular interactions between vector and virus can reveal important insights in virus immune defense and vector transmission. Results RNA sequencing (RNA-Seq) was performed to characterize the transcriptome of G. nigrifrons. A total of 38,240 ESTs of a minimum 100 bp were generated from two separate cDNA libraries consisting of virus transmitters and acquirers. More than 60% of known D. melanogaster, A. gambiae, T. castaneum immune response genes mapped to our G. nigrifrons EST database. Real time quantitative PCR (RT-qPCR) showed significant down-regulation of three genes for peptidoglycan recognition proteins (PGRP – SB1, SD, and LC) in G. nigrifrons transmitters versus control leafhoppers. Conclusions Our study is the first to characterize the transcriptome of a leafhopper vector species. Significant sequence similarity in immune defense genes existed between G. nigrifrons and other well characterized insects. The down-regulation of PGRPs in MFSV transmitters suggested a possible role in rhabdovirus transmission. The results provide a framework for future studies aimed at elucidating the molecular mechanisms of plant virus vector competence. PMID:22808205
Significance of genome-wide association studies in molecular anthropology.
Gupta, Vipin; Khadgawat, Rajesh; Sachdeva, Mohinder Pal
2009-12-01
The successful advent of a genome-wide approach in association studies raises the hopes of human geneticists for solving a genetic maze of complex traits especially the disorders. This approach, which is replete with the application of cutting-edge technology and supported by big science projects (like Human Genome Project; and even more importantly the International HapMap Project) and various important databases (SNP database, CNV database, etc.), has had unprecedented success in rapidly uncovering many of the genetic determinants of complex disorders. The magnitude of this approach in the genetics of classical anthropological variables like height, skin color, eye color, and other genome diversity projects has certainly expanded the horizons of molecular anthropology. Therefore, in this article we have proposed a genome-wide association approach in molecular anthropological studies by providing lessons from the exemplary study of the Wellcome Trust Case Control Consortium. We have also highlighted the importance and uniqueness of Indian population groups in facilitating the design and finding optimum solutions for other genome-wide association-related challenges.
Geometric database maintenance using CCTV cameras and overlay graphics
NASA Astrophysics Data System (ADS)
Oxenberg, Sheldon C.; Landell, B. Patrick; Kan, Edwin
1988-01-01
An interactive graphics system using closed circuit television (CCTV) cameras for remote verification and maintenance of a geometric world model database has been demonstrated in GE's telerobotics testbed. The database provides geometric models and locations of objects viewed by CCTV cameras and manipulated by telerobots. To update the database, an operator uses the interactive graphics system to superimpose a wireframe line drawing of an object with known dimensions on a live video scene containing that object. The methodology used is multipoint positioning to easily superimpose a wireframe graphic on the CCTV image of an object in the work scene. An enhanced version of GE's interactive graphics system will provide the object designation function for the operator control station of the Jet Propulsion Laboratory's telerobot demonstration system.
Mouse IDGenes: a reference database for genetic interactions in the developing mouse brain
Matthes, Michaela; Preusse, Martin; Zhang, Jingzhong; Schechter, Julia; Mayer, Daniela; Lentes, Bernd; Theis, Fabian; Prakash, Nilima; Wurst, Wolfgang; Trümbach, Dietrich
2014-01-01
The study of developmental processes in the mouse and other vertebrates includes the understanding of patterning along the anterior–posterior, dorsal–ventral and medial– lateral axis. Specifically, neural development is also of great clinical relevance because several human neuropsychiatric disorders such as schizophrenia, autism disorders or drug addiction and also brain malformations are thought to have neurodevelopmental origins, i.e. pathogenesis initiates during childhood and adolescence. Impacts during early neurodevelopment might also predispose to late-onset neurodegenerative disorders, such as Parkinson’s disease. The neural tube develops from its precursor tissue, the neural plate, in a patterning process that is determined by compartmentalization into morphogenetic units, the action of local signaling centers and a well-defined and locally restricted expression of genes and their interactions. While public databases provide gene expression data with spatio-temporal resolution, they usually neglect the genetic interactions that govern neural development. Here, we introduce Mouse IDGenes, a reference database for genetic interactions in the developing mouse brain. The database is highly curated and offers detailed information about gene expressions and the genetic interactions at the developing mid-/hindbrain boundary. To showcase the predictive power of interaction data, we infer new Wnt/β-catenin target genes by machine learning and validate one of them experimentally. The database is updated regularly. Moreover, it can easily be extended by the research community. Mouse IDGenes will contribute as an important resource to the research on mouse brain development, not exclusively by offering data retrieval, but also by allowing data input. Database URL: http://mouseidgenes.helmholtz-muenchen.de. PMID:25145340
Narad, Priyanka; Kumar, Abhishek; Chakraborty, Amlan; Patni, Pranav; Sengupta, Abhishek; Wadhwa, Gulshan; Upadhyaya, K C
2017-09-01
Transcription factors are trans-acting proteins that interact with specific nucleotide sequences known as transcription factor binding site (TFBS), and these interactions are implicated in regulation of the gene expression. Regulation of transcriptional activation of a gene often involves multiple interactions of transcription factors with various sequence elements. Identification of these sequence elements is the first step in understanding the underlying molecular mechanism(s) that regulate the gene expression. For in silico identification of these sequence elements, we have developed an online computational tool named transcription factor information system (TFIS) for detecting TFBS for the first time using a collection of JAVA programs and is mainly based on TFBS detection using position weight matrix (PWM). The database used for obtaining position frequency matrices (PFM) is JASPAR and HOCOMOCO, which is an open-access database of transcription factor binding profiles. Pseudo-counts are used while converting PFM to PWM, and TFBS detection is carried out on the basis of percent score taken as threshold value. TFIS is equipped with advanced features such as direct sequence retrieving from NCBI database using gene identification number and accession number, detecting binding site for common TF in a batch of gene sequences, and TFBS detection after generating PWM from known raw binding sequences in addition to general detection methods. TFIS can detect the presence of potential TFBSs in both the directions at the same time. This feature increases its efficiency. And the results for this dual detection are presented in different colors specific to the orientation of the binding site. Results obtained by the TFIS are more detailed and specific to the detected TFs as integration of more informative links from various related web servers are added in the result pages like Gene Ontology, PAZAR database and Transcription Factor Encyclopedia in addition to NCBI and UniProt. Common TFs like SP1, AP1 and NF-KB of the Amyloid beta precursor gene is easily detected using TFIS along with multiple binding sites. In another scenario of embryonic developmental process, TFs of the FOX family (FOXL1 and FOXC1) were also identified. TFIS is platform-independent which is publicly available along with its support and documentation at http://tfistool.appspot.com and http://www.bioinfoplus.com/tfis/ . TFIS is licensed under the GNU General Public License, version 3 (GPL-3.0).
A RESTful application programming interface for the PubMLST molecular typing and genome databases
Bray, James E.; Maiden, Martin C. J.
2017-01-01
Abstract Molecular typing is used to differentiate microorganisms at the subspecies or strain level for epidemiological investigations, infection control, public health and environmental sampling. DNA sequence-based typing methods require authoritative databases that link sequence variants to nomenclature in order to facilitate communication and comparison of identified types in national or global settings. The PubMLST website (https://pubmlst.org/) fulfils this role for over a hundred microorganisms for which it hosts curated molecular sequence typing data, providing sequence and allelic profile definitions for multi-locus sequence typing (MLST) and single-gene typing approaches. In recent years, these have expanded to cover the whole genome with schemes such as core genome MLST (cgMLST) and whole genome MLST (wgMLST) which catalogue the allelic diversity found in hundreds to thousands of genes. These approaches provide a common nomenclature for high-resolution strain characterization and comparison. Molecular typing information is linked to isolate provenance, phenotype, and increasingly genome assemblies, providing a resource for outbreak investigation and research in to population structure, gene association, global epidemiology and vaccine coverage. A Representational State Transfer (REST) Application Programming Interface (API) has been developed for the PubMLST website to make these large quantities of structured molecular typing and whole genome sequence data available for programmatic access by any third party application. The API is an integral component of the Bacterial Isolate Genome Sequence Database (BIGSdb) platform that is used to host PubMLST resources, and exposes all public data within the site. In addition to data browsing, searching and download, the API supports authentication and submission of new data to curator queues. Database URL: http://rest.pubmlst.org/ PMID:29220452
Molecular Identification and Databases in Fusarium
USDA-ARS?s Scientific Manuscript database
DNA sequence-based methods for identifying pathogenic and mycotoxigenic Fusarium isolates have become the gold standard worldwide. Moreover, fusarial DNA sequence data are increasing rapidly in several web-accessible databases for comparative purposes. Unfortunately, the use of Basic Alignment Sea...
Environmental and Molecular Science Laboratory Arrow
DOE Office of Scientific and Technical Information (OSTI.GOV)
2016-06-24
Arrows is a software package that combines NWChem, SQL and NOSQL databases, email, and social networks (e.g. Twitter, Tumblr) that simplifies molecular and materials modeling and makes these modeling capabilities accessible to all scientists and engineers. EMSL Arrows is very simple to use. The user just emails chemical reactions to arrows@emsl.pnnl.gov and then an email is sent back with thermodynamic, reaction pathway (kinetic), spectroscopy, and other results. EMSL Arrows parses the email and then searches the database for the compounds in the reactions. If a compound isn't there, an NWChem calculation is setup and submitted to calculate it. Once themore » calculation is finished the results are entered into the database and then results are emailed back.« less
GPCR & company: databases and servers for GPCRs and interacting partners.
Kowalsman, Noga; Niv, Masha Y
2014-01-01
G-protein-coupled receptors (GPCRs) are a large superfamily of membrane receptors that are involved in a wide range of signaling pathways. To fulfill their tasks, GPCRs interact with a variety of partners, including small molecules, lipids and proteins. They are accompanied by different proteins during all phases of their life cycle. Therefore, GPCR interactions with their partners are of great interest in basic cell-signaling research and in drug discovery.Due to the rapid development of computers and internet communication, knowledge and data can be easily shared within the worldwide research community via freely available databases and servers. These provide an abundance of biological, chemical and pharmacological information.This chapter describes the available web resources for investigating GPCR interactions. We review about 40 freely available databases and servers, and provide a few sentences about the essence and the data they supply. For simplification, the databases and servers were grouped under the following topics: general GPCR-ligand interactions; particular families of GPCRs and their ligands; GPCR oligomerization; GPCR interactions with intracellular partners; and structural information on GPCRs. In conclusion, a multitude of useful tools are currently available. Summary tables are provided to ease navigation between the numerous and partially overlapping resources. Suggestions for future enhancements of the online tools include the addition of links from general to specialized databases and enabling usage of user-supplied template for GPCR structural modeling.
Ganai, Shabir Ahmad; Abdullah, Ehsaan; Rashid, Romana; Altaf, Mohammad
2017-01-01
Histone deacetylases (HDACs) regulate epigenetic gene expression programs by modulating chromatin architecture and are required for neuronal development. Dysregulation of HDACs and aberrant chromatin acetylation homeostasis have been implicated in various diseases ranging from cancer to neurodegenerative disorders. Histone deacetylase inhibitors (HDACi), the small molecules interfering HDACs have shown enhanced acetylation of the genome and are gaining great attention as potent drugs for treating cancer and neurodegeneration. HDAC2 overexpression has implications in decreasing dendrite spine density, synaptic plasticity and in triggering neurodegenerative signaling. Pharmacological intervention against HDAC2 though promising also targets neuroprotective HDAC1 due to high sequence identity (94%) with former in catalytic domain, culminating in debilitating off-target effects and creating hindrance in the defined intervention. This emphasizes the need of designing HDAC2-selective inhibitors to overcome these vicious effects and for escalating the therapeutic efficacy. Here we report a top-down combinatorial in silico approach for identifying the structural variants that are substantial for interactions against HDAC1 and HDAC2 enzymes. We used extra-precision (XP)-molecular docking, Molecular Mechanics Generalized Born Surface Area (MMGBSA) for predicting affinity of inhibitors against the HDAC1 and HDAC2 enzymes. Importantly, we employed a novel in silico strategy of coupling the state-of-the-art molecular dynamics simulation (MDS) to energetically-optimized structure based pharmacophores (e-Pharmacophores) method via MDS trajectory clustering for hypothesizing the e-Pharmacophore models. Further, we performed e-Pharmacophores based virtual screening against phase database containing millions of compounds. We validated the data by performing the molecular docking and MM-GBSA studies for the selected hits among the retrieved ones. Our studies attributed inhibitor potency to the ability of forming multiple interactions and infirm potency to least interactions. Moreover, our studies delineated that a single HDAC inhibitor portrays differential features against HDAC1 and HDAC2 enzymes. The high affinity and selective HDAC2 inhibitors retrieved through e-Pharmacophores based virtual screening will play a critical role in ameliorating neurodegenerative signaling without hampering the neuroprotective isoform (HDAC1). PMID:29170627
Ma, Yazhen; Xu, Ting; Wan, Dongshi; Ma, Tao; Shi, Sheng; Liu, Jianquan; Hu, Quanjun
2015-03-17
Soil salinity is a significant factor that impairs plant growth and agricultural productivity, and numerous efforts are underway to enhance salt tolerance of economically important plants. Populus species are widely cultivated for diverse uses. Especially, they grow in different habitats, from salty soil to mesophytic environment, and are therefore used as a model genus for elucidating physiological and molecular mechanisms of stress tolerance in woody plants. The Salinity Tolerant Poplar Database (STPD) is an integrative database for salt-tolerant poplar genome biology. Currently the STPD contains Populus euphratica genome and its related genetic resources. P. euphratica, with a preference of the salty habitats, has become a valuable genetic resource for the exploitation of tolerance characteristics in trees. This database contains curated data including genomic sequence, genes and gene functional information, non-coding RNA sequences, transposable elements, simple sequence repeats and single nucleotide polymorphisms information of P. euphratica, gene expression data between P. euphratica and Populus tomentosa, and whole-genome alignments between Populus trichocarpa, P. euphratica and Salix suchowensis. The STPD provides useful searching and data mining tools, including GBrowse genome browser, BLAST servers and genome alignments viewer, which can be used to browse genome regions, identify similar sequences and visualize genome alignments. Datasets within the STPD can also be downloaded to perform local searches. A new Salinity Tolerant Poplar Database has been developed to assist studies of salt tolerance in trees and poplar genomics. The database will be continuously updated to incorporate new genome-wide data of related poplar species. This database will serve as an infrastructure for researches on the molecular function of genes, comparative genomics, and evolution in closely related species as well as promote advances in molecular breeding within Populus. The STPD can be accessed at http://me.lzu.edu.cn/stpd/ .
Genome-Wide Identification of Molecular Mimicry Candidates in Parasites
Ludin, Philipp; Nilsson, Daniel; Mäser, Pascal
2011-01-01
Among the many strategies employed by parasites for immune evasion and host manipulation, one of the most fascinating is molecular mimicry. With genome sequences available for host and parasite, mimicry of linear amino acid epitopes can be investigated by comparative genomics. Here we developed an in silico pipeline for genome-wide identification of molecular mimicry candidate proteins or epitopes. The predicted proteome of a given parasite was broken down into overlapping fragments, each of which was screened for close hits in the human proteome. Control searches were carried out against unrelated, free-living eukaryotes to eliminate the generally conserved proteins, and with randomized versions of the parasite proteins to get an estimate of statistical significance. This simple but computation-intensive approach yielded interesting candidates from human-pathogenic parasites. From Plasmodium falciparum, it returned a 14 amino acid motif in several of the PfEMP1 variants identical to part of the heparin-binding domain in the immunosuppressive serum protein vitronectin. And in Brugia malayi, fragments were detected that matched to periphilin-1, a protein of cell-cell junctions involved in barrier formation. All the results are publicly available by means of mimicDB, a searchable online database for molecular mimicry candidates from pathogens. To our knowledge, this is the first genome-wide survey for molecular mimicry proteins in parasites. The strategy can be adopted to any pair of host and pathogen, once appropriate negative control organisms are chosen. MimicDB provides a host of new starting points to gain insights into the molecular nature of host-pathogen interactions. PMID:21408160
Morgnanesi, Dante; Heinrichs, Eric J; Mele, Anthony R; Wilkinson, Sean; Zhou, Suzanne; Kulp, John L
2015-11-01
Computational chemical biology, applied to research on hepatitis B virus (HBV), has two major branches: bioinformatics (statistical models) and first-principle methods (molecular physics). While bioinformatics focuses on statistical tools and biological databases, molecular physics uses mathematics and chemical theory to study the interactions of biomolecules. Three computational techniques most commonly used in HBV research are homology modeling, molecular docking, and molecular dynamics. Homology modeling is a computational simulation to predict protein structure and has been used to construct conformers of the viral polymerase (reverse transcriptase domain and RNase H domain) and the HBV X protein. Molecular docking is used to predict the most likely orientation of a ligand when it is bound to a protein, as well as determining an energy score of the docked conformation. Molecular dynamics is a simulation that analyzes biomolecule motions and determines conformation and stability patterns. All of these modeling techniques have aided in the understanding of resistance mutations on HBV non-nucleos(t)ide reverse-transcriptase inhibitor binding. Finally, bioinformatics can be used to study the DNA and RNA protein sequences of viruses to both analyze drug resistance and to genotype the viral genomes. Overall, with these techniques, and others, computational chemical biology is becoming more and more necessary in hepatitis B research. This article forms part of a symposium in Antiviral Research on "An unfinished story: from the discovery of the Australia antigen to the development of new curative therapies for hepatitis B." Copyright © 2015 Elsevier B.V. All rights reserved.
HepSEQ: International Public Health Repository for Hepatitis B
Gnaneshan, Saravanamuttu; Ijaz, Samreen; Moran, Joanne; Ramsay, Mary; Green, Jonathan
2007-01-01
HepSEQ is a repository for an extensive library of public health and molecular data relating to hepatitis B virus (HBV) infection collected from international sources. It is hosted by the Centre for Infections, Health Protection Agency (HPA), England, United Kingdom. This repository has been developed as a web-enabled, quality-controlled database to act as a tool for surveillance, HBV case management and for research. The web front-end for the database system can be accessed from . The format of the database system allows for comprehensive molecular, clinical and epidemiological data to be deposited into a functional database, to search and manipulate the stored data and to extract and visualize the information on epidemiological, virological, clinical, nucleotide sequence and mutational aspects of HBV infection through web front-end. Specific tools, built into the database, can be utilized to analyse deposited data and provide information on HBV genotype, identify mutations with known clinical significance (e.g. vaccine escape, precore and antiviral-resistant mutations) and carry out sequence homology searches against other deposited strains. Further mechanisms are also in place to allow specific tailored searches of the database to be undertaken. PMID:17130143
NASA Astrophysics Data System (ADS)
Choudhary, Kamal; Congo, Faical Yannick P.; Liang, Tao; Becker, Chandler; Hennig, Richard G.; Tavazza, Francesca
2017-01-01
Classical empirical potentials/force-fields (FF) provide atomistic insights into material phenomena through molecular dynamics and Monte Carlo simulations. Despite their wide applicability, a systematic evaluation of materials properties using such potentials and, especially, an easy-to-use user-interface for their comparison is still lacking. To address this deficiency, we computed energetics and elastic properties of variety of materials such as metals and ceramics using a wide range of empirical potentials and compared them to density functional theory (DFT) as well as to experimental data, where available. The database currently consists of 3248 entries including energetics and elastic property calculations, and it is still increasing. We also include computational tools for convex-hull plots for DFT and FF calculations. The data covers 1471 materials and 116 force-fields. In addition, both the complete database and the software coding used in the process have been released for public use online (presently at http://www.ctcms.nist.gov/˜knc6/periodic.html) in a user-friendly way designed to enable further material design and discovery.
Choudhary, Kamal; Congo, Faical Yannick P.; Liang, Tao; Becker, Chandler; Hennig, Richard G.; Tavazza, Francesca
2017-01-01
Classical empirical potentials/force-fields (FF) provide atomistic insights into material phenomena through molecular dynamics and Monte Carlo simulations. Despite their wide applicability, a systematic evaluation of materials properties using such potentials and, especially, an easy-to-use user-interface for their comparison is still lacking. To address this deficiency, we computed energetics and elastic properties of variety of materials such as metals and ceramics using a wide range of empirical potentials and compared them to density functional theory (DFT) as well as to experimental data, where available. The database currently consists of 3248 entries including energetics and elastic property calculations, and it is still increasing. We also include computational tools for convex-hull plots for DFT and FF calculations. The data covers 1471 materials and 116 force-fields. In addition, both the complete database and the software coding used in the process have been released for public use online (presently at http://www.ctcms.nist.gov/∼knc6/periodic.html) in a user-friendly way designed to enable further material design and discovery. PMID:28140407
LigandBox: A database for 3D structures of chemical compounds
Kawabata, Takeshi; Sugihara, Yusuke; Fukunishi, Yoshifumi; Nakamura, Haruki
2013-01-01
A database for the 3D structures of available compounds is essential for the virtual screening by molecular docking. We have developed the LigandBox database (http://ligandbox.protein.osaka-u.ac.jp/ligandbox/) containing four million available compounds, collected from the catalogues of 37 commercial suppliers, and approved drugs and biochemical compounds taken from KEGG_DRUG, KEGG_COMPOUND and PDB databases. Each chemical compound in the database has several 3D conformers with hydrogen atoms and atomic charges, which are ready to be docked into receptors using docking programs. The 3D conformations were generated using our molecular simulation program package, myPresto. Various physical properties, such as aqueous solubility (LogS) and carcinogenicity have also been calculated to characterize the ADME-Tox properties of the compounds. The Web database provides two services for compound searches: a property/chemical ID search and a chemical structure search. The chemical structure search is performed by a descriptor search and a maximum common substructure (MCS) search combination, using our program kcombu. By specifying a query chemical structure, users can find similar compounds among the millions of compounds in the database within a few minutes. Our database is expected to assist a wide range of researchers, in the fields of medical science, chemical biology, and biochemistry, who are seeking to discover active chemical compounds by the virtual screening. PMID:27493549
LigandBox: A database for 3D structures of chemical compounds.
Kawabata, Takeshi; Sugihara, Yusuke; Fukunishi, Yoshifumi; Nakamura, Haruki
2013-01-01
A database for the 3D structures of available compounds is essential for the virtual screening by molecular docking. We have developed the LigandBox database (http://ligandbox.protein.osaka-u.ac.jp/ligandbox/) containing four million available compounds, collected from the catalogues of 37 commercial suppliers, and approved drugs and biochemical compounds taken from KEGG_DRUG, KEGG_COMPOUND and PDB databases. Each chemical compound in the database has several 3D conformers with hydrogen atoms and atomic charges, which are ready to be docked into receptors using docking programs. The 3D conformations were generated using our molecular simulation program package, myPresto. Various physical properties, such as aqueous solubility (LogS) and carcinogenicity have also been calculated to characterize the ADME-Tox properties of the compounds. The Web database provides two services for compound searches: a property/chemical ID search and a chemical structure search. The chemical structure search is performed by a descriptor search and a maximum common substructure (MCS) search combination, using our program kcombu. By specifying a query chemical structure, users can find similar compounds among the millions of compounds in the database within a few minutes. Our database is expected to assist a wide range of researchers, in the fields of medical science, chemical biology, and biochemistry, who are seeking to discover active chemical compounds by the virtual screening.
Database-Guided Discovery of Potent Peptides to Combat HIV-1 or Superbugs
Wang, Guangshun
2013-01-01
Antimicrobial peptides (AMPs), small host defense proteins, are indispensable for the protection of multicellular organisms such as plants and animals from infection. The number of AMPs discovered per year increased steadily since the 1980s. Over 2,000 natural AMPs from bacteria, protozoa, fungi, plants, and animals have been registered into the antimicrobial peptide database (APD). The majority of these AMPs (>86%) possess 11–50 amino acids with a net charge from 0 to +7 and hydrophobic percentages between 31–70%. This article summarizes peptide discovery on the basis of the APD. The major methods are the linguistic model, database screening, de novo design, and template-based design. Using these methods, we identified various potent peptides against human immunodeficiency virus type 1 (HIV-1) or methicillin-resistant Staphylococcus aureus (MRSA). While the stepwise designed anti-HIV peptide is disulfide-linked and rich in arginines, the ab initio designed anti-MRSA peptide is linear and rich in leucines. Thus, there are different requirements for antiviral and antibacterial peptides, which could kill pathogens via different molecular targets. The biased amino acid composition in the database-designed peptides, or natural peptides such as θ-defensins, requires the use of the improved two-dimensional NMR method for structural determination to avoid the publication of misleading structure and dynamics. In the case of human cathelicidin LL-37, structural determination requires 3D NMR techniques. The high-quality structure of LL-37 provides a solid basis for understanding its interactions with membranes of bacteria and other pathogens. In conclusion, the APD database is a comprehensive platform for storing, classifying, searching, predicting, and designing potent peptides against pathogenic bacteria, viruses, fungi, parasites, and cancer cells. PMID:24276259
ERIC Educational Resources Information Center
Battle, Gary M.; Allen, Frank H.; Ferrence, Gregory M.
2010-01-01
A series of online interactive teaching units have been developed that illustrate the use of experimentally measured three-dimensional (3D) structures to teach fundamental chemistry concepts. The units integrate a 500-structure subset of the Cambridge Structural Database specially chosen for their pedagogical value. The units span a number of key…
ERIC Educational Resources Information Center
Lamothe, Alain R.
2011-01-01
The purpose of this paper is to report the results of a quantitative analysis exploring the interaction and relationship between the online database and electronic journal collections at the J. N. Desmarais Library of Laurentian University. A very strong relationship exists between the number of searches and the size of the online database…
Rallapalli, P M; Kemball-Cook, G; Tuddenham, E G; Gomez, K; Perkins, S J
2013-07-01
Factor IX (FIX) is important in the coagulation cascade, being activated to FIXa on cleavage. Defects in the human F9 gene frequently lead to hemophilia B. To assess 1113 unique F9 mutations corresponding to 3721 patient entries in a new and up-to-date interactive web database alongside the FIXa protein structure. The mutations database was built using MySQL and structural analyses were based on a homology model for the human FIXa structure based on closely-related crystal structures. Mutations have been found in 336 (73%) out of 461 residues in FIX. There were 812 unique point mutations, 182 deletions, 54 polymorphisms, 39 insertions and 26 others that together comprise a total of 1113 unique variants. The 64 unique mild severity mutations in the mature protein with known circulating protein phenotypes include 15 (23%) quantitative type I mutations and 41 (64%) predominantly qualitative type II mutations. Inhibitors were described in 59 reports (1.6%) corresponding to 25 unique mutations. The interactive database provides insights into mechanisms of hemophilia B. Type II mutations are deduced to disrupt predominantly those structural regions involved with functional interactions. The interactive features of the database will assist in making judgments about patient management. © 2013 International Society on Thrombosis and Haemostasis.
Integrating In Silico Resources to Map a Signaling Network
Liu, Hanqing; Beck, Tim N.; Golemis, Erica A.; Serebriiskii, Ilya G.
2013-01-01
The abundance of publicly available life science databases offer a wealth of information that can support interpretation of experimentally derived data and greatly enhance hypothesis generation. Protein interaction and functional networks are not simply new renditions of existing data: they provide the opportunity to gain insights into the specific physical and functional role a protein plays as part of the biological system. In this chapter, we describe different in silico tools that can quickly and conveniently retrieve data from existing data repositories and discuss how the available tools are best utilized for different purposes. While emphasizing protein-protein interaction databases (e.g., BioGrid and IntAct), we also introduce metasearch platforms such as STRING and GeneMANIA, pathway databases (e.g., BioCarta and Pathway Commons), text mining approaches (e.g., PubMed and Chilibot), and resources for drug-protein interactions, genetic information for model organisms and gene expression information based on microarray data mining. Furthermore, we provide a simple step-by-step protocol to building customized protein-protein interaction networks in Cytoscape, a powerful network assembly and visualization program, integrating data retrieved from these various databases. As we illustrate, generation of composite interaction networks enables investigators to extract significantly more information about a given biological system than utilization of a single database or sole reliance on primary literature. PMID:24233784
Toxico-Cheminformatics and QSPR Modeling of the Carcinogenic Potency Database
Report on the development of a tiered, confirmatory scheme for prediction of chemical carcinogenicity based on QSAR studies of compounds with available mutagenic and carcinogenic data. For 693 such compounds from the Carcinogenic Potency Database characterized molecular topologic...
Complex molecular assemblies at hand via interactive simulations.
Delalande, Olivier; Férey, Nicolas; Grasseau, Gilles; Baaden, Marc
2009-11-30
Studying complex molecular assemblies interactively is becoming an increasingly appealing approach to molecular modeling. Here we focus on interactive molecular dynamics (IMD) as a textbook example for interactive simulation methods. Such simulations can be useful in exploring and generating hypotheses about the structural and mechanical aspects of biomolecular interactions. For the first time, we carry out low-resolution coarse-grain IMD simulations. Such simplified modeling methods currently appear to be more suitable for interactive experiments and represent a well-balanced compromise between an important gain in computational speed versus a moderate loss in modeling accuracy compared to higher resolution all-atom simulations. This is particularly useful for initial exploration and hypothesis development for rare molecular interaction events. We evaluate which applications are currently feasible using molecular assemblies from 1900 to over 300,000 particles. Three biochemical systems are discussed: the guanylate kinase (GK) enzyme, the outer membrane protease T and the soluble N-ethylmaleimide-sensitive factor attachment protein receptors complex involved in membrane fusion. We induce large conformational changes, carry out interactive docking experiments, probe lipid-protein interactions and are able to sense the mechanical properties of a molecular model. Furthermore, such interactive simulations facilitate exploration of modeling parameters for method improvement. For the purpose of these simulations, we have developed a freely available software library called MDDriver. It uses the IMD protocol from NAMD and facilitates the implementation and application of interactive simulations. With MDDriver it becomes very easy to render any particle-based molecular simulation engine interactive. Here we use its implementation in the Gromacs software as an example. Copyright 2009 Wiley Periodicals, Inc.
Liu, Fu-Feng; Liu, Zhen; Bai, Shu; Dong, Xiao-Yan; Sun, Yan
2012-04-14
Aggregation of amyloid-β (Aβ) peptides correlates with the pathology of Alzheimer's disease. However, the inter-molecular interactions between Aβ protofibril remain elusive. Herein, molecular mechanics Poisson-Boltzmann surface area analysis based on all-atom molecular dynamics simulations was performed to study the inter-molecular interactions in Aβ(17-42) protofibril. It is found that the nonpolar interactions are the important forces to stabilize the Aβ(17-42) protofibril, while electrostatic interactions play a minor role. Through free energy decomposition, 18 residues of the Aβ(17-42) are identified to provide interaction energy lower than -2.5 kcal/mol. The nonpolar interactions are mainly provided by the main chain of the peptide and the side chains of nine hydrophobic residues (Leu17, Phe19, Phe20, Leu32, Leu34, Met35, Val36, Val40, and Ile41). However, the electrostatic interactions are mainly supplied by the main chains of six hydrophobic residues (Phe19, Phe20, Val24, Met35, Val36, and Val40) and the side chains of the charged residues (Glu22, Asp23, and Lys28). In the electrostatic interactions, the overwhelming majority of hydrogen bonds involve the main chains of Aβ as well as the guanidinium group of the charged side chain of Lys28. The work has thus elucidated the molecular mechanism of the inter-molecular interactions between Aβ monomers in Aβ(17-42) protofibril, and the findings are considered critical for exploring effective agents for the inhibition of Aβ aggregation.
NASA Astrophysics Data System (ADS)
Liu, Fu-Feng; Liu, Zhen; Bai, Shu; Dong, Xiao-Yan; Sun, Yan
2012-04-01
Aggregation of amyloid-β (Aβ) peptides correlates with the pathology of Alzheimer's disease. However, the inter-molecular interactions between Aβ protofibril remain elusive. Herein, molecular mechanics Poisson-Boltzmann surface area analysis based on all-atom molecular dynamics simulations was performed to study the inter-molecular interactions in Aβ17-42 protofibril. It is found that the nonpolar interactions are the important forces to stabilize the Aβ17-42 protofibril, while electrostatic interactions play a minor role. Through free energy decomposition, 18 residues of the Aβ17-42 are identified to provide interaction energy lower than -2.5 kcal/mol. The nonpolar interactions are mainly provided by the main chain of the peptide and the side chains of nine hydrophobic residues (Leu17, Phe19, Phe20, Leu32, Leu34, Met35, Val36, Val40, and Ile41). However, the electrostatic interactions are mainly supplied by the main chains of six hydrophobic residues (Phe19, Phe20, Val24, Met35, Val36, and Val40) and the side chains of the charged residues (Glu22, Asp23, and Lys28). In the electrostatic interactions, the overwhelming majority of hydrogen bonds involve the main chains of Aβ as well as the guanidinium group of the charged side chain of Lys28. The work has thus elucidated the molecular mechanism of the inter-molecular interactions between Aβ monomers in Aβ17-42 protofibril, and the findings are considered critical for exploring effective agents for the inhibition of Aβ aggregation.
Nguyen, Phuong T V; Yu, Haibo; Keller, Paul A
2017-03-11
The chikungunya virus (CHIKV) envelope glycoproteins are considered important potential targets for anti-CHIKV drug discovery due to their crucial roles in virus attachment and virus entry. In this study, using two available crystal structures of the immature and mature forms of envelope glycoproteins, virtual screenings based on blind dockings and focused dockings were carried out to identify potential binding pockets and hit compounds for the virus. The chemical library database of compounds, NCI Diversity Set II, was used in these docking studies. In addition to reproducing previously reported examples, new binding pockets were identified, e.g., Pocket 2 in the 3N40, and Pocket 2 and Pocket 3 in the 3N42. Convergences in conformational sampling in docking using AutoDock Vina were evaluated. An analysis of docking results was carried out to understand interactions of the envelope glycoproteins complexes. Some key residues for interactions, for example Gly91 and His230, are identified as possessing important roles in the fusion process.
The trans-kingdom identification of negative regulators of pathogen hypervirulence.
Brown, Neil A; Urban, Martin; Hammond-Kosack, Kim E
2016-01-01
Modern society and global ecosystems are increasingly under threat from pathogens, which cause a plethora of human, animal, invertebrate and plant diseases. Of increasing concern is the trans-kingdom tendency for increased pathogen virulence that is beginning to emerge in natural, clinical and agricultural settings. The study of pathogenicity has revealed multiple examples of convergently evolved virulence mechanisms. Originally described as rare, but increasingly common, are interactions where a single gene deletion in a pathogenic species causes hypervirulence. This review utilised the pathogen-host interaction database (www.PHI-base.org) to identify 112 hypervirulent mutations from 37 pathogen species, and subsequently interrogates the trans-kingdom, conserved, molecular, biochemical and cellular themes that cause hypervirulence. This study investigates 22 animal and 15 plant pathogens including 17 bacterial and 17 fungal species. Finally, the evolutionary significance and trans-kingdom requirement for negative regulators of hypervirulence and the implication of pathogen hypervirulence and emerging infectious diseases on society are discussed. © FEMS 2015.
Integrated analysis of drug-induced gene expression profiles predicts novel hERG inhibitors.
Babcock, Joseph J; Du, Fang; Xu, Kaiping; Wheelan, Sarah J; Li, Min
2013-01-01
Growing evidence suggests that drugs interact with diverse molecular targets mediating both therapeutic and toxic effects. Prediction of these complex interactions from chemical structures alone remains challenging, as compounds with different structures may possess similar toxicity profiles. In contrast, predictions based on systems-level measurements of drug effect may reveal pharmacologic similarities not evident from structure or known therapeutic indications. Here we utilized drug-induced transcriptional responses in the Connectivity Map (CMap) to discover such similarities among diverse antagonists of the human ether-à-go-go related (hERG) potassium channel, a common target of promiscuous inhibition by small molecules. Analysis of transcriptional profiles generated in three independent cell lines revealed clusters enriched for hERG inhibitors annotated using a database of experimental measurements (hERGcentral) and clinical indications. As a validation, we experimentally identified novel hERG inhibitors among the unannotated drugs in these enriched clusters, suggesting transcriptional responses may serve as predictive surrogates of cardiotoxicity complementing existing functional assays.
Linking disease-associated genes to regulatory networks via promoter organization
Döhr, S.; Klingenhoff, A.; Maier, H.; de Angelis, M. Hrabé; Werner, T.; Schneider, R.
2005-01-01
Pathway- or disease-associated genes may participate in more than one transcriptional co-regulation network. Such gene groups can be readily obtained by literature analysis or by high-throughput techniques such as microarrays or protein-interaction mapping. We developed a strategy that defines regulatory networks by in silico promoter analysis, finding potentially co-regulated subgroups without a priori knowledge. Pairs of transcription factor binding sites conserved in orthologous genes (vertically) as well as in promoter sequences of co-regulated genes (horizontally) were used as seeds for the development of promoter models representing potential co-regulation. This approach was applied to a Maturity Onset Diabetes of the Young (MODY)-associated gene list, which yielded two models connecting functionally interacting genes within MODY-related insulin/glucose signaling pathways. Additional genes functionally connected to our initial gene list were identified by database searches with these promoter models. Thus, data-driven in silico promoter analysis allowed integrating molecular mechanisms with biological functions of the cell. PMID:15701758
Integrated Analysis of Drug-Induced Gene Expression Profiles Predicts Novel hERG Inhibitors
Babcock, Joseph J.; Du, Fang; Xu, Kaiping; Wheelan, Sarah J.; Li, Min
2013-01-01
Growing evidence suggests that drugs interact with diverse molecular targets mediating both therapeutic and toxic effects. Prediction of these complex interactions from chemical structures alone remains challenging, as compounds with different structures may possess similar toxicity profiles. In contrast, predictions based on systems-level measurements of drug effect may reveal pharmacologic similarities not evident from structure or known therapeutic indications. Here we utilized drug-induced transcriptional responses in the Connectivity Map (CMap) to discover such similarities among diverse antagonists of the human ether-à-go-go related (hERG) potassium channel, a common target of promiscuous inhibition by small molecules. Analysis of transcriptional profiles generated in three independent cell lines revealed clusters enriched for hERG inhibitors annotated using a database of experimental measurements (hERGcentral) and clinical indications. As a validation, we experimentally identified novel hERG inhibitors among the unannotated drugs in these enriched clusters, suggesting transcriptional responses may serve as predictive surrogates of cardiotoxicity complementing existing functional assays. PMID:23936032
The trans-kingdom identification of negative regulators of pathogen hypervirulence
Brown, Neil A.; Urban, Martin; Hammond-Kosack, Kim E.
2015-01-01
Modern society and global ecosystems are increasingly under threat from pathogens, which cause a plethora of human, animal, invertebrate and plant diseases. Of increasing concern is the trans-kingdom tendency for increased pathogen virulence that is beginning to emerge in natural, clinical and agricultural settings. The study of pathogenicity has revealed multiple examples of convergently evolved virulence mechanisms. Originally described as rare, but increasingly common, are interactions where a single gene deletion in a pathogenic species causes hypervirulence. This review utilised the pathogen–host interaction database (www.PHI-base.org) to identify 112 hypervirulent mutations from 37 pathogen species, and subsequently interrogates the trans-kingdom, conserved, molecular, biochemical and cellular themes that cause hypervirulence. This study investigates 22 animal and 15 plant pathogens including 17 bacterial and 17 fungal species. Finally, the evolutionary significance and trans-kingdom requirement for negative regulators of hypervirulence and the implication of pathogen hypervirulence and emerging infectious diseases on society are discussed. PMID:26468211
Formation of a new archetypal Metal-Organic Framework from a simple monatomic liquid
NASA Astrophysics Data System (ADS)
Metere, Alfredo; Oleynikov, Peter; Dzugutov, Mikhail; O'Keeffe, Michael
2014-12-01
We report a molecular-dynamics simulation of a single-component system of particles interacting via a spherically symmetric potential that is found to form, upon cooling from a liquid state, a low-density porous crystalline phase. Its structure analysis demonstrates that the crystal can be described by a net with a topology that belongs to the class of topologies characteristic of the Metal-Organic Frameworks (MOFs). The observed net is new, and it is now included in the Reticular Chemistry Structure Resource database. The observation that a net topology characteristic of MOF crystals, which are known to be formed by a coordination-driven self-assembly process, can be reproduced by a thermodynamically stable configuration of a simple single-component system of particles opens a possibility of using these models in studies of MOF nets. It also indicates that structures with MOF topology, as well as other low-density porous crystalline structures can possibly be produced in colloidal systems of spherical particles, with an appropriate tuning of interparticle interaction.
Just Working with the Cellular Machine: A High School Game for Teaching Molecular Biology
ERIC Educational Resources Information Center
Cardoso, Fernanda Serpa; Dumpel, Renata; Gomes da Silva, Luisa B.; Rodrigues, Carlos R.; Santos, Dilvani O.; Cabral, Lucio Mendes; Castro, Helena C.
2008-01-01
Molecular biology is a difficult comprehension subject due to its high complexity, thus requiring new teaching approaches. Herein, we developed an interdisciplinary board game involving the human immune system response against a bacterial infection for teaching molecular biology at high school. Initially, we created a database with several…
Martins-de-Souza, Daniel; Cassoli, Juliana S; Nascimento, Juliana M; Hensley, Kenneth; Guest, Paul C; Pinzon-Velasco, Andres M; Turck, Christoph W
2015-10-01
Collapsin response mediator protein-2 (CRMP2) is a CNS protein involved in neuronal development, axonal and neuronal growth, cell migration, and protein trafficking. Recent studies have linked perturbations in CRMP2 function to neurodegenerative disorders such as Alzheimer's disease, neuropathic pain, and Batten disease, and to psychiatric disorders such as schizophrenia. Like most proteins, CRMP2 functions though interactions with a molecular network of proteins and other molecules. Here, we have attempted to identify additional proteins of the CRMP2 interactome to provide further leads about its roles in neurological functions. We used a combined co-immunoprecipitation and shotgun proteomic approach in order to identify CRMP2 protein partners. We identified 78 CRMP2 protein partners not previously reported in public protein interaction databases. These were involved in seven biological processes, which included cell signaling, growth, metabolism, trafficking, and immune function, according to Gene Ontology classifications. Furthermore, 32 different molecular functions were found to be associated with these proteins, such as RNA binding, ribosomal functions, transporter activity, receptor activity, serine/threonine phosphatase activity, cell adhesion, cytoskeletal protein binding and catalytic activity. In silico pathway interactome construction revealed a highly connected network with the most overrepresented functions corresponding to semaphorin interactions, along with axon guidance and WNT5A signaling. Taken together, these findings suggest that the CRMP2 pathway is critical for regulating neuronal and synaptic architecture. Further studies along these lines might uncover novel biomarkers and drug targets for use in drug discovery. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Santos, Eliane Macedo Sobrinho; Santos, Hércules Otacílio; Dos Santos Dias, Ivoneth; Santos, Sérgio Henrique; Batista de Paula, Alfredo Maurício; Feltenberger, John David; Sena Guimarães, André Luiz; Farias, Lucyana Conceição
2016-01-01
Pathogenesis of odontogenic tumors is not well known. It is important to identify genetic deregulations and molecular alterations. This study aimed to investigate, through bioinformatic analysis, the possible genes involved in the pathogenesis of ameloblastoma (AM) and keratocystic odontogenic tumor (KCOT). Genes involved in the pathogenesis of AM and KCOT were identified in GeneCards. Gene list was expanded, and the gene interactions network was mapped using the STRING software. "Weighted number of links" (WNL) was calculated to identify "leader genes" (highest WNL). Genes were ranked by K-means method and Kruskal-Wallis test was used (P<0.001). Total interactions score (TIS) was also calculated using all interaction data generated by the STRING database, in order to achieve global connectivity for each gene. The topological and ontological analyses were performed using Cytoscape software and BinGO plugin. Literature review data was used to corroborate the bioinformatics data. CDK1 was identified as leader gene for AM. In KCOT group, results show PCNA and TP53 . Both tumors exhibit a power law behavior. Our topological analysis suggested leader genes possibly important in the pathogenesis of AM and KCOT, by clustering coefficient calculated for both odontogenic tumors (0.028 for AM, zero for KCOT). The results obtained in the scatter diagram suggest an important relationship of these genes with the molecular processes involved in AM and KCOT. Ontological analysis for both AM and KCOT demonstrated different mechanisms. Bioinformatics analyzes were confirmed through literature review. These results may suggest the involvement of promising genes for a better understanding of the pathogenesis of AM and KCOT.
Evolutionary origins of a novel host plant detoxification gene in butterflies.
Fischer, Hanna M; Wheat, Christopher W; Heckel, David G; Vogel, Heiko
2008-05-01
Chemical interactions between plants and their insect herbivores provide an excellent opportunity to study the evolution of species interactions on a molecular level. Here, we investigate the molecular evolutionary events that gave rise to a novel detoxifying enzyme (nitrile-specifier protein [NSP]) in the butterfly family Pieridae, previously identified as a coevolutionary key innovation. By generating and sequencing expressed sequence tags, genomic libraries, and screening databases we found NSP to be a member of an insect-specific gene family, which we characterized and named the NSP-like gene family. Members consist of variable tandem repeats, are gut expressed, and are found across Insecta evolving in a dynamic, ongoing birth-death process. In the Lepidoptera, multiple copies of single-domain major allergen genes are present and originate via tandem duplications. Multiple domain genes are found solely within the brassicaceous-feeding Pieridae butterflies, one of them being NSP and another called major allergen (MA). Analyses suggest that NSP and its paralog MA have a unique single-domain evolutionary origin, being formed by intragenic domain duplication followed by tandem whole-gene duplication. Duplicates subsequently experienced a period of relaxed constraint followed by an increase in constraint, perhaps after neofunctionalization. NSP and its ortholog MA are still experiencing high rates of change, reflecting a dynamic evolution consistent with the known role of NSP in plant-insect interactions. Our results provide direct evidence to the hypothesis that gene duplication is one of the driving forces for speciation and adaptation, showing that both within- and whole-gene tandem duplications are a powerful force underlying evolutionary adaptation.
Integrated inference and evaluation of host–fungi interaction networks
Remmele, Christian W.; Luther, Christian H.; Balkenhol, Johannes; Dandekar, Thomas; Müller, Tobias; Dittrich, Marcus T.
2015-01-01
Fungal microorganisms frequently lead to life-threatening infections. Within this group of pathogens, the commensal Candida albicans and the filamentous fungus Aspergillus fumigatus are by far the most important causes of invasive mycoses in Europe. A key capability for host invasion and immune response evasion are specific molecular interactions between the fungal pathogen and its human host. Experimentally validated knowledge about these crucial interactions is rare in literature and even specialized host–pathogen databases mainly focus on bacterial and viral interactions whereas information on fungi is still sparse. To establish large-scale host–fungi interaction networks on a systems biology scale, we develop an extended inference approach based on protein orthology and data on gene functions. Using human and yeast intraspecies networks as template, we derive a large network of pathogen–host interactions (PHI). Rigorous filtering and refinement steps based on cellular localization and pathogenicity information of predicted interactors yield a primary scaffold of fungi–human and fungi–mouse interaction networks. Specific enrichment of known pathogenicity-relevant genes indicates the biological relevance of the predicted PHI. A detailed inspection of functionally relevant subnetworks reveals novel host–fungal interaction candidates such as the Candida virulence factor PLB1 and the anti-fungal host protein APP. Our results demonstrate the applicability of interolog-based prediction methods for host–fungi interactions and underline the importance of filtering and refinement steps to attain biologically more relevant interactions. This integrated network framework can serve as a basis for future analyses of high-throughput host–fungi transcriptome and proteome data. PMID:26300851
SFINX-a drug-drug interaction database designed for clinical decision support systems.
Böttiger, Ylva; Laine, Kari; Andersson, Marine L; Korhonen, Tuomas; Molin, Björn; Ovesjö, Marie-Louise; Tirkkonen, Tuire; Rane, Anders; Gustafsson, Lars L; Eiermann, Birgit
2009-06-01
The aim was to develop a drug-drug interaction database (SFINX) to be integrated into decision support systems or to be used in website solutions for clinical evaluation of interactions. Key elements such as substance properties and names, drug formulations, text structures and references were defined before development of the database. Standard operating procedures for literature searches, text writing rules and a classification system for clinical relevance and documentation level were determined. ATC codes, CAS numbers and country-specific codes for substances were identified and quality assured to ensure safe integration of SFINX into other data systems. Much effort was put into giving short and practical advice regarding clinically relevant drug-drug interactions. SFINX includes over 8,000 interaction pairs and is integrated into Swedish and Finnish computerised decision support systems. Over 31,000 physicians and pharmacists are receiving interaction alerts through SFINX. User feedback is collected for continuous improvement of the content. SFINX is a potentially valuable tool delivering instant information on drug interactions during prescribing and dispensing.
Exploring molecular networks using MONET ontology.
Silva, João Paulo Müller da; Lemke, Ney; Mombach, José Carlos; Souza, José Guilherme Camargo de; Sinigaglia, Marialva; Vieira, Renata
2006-03-31
The description of the complex molecular network responsible for cell behavior requires new tools to integrate large quantities of experimental data in the design of biological information systems. These tools could be used in the characterization of these networks and in the formulation of relevant biological hypotheses. The building of an ontology is a crucial step because it integrates in a coherent framework the concepts necessary to accomplish such a task. We present MONET (molecular network), an extensible ontology and an architecture designed to facilitate the integration of data originating from different public databases in a single- and well-documented relational database, that is compatible with MONET formal definition. We also present an example of an application that can easily be implemented using these tools.
Synthesizing and databasing fossil calibrations: divergence dating and beyond
Ksepka, Daniel T.; Benton, Michael J.; Carrano, Matthew T.; Gandolfo, Maria A.; Head, Jason J.; Hermsen, Elizabeth J.; Joyce, Walter G.; Lamm, Kristin S.; Patané, José S. L.; Phillips, Matthew J.; Polly, P. David; Van Tuinen, Marcel; Ware, Jessica L.; Warnock, Rachel C. M.; Parham, James F.
2011-01-01
Divergence dating studies, which combine temporal data from the fossil record with branch length data from molecular phylogenetic trees, represent a rapidly expanding approach to understanding the history of life. National Evolutionary Synthesis Center hosted the first Fossil Calibrations Working Group (3–6 March, 2011, Durham, NC, USA), bringing together palaeontologists, molecular evolutionists and bioinformatics experts to present perspectives from disciplines that generate, model and use fossil calibration data. Presentations and discussions focused on channels for interdisciplinary collaboration, best practices for justifying, reporting and using fossil calibrations and roadblocks to synthesis of palaeontological and molecular data. Bioinformatics solutions were proposed, with the primary objective being a new database for vetted fossil calibrations with linkages to existing resources, targeted for a 2012 launch. PMID:21525049
Balaur, Irina; Saqi, Mansoor; Barat, Ana; Lysenko, Artem; Mazein, Alexander; Rawlings, Christopher J; Ruskin, Heather J; Auffray, Charles
2017-10-01
The development of colorectal cancer (CRC)-the third most common cancer type-has been associated with deregulations of cellular mechanisms stimulated by both genetic and epigenetic events. StatEpigen is a manually curated and annotated database, containing information on interdependencies between genetic and epigenetic signals, and specialized currently for CRC research. Although StatEpigen provides a well-developed graphical user interface for information retrieval, advanced queries involving associations between multiple concepts can benefit from more detailed graph representation of the integrated data. This can be achieved by using a graph database (NoSQL) approach. Data were extracted from StatEpigen and imported to our newly developed EpiGeNet, a graph database for storage and querying of conditional relationships between molecular (genetic and epigenetic) events observed at different stages of colorectal oncogenesis. We illustrate the enhanced capability of EpiGeNet for exploration of different queries related to colorectal tumor progression; specifically, we demonstrate the query process for (i) stage-specific molecular events, (ii) most frequently observed genetic and epigenetic interdependencies in colon adenoma, and (iii) paths connecting key genes reported in CRC and associated events. The EpiGeNet framework offers improved capability for management and visualization of data on molecular events specific to CRC initiation and progression.
The BioGRID Interaction Database: 2011 update
Stark, Chris; Breitkreutz, Bobby-Joe; Chatr-aryamontri, Andrew; Boucher, Lorrie; Oughtred, Rose; Livstone, Michael S.; Nixon, Julie; Van Auken, Kimberly; Wang, Xiaodong; Shi, Xiaoqi; Reguly, Teresa; Rust, Jennifer M.; Winter, Andrew; Dolinski, Kara; Tyers, Mike
2011-01-01
The Biological General Repository for Interaction Datasets (BioGRID) is a public database that archives and disseminates genetic and protein interaction data from model organisms and humans (http://www.thebiogrid.org). BioGRID currently holds 347 966 interactions (170 162 genetic, 177 804 protein) curated from both high-throughput data sets and individual focused studies, as derived from over 23 000 publications in the primary literature. Complete coverage of the entire literature is maintained for budding yeast (Saccharomyces cerevisiae), fission yeast (Schizosaccharomyces pombe) and thale cress (Arabidopsis thaliana), and efforts to expand curation across multiple metazoan species are underway. The BioGRID houses 48 831 human protein interactions that have been curated from 10 247 publications. Current curation drives are focused on particular areas of biology to enable insights into conserved networks and pathways that are relevant to human health. The BioGRID 3.0 web interface contains new search and display features that enable rapid queries across multiple data types and sources. An automated Interaction Management System (IMS) is used to prioritize, coordinate and track curation across international sites and projects. BioGRID provides interaction data to several model organism databases, resources such as Entrez-Gene and other interaction meta-databases. The entire BioGRID 3.0 data collection may be downloaded in multiple file formats, including PSI MI XML. Source code for BioGRID 3.0 is freely available without any restrictions. PMID:21071413
Ghahremanpour, Mohammad M; van Maaren, Paul J; van der Spoel, David
2018-04-10
Data quality as well as library size are crucial issues for force field development. In order to predict molecular properties in a large chemical space, the foundation to build force fields on needs to encompass a large variety of chemical compounds. The tabulated molecular physicochemical properties also need to be accurate. Due to the limited transparency in data used for development of existing force fields it is hard to establish data quality and reusability is low. This paper presents the Alexandria library as an open and freely accessible database of optimized molecular geometries, frequencies, electrostatic moments up to the hexadecupole, electrostatic potential, polarizabilities, and thermochemistry, obtained from quantum chemistry calculations for 2704 compounds. Values are tabulated and where available compared to experimental data. This library can assist systematic development and training of empirical force fields for a broad range of molecules.
NASA Astrophysics Data System (ADS)
Ghahremanpour, Mohammad M.; van Maaren, Paul J.; van der Spoel, David
2018-04-01
Data quality as well as library size are crucial issues for force field development. In order to predict molecular properties in a large chemical space, the foundation to build force fields on needs to encompass a large variety of chemical compounds. The tabulated molecular physicochemical properties also need to be accurate. Due to the limited transparency in data used for development of existing force fields it is hard to establish data quality and reusability is low. This paper presents the Alexandria library as an open and freely accessible database of optimized molecular geometries, frequencies, electrostatic moments up to the hexadecupole, electrostatic potential, polarizabilities, and thermochemistry, obtained from quantum chemistry calculations for 2704 compounds. Values are tabulated and where available compared to experimental data. This library can assist systematic development and training of empirical force fields for a broad range of molecules.
Pragmatic precision oncology: the secondary uses of clinical tumor molecular profiling.
Rioth, Matthew J; Thota, Ramya; Staggs, David B; Johnson, Douglas B; Warner, Jeremy L
2016-07-01
Precision oncology increasingly utilizes molecular profiling of tumors to determine treatment decisions with targeted therapeutics. The molecular profiling data is valuable in the treatment of individual patients as well as for multiple secondary uses. To automatically parse, categorize, and aggregate clinical molecular profile data generated during cancer care as well as use this data to address multiple secondary use cases. A system to parse, categorize and aggregate molecular profile data was created. A naÿve Bayesian classifier categorized results according to clinical groups. The accuracy of these systems were validated against a published expertly-curated subset of molecular profiling data. Following one year of operation, 819 samples have been accurately parsed and categorized to generate a data repository of 10,620 genetic variants. The database has been used for operational, clinical trial, and discovery science research. A real-time database of molecular profiling data is a pragmatic solution to several knowledge management problems in the practice and science of precision oncology. © The Author 2016. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Molecular docking based screening of compounds against VP40 from Ebola virus.
M Alam El-Din, Hanaa; A Loutfy, Samah; Fathy, Nasra; H Elberry, Mostafa; M Mayla, Ahmed; Kassem, Sara; Naqvi, Asif
2016-01-01
Ebola virus causes severe and often fatal hemorrhagic fevers in humans. The 2014 Ebola epidemic affected multiple countries. The virus matrix protein (VP40) plays a central role in virus assembly and budding. Since there is no FDA-approved vaccine or medicine against Ebola viral infection, discovering new compounds with different binding patterns against it is required. Therefore, we aim to identify small molecules that target the Arg 134 RNA binding and active site of VP40 protein. 1800 molecules were retrieved from PubChem compound database based on Structure Similarity and Conformers of pyrimidine-2, 4-dione. Molecular docking approach using Lamarckian Genetic Algorithm was carried out to find the potent inhibitors for VP40 based on calculated ligand-protein pairwise interaction energies. The grid maps representing the protein were calculated using auto grid and grid size was set to 60*60*60 points with grid spacing of 0.375 Ǻ. Ten independent docking runs were carried out for each ligand and results were clustered according to the 1.0 Ǻ RMSD criteria. The post-docking analysis showed that binding energies ranged from -8.87 to 0.6 Kcal/mol. We report 7 molecules, which showed promising ADMET results, LD-50, as well as H-bond interaction in the binding pocket. The small molecules discovered could act as potential inhibitors for VP40 and could interfere with virus assembly and budding process.
Molecular docking based screening of compounds against VP40 from Ebola virus
M Alam El-Din, Hanaa; A. Loutfy, Samah; Fathy, Nasra; H Elberry, Mostafa; M Mayla, Ahmed; Kassem, Sara; Naqvi, Asif
2016-01-01
Ebola virus causes severe and often fatal hemorrhagic fevers in humans. The 2014 Ebola epidemic affected multiple countries. The virus matrix protein (VP40) plays a central role in virus assembly and budding. Since there is no FDA-approved vaccine or medicine against Ebola viral infection, discovering new compounds with different binding patterns against it is required. Therefore, we aim to identify small molecules that target the Arg 134 RNA binding and active site of VP40 protein. 1800 molecules were retrieved from PubChem compound database based on Structure Similarity and Conformers of pyrimidine-2, 4-dione. Molecular docking approach using Lamarckian Genetic Algorithm was carried out to find the potent inhibitors for VP40 based on calculated ligand-protein pairwise interaction energies. The grid maps representing the protein were calculated using auto grid and grid size was set to 60*60*60 points with grid spacing of 0.375 Ǻ. Ten independent docking runs were carried out for each ligand and results were clustered according to the 1.0 Ǻ RMSD criteria. The post-docking analysis showed that binding energies ranged from -8.87 to 0.6 Kcal/mol. We report 7 molecules, which showed promising ADMET results, LD-50, as well as H-bond interaction in the binding pocket. The small molecules discovered could act as potential inhibitors for VP40 and could interfere with virus assembly and budding process. PMID:28149054
Picot, Marie C N; Zengin, Gokhan; Mollica, Adriano; Stefanucci, Azzurra; Carradori, Simone; Mahomoodally, Mohamad F
2017-01-01
Mangiferin, was identified in the crude methanol extract, ethyl acetate, and n-butanol fractions of Aphloia theiformis (Vahl.) Benn. This study aimed to analyze the plausible binding modes of mangiferin to key enzymes linked to diabetes type 2 (DT2), obesity, hypertension, Alzheimer's disease, and urolithiasis using molecular docking. Crystallographic structures of α-amylase, α-glucosidase, glycogen phosphorylase (GP), pancreatic lipase, cholesterol esterase (CEase), angiotensin-I-converting enzyme (ACE), acetyl cholinesterase (AChE), and urease available on the Protein Databank database were docked to mangiferin using Gold 6.0 software. We showed that mangiferin bound to all enzymes by π-π and hydrogen bonds mostly. Mangiferin was docked to both allosteric and orthosteric sites of α-glucosidase by π-π interactions. However, several hydrogen bonds were observed at the orthosteric position, suggesting a preference for this site. The docking of mangiferin on AChE with the catalytic pocket occupied by paraoxon could be attributed to π-π stacking involving amino acid residues, Trp341 and Trp124. This study provided an insight of the molecular interaction of mangiferin with the studied enzymes and can be considered as a valuable tool for designing new drugs for better management of these diseases. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
NASA Astrophysics Data System (ADS)
Hu, Xin; Legler, Patricia M.; Southall, Noel; Maloney, David J.; Simeonov, Anton; Jadhav, Ajit
2014-07-01
Botulinum neurotoxin serotype A (BoNT/A) is the most lethal toxin among the Tier 1 Select Agents. Development of potent and selective small molecule inhibitors against BoNT/A zinc metalloprotease remains a challenging problem due to its exceptionally large substrate binding surface and conformational plasticity. The exosites of the catalytic domain of BoNT/A are intriguing alternative sites for small molecule intervention, but their suitability for inhibitor design remains largely unexplored. In this study, we employed two recently identified exosite inhibitors, D-chicoric acid and lomofungin, to probe the structural features of the exosites and molecular mechanisms of synergistic inhibition. The results showed that D-chicoric acid favors binding at the α-exosite, whereas lomofungin preferentially binds at the β-exosite by mimicking the substrate β-sheet binding interaction. Molecular dynamics simulations and binding interaction analysis of the exosite inhibitors with BoNT/A revealed key elements and hotspots that likely contribute to the inhibitor binding and synergistic inhibition. Finally, we performed database virtual screening for novel inhibitors of BoNT/A targeting the exosites. Hits C1 and C2 showed non-competitive inhibition and likely target the α- and β-exosites, respectively. The identified exosite inhibitors may provide novel candidates for structure-based development of therapeutics against BoNT/A intoxication.
Hu, Xin; Legler, Patricia M; Southall, Noel; Maloney, David J; Simeonov, Anton; Jadhav, Ajit
2014-07-01
Botulinum neurotoxin serotype A (BoNT/A) is the most lethal toxin among the Tier 1 Select Agents. Development of potent and selective small molecule inhibitors against BoNT/A zinc metalloprotease remains a challenging problem due to its exceptionally large substrate binding surface and conformational plasticity. The exosites of the catalytic domain of BoNT/A are intriguing alternative sites for small molecule intervention, but their suitability for inhibitor design remains largely unexplored. In this study, we employed two recently identified exosite inhibitors, D-chicoric acid and lomofungin, to probe the structural features of the exosites and molecular mechanisms of synergistic inhibition. The results showed that D-chicoric acid favors binding at the α-exosite, whereas lomofungin preferentially binds at the β-exosite by mimicking the substrate β-sheet binding interaction. Molecular dynamics simulations and binding interaction analysis of the exosite inhibitors with BoNT/A revealed key elements and hotspots that likely contribute to the inhibitor binding and synergistic inhibition. Finally, we performed database virtual screening for novel inhibitors of BoNT/A targeting the exosites. Hits C1 and C2 showed non-competitive inhibition and likely target the α- and β-exosites, respectively. The identified exosite inhibitors may provide novel candidates for structure-based development of therapeutics against BoNT/A intoxication.
Validation and extraction of molecular-geometry information from small-molecule databases.
Long, Fei; Nicholls, Robert A; Emsley, Paul; Graǽulis, Saulius; Merkys, Andrius; Vaitkus, Antanas; Murshudov, Garib N
2017-02-01
A freely available small-molecule structure database, the Crystallography Open Database (COD), is used for the extraction of molecular-geometry information on small-molecule compounds. The results are used for the generation of new ligand descriptions, which are subsequently used by macromolecular model-building and structure-refinement software. To increase the reliability of the derived data, and therefore the new ligand descriptions, the entries from this database were subjected to very strict validation. The selection criteria made sure that the crystal structures used to derive atom types, bond and angle classes are of sufficiently high quality. Any suspicious entries at a crystal or molecular level were removed from further consideration. The selection criteria included (i) the resolution of the data used for refinement (entries solved at 0.84 Å resolution or higher) and (ii) the structure-solution method (structures must be from a single-crystal experiment and all atoms of generated molecules must have full occupancies), as well as basic sanity checks such as (iii) consistency between the valences and the number of connections between atoms, (iv) acceptable bond-length deviations from the expected values and (v) detection of atomic collisions. The derived atom types and bond classes were then validated using high-order moment-based statistical techniques. The results of the statistical analyses were fed back to fine-tune the atom typing. The developed procedure was repeated four times, resulting in fine-grained atom typing, bond and angle classes. The procedure will be repeated in the future as and when new entries are deposited in the COD. The whole procedure can also be applied to any source of small-molecule structures, including the Cambridge Structural Database and the ZINC database.
Profiling lethal factor interacting proteins from human stomach using T7 phage display screening.
Cardona-Correa, Albin; Rios-Velazquez, Carlos
2016-05-01
The anthrax lethal factor (LF) is a zinc dependent metalloproteinase that cleaves the majority of mitogen-activated protein kinase kinases and a member of NOD-like receptor proteins, inducing cell apoptosis. Despite efforts to fully understand the Bacillus anthracis toxin components, the gastrointestinal (GI) anthrax mechanisms have not been fully elucidated. Previous studies demonstrated gastric ulceration, and a substantial bacterial growth rate in Peyer's patches. However, the complete molecular pathways of the disease that results in tissue damage by LF proteolytic activity remains unclear. In the present study, to identify the profile of the proteins potentially involved in GI anthrax, protein‑protein interactions were investigated using human stomach T7 phage display (T7PD) cDNA libraries. T7PD is a high throughput technique that allows the expression of cloned DNA sequences as peptides on the phage surface, enabling the selection and identification of protein ligands. A wild type and mutant LF (E687A) were used to differentiate interaction sites. A total of 124 clones were identified from 194 interacting‑phages, at both the DNA and protein level, by in silico analysis. Databases revealed that the selected candidates were proteins from different families including lipase, peptidase‑A1 and cation transport families, among others. Furthermore, individual T7PD candidates were tested against LF in order to detect their specificity to the target molecule, resulting in 10 LF‑interacting peptides. With a minimum concentration of LF for interaction at 1 µg/ml, the T7PD isolated pepsin A3 pre‑protein (PAP) demonstrated affinity to both types of LF. In addition, PAP was isolated in various lengths for the same protein, exhibiting common regions following PRALINE alignment. These findings will help elucidate and improve the understanding of the molecular pathogenesis of GI anthrax, and aid in the development of potential therapeutic agents.
PolySac3DB: an annotated data base of 3 dimensional structures of polysaccharides.
Sarkar, Anita; Pérez, Serge
2012-11-14
Polysaccharides are ubiquitously present in the living world. Their structural versatility makes them important and interesting components in numerous biological and technological processes ranging from structural stabilization to a variety of immunologically important molecular recognition events. The knowledge of polysaccharide three-dimensional (3D) structure is important in studying carbohydrate-mediated host-pathogen interactions, interactions with other bio-macromolecules, drug design and vaccine development as well as material science applications or production of bio-ethanol. PolySac3DB is an annotated database that contains the 3D structural information of 157 polysaccharide entries that have been collected from an extensive screening of scientific literature. They have been systematically organized using standard names in the field of carbohydrate research into 18 categories representing polysaccharide families. Structure-related information includes the saccharides making up the repeat unit(s) and their glycosidic linkages, the expanded 3D representation of the repeat unit, unit cell dimensions and space group, helix type, diffraction diagram(s) (when applicable), experimental and/or simulation methods used for structure description, link to the abstract of the publication, reference and the atomic coordinate files for visualization and download. The database is accompanied by a user-friendly graphical user interface (GUI). It features interactive displays of polysaccharide structures and customized search options for beginners and experts, respectively. The site also serves as an information portal for polysaccharide structure determination techniques. The web-interface also references external links where other carbohydrate-related resources are available. PolySac3DB is established to maintain information on the detailed 3D structures of polysaccharides. All the data and features are available via the web-interface utilizing the search engine and can be accessed at http://polysac3db.cermav.cnrs.fr.
Quiapim, Andréa C.; Brito, Michael S.; Bernardes, Luciano A.S.; daSilva, Idalete; Malavazi, Iran; DePaoli, Henrique C.; Molfetta-Machado, Jeanne B.; Giuliatti, Silvana; Goldman, Gustavo H.; Goldman, Maria Helena S.
2009-01-01
The success of plant reproduction depends on pollen-pistil interactions occurring at the stigma/style. These interactions vary depending on the stigma type: wet or dry. Tobacco (Nicotiana tabacum) represents a model of wet stigma, and its stigmas/styles express genes to accomplish the appropriate functions. For a large-scale study of gene expression during tobacco pistil development and preparation for pollination, we generated 11,216 high-quality expressed sequence tags (ESTs) from stigmas/styles and created the TOBEST database. These ESTs were assembled in 6,177 clusters, from which 52.1% are pistil transcripts/genes of unknown function. The 21 clusters with the highest number of ESTs (putative higher expression levels) correspond to genes associated with defense mechanisms or pollen-pistil interactions. The database analysis unraveled tobacco sequences homologous to the Arabidopsis (Arabidopsis thaliana) genes involved in specifying pistil identity or determining normal pistil morphology and function. Additionally, 782 independent clusters were examined by macroarray, revealing 46 stigma/style preferentially expressed genes. Real-time reverse transcription-polymerase chain reaction experiments validated the pistil-preferential expression for nine out of 10 genes tested. A search for these 46 genes in the Arabidopsis pistil data sets demonstrated that only 11 sequences, with putative equivalent molecular functions, are expressed in this dry stigma species. The reverse search for the Arabidopsis pistil genes in the TOBEST exposed a partial overlap between these dry and wet stigma transcriptomes. The TOBEST represents the most extensive survey of gene expression in the stigmas/styles of wet stigma plants, and our results indicate that wet and dry stigmas/styles express common as well as distinct genes in preparation for the pollination process. PMID:19052150
Abnormal DNA methylation may contribute to the progression of osteosarcoma.
Chen, Xiao-Gang; Ma, Liang; Xu, Jia-Xin
2018-01-01
The identification of optimal methylation biomarkers to achieve maximum diagnostic ability remains a challenge. The present study aimed to elucidate the potential molecular mechanisms underlying osteosarcoma (OS) using DNA methylation analysis. Based on the GSE36002 dataset obtained from the Gene Expression Omnibus database, differentially methylated genes were extracted between patients with OS and controls using t‑tests. Subsequently, hierarchical clustering was performed to segregate the samples into two distinct clusters, OS and normal. Gene Ontology (GO) and pathway enrichment analyses for differentially methylated genes were performed using the Database for Annotation, Visualization and Integrated Discovery tool. A protein‑protein interaction (PPI) network was established, followed by hub gene identification. Using the cut‑off threshold of ≥0.2 average β‑value difference, 3,725 unique CpGs (2,862 genes) were identified to be differentially methylated between the OS and normal groups. Among these 2,862 genes, 510 genes were differentially hypermethylated and 2,352 were differentially hypomethylated. The differentially hypermethylated genes were primarily involved in 20 GO terms, and the top 3 terms were associated with potassium ion transport. For differentially hypomethylated genes, GO functions principally included passive transmembrane transporter activity, channel activity and metal ion transmembrane transporter activity. In addition, a total of 10 significant pathways were enriched by differentially hypomethylated genes; notably, neuroactive ligand‑receptor interaction was the most significant pathway. Based on a connectivity degree >90, 7 hub genes were selected from the PPI network, including neuromedin U (NMU; degree=103) and NMU receptor 1 (NMUR1; degree=103). Functional terms (potassium ion transport, transmembrane transporter activity, and neuroactive ligand‑receptor interaction) and hub genes (NMU and NMUR1) may serve as potential targets for the treatment and diagnosis of OS.