Science.gov

Sample records for 250-kv embl enantiomorphic

  1. An enantiomorphic blumlein impulse generator

    SciTech Connect

    Rinehart, L.F.; Buttram, M.T.; Crowe, W.R.; Clark, R.S.; Lundstrom, J.M.; Patterson, P.E.

    1992-01-01

    Working designs exist for 1 GW, 1 kHz ultra-wideband (UWB) sources (e.g. SNIPER). As these generators are pressed to higher peak powers and repetition rates, insulation, energy loss due to stray capacitance, and system efficiency (including power supplies and modulators) become critical issues. The EnantioMorphic (mirror image) BLumlein (EMBL) is a new type of vector inversion transmission line pulser which is designed to alleviate some of these problems. The design goals for EMBL are : >500 kV, {approximately}1 kHz rep-rate and <100 ps risetime in a 50 ohm geometry. In addition to the pulse forming line (PFL), EMBL also requires a high rep-rate modulator, primary switch, and peaking switch which will be described. Empirical design equations for peaking switch performance are included.

  2. An enantiomorphic blumlein impulse generator

    SciTech Connect

    Rinehart, L.F.; Buttram, M.T.; Crowe, W.R.; Clark, R.S.; Lundstrom, J.M.; Patterson, P.E.

    1992-07-01

    Working designs exist for 1 GW, 1 kHz ultra-wideband (UWB) sources (e.g. SNIPER). As these generators are pressed to higher peak powers and repetition rates, insulation, energy loss due to stray capacitance, and system efficiency (including power supplies and modulators) become critical issues. The EnantioMorphic (mirror image) BLumlein (EMBL) is a new type of vector inversion transmission line pulser which is designed to alleviate some of these problems. The design goals for EMBL are : >500 kV, {approximately}1 kHz rep-rate and <100 ps risetime in a 50 ohm geometry. In addition to the pulse forming line (PFL), EMBL also requires a high rep-rate modulator, primary switch, and peaking switch which will be described. Empirical design equations for peaking switch performance are included.

  3. Showing Enantiomorphous Crystals of Tartaric Acid

    ERIC Educational Resources Information Center

    Andrade-Gamboa, Julio

    2007-01-01

    Most of the articles and textbooks that show drawings of enantiomorphous crystals use an inadequate view to appreciate the fact that they are non-superimposable mirror images of one another. If a graphical presentation of crystal chirality is not evident, the main attribute of crystal enantiomorphism can not be recognized by students. The classic…

  4. The EMBL Nucleotide Sequence Database.

    PubMed

    Stoesser, G; Tuli, M A; Lopez, R; Sterk, P

    1999-01-01

    The EMBL Nucleotide Sequence Database (http://www.ebi.ac.uk/embl.html) constitutes Europe's primary nucleotide sequence resource. Main sources for DNA and RNA sequences are direct submissions from individual researchers, genome sequencing projects and patent applications. While automatic procedures allow incorporation of sequence data from large-scale genome sequencing centres and from the European Patent Office (EPO), the preferred submission tool for individual submitters is Webin (WWW). Through all stages, dataflow is monitored by EBI biologists communicating with the sequencing groups. In collaboration with DDBJ and GenBank the database is produced, maintained and distributed at the European Bioinformatics Institute (EBI). Database releases are produced quarterly and are distributed on CD-ROM. Network services allow access to the most up-to-date data collection via Internet and World Wide Web interface. EBI's Sequence Retrieval System (SRS) is a Network Browser for Databanks in Molecular Biology, integrating and linking the main nucleotide and protein databases, plus many specialised databases. For sequence similarity searching a variety of tools (e.g. Blitz, Fasta, Blast etc) are available for external users to compare their own sequences against the most currently available data in the EMBL Nucleotide Sequence Database and SWISS-PROT. PMID:9847133

  5. Enantiomorphism of kaolinite: Manifestation at the levels of elementary layer and microcrystals

    SciTech Connect

    Samotoin, N. D.

    2011-03-15

    The right and left forms of the argillaceous mineral kaolinite (Al{sub 2} Si{sub 2}O{sub 5}(OH){sub 4}), which is wide-spread in nature, have been revealed for the first time by transmission electron microscopy and gold decoration in vacuum. The enantiomorphic forms of this mineral are established at the level of the elementary 7 Angstrom-Sign layer, which determines the kaolinite structure, and at the level of nano- and microcrystals typical of this mineral. Both kaolinite forms are widespread in ancient and young weathering crusts. Enantiomorphic kaolinite microcrystals are formed in two ways: through the periodic formation of 2D nuclei and via helical growth, which is dominant for both kaolinite forms. The right- and left-handed kaolinite forms are observed in the samples under study with equal probability.

  6. Circular polarization of light by planet Mercury and enantiomorphism of its surface minerals.

    PubMed

    Meierhenrich, Uwe J; Thiemann, Wolfram H P; Barbier, Bernard; Brack, André; Alcaraz, Christian; Nahon, Laurent; Wolstencroft, Ray

    2002-04-01

    Different mechanisms for the generation of circular polarization by the surface of planets and satellites are described. The observed values for Venus, the Moon, Mars, and Jupiter obtained by photo-polarimetric measurements with Earth based telescopes, showed accordance with theory. However, for planet Mercury asymmetric parameters in the circular polarization were measured that do not fit with calculations. For BepiColombo, the ESA cornerstone mission 5 to Mercury, we propose to investigate this phenomenon using a concept which includes two instruments. The first instrument is a high-resolution optical polarimeter, capable to determine and map the circular polarization by remote scanning of Mercury's surface from the Mercury Planetary Orbiter MPO. The second instrument is an in situ sensor for the detection of the enantiomorphism of surface crystals and minerals, proposed to be included in the Mercury Lander MSE. PMID:12185675

  7. The ChEMBL database as linked open data

    PubMed Central

    2013-01-01

    Background Making data available as Linked Data using Resource Description Framework (RDF) promotes integration with other web resources. RDF documents can natively link to related data, and others can link back using Uniform Resource Identifiers (URIs). RDF makes the data machine-readable and uses extensible vocabularies for additional information, making it easier to scale up inference and data analysis. Results This paper describes recent developments in an ongoing project converting data from the ChEMBL database into RDF triples. Relative to earlier versions, this updated version of ChEMBL-RDF uses recently introduced ontologies, including CHEMINF and CiTO; exposes more information from the database; and is now available as dereferencable, linked data. To demonstrate these new features, we present novel use cases showing further integration with other web resources, including Bio2RDF, Chem2Bio2RDF, and ChemSpider, and showing the use of standard ontologies for querying. Conclusions We have illustrated the advantages of using open standards and ontologies to link the ChEMBL database to other databases. Using those links and the knowledge encoded in standards and ontologies, the ChEMBL-RDF resource creates a foundation for integrated semantic web cheminformatics applications, such as the presented decision support. PMID:23657106

  8. The ChEMBL bioactivity database: an update.

    PubMed

    Bento, A Patrícia; Gaulton, Anna; Hersey, Anne; Bellis, Louisa J; Chambers, Jon; Davies, Mark; Krüger, Felix A; Light, Yvonne; Mak, Lora; McGlinchey, Shaun; Nowotka, Michal; Papadatos, George; Santos, Rita; Overington, John P

    2014-01-01

    ChEMBL is an open large-scale bioactivity database (https://www.ebi.ac.uk/chembl), previously described in the 2012 Nucleic Acids Research Database Issue. Since then, a variety of new data sources and improvements in functionality have contributed to the growth and utility of the resource. In particular, more comprehensive tracking of compounds from research stages through clinical development to market is provided through the inclusion of data from United States Adopted Name applications; a new richer data model for representing drug targets has been developed; and a number of methods have been put in place to allow users to more easily identify reliable data. Finally, access to ChEMBL is now available via a new Resource Description Framework format, in addition to the web-based interface, data downloads and web services. PMID:24214965

  9. The ChEMBL bioactivity database: an update

    PubMed Central

    Bento, A. Patrícia; Gaulton, Anna; Hersey, Anne; Bellis, Louisa J.; Chambers, Jon; Davies, Mark; Krüger, Felix A.; Light, Yvonne; Mak, Lora; McGlinchey, Shaun; Nowotka, Michal; Papadatos, George; Santos, Rita; Overington, John P.

    2014-01-01

    ChEMBL is an open large-scale bioactivity database (https://www.ebi.ac.uk/chembl), previously described in the 2012 Nucleic Acids Research Database Issue. Since then, a variety of new data sources and improvements in functionality have contributed to the growth and utility of the resource. In particular, more comprehensive tracking of compounds from research stages through clinical development to market is provided through the inclusion of data from United States Adopted Name applications; a new richer data model for representing drug targets has been developed; and a number of methods have been put in place to allow users to more easily identify reliable data. Finally, access to ChEMBL is now available via a new Resource Description Framework format, in addition to the web-based interface, data downloads and web services. PMID:24214965

  10. Enantiomorphous Periodic Mesoporous Organosilica-Based Nanocomposite Hydrogel Scaffolds for Cell Adhesion and Cell Enrichment.

    PubMed

    Kehr, Nermin Seda

    2016-03-14

    The chemical functionalization of nanomaterials with bioactive molecules has been used as an effective tool to mimic extracellular matrix (ECM) and to study the cell-material interaction in tissue engineering applications. In this respect, this study demonstrates the use of enantiomerically functionalized periodic mesoporous organosilicas (PMO) for the generation of new multifunctional 3D nanocomposite (NC) hydrogels to control the affinity of cells to the hydrogel surfaces and so to control the enrichment of cells and simultaneous drug delivery in 3D network. The functionalization of PMO with enantiomers of bioactive molecules, preparation of their nanocomposite hydrogels, and the stereoselective interaction of them with selected cell types are described. The results show that the affinity of cells to the respective NC hydrogel scaffolds is affected by the nature of the biomolecule and its enantiomers, which is more pronounced in serum containing media. The differentiation of enantiomorphous NC hydrogels by cells is used to enrich one cell type from a mixture of two cells. Finally, PMO are utilized as nanocontainers to release two different dye molecules as a proof of principle for multidrug delivery in 3D NC hydrogel scaffolds. PMID:26811946

  11. Conformerism, enantiomorphism and double catemer motifs in para-substituted nostoclide analogues

    NASA Astrophysics Data System (ADS)

    Teixeira, Róbson Ricardo; Barbosa, Luiz Claudio Almeida; Valero Antolinez, Isabel; de Souza Corrêa, Rodrigo; Martins, Felipe Terra; Doriguetto, Antônio Carlos

    2016-02-01

    We have here elucidated the crystal structures of five nostoclide analogues. A common feature in all compounds is a substituent at the para-position of the benzylidene group. Compounds with either bromine (3) or hydroxyl (4) as para-substituent crystallizes with Z' = 2 as result of conformerism. It was also observed that Z' > 1 in the compound with a para-dimethylamino substituent (1). However, its four crystallographically independent molecules are conformationally similar. They are not related by crystallographic symmetry due to the offset packing of their C-H … Odbnd C nonclassical hydrogen bonded double chains. This compound (1) has also crystallized in a chiral space group (P21) despite the lack of a stereocenter. Such enantiomorphism phenomenon is related to the presence of only one of the two mirror benzyl conformations with phenyl ring at the equatorial position opposite the lactone oxygen atom. The molecular mean plane of nostoclide analogues has been featured by high level of planarity, except in the brominated compound where two twisted conformations occurred due to rotations on the single bond axis into benzylidene group. The benzyl conformation has been the greatest difference between the two crystallographically independent molecules of the para-hydroxylated compound (4). The crystal packing of the compounds is marked by double catemer motif assembled through C-H … Odbnd C non-classical hydrogen bonds, although C-H … π interactions do play an important role in stabilizing the crystal packing of some compounds of the series.

  12. ChEMBL web services: streamlining access to drug discovery data and utilities

    PubMed Central

    Davies, Mark; Nowotka, Michał; Papadatos, George; Dedman, Nathan; Gaulton, Anna; Atkinson, Francis; Bellis, Louisa; Overington, John P.

    2015-01-01

    ChEMBL is now a well-established resource in the fields of drug discovery and medicinal chemistry research. The ChEMBL database curates and stores standardized bioactivity, molecule, target and drug data extracted from multiple sources, including the primary medicinal chemistry literature. Programmatic access to ChEMBL data has been improved by a recent update to the ChEMBL web services (version 2.0.x, https://www.ebi.ac.uk/chembl/api/data/docs), which exposes significantly more data from the underlying database and introduces new functionality. To complement the data-focused services, a utility service (version 1.0.x, https://www.ebi.ac.uk/chembl/api/utils/docs), which provides RESTful access to commonly used cheminformatics methods, has also been concurrently developed. The ChEMBL web services can be used together or independently to build applications and data processing workflows relevant to drug discovery and chemical biology. PMID:25883136

  13. The EMBL-EBI bioinformatics web and programmatic tools framework.

    PubMed

    Li, Weizhong; Cowley, Andrew; Uludag, Mahmut; Gur, Tamer; McWilliam, Hamish; Squizzato, Silvano; Park, Young Mi; Buso, Nicola; Lopez, Rodrigo

    2015-07-01

    Since 2009 the EMBL-EBI Job Dispatcher framework has provided free access to a range of mainstream sequence analysis applications. These include sequence similarity search services (https://www.ebi.ac.uk/Tools/sss/) such as BLAST, FASTA and PSI-Search, multiple sequence alignment tools (https://www.ebi.ac.uk/Tools/msa/) such as Clustal Omega, MAFFT and T-Coffee, and other sequence analysis tools (https://www.ebi.ac.uk/Tools/pfa/) such as InterProScan. Through these services users can search mainstream sequence databases such as ENA, UniProt and Ensembl Genomes, utilising a uniform web interface or systematically through Web Services interfaces (https://www.ebi.ac.uk/Tools/webservices/) using common programming languages, and obtain enriched results with novel visualisations. Integration with EBI Search (https://www.ebi.ac.uk/ebisearch/) and the dbfetch retrieval service (https://www.ebi.ac.uk/Tools/dbfetch/) further expands the usefulness of the framework. New tools and updates such as NCBI BLAST+, InterProScan 5 and PfamScan, new categories such as RNA analysis tools (https://www.ebi.ac.uk/Tools/rna/), new databases such as ENA non-coding, WormBase ParaSite, Pfam and Rfam, and new workflow methods, together with the retirement of depreciated services, ensure that the framework remains relevant to today's biological community. PMID:25845596

  14. Sr14Sn3As12 and Eu14Sn3As12: enantiomorph-like Zintl compounds.

    PubMed

    Liu, Xiao-Cun; Pan, Ming-Yan; Xia, Sheng-Qing; Tao, Xu-Tang

    2015-09-21

    Two new chiral Zintl compounds, Sr14Sn3As12 and Eu14Sn3As12, were synthesized from tin-flux reactions, and the structures were determined by using single-crystal X-ray diffraction. Both compounds crystallize in the trigonal space group R3 (No. 146, Z = 3) with the anion structures containing various units: dumbbell-shaped [Sn2As6](12-) dimers, [SnAs3](7-) triangular pyramids, and isolated As(3-) anions. Very interestingly, these two compounds exhibit opposite chirality in the observed crystal structures, resembling enantiomorphs. Detailed structure analyses suggest possible steric effects among the anion clusters, and on the basis of the calculated electronic structures, substantial electron lone pairs exist on the anions of both compounds, which may provide a hint to understanding the origination of chirality in these intermetallic compounds. PMID:26361335

  15. SureChEMBL: a large-scale, chemically annotated patent document database.

    PubMed

    Papadatos, George; Davies, Mark; Dedman, Nathan; Chambers, Jon; Gaulton, Anna; Siddle, James; Koks, Richard; Irvine, Sean A; Pettersson, Joe; Goncharoff, Nicko; Hersey, Anne; Overington, John P

    2016-01-01

    SureChEMBL is a publicly available large-scale resource containing compounds extracted from the full text, images and attachments of patent documents. The data are extracted from the patent literature according to an automated text and image-mining pipeline on a daily basis. SureChEMBL provides access to a previously unavailable, open and timely set of annotated compound-patent associations, complemented with sophisticated combined structure and keyword-based search capabilities against the compound repository and patent document corpus; given the wealth of knowledge hidden in patent documents, analysis of SureChEMBL data has immediate applications in drug discovery, medicinal chemistry and other commercial areas of chemical science. Currently, the database contains 17 million compounds extracted from 14 million patent documents. Access is available through a dedicated web-based interface and data downloads at: https://www.surechembl.org/. PMID:26582922

  16. SureChEMBL: a large-scale, chemically annotated patent document database

    PubMed Central

    Papadatos, George; Davies, Mark; Dedman, Nathan; Chambers, Jon; Gaulton, Anna; Siddle, James; Koks, Richard; Irvine, Sean A.; Pettersson, Joe; Goncharoff, Nicko; Hersey, Anne; Overington, John P.

    2016-01-01

    SureChEMBL is a publicly available large-scale resource containing compounds extracted from the full text, images and attachments of patent documents. The data are extracted from the patent literature according to an automated text and image-mining pipeline on a daily basis. SureChEMBL provides access to a previously unavailable, open and timely set of annotated compound-patent associations, complemented with sophisticated combined structure and keyword-based search capabilities against the compound repository and patent document corpus; given the wealth of knowledge hidden in patent documents, analysis of SureChEMBL data has immediate applications in drug discovery, medicinal chemistry and other commercial areas of chemical science. Currently, the database contains 17 million compounds extracted from 14 million patent documents. Access is available through a dedicated web-based interface and data downloads at: https://www.surechembl.org/. PMID:26582922

  17. Activity, assay and target data curation and quality in the ChEMBL database.

    PubMed

    Papadatos, George; Gaulton, Anna; Hersey, Anne; Overington, John P

    2015-09-01

    The emergence of a number of publicly available bioactivity databases, such as ChEMBL, PubChem BioAssay and BindingDB, has raised awareness about the topics of data curation, quality and integrity. Here we provide an overview and discussion of the current and future approaches to activity, assay and target data curation of the ChEMBL database. This curation process involves several manual and automated steps and aims to: (1) maximise data accessibility and comparability; (2) improve data integrity and flag outliers, ambiguities and potential errors; and (3) add further curated annotations and mappings thus increasing the usefulness and accuracy of the ChEMBL data for all users and modellers in particular. Issues related to activity, assay and target data curation and integrity along with their potential impact for users of the data are discussed, alongside robust selection and filter strategies in order to avoid or minimise these, depending on the desired application. PMID:26201396

  18. The SWISS-PROT protein sequence data bank and its supplement TrEMBL in 1998.

    PubMed Central

    Bairoch, A; Apweiler, R

    1998-01-01

    SWISS-PROT (http://www.expasy.ch/) is a curated protein sequence database which strives to provide a high level of annotations (such as the description of the function of a protein, its domains structure, post-translational modifications, variants, etc.), a minimal level of redundancy and high level of integration with other databases. Recent developments of the database include: an increase in the number and scope of model organisms; cross-references to two additional databases; a variety of new documentation files and improvements to TrEMBL, a computer annotated supplement to SWISS-PROT. TrEMBL consists of entries in SWISS-PROT-like format derived from the translation of all coding sequences (CDS) in the EMBL nucleotide sequence database, except the CDS already included in SWISS-PROT. PMID:9399796

  19. The SWISS-PROT protein sequence data bank and its supplement TrEMBL in 1999.

    PubMed Central

    Bairoch, A; Apweiler, R

    1999-01-01

    SWISS-PROT is a curated protein sequence database which strives to provide a high level of annotation (such as the description of the function of a protein, its domain structure, post-translational modifications, variants, etc.), a minimal level of redundancy and high level of integration with other databases. Recent developments of the database include: cross-references to additional databases; a variety of new documentation files and improvements to TrEMBL, a computer annotated supplement to SWISS-PROT. TrEMBL consists of entries in SWISS-PROT-like format derived from the translation of all coding sequences (CDS) in the EMBL nucleotide sequence database, except the CDS already included in SWISS-PROT. The URLs for SWISS-PROT on the WWW are: http://www.expasy.ch/sprot and http://www. ebi.ac.uk/sprot PMID:9847139

  20. The EBI Search engine: providing search and retrieval functionality for biological data from EMBL-EBI.

    PubMed

    Squizzato, Silvano; Park, Young Mi; Buso, Nicola; Gur, Tamer; Cowley, Andrew; Li, Weizhong; Uludag, Mahmut; Pundir, Sangya; Cham, Jennifer A; McWilliam, Hamish; Lopez, Rodrigo

    2015-07-01

    The European Bioinformatics Institute (EMBL-EBI-https://www.ebi.ac.uk) provides free and unrestricted access to data across all major areas of biology and biomedicine. Searching and extracting knowledge across these domains requires a fast and scalable solution that addresses the requirements of domain experts as well as casual users. We present the EBI Search engine, referred to here as 'EBI Search', an easy-to-use fast text search and indexing system with powerful data navigation and retrieval capabilities. API integration provides access to analytical tools, allowing users to further investigate the results of their search. The interconnectivity that exists between data resources at EMBL-EBI provides easy, quick and precise navigation and a better understanding of the relationship between different data types including sequences, genes, gene products, proteins, protein domains, protein families, enzymes and macromolecular structures, together with relevant life science literature. PMID:25855807

  1. The EBI Search engine: providing search and retrieval functionality for biological data from EMBL-EBI

    PubMed Central

    Squizzato, Silvano; Park, Young Mi; Buso, Nicola; Gur, Tamer; Cowley, Andrew; Li, Weizhong; Uludag, Mahmut; Pundir, Sangya; Cham, Jennifer A.; McWilliam, Hamish; Lopez, Rodrigo

    2015-01-01

    The European Bioinformatics Institute (EMBL-EBI—https://www.ebi.ac.uk) provides free and unrestricted access to data across all major areas of biology and biomedicine. Searching and extracting knowledge across these domains requires a fast and scalable solution that addresses the requirements of domain experts as well as casual users. We present the EBI Search engine, referred to here as ‘EBI Search’, an easy-to-use fast text search and indexing system with powerful data navigation and retrieval capabilities. API integration provides access to analytical tools, allowing users to further investigate the results of their search. The interconnectivity that exists between data resources at EMBL-EBI provides easy, quick and precise navigation and a better understanding of the relationship between different data types including sequences, genes, gene products, proteins, protein domains, protein families, enzymes and macromolecular structures, together with relevant life science literature. PMID:25855807

  2. High-quality protein knowledge resource: SWISS-PROT and TrEMBL.

    PubMed

    O'Donovan, Claire; Martin, Maria Jesus; Gattiker, Alexandre; Gasteiger, Elisabeth; Bairoch, Amos; Apweiler, Rolf

    2002-09-01

    SWISS-PROT is a curated protein sequence database which strives to provide a high level of annotation (such as the description of the function of a protein, its domain structure, post-translational modifications, variants, etc.), a minimal level of redundancy and a high level of integration with other databases. Together with its automatically annotated supplement TrEMBL, it provides a comprehensive and high-quality view of the current state of knowledge about proteins. Ongoing developments include the further improvement of functional and automatic annotation in the databases including evidence attribution with particular emphasis on the human, archaeal and bacterial proteomes and the provision of additional resources such as the International Protein Index (IPI) and XML format of SWISS-PROT and TrEMBL to the user community. PMID:12230036

  3. The SWISS-PROT protein sequence data bank and its supplement TrEMBL.

    PubMed Central

    Bairoch, A; Apweiler, R

    1997-01-01

    SWISS-PROT is a curated protein sequence database which strives to provide a high level of annotations (such as the description of the function of a protein, structure of its domains, post-translational modifications, variants, etc.), a minimal level of redundancy and high level of integration with other databases. Recent developments of the database include: an increase in the number and scope of model organisms; cross-references to two additional databases; a variety of new documentation files and the creation of TrEMBL, a computer annotated supplement to SWISS-PROT. This supplement consists of entries in SWISS-PROT-like format derived from the translation of all coding sequences (CDS) in the EMBL nucleotide sequence database, except the CDS already included in SWISS-PROT. PMID:9016499

  4. The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000

    PubMed Central

    Bairoch, Amos; Apweiler, Rolf

    2000-01-01

    SWISS-PROT is a curated protein sequence database which strives to provide a high level of annotation (such as the description of the function of a protein, its domains structure, post-translational modifications, variants, etc.), a minimal level of redundancy and high level of integration with other databases. Recent developments of the database include format and content enhancements, cross-references to additional databases, new documentation files and improvements to TrEMBL, a computer-annotated supplement to SWISS-PROT. TrEMBL consists of entries in SWISS-PROT-like format derived from the translation of all coding sequences (CDSs) in the EMBL Nucleotide Sequence Database, except the CDSs already included in SWISS-PROT. We also describe the Human Proteomics Initiative (HPI), a major project to annotate all known human sequences according to the quality standards of SWISS-PROT. SWISS-PROT is available at: http://www.expasy.ch/sprot/ and http://www.ebi.ac.uk/swissprot/ PMID:10592178

  5. The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000.

    PubMed

    Bairoch, A; Apweiler, R

    2000-01-01

    SWISS-PROT is a curated protein sequence database which strives to provide a high level of annotation (such as the description of the function of a protein, its domains structure, post-translational modifications, variants, etc.), a minimal level of redundancy and high level of integration with other databases. Recent developments of the database include format and content enhancements, cross-references to additional databases, new documentation files and improvements to TrEMBL, a computer-annotated supplement to SWISS-PROT. TrEMBL consists of entries in SWISS-PROT-like format derived from the translation of all coding sequences (CDSs) in the EMBL Nucleotide Sequence Database, except the CDSs already included in SWISS-PROT. We also describe the Human Proteomics Initiative (HPI), a major project to annotate all known human sequences according to the quality standards of SWISS-PROT. SWISS-PROT is available at: http://www.expasy.ch/sprot/ and http://www.ebi.ac.uk/swissprot/ PMID:10592178

  6. Using EMBL-EBI services via Web interface and programmatically via Web Services

    PubMed Central

    Lopez, Rodrigo; Cowley, Andrew; Li, Weizhong; McWilliam, Hamish

    2015-01-01

    The European Bioinformatics Institute (EMBL-EBI) provides access to a wide range of databases and analysis tools that are of key importance in bioinformatics. As well as providing Web interfaces to these resources, Web Services are available using SOAP and REST protocols that enable programmatic access to our resources and allow their integration into other applications and analytical workflows. This unit describes the various options available to a typical researcher or bioinformatician who wishes to use our resources via Web interface or programmatically via a range of programming languages. PMID:25501941

  7. An integrated pipeline for sample preparation and characterization at the EMBL@PETRA3 synchrotron facilities.

    PubMed

    Boivin, Stephane; Kozak, Sandra; Rasmussen, Gry; Nemtanu, Ioana Maria; Vieira, Vanessa; Meijers, Rob

    2016-02-15

    The characterization of macromolecular samples at synchrotrons has traditionally been restricted to direct exposure to X-rays, but beamline automation and diversification of the user community has led to the establishment of complementary characterization facilities off-line. The Sample Preparation and Characterization (SPC) facility at the EMBL@PETRA3 synchrotron provides synchrotron users access to a range of biophysical techniques for preliminary or parallel sample characterization, to optimize sample usage at the beamlines. Here we describe a sample pipeline from bench to beamline, to assist successful structural characterization using small angle X-ray scattering (SAXS) or macromolecular X-ray crystallography (MX). The SPC has developed a range of quality control protocols to assess incoming samples and to suggest optimization protocols. A high-throughput crystallization platform has been adapted to reach a broader user community, to include chemists and biologists that are not experts in structural biology. The SPC in combination with the beamline and computational facilities at EMBL Hamburg provide a full package of integrated facilities for structural biology and can serve as model for implementation of such resources for other infrastructures. PMID:26255961

  8. Similarity Mapplet: Interactive Visualization of the Directory of Useful Decoys and ChEMBL in High Dimensional Chemical Spaces.

    PubMed

    Awale, Mahendra; Reymond, Jean-Louis

    2015-08-24

    An Internet portal accessible at www.gdb.unibe.ch has been set up to automatically generate color-coded similarity maps of the ChEMBL database in relation to up to two sets of active compounds taken from the enhanced Directory of Useful Decoys (eDUD), a random set of molecules, or up to two sets of user-defined reference molecules. These maps visualize the relationships between the selected compounds and ChEMBL in six different high dimensional chemical spaces, namely MQN (42-D molecular quantum numbers), SMIfp (34-D SMILES fingerprint), APfp (20-D shape fingerprint), Xfp (55-D pharmacophore fingerprint), Sfp (1024-bit substructure fingerprint), and ECfp4 (1024-bit extended connectivity fingerprint). The maps are supplied in form of Java based desktop applications called "similarity mapplets" allowing interactive content browsing and linked to a "Multifingerprint Browser for ChEMBL" (also accessible directly at www.gdb.unibe.ch ) to perform nearest neighbor searches. One can obtain six similarity mapplets of ChEMBL relative to random reference compounds, 606 similarity mapplets relative to single eDUD active sets, 30,300 similarity mapplets relative to pairs of eDUD active sets, and any number of similarity mapplets relative to user-defined reference sets to help visualize the structural diversity of compound series in drug optimization projects and their relationship to other known bioactive compounds. PMID:26207526

  9. Chemical Space Mapping and Structure-Activity Analysis of the ChEMBL Antiviral Compound Set.

    PubMed

    Klimenko, Kyrylo; Marcou, Gilles; Horvath, Dragos; Varnek, Alexandre

    2016-08-22

    Curation, standardization and data fusion of the antiviral information present in the ChEMBL public database led to the definition of a robust data set, providing an association of antiviral compounds to seven broadly defined antiviral activity classes. Generative topographic mapping (GTM) subjected to evolutionary tuning was then used to produce maps of the antiviral chemical space, providing an optimal separation of compound families associated with the different antiviral classes. The ability to pinpoint the specific spots occupied (responsibility patterns) on a map by various classes of antiviral compounds opened the way for a GTM-supported search for privileged structural motifs, typical for each antiviral class. The privileged locations of antiviral classes were analyzed in order to highlight underlying privileged common structural motifs. Unlike in classical medicinal chemistry, where privileged structures are, almost always, predefined scaffolds, privileged structural motif detection based on GTM responsibility patterns has the decisive advantage of being able to automatically capture the nature ("resolution detail"-scaffold, detailed substructure, pharmacophore pattern, etc.) of the relevant structural motifs. Responsibility patterns were found to represent underlying structural motifs of various natures-from very fuzzy (groups of various "interchangeable" similar scaffolds), to the classical scenario in medicinal chemistry (underlying motif actually being the scaffold), to very precisely defined motifs (specifically substituted scaffolds). PMID:27410486

  10. The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003.

    PubMed

    Boeckmann, Brigitte; Bairoch, Amos; Apweiler, Rolf; Blatter, Marie-Claude; Estreicher, Anne; Gasteiger, Elisabeth; Martin, Maria J; Michoud, Karine; O'Donovan, Claire; Phan, Isabelle; Pilbout, Sandrine; Schneider, Michel

    2003-01-01

    The SWISS-PROT protein knowledgebase (http://www.expasy.org/sprot/ and http://www.ebi.ac.uk/swissprot/) connects amino acid sequences with the current knowledge in the Life Sciences. Each protein entry provides an interdisciplinary overview of relevant information by bringing together experimental results, computed features and sometimes even contradictory conclusions. Detailed expertise that goes beyond the scope of SWISS-PROT is made available via direct links to specialised databases. SWISS-PROT provides annotated entries for all species, but concentrates on the annotation of entries from human (the HPI project) and other model organisms to ensure the presence of high quality annotation for representative members of all protein families. Part of the annotation can be transferred to other family members, as is already done for microbes by the High-quality Automated and Manual Annotation of microbial Proteomes (HAMAP) project. Protein families and groups of proteins are regularly reviewed to keep up with current scientific findings. Complementarily, TrEMBL strives to comprise all protein sequences that are not yet represented in SWISS-PROT, by incorporating a perpetually increasing level of mostly automated annotation. Researchers are welcome to contribute their knowledge to the scientific community by submitting relevant findings to SWISS-PROT at swiss-prot@expasy.org. PMID:12520024

  11. The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003

    PubMed Central

    Boeckmann, Brigitte; Bairoch, Amos; Apweiler, Rolf; Blatter, Marie-Claude; Estreicher, Anne; Gasteiger, Elisabeth; Martin, Maria J.; Michoud, Karine; O'Donovan, Claire; Phan, Isabelle; Pilbout, Sandrine; Schneider, Michel

    2003-01-01

    The SWISS-PROT protein knowledgebase (http://www.expasy.org/sprot/ and http://www.ebi.ac.uk/swissprot/) connects amino acid sequences with the current knowledge in the Life Sciences. Each protein entry provides an interdisciplinary overview of relevant information by bringing together experimental results, computed features and sometimes even contradictory conclusions. Detailed expertise that goes beyond the scope of SWISS-PROT is made available via direct links to specialised databases. SWISS-PROT provides annotated entries for all species, but concentrates on the annotation of entries from human (the HPI project) and other model organisms to ensure the presence of high quality annotation for representative members of all protein families. Part of the annotation can be transferred to other family members, as is already done for microbes by the High-quality Automated and Manual Annotation of microbial Proteomes (HAMAP) project. Protein families and groups of proteins are regularly reviewed to keep up with current scientific findings. Complementarily, TrEMBL strives to comprise all protein sequences that are not yet represented in SWISS-PROT, by incorporating a perpetually increasing level of mostly automated annotation. Researchers are welcome to contribute their knowledge to the scientific community by submitting relevant findings to SWISS-PROT at swiss-prot@expasy.org. PMID:12520024

  12. Submitting MIGS, MIMS, MIENS Information to EMBL and Standards and the Sequencing Pipelines of the Gordon and Betty Moore Foundation (GSC8 Meeting)

    ScienceCinema

    Vaughan, Bob [EMBL]; Kaye, Jon [Gordon and Betty Moore Foundation

    2011-04-29

    The Genomic Standards Consortium was formed in September 2005. It is an international, open-membership working body which promotes standardization in the description of genomes and the exchange and integration of genomic data. The 2009 meeting was an activity of a five-year funding "Research Coordination Network" from the National Science Foundation and was organized held at the DOE Joint Genome Institute with organizational support provided by the JGI and by the University of California - San Diego. Bob Vaughan of EMBL on submitting MIGS/MIMS/MIENS information to EMBL-EBI's system, followed by a brief talk from Jon Kaye of the Gordon and Betty Moore Foundation on standards and the foundation's sequencing pipelines at the Genomic Standards Consortium's 8th meeting at the DOE JGI in Walnut Creek, Calif. on Sept. 9, 2009

  13. Submitting MIGS, MIMS, MIENS Information to EMBL and Standards and the Sequencing Pipelines of the Gordon and Betty Moore Foundation (GSC8 Meeting)

    SciTech Connect

    Vaughan, Bob; Kaye, Jon

    2009-09-09

    The Genomic Standards Consortium was formed in September 2005. It is an international, open-membership working body which promotes standardization in the description of genomes and the exchange and integration of genomic data. The 2009 meeting was an activity of a five-year funding "Research Coordination Network" from the National Science Foundation and was organized held at the DOE Joint Genome Institute with organizational support provided by the JGI and by the University of California - San Diego. Bob Vaughan of EMBL on submitting MIGS/MIMS/MIENS information to EMBL-EBI's system, followed by a brief talk from Jon Kaye of the Gordon and Betty Moore Foundation on standards and the foundation's sequencing pipelines at the Genomic Standards Consortium's 8th meeting at the DOE JGI in Walnut Creek, Calif. on Sept. 9, 2009

  14. Automated sample-changing robot for solution scattering experiments at the EMBL Hamburg SAXS station X33.

    PubMed

    Round, A R; Franke, D; Moritz, S; Huchler, R; Fritsche, M; Malthan, D; Klaering, R; Svergun, D I; Roessle, M

    2008-10-01

    There is a rapidly increasing interest in the use of synchrotron small-angle X-ray scattering (SAXS) for large-scale studies of biological macromolecules in solution, and this requires an adequate means of automating the experiment. A prototype has been developed of an automated sample changer for solution SAXS, where the solutions are kept in thermostatically controlled well plates allowing for operation with up to 192 samples. The measuring protocol involves controlled loading of protein solutions and matching buffers, followed by cleaning and drying of the cell between measurements. The system was installed and tested at the X33 beamline of the EMBL, at the storage ring DORIS-III (DESY, Hamburg), where it was used by over 50 external groups during 2007. At X33, a throughput of approximately 12 samples per hour, with a failure rate of sample loading of less than 0.5%, was observed. The feedback from users indicates that the ease of use and reliability of the user operation at the beamline were greatly improved compared with the manual filling mode. The changer is controlled by a client-server-based network protocol, locally and remotely. During the testing phase, the changer was operated in an attended mode to assess its reliability and convenience. Full integration with the beamline control software, allowing for automated data collection of all samples loaded into the machine with remote control from the user, is presently being implemented. The approach reported is not limited to synchrotron-based SAXS but can also be used on laboratory and neutron sources. PMID:25484841

  15. Automated sample-changing robot for solution scattering experiments at the EMBL Hamburg SAXS station X33

    PubMed Central

    Round, A. R.; Franke, D.; Moritz, S.; Huchler, R.; Fritsche, M.; Malthan, D.; Klaering, R.; Svergun, D. I.; Roessle, M.

    2008-01-01

    There is a rapidly increasing interest in the use of synchrotron small-angle X-ray scattering (SAXS) for large-scale studies of biological macromolecules in solution, and this requires an adequate means of automating the experiment. A prototype has been developed of an automated sample changer for solution SAXS, where the solutions are kept in thermostatically controlled well plates allowing for operation with up to 192 samples. The measuring protocol involves controlled loading of protein solutions and matching buffers, followed by cleaning and drying of the cell between measurements. The system was installed and tested at the X33 beamline of the EMBL, at the storage ring DORIS-III (DESY, Hamburg), where it was used by over 50 external groups during 2007. At X33, a throughput of approximately 12 samples per hour, with a failure rate of sample loading of less than 0.5%, was observed. The feedback from users indicates that the ease of use and reliability of the user operation at the beamline were greatly improved compared with the manual filling mode. The changer is controlled by a client–server-based network protocol, locally and remotely. During the testing phase, the changer was operated in an attended mode to assess its reliability and convenience. Full integration with the beamline control software, allowing for automated data collection of all samples loaded into the machine with remote control from the user, is presently being implemented. The approach reported is not limited to synchrotron-based SAXS but can also be used on laboratory and neutron sources. PMID:25484841

  16. BLAST2SRS, a web server for flexible retrieval of related protein sequences in the SWISS-PROT and SPTrEMBL databases

    PubMed Central

    Bimpikis, Konstantinos; Budd, Aidan; Linding, Rune; Gibson, Toby J.

    2003-01-01

    SRS (Sequence Retrieval System) is a widely used keyword search engine for querying biological databases. BLAST2 is the most widely used tool to query databases by sequence similarity search. These tools allow users to retrieve sequences by shared keyword or by shared similarity, with many public web servers available. However, with the increasingly large datasets available it is now quite common that a user is interested in some subset of homologous sequences but has no efficient way to restrict retrieval to that set. By allowing the user to control SRS from the BLAST output, BLAST2SRS (http://blast2srs.embl.de/) aims to meet this need. This server therefore combines the two ways to search sequence databases: similarity and keyword. PMID:12824420

  17. Centrosome and spindle pole body dynamics. Review and abstracts of the EMBO/EMBL Conference on Centrosomes and Spindle Pole Bodies, Heidelberg, September 13-17, 2002.

    PubMed

    2003-02-01

    Five years after the first meeting held on Centrosomes and Spindle Pole Bodies, a second meeting was organized by Tano Gonzalez, Eric Karsenti, Kip Sluder, and Mark Winey in Heidelberg, Germany. Sponsored by the gracious European community (EMBO/EMBL), the meeting was both spectacular and exhausting. The wealth of information delivered, the plethora of model systems and unique approaches described, and the free exchange of information by a cooperative and excited community of scientists overwhelmed all participants. Even the best prepared scholars could not have anticipated the avalanche of data and insights that poured from the presentations from beginning to end. Daily posters by young and senior scientists added dimension to round out the well-planned series of presentations. The meeting began with opening remarks by Eric Karsenti and Michel Bornens who reminded participants of the historical questions of the field. Where does the centrosome come from? What are the mechanisms that control centrosome assembly and duplication? How is duplication coordinated with the cell cycle? Why do some cells have centrosomes, while others do not? What are the components of the centrosome? Does the centrosome play an important role in disease? PMID:12529860

  18. Discovery of Potent Positive Allosteric Modulators of the α3β2 Nicotinic Acetylcholine Receptor by a Chemical Space Walk in ChEMBL

    PubMed Central

    2014-01-01

    While a plethora of ligands are known for the well studied α7 and α4β2 nicotinic acetylcholine receptor (nAChR), only very few ligands address the related α3β2 nAChR expressed in the central nervous system and at the neuromuscular junction. Starting with the public database ChEMBL organized in the chemical space of Molecular Quantum Numbers (MQN, a series of 42 integer value descriptors of molecular structure), a visual survey of nearest neighbors of the α7 nAChR partial agonist N-(3R)-1-azabicyclo[2.2.2]oct-3-yl-4-chlorobenzamide (PNU-282,987) pointed to N-(2-halobenzyl)-3-aminoquinuclidines as possible nAChR modulators. This simple “chemical space walk” was performed using a web-browser available at www.gdb.unibe.ch. Electrophysiological recordings revealed that these ligands represent a new and to date most potent class of positive allosteric modulators (PAMs) of the α3β2 nAChR, which also exert significant effects in vivo. The present discovery highlights the value of surveying chemical space neighbors of known drugs within public databases to uncover new pharmacology. PMID:24593915

  19. Template CoMFA Generates Single 3D-QSAR Models that, for Twelve of Twelve Biological Targets, Predict All ChEMBL-Tabulated Affinities

    PubMed Central

    Cramer, Richard D.

    2015-01-01

    The possible applicability of the new template CoMFA methodology to the prediction of unknown biological affinities was explored. For twelve selected targets, all ChEMBL binding affinities were used as training and/or prediction sets, making these 3D-QSAR models the most structurally diverse and among the largest ever. For six of the targets, X-ray crystallographic structures provided the aligned templates required as input (BACE, cdk1, chk2, carbonic anhydrase-II, factor Xa, PTP1B). For all targets including the other six (hERG, cyp3A4 binding, endocrine receptor, COX2, D2, and GABAa), six modeling protocols applied to only three familiar ligands provided six alternate sets of aligned templates. The statistical qualities of the six or seven models thus resulting for each individual target were remarkably similar. Also, perhaps unexpectedly, the standard deviations of the errors of cross-validation predictions accompanying model derivations were indistinguishable from the standard deviations of the errors of truly prospective predictions. These standard deviations of prediction ranged from 0.70 to 1.14 log units and averaged 0.89 (8x in concentration units) over the twelve targets, representing an average reduction of almost 50% in uncertainty, compared to the null hypothesis of “predicting” an unknown affinity to be the average of known affinities. These errors of prediction are similar to those from Tanimoto coefficients of fragment occurrence frequencies, the predominant approach to side effect prediction, which template CoMFA can augment by identifying additional active structural classes, by improving Tanimoto-only predictions, by yielding quantitative predictions of potency, and by providing interpretable guidance for avoiding or enhancing any specific target response. PMID:26065424

  20. The open-access high-throughput crystallization facility at EMBL Hamburg.

    PubMed

    Mueller-Dieckmann, Jochen

    2006-12-01

    Here, the establishment of Europe's largest high-throughput crystallization facility with open access to the general user community is reported. The facility covers every step in the crystallization process from the preparation of crystallization cocktails for initial or customized screens to the setup of hanging-drop vapour-diffusion experiments and their automatic imaging. In its first year of operation, 43 internal and 40 external users submitted over 500 samples for a total of 2985 crystallization plates. An electronic booking system for registration, the selection of experimental parameters (e.g. drop size, sample-to-reservoir ratio) and the reservation of time slots was developed. External users can choose from more than 1000 initial crystallization conditions. By default, experiments are kept for six months and are imaged 15 times during this time period. A remote viewing system is available to inspect experiments via the internet. Over 100 stock solutions are available for users wishing to design customized screens. PMID:17139079

  1. 5-Oxatricyclo[5.1.0.0(1,3)]octan-4-one, containing an enantiomorph and a racemate and not two polymorphs, is another example of a composite crystal.

    PubMed

    Herbstein, Frank H

    2003-04-01

    This new example of a composite crystal adds to the small number previously reported among inorganic and molecular materials. Revision of the nomenclature used allows more fruitful comparison with the precedents. PMID:12657822

  2. A day of systems and synthetic biology for non-experts: reflections on day 1 of the EMBL/EMBO joint conference on Science and Society.

    PubMed

    Moore, Andrew

    2009-01-01

    From understanding ageing to the creation of artificial membrane-bounded 'organisms', systems biology and synthetic biology are seen as the latest revolutions in the life sciences. They certainly represent a major change of gear, but paradigm shifts? This is open to debate, to say the least. For scientists they open up exciting ways of studying living systems, of formulating the 'laws of life', and the relationship between the origin of life, evolution and artificial biological systems. However, the ethical and societal considerations are probably indistinguishable from those of human genetics and genetically modified organisms. There are some tangible developments just around the corner for society, and as ever, our ability to understand the consequences of, and manage, our own progress lags far behind our technological abilities. Furthermore our educational systems are doing a bad job of preparing the next generation of scientists and non-scientists. PMID:19153995

  3. US Imperialism, Transmodernism and Education: A Marxist Critique

    ERIC Educational Resources Information Center

    Cole, Mike

    2004-01-01

    The author begins by discussing David Geoffrey Smith's analysis of the enantiomorphism inherent in the rhetoric of New American Imperialism. He goes on to examine critically Smith's defence of Enrique Dussel's advocacy of transmodernism as a way of understanding this enantiomorphism and of moving beyond what are seen as the constraints of both…

  4. Modelling Symmetry Classes 233 and 432.

    ERIC Educational Resources Information Center

    Dutch, Steven I.

    1986-01-01

    Offers instructions and geometrical data for constructing solids of the enantiomorphous symmetry classes 233 and 432. Provides background information for each class and highlights symmetrical relationships and construction patterns. (ML)

  5. Manifestation of optical activity in different materials

    NASA Astrophysics Data System (ADS)

    Konstantinova, A. F.; Golovina, T. G.; Konstantinov, K. K.

    2014-07-01

    Various manifestations of optical activity (OA) in crystals and organic materials are considered. Examples of optically active enantiomorphic and nonenantiomorphic crystals of 18 symmetry classes are presented. The OA of enantiomorphic organic materials as components of living nature (amino acids, sugars, and proteins) is analyzed. Questions related to the origin of life on earth are considered. Examples of differences in the enantiomers of drugs are shown. The consequences of replacing conventional left-handed amino acids with additionally right-handed amino acids for living organisms are indicated.

  6. True and false chirality, {ital CP} violation, and the breakdown of microscopic reversibility in chiral molecular and elementary particle processes

    SciTech Connect

    Barron, L.D.

    1996-07-01

    The concept of chirality is extended to cover systems that exhibit enantiomorphism on account of motion. This is achieved by applying time reversal in addition to space inversion and leads to a more precise definition of a chiral system. Although spatial enantiomorphism is sufficient to guarantee chirality in a stationary system such as a finite helix, enantiomorphous systems are not necessarily chiral when motion is involved, which leads to the concept of true and false chirality associated with time-invariant and time-noninvariant enantiomorphism, respectively. Only a truly chiral influence can induce an enantiomeric excess in a reaction that has reached true thermodynamic equilibrium (i.e., when all possible interconversion pathways have equilibrated); however, false chirality can suffice in a reaction under kinetic control due to a breakdown of microscopic reversibility analogous to that observed in particle-antiparticle processes involving the neutral K-meason as a result of {ital CP} violation, with the apparently contradictory kinetic and thermodynamic aspects being reconciled by an appeal to unitarity. This reveals that {ital CP} violation is analogous to chemical catalysis since it affects the rates of certain particle-antiparticle interconversion pathways without affecting the initial and final particle energies and hence the equilibrium thermodynamics. Consideration of falsely chiral influences, including the {open_quote}ratchet effect{close_quote} arising from the associated breakdown in microscopic reversibility, greatly enlarges the range of possible chiral advantage factors in prebiotic chemical processes if far from equilibrium. {copyright} {ital 1996 American Institute of Physics.}

  7. Statistical models and NMR analysis of polymer microstructure

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Statistical models can be used in conjunction with NMR spectroscopy to study polymer microstructure and polymerization mechanisms. Thus, Bernoullian, Markovian, and enantiomorphic-site models are well known. Many additional models have been formulated over the years for additional situations. Typica...

  8. Differential effects of opiates on the incorporation of [14C] thiamine in the central nervous system of the rat.

    PubMed

    Misra, A L; Vadlamani, N L; Pontani, R B

    1977-03-15

    Opiate agonist (morphine), pure antagonist (naloxone), mixed agonist-antagonist (nalorphine) and analgesically inactive enantiomorph (dextrorphan) produced differential stereoselective effects on the incorporation of [14C] thiamine in the central nervous system of the rats. The possible role of thiamine in opiate effects and its implications are discussed. PMID:858372

  9. Carl Friedrich Naumann and the introduction of enantio terminology: a review and analysis on the 150th anniversary.

    PubMed

    Gal, Joseph

    2007-02-01

    Enantiomorphism and enantiomorphous were the first enantio-based terms, introduced 150 years ago, by Carl Friedrich Naumann, a German crystallographer, to refer to non-superposable mirror-image crystals. The terminology was not adopted by Pasteur, the discoverer of molecular chirality, and was not embraced at first in the stereochemical context, until it was accepted in 1877 by Van't Hoff in the German edition of his proposal for the tetrahedral asymmetric carbon atom. In the 1890s the use of enantio terms began to spread in the research literature, and many new derivatives of Naumann's original two terms were subsequently introduced. Problems in the usage of some of the terms are often found in the literature, e.g., enantiomorphism is sometimes confused with chirality; enantiomeric is often misused; the meaning of some of the many derived terms, e.g., enantiosymmetric, enantioposition, etc., is unclear. All in all, Naumann should be remembered as the creator of essential terminology in the realm of chirality. PMID:17096375

  10. TARA OCEANS: A Global Analysis of Oceanic Plankton Ecosystems (2013 DOE JGI Genomics of Energy and Environment 8th Annual User Meeting)

    SciTech Connect

    Karsenti, Eric

    2013-03-01

    Eric Karsenti of EMBL delivers the closing keynote on "TARA OCEANS: A Global Analysis of Oceanic Plankton Ecosystems" at the 8th Annual Genomics of Energy & Environment Meeting on March 28, 2013 in Walnut Creek, Calif.

  11. A Molecular Switch Based on Current-Driven Rotation of an Encapsulated Cluster within a Fullerene Cage

    SciTech Connect

    Huang, Tian; Zhao, Jin; Feng, Min; Popov, Alexey A.; Yang, Shangfeng; Dunsch, Lothar; Petek, Hrvoje

    2011-12-14

    By scanning tunneling microscopy imaging and electronic structure theory, we investigate a single-molecule switch based on tunneling electron-driven rotation of a triangular Sc3N cluster within an icosahedral C80 fullerene cage among three pairs of enantiomorphic configurations. Bias-dependent action spectra and modeling implicate the antisymmetric stretch vibration of Sc3N cluster as the gateway for energy transfer from the tunneling electrons into the cluster rotation. Hierarchical switching of conductivity among multiple stationary states while maintaining a constant molecular shape, offers an advantage for the integration of endohedral fullerene-based single-molecule switches into multiple logic state molecular devices.

  12. [Morphological diversity of Pandorina morum (Mull.) Vory (Volvocaceae) colonies].

    PubMed

    Voĭtekhovskiĭ, Iu L

    2001-01-01

    Morphological variability of polyhedral colonies of green algae (Volvocaceae) were studied using some elements of combinative theory of polyhedron and the theory of diophantine equations. These colonies are considered as results of self-organization according to topological regularities of sphere dissection by convex polygons. It was shown that in three-dimensional Euclidean space for each colony of Pandorina morum (Müll.) Bory only three different forms are possible. One of them has no plane of symmetry and, thus, has two enantiomorphous varieties. It is suggested that frequency spectrum of forms can be used as potential indicator of environment pollution. PMID:11605552

  13. Expression and crystallization of a soluble form of Drosophila fasciclin III.

    PubMed

    Strong, R K; Vaughn, D E; Bjorkman, P J; Snow, P M

    1994-08-19

    A truncated form of Drosophila fasciclin III has been engineered by site-directed mutagenesis. Secreted fasciclin III is expressed at 35 to 40 mg/l in insect cells with baculovirus carrying the recombinant gene. Single crystals of purified soluble fasciclin III have been grown by vapor diffusion versus polyethylene glycol 8000/sodium citrate at low pH. The space group is P6(1)22 or its enantiomorph P6(5)22, with unit cell dimensions a = b = 140 A, c = 260 A. Cryo-preserved crystals diffract to reciprocal lattice spacings beyond 3.0 A. PMID:8064861

  14. Analysis of genetic stability at SSR loci during somatic embryogenesis in maritime pine (Pinus pinaster).

    PubMed

    Marum, Liliana; Rocheta, Margarida; Maroco, João; Oliveira, M Margarida; Miguel, Célia

    2009-04-01

    Somatic embryogenesis (SE) is a propagation tool of particular interest for accelerating the deployment of new high-performance planting stock in multivarietal forestry. However, genetic conformity in in vitro propagated plants should be assessed as early as possible, especially in long-living trees such as conifers. The main objective of this work was to study such conformity based on genetic stability at simple sequence repeat (SSR) loci during somatic embryogenesis in maritime pine (Pinus pinaster Ait.). Embryogenic cell lines (ECLs) subjected to tissue proliferation during 6, 14 or 22 months, as well as emblings regenerated from several ECLs, were analyzed. Genetic variation at seven SSR loci was detected in ECLs under proliferation conditions for all time points, and in 5 out of 52 emblings recovered from somatic embryos. Three of these five emblings showed an abnormal phenotype consisting mainly of plagiotropism and loss of apical dominance. Despite the variation found in somatic embryogenesis-derived plant material, no correlation was established between genetic stability at the analyzed loci and abnormal embling phenotype, present in 64% of the emblings. The use of microsatellites in this work was efficient for monitoring mutation events during the somatic embryogenesis in P. pinaster. These molecular markers should be useful in the implementation of new breeding and deployment strategies for improved trees using SE. PMID:19153739

  15. Bioinformatics Goes to School—New Avenues for Teaching Contemporary Biology

    PubMed Central

    Wood, Louisa; Gebhardt, Philipp

    2013-01-01

    Since 2010, the European Molecular Biology Laboratory's (EMBL) Heidelberg laboratory and the European Bioinformatics Institute (EMBL-EBI) have jointly run bioinformatics training courses developed specifically for secondary school science teachers within Europe and EMBL member states. These courses focus on introducing bioinformatics, databases, and data-intensive biology, allowing participants to explore resources and providing classroom-ready materials to support them in sharing this new knowledge with their students. In this article, we chart our progress made in creating and running three bioinformatics training courses, including how the course resources are received by participants and how these, and bioinformatics in general, are subsequently used in the classroom. We assess the strengths and challenges of our approach, and share what we have learned through our interactions with European science teachers. PMID:23785266

  16. Patterns and development of floral asymmetry in Senna (Leguminosae, Cassiinae).

    PubMed

    Marazzi, Brigitte; Endress, Peter K

    2008-01-01

    The buzz-pollinated genus Senna (Leguminosae) is outstanding for including species with monosymmetric flowers and species with diverse asymmetric, enantiomorphic (enantiostylous) flowers. To recognize patterns of homology, we dissected the floral symmetry character complex and explored corolla morphology in 60 Senna species and studied floral development of four enantiomorphic species. The asymmetry morph of a flower is correlated with the direction of spiral calyx aestivation. We recognized five patterns of floral asymmetry, resulting from different combinations of six structural elements: deflection of the carpel, deflection of the median abaxial stamen, deflection or modification in size of one lateral abaxial stamen, and modification in shape and size of one or both lower petals. Prominent corolla asymmetry begins in the earl-stage bud (unequal development of lower petals). Androecium asymmetry begins either in the midstage bud (unequal development of thecae in median abaxial stamen; twisting of androecium) or at anthesis (stamen deflection). Gynoecium asymmetry begins in early bud (primordium off the median plane, ventral slit laterally oriented) or midstage to late bud (carpel deflection). In enantiostylous flowers, pronouncedly concave and robust petals of both monosymmetric and asymmetric corollas likely function to ricochet and direct pollen flow during buzz pollination. Occurrence of particular combinations of structural elements of floral symmetry in the subclades is shown. PMID:21632312

  17. Evolution of whole-body enantiomorphy in the tree snail genus Amphidromus

    PubMed Central

    SUTCHARIT, C; ASAMI, T; PANHA, S

    2007-01-01

    Diverse animals exhibit left–right asymmetry in development. However, no example of dimorphism for the left–right polarity of development (whole-body enantiomorphy) is known to persist within natural populations. In snails, whole-body enantiomorphs have repeatedly evolved as separate species. Within populations, however, snails are not expected to exhibit enantiomorphy, because of selection against the less common morph resulting from mating disadvantage. Here we present a unique example of evolutionarily stable whole-body enantiomorphy in snails. Our molecular phylogeny of South-east Asian tree snails in the genus Amphidromus indicates that enantiomorphy has likely persisted as the ancestral state over a million generations. Enantiomorphs have continuously coexisted in every population surveyed spanning a period of 10 years. Our results indicate that whole-body enantiomorphy is maintained within populations opposing the rule of directional asymmetry in animals. This study implicates the need for explicit approaches to disclosure of a maintenance mechanism and conservation of the genus. PMID:17305832

  18. The SWISS-PROT protein sequence data bank: current status.

    PubMed Central

    Bairoch, A; Boeckmann, B

    1994-01-01

    SWISS-PROT is an annotated protein sequence database established in 1986 and maintained collaboratively, since 1988, by the Department of Medical Biochemistry of the University of Geneva and the EMBL Data Library. The SWISS-PROT protein sequence data bank consist of sequence entries. Sequence entries are composed of different lines types, each with their own format. For standardization purposes the format of SWISS-PROT follows as closely as possible that of the EMBL Nucleotide Sequence Database. A sample SWISS-PROT entry is shown in Figure 1. PMID:7937062

  19. 75 FR 11197 - Notice Pursuant to the National Cooperative Research and Production Act of 1993-Pistoia Alliance...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-03-10

    ...) of the Act on July 15, 2009 (74 FR 34364). The last notification was filed with the Department on... December 9, 2009 (74 FR 65157). Patricia A. Brink, Deputy Director of Operations, Antitrust Division... KINGDOM; Thomson Reuters HealthCare and Science, Philadelphia, PA; and EMBL/EBI, Hinxton,...

  20. Whole-genome sequence of Clostridium lituseburense L74, isolated from the larval gut of the rhinoceros beetle, Trypoxylus dichotomus

    PubMed Central

    Lee, Yookyung; Lim, Sooyeon; Rhee, Moon-Soo; Chang, Dong-Ho; Kim, Byoung-Chan

    2016-01-01

    Clostridium lituseburense L74 was isolated from the larval gut of the rhinoceros beetle, Trypoxylus dichotomus collected in Yeong-dong, Chuncheongbuk-do, South Korea and subjected to whole genome sequencing on HiSeq platform and annotated on RAST. The nucleotide sequence of this genome was deposited into DDBJ/EMBL/GenBank under the accession NZ_LITJ00000000. PMID:26981432

  1. From metaphor to practices: The introduction of "information engineers" into the first DNA sequence database.

    PubMed

    García-Sancho, Miguel

    2011-01-01

    This paper explores the introduction of professional systems engineers and information management practices into the first centralized DNA sequence database, developed at the European Molecular Biology Laboratory (EMBL) during the 1980s. In so doing, it complements the literature on the emergence of an information discourse after World War II and its subsequent influence in biological research. By the careers of the database creators and the computer algorithms they designed, analyzing, from the mid-1960s onwards information in biology gradually shifted from a pervasive metaphor to be embodied in practices and professionals such as those incorporated at the EMBL. I then investigate the reception of these database professionals by the EMBL biological staff, which evolved from initial disregard to necessary collaboration as the relationship between DNA, genes, and proteins turned out to be more complex than expected. The trajectories of the database professionals at the EMBL suggest that the initial subject matter of the historiography of genomics should be the long-standing practices that emerged after World War II and to a large extent originated outside biomedicine and academia. Only after addressing these practices, historians may turn to their further disciplinary assemblage in fields such as bioinformatics or biotechnology. PMID:21789956

  2. Whole-genome sequence of Clostridium lituseburense L74, isolated from the larval gut of the rhinoceros beetle, Trypoxylus dichotomus.

    PubMed

    Lee, Yookyung; Lim, Sooyeon; Rhee, Moon-Soo; Chang, Dong-Ho; Kim, Byoung-Chan

    2016-03-01

    Clostridium lituseburense L74 was isolated from the larval gut of the rhinoceros beetle, Trypoxylus dichotomus collected in Yeong-dong, Chuncheongbuk-do, South Korea and subjected to whole genome sequencing on HiSeq platform and annotated on RAST. The nucleotide sequence of this genome was deposited into DDBJ/EMBL/GenBank under the accession NZ_LITJ00000000. PMID:26981432

  3. Whole genome analysis of Klebsiella pneumoniae T2-1-1 from human oral cavity.

    PubMed

    Chan, Kok-Gan; Yin, Wai-Fong; Chan, Xin-Yue

    2016-03-01

    Klebsiella pneumoniae T2-1-1 was isolated from the human tongue debris and subjected to whole genome sequencing on HiSeq platform and annotated on RAST. The nucleotide sequence of this genome was deposited into DDBJ/EMBL/GenBank under the accession JAQL00000000. PMID:26981378

  4. High Genetic and Epigenetic Stability in Coffea arabica Plants Derived from Embryogenic Suspensions and Secondary Embryogenesis as Revealed by AFLP, MSAP and the Phenotypic Variation Rate

    PubMed Central

    Bobadilla Landey, Roberto; Cenci, Alberto; Georget, Frédéric; Bertrand, Benoît; Camayo, Gloria; Dechamp, Eveline; Herrera, Juan Carlos; Santoni, Sylvain; Lashermes, Philippe; Simpson, June; Etienne, Hervé

    2013-01-01

    Embryogenic suspensions that involve extensive cell division are risky in respect to genome and epigenome instability. Elevated frequencies of somaclonal variation in embryogenic suspension-derived plants were reported in many species, including coffee. This problem could be overcome by using culture conditions that allow moderate cell proliferation. In view of true-to-type large-scale propagation of C. arabica hybrids, suspension protocols based on low 2,4-D concentrations and short proliferation periods were developed. As mechanisms leading to somaclonal variation are often complex, the phenotypic, genetic and epigenetic changes were jointly assessed so as to accurately evaluate the conformity of suspension-derived plants. The effects of embryogenic suspensions and secondary embryogenesis, used as proliferation systems, on the genetic conformity of somatic embryogenesis-derived plants (emblings) were assessed in two hybrids. When applied over a 6 month period, both systems ensured very low somaclonal variation rates, as observed through massive phenotypic observations in field plots (0.74% from 200 000 plant). Molecular AFLP and MSAP analyses performed on 145 three year-old emblings showed that polymorphism between mother plants and emblings was extremely low, i.e. ranges of 0–0.003% and 0.07–0.18% respectively, with no significant difference between the proliferation systems for the two hybrids. No embling was found to cumulate more than three methylation polymorphisms. No relation was established between the variant phenotype (27 variants studied) and a particular MSAP pattern. Chromosome counting showed that 7 of the 11 variant emblings analyzed were characterized by the loss of 1–3 chromosomes. This work showed that both embryogenic suspensions and secondary embryogenesis are reliable for true-to-type propagation of elite material. Molecular analyses revealed that genetic and epigenetic alterations are particularly limited during coffee somatic

  5. A Multi-State Single-Molecule Switch Actuated by Rotation of an Encapsulated Cluster within a Fullerene Cage

    SciTech Connect

    Huang, Tian; Zhao, Jin; Feng, Min; Popov, Alexey A.; Yang, Shangfeng; Dunsch, Lothar; Petek, Hrvoje

    2012-11-12

    We demonstrate a single-molecule switch based on tunneling electron-driven rotation of a triangular Sc₃N cluster within an icosahedral C 80 fullerene cage among three pairs of enantiomorphic configura-tions. Scanning tunneling microscopy imaging of switching within single molecules and electronic structure theory identify the conformational isomers and their isomerization pathways. Bias-dependent actionspectra and modeling identify the antisymmetric stretch vibration of Sc 3N cluster to be the gateway for energy transfer from the tunneling electrons to the cluster rotation. Hierarchical switching of conductivity through the internal cluster motion among multiple stationary states while maintaining a constant shape, is advantageous for the integration of endohedral fullerene-based single-molecule memory and logic devices into parallel molecular computing arc.

  6. Experimental phasing: best practice and pitfalls

    PubMed Central

    McCoy, Airlie J.; Read, Randy J.

    2010-01-01

    Developments in protein crystal structure determination by experimental phasing are reviewed, emphasizing the theoretical continuum between experimental phasing, density modification, model building and refinement. Traditional notions of the composition of the substructure and the best coefficients for map generation are discussed. Pitfalls such as determining the enantiomorph, identifying centrosymmetry (or pseudo-symmetry) in the substructure and crystal twinning are discussed in detail. An appendix introduces com­bined real–imaginary log-likelihood gradient map coefficients for SAD phasing and their use for substructure completion as implemented in the software Phaser. Supplementary material includes animated probabilistic Harker diagrams showing how maximum-likelihood-based phasing methods can be used to refine parameters in the case of SIR and MIR; it is hoped that these will be useful for those teaching best practice in experimental phasing methods. PMID:20382999

  7. Gyration and Permittivity of Ethylenediammonium Sulfate Crystals.

    PubMed

    Nichols, Shane; Martin, Alexander; Choi, Joshua; Kahr, Bart

    2016-06-01

    Ethylenediammonium sulfate (EDS) crystals were grown from aqueous solution and cleaved into thin (100-500 micron) plates. The 422 point group of EDS was confirmed by X-ray diffraction. The constitutive relations of EDS crystals were determined through generalized ellipsometry with an instrument that uses four photoelastic modulators (4PEM). The optical rotation at 500 nm, for example, was + 22.9°/mm along the optic axis and - 12.1°/mm perpendicular to the optic axis for the P41 21 2 crystals. Enantiomorphous twins frequently form across the (001) plane. Mirrored halves must be separated by cleavage in advance of optical measurements. Chirality 28:460-465, 2016. © 2016 Wiley Periodicals, Inc. PMID:27126891

  8. Scattering of an anisotropic sphere by an arbitrarily incident Hermite-Gaussian beam

    NASA Astrophysics Data System (ADS)

    Qu, Tan; Wu, Zhensen; Shang, Qingchao; Li, Zhengjun; Bai, Lu; Li, Haiying

    2016-02-01

    An analytic theory for the scattering of an off-axis Hermite-Gaussian (HG) beam obliquely incident on an anisotropic sphere is developed. Based on the complex-source-point method and coordinate rotation theory, a general expansion expression for an arbitrarily incident HG beam in terms of Spherical Vector Wave Functions (SVWFs) is derived, and its convergence is numerically discussed. By introducing the Fourier transformation, the internal field expressions of the anisotropic sphere are represented. With the continuous tangential boundary conditions applied, the unknown scattering coefficients are solved. The theory and code are verified from the comparisons between the degenerated cases using our theory and those in the references. Two eigenmodes inside the uniaxial anisotropic sphere are characterized. The influences of beam mode, oblique incident angles, permittivity and permeability tensors, and sphere radius on the scattered field are analyzed numerically. The scattering intensity distributions on uniaxial anisotropic sphere in xoz and yoz plane are enantiomorphous for on-axis oblique illumination.

  9. 1',5'-Anhydro-L-ribo-hexitol Adenine Nucleic Acids (α-L-HNA-A): Synthesis and Chiral Selection Properties in the Mirror Image World.

    PubMed

    D'Alonzo, Daniele; Froeyen, Mathy; Schepers, Guy; Di Fabio, Giovanni; Van Aerschot, Arthur; Herdewijn, Piet; Palumbo, Giovanni; Guaragna, Annalisa

    2015-05-15

    The synthesis and a preliminary investigation of the base pairing properties of (6' → 4')-linked 1',5'-anhydro-L-ribo-hexitol nucleic acids (α-L-HNA) have herein been reported through the study of a model oligoadenylate system in the mirror image world. Despite its considerable preorganization due to the rigidity of the "all equatorial" pyranyl sugar backbone, α-L-HNA represents a versatile informational biopolymer, in view of its capability to cross-communicate with natural and unnatural complements in both enantiomeric forms. This seems the result of an inherent flexibility of the oligonucleotide system, as witnessed by the singular formation of iso- and heterochiral associations composed of regular, enantiomorphic helical structures. The peculiar properties of α-L-HNA (and most generally of the α-HNA system) provide new elements in our understanding of the structural prerequisites ruling the stereoselectivity of the hybridization processes of nucleic acids. PMID:25853790

  10. Gram-scale synthesis of two-dimensional polymer crystals and their structure analysis by X-ray diffraction.

    PubMed

    Kory, Max J; Wörle, Michael; Weber, Thomas; Payamyar, Payam; van de Poll, Stan W; Dshemuchadse, Julia; Trapp, Nils; Schlüter, A Dieter

    2014-09-01

    The rise of graphene, a natural two-dimensional polymer (2DP) with topologically planar repeat units, has challenged synthetic chemistry, and has highlighted that accessing equivalent covalently bonded sheet-like macromolecules has, until recently, not been achieved. Here we show that non-centrosymmetric, enantiomorphic single crystals of a simple-to-make monomer can be photochemically converted into chiral 2DP crystals and cleanly reversed back to the monomer. X-ray diffraction established unequivocal structural proof for this synthetic 2DP, which has an all-carbon scaffold and can be synthesized on the gram scale. The monomer crystals are highly robust, can be easily grown to sizes greater than 1 mm and the resulting 2DP crystals exfoliated into nanometre-thin sheets. This unique combination of features suggests that these 2DPs could find use in membranes and nonlinear optics. PMID:25143212

  11. meso-Tetrahydropyranylperoxides: molecular structures in solution, in the crystal, and by DFT calculations and their isomerization to the racemate.

    PubMed

    Balaban, Teodor Silviu; Eichhöfer, Andreas; Ghiviriga, Ion; Hugo, Holger; Wenzel, Wolfgang

    2003-06-27

    The crystalline peroxide 3a is the main product (out of 10 theoretically possible) from the aerial peroxidation of all-cis-2,4,6-trimethyltetrahydropyran (2a). It has a similar structure both in solution and in the crystal as shown by nuclear Overhauser effects and X-ray analysis, respectively. Theoretical calculations at a density functional theory level (B3LYP/6-31G) provide insight into the stabilities of the different stereoisomers of this peroxide, accounting for the facile, acid-catalyzed isomerization from the meso form to the racemate. Peroxide 3b, which is the 2-tert-butyl analogue of 3a, out of 22 theoretically possible isomers, crystallizes in a similar meso form. As a result of crystal packing effects and the intrinsically (axial) chiral peroxy "chromophore" that deviates slightly from the antiperiplanar conformation, both enantiomorphic forms of 3b are encountered in the lattice. PMID:12816495

  12. Crystalline Isotactic Polar Polypropylene from the Palladium-Catalyzed Copolymerization of Propylene and Polar Monomers.

    PubMed

    Ota, Yusuke; Ito, Shingo; Kobayashi, Minoru; Kitade, Shinichi; Sakata, Kazuya; Tayano, Takao; Nozaki, Kyoko

    2016-06-20

    Moderately isospecific homopolymerization of propylene and the copolymerization of propylene and polar monomers have been achieved with palladium complexes bearing a phosphine-sulfonate ligand. Optimization of substituents on the phosphorus atom of the ligand revealed that the presence of bulky alkyl groups (e.g. menthyl) is crucial for the generation of high-molecular-weight polypropylenes (Mw ≈10(4) ), and the substituent at the ortho-position relative to the sulfonate group influences the molecular weight and isotactic regularity of the obtained polypropylenes. Statistical analysis suggested that the introduction of substituents at the ortho-position relative to the sulfonate group favors enantiomorphic site control over chain end control in the chain propagation step. The triad isotacticity could be increased to mm=0.55-0.59, with formation of crystalline polar polypropylenes, as supported by the presence of melting points and sharp peaks in the corresponding X-ray diffraction patterns. PMID:27161896

  13. From hand to eye: the role of literacy, familiarity, graspability, and vision-for-action on enantiomorphy.

    PubMed

    Fernandes, Tânia; Kolinsky, Régine

    2013-01-01

    Literacy in a script with mirrored symbols boosts the ability to discriminate mirror images, i.e., enantiomorphy. In the present study we evaluated the impact of four factors on enantiomorphic abilities: (i) the degree of literacy of the participants; (ii) the familiarity of the material; (iii) the strength of the association between familiar objects and manipulation, i.e., graspability; and (iv) the involvement of vision-for-action in the task. Three groups of adults - unschooled illiterates, unschooled ex-illiterates, and schooled literates - participated in two experiments. In Experiment 1, participants performed a vision-for-perception task, i.e., an orientation-based same-different comparison task, on pictures of familiar objects and geometric shapes. Graspability of familiar objects and unfamiliarity of the stimuli facilitated orientation discrimination, but did not help illiterate participants to overcome their difficulties with enantiomorphy. Compared to a baseline, illiterate adults had the strongest performance drop for mirror images, whereas for plane rotations the performance drop was similar across groups. In Experiment 2, participants performed a vision-for-action task; they were asked to decide which hand they would use to grasp a familiar object according to its current position (e.g., indicating left-hand usage to grasp a cup with the handle on the left side, and right-hand usage for its mirror image). Illiterates were as skillful as literates to perform this task. The present study thus provided three important findings. First, once triggered by literacy, enantiomorphy generalizes to any visual object category, as part of vision-for-perception, i.e., in visual recognition and identification processes. Second, the impact of literacy is much stronger on enantiomorphy than on the processing of other orientation contrasts. Third, in vision-for-action tasks, illiterates are as sensitive as literates to enantiomorphic-related information. PMID:23232335

  14. Crystallization of a 2:2 complex of granulocyte-colony stimulating factor (GCSF) with the ligand-binding region of the GCSF receptor

    SciTech Connect

    Honjo, Eijiro; Tamada, Taro; Maeda, Yoshitake; Koshiba, Takumi; Matsukura, Yasuko; Okamoto, Tomoyuki; Ishibashi, Matsujiro; Tokunaga, Masao; Kuroki, Ryota

    2005-08-01

    A 2:2 complex of highly purified GCSF receptor (Ig-CRH) with GCSF was crystallized. The crystal diffracted to 2.8 Å resolution with sufficient quality for further structure determination. The granulocyte-colony stimulating factor (GCSF) receptor receives signals for regulating the maturation, proliferation and differentiation of the precursor cells of neutrophilic granulocytes. The signalling complex composed of two GCSFs (GCSF, 19 kDa) and two GCSF receptors (GCSFR, 34 kDa) consisting of an Ig-like domain and a cytokine-receptor homologous (CRH) domain was crystallized. A crystal of the complex was grown in 1.0 M sodium formate and 0.1 M sodium acetate pH 4.6 and belongs to space group P4{sub 1}2{sub 1}2 (or its enantiomorph P4{sub 3}2{sub 1}2), with unit-cell parameters a = b = 110.1, c = 331.8 Å. Unfortunately, this crystal form did not diffract beyond 5 Å resolution. Since the heterogeneity of GCSF receptor appeared to prevent the growth of good-quality crystals, the GCSF receptor was fractionated by anion-exchange chromatography. Crystals of the GCSF–fractionated GCSF receptor complex were grown as a new crystal form in 0.2 M ammonium phosphate. This new crystal form diffracted to beyond 3.0 Å resolution and belonged to space group P3{sub 1}21 (or its enantiomorph P3{sub 2}21), with unit-cell parameters a = b = 134.8, c = 105.7 Å.

  15. Stereochemical vocabulary for structures that are chiral but not asymmetric: History, analysis, and proposal for a rational terminology.

    PubMed

    Gal, Joseph

    2011-09-01

    Asymmetric objects are necessarily chiral, but a structure may be chiral and not asymmetric if it possesses one or more proper rotation axes. Chiral but not asymmetric molecules are important in chemistry and its applications, but no suitable term exists for the designation of such structures, and their terminology in the literature is confused and chaotic. Dissymmetric has been redefined by some authors as "chiral but not asymmetric," in conflict both with Pasteur's definition of the term as "not superposable on its mirror image" (without other restrictions, i.e., chiral) and the understanding of the term in stereochemistry. Moreover, dissymmetric and asymmetric are frequently confused because of their similar forms. Furthermore, dissymmetric is widely used in many other definitions in chemistry, physics, and other disciplines. Thus, dissymmetric is unsuitable in the new definition of "chiral but not asymmetric," and a new term is needed. The adjective "symmanumorphous" is therefore proposed for "chiral but not asymmetric". "Sym" (from symmetry) indicates the presence of some symmetry in the structure, and "manu" (from "manus," Latin for hand, e.g., manual, manuscript) refers to its handedness. "Morphous," from the Greek "morphē," that is, form, is widely used, for example, anthropomorphous, enantiomorphous, etc. Symmanumorphous is convenient and euphonious and at 15 characters (same as enantiomorphous) is not unduly long. The nouns "a symmanumorph" (a structure that is chiral but not asymmetric) and "symmanumorphism" (the phenomenon of chirality without asymmetry) are also proposed. The new terminology is adaptable in other languages and would contribute to creating order out of linguistic chaos. PMID:21751256

  16. Crystallization and preliminary crystallographic analysis of a family 43 β-d-xylosidase from Geobacillus stearothermophilus T-6

    SciTech Connect

    Brüx, Christian; Niefind, Karsten; Ben-David, Alon; Leon, Maya; Shoham, Gil; Shoham, Yuval; Schomburg, Dietmar

    2005-12-01

    The crystallization and preliminary X-ray analysis of a β-d-xylosidase from G. stearothermophilus T-6, a family 43 glycoside hydrolase, is described. Native and catalytic inactive mutants of the enzymes were crystallized in two different space groups, orthorhombic P2{sub 1}2{sub 1}2 and tetragonal P4{sub 1}2{sub 1}2 (or the enantiomorphic space group P4{sub 3}2{sub 1}2), using a sensitive cryoprotocol. The latter crystal form diffracted X-rays to a resolution of 2.2 Å. β-d-Xylosidases (EC 3.2.1.37) are hemicellulases that cleave single xylose units from the nonreducing end of xylooligomers. In this study, the crystallization and preliminary X-ray analysis of a β-d-xylosidase from Geobacillus stearothermophilus T-6 (XynB3), a family 43 glycoside hydrolase, is described. XynB3 is a 535-amino-acid protein with a calculated molecular weight of 61 891 Da. Purified recombinant native and catalytic inactive mutant proteins were crystallized and cocrystallized with xylobiose in two different space groups, P2{sub 1}2{sub 1}2 (unit-cell parameters a = 98.32, b = 99.36, c = 258.64 Å) and P4{sub 1}2{sub 1}2 (or the enantiomorphic space group P4{sub 3}2{sub 1}2; unit-cell parameters a = b = 140.15, c = 233.11 Å), depending on the detergent. Transferring crystals to cryoconditions required a very careful protocol. Orthorhombic crystals diffract to 2.5 Å and tetragonal crystals to 2.2 Å.

  17. The European Bioinformatics Institute's data resources 2014.

    PubMed

    Brooksbank, Catherine; Bergman, Mary Todd; Apweiler, Rolf; Birney, Ewan; Thornton, Janet

    2014-01-01

    Molecular Biology has been at the heart of the 'big data' revolution from its very beginning, and the need for access to biological data is a common thread running from the 1965 publication of Dayhoff's 'Atlas of Protein Sequence and Structure' through the Human Genome Project in the late 1990s and early 2000s to today's population-scale sequencing initiatives. The European Bioinformatics Institute (EMBL-EBI; http://www.ebi.ac.uk) is one of three organizations worldwide that provides free access to comprehensive, integrated molecular data sets. Here, we summarize the principles underpinning the development of these public resources and provide an overview of EMBL-EBI's database collection to complement the reviews of individual databases provided elsewhere in this issue. PMID:24271396

  18. The European Bioinformatics Institute in 2016: Data growth and integration

    PubMed Central

    Cook, Charles E.; Bergman, Mary Todd; Finn, Robert D.; Cochrane, Guy; Birney, Ewan; Apweiler, Rolf

    2016-01-01

    New technologies are revolutionising biological research and its applications by making it easier and cheaper to generate ever-greater volumes and types of data. In response, the services and infrastructure of the European Bioinformatics Institute (EMBL-EBI, www.ebi.ac.uk) are continually expanding: total disk capacity increases significantly every year to keep pace with demand (75 petabytes as of December 2015), and interoperability between resources remains a strategic priority. Since 2014 we have launched two new resources: the European Variation Archive for genetic variation data and EMPIAR for two-dimensional electron microscopy data, as well as a Resource Description Framework platform. We also launched the Embassy Cloud service, which allows users to run large analyses in a virtual environment next to EMBL-EBI's vast public data resources. PMID:26673705

  19. The European Bioinformatics Institute in 2016: Data growth and integration.

    PubMed

    Cook, Charles E; Bergman, Mary Todd; Finn, Robert D; Cochrane, Guy; Birney, Ewan; Apweiler, Rolf

    2016-01-01

    New technologies are revolutionising biological research and its applications by making it easier and cheaper to generate ever-greater volumes and types of data. In response, the services and infrastructure of the European Bioinformatics Institute (EMBL-EBI, www.ebi.ac.uk) are continually expanding: total disk capacity increases significantly every year to keep pace with demand (75 petabytes as of December 2015), and interoperability between resources remains a strategic priority. Since 2014 we have launched two new resources: the European Variation Archive for genetic variation data and EMPIAR for two-dimensional electron microscopy data, as well as a Resource Description Framework platform. We also launched the Embassy Cloud service, which allows users to run large analyses in a virtual environment next to EMBL-EBI's vast public data resources. PMID:26673705

  20. The European Bioinformatics Institute’s data resources 2014

    PubMed Central

    Brooksbank, Catherine; Bergman, Mary Todd; Apweiler, Rolf; Birney, Ewan; Thornton, Janet

    2014-01-01

    Molecular Biology has been at the heart of the ‘big data’ revolution from its very beginning, and the need for access to biological data is a common thread running from the 1965 publication of Dayhoff’s ‘Atlas of Protein Sequence and Structure’ through the Human Genome Project in the late 1990s and early 2000s to today’s population-scale sequencing initiatives. The European Bioinformatics Institute (EMBL-EBI; http://www.ebi.ac.uk) is one of three organizations worldwide that provides free access to comprehensive, integrated molecular data sets. Here, we summarize the principles underpinning the development of these public resources and provide an overview of EMBL-EBI’s database collection to complement the reviews of individual databases provided elsewhere in this issue. PMID:24271396

  1. MOCAT2: a metagenomic assembly, annotation and profiling framework

    PubMed Central

    Kultima, Jens Roat; Coelho, Luis Pedro; Forslund, Kristoffer; Huerta-Cepas, Jaime; Li, Simone S.; Driessen, Marja; Voigt, Anita Yvonne; Zeller, Georg; Sunagawa, Shinichi; Bork, Peer

    2016-01-01

    Summary: MOCAT2 is a software pipeline for metagenomic sequence assembly and gene prediction with novel features for taxonomic and functional abundance profiling. The automated generation and efficient annotation of non-redundant reference catalogs by propagating pre-computed assignments from 18 databases covering various functional categories allows for fast and comprehensive functional characterization of metagenomes. Availability and Implementation: MOCAT2 is implemented in Perl 5 and Python 2.7, designed for 64-bit UNIX systems and offers support for high-performance computer usage via LSF, PBS or SGE queuing systems; source code is freely available under the GPL3 license at http://mocat.embl.de. Contact: bork@embl.de Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27153620

  2. A large-scale crop protection bioassay data set.

    PubMed

    Gaulton, Anna; Kale, Namrata; van Westen, Gerard J P; Bellis, Louisa J; Bento, A Patrícia; Davies, Mark; Hersey, Anne; Papadatos, George; Forster, Mark; Wege, Philip; Overington, John P

    2015-01-01

    ChEMBL is a large-scale drug discovery database containing bioactivity information primarily extracted from scientific literature. Due to the medicinal chemistry focus of the journals from which data are extracted, the data are currently of most direct value in the field of human health research. However, many of the scientific use-cases for the current data set are equally applicable in other fields, such as crop protection research: for example, identification of chemical scaffolds active against a particular target or endpoint, the de-convolution of the potential targets of a phenotypic assay, or the potential targets/pathways for safety liabilities. In order to broaden the applicability of the ChEMBL database and allow more widespread use in crop protection research, an extensive data set of bioactivity data of insecticidal, fungicidal and herbicidal compounds and assays was collated and added to the database. PMID:26175909

  3. Computational Method for the Systematic Identification of Analog Series and Key Compounds Representing Series and Their Biological Activity Profiles.

    PubMed

    Stumpfe, Dagmar; Dimova, Dilyana; Bajorath, Jürgen

    2016-08-25

    A computational methodology is introduced for detecting all unique series of analogs in large compound data sets, regardless of chemical relationships between analogs. No prior knowledge of core structures or R-groups is required, which are automatically determined. The approach is based upon the generation of retrosynthetic matched molecular pairs and analog networks from which distinct series are isolated. The methodology was applied to systematically extract more than 17 000 distinct series from the ChEMBL database. For comparison, analog series were also isolated from screening compounds and drugs. Known biological activities were mapped to series from ChEMBL, and in more than 13 000 of these series, key compounds were identified that represented substitution sites of all analogs within a series and its complete activity profile. The analog series, key compounds, and activity profiles are made freely available as a resource for medicinal chemistry applications. PMID:27501131

  4. Compilation of DNA sequences of Escherichia coli (update 1991)

    PubMed Central

    Kröger, Manfred; Wahl, Ralf; Rice, Peter

    1991-01-01

    We have compiled the DNA sequence data for E.coli available from the GENBANK and EMBL data libraries and over a period of several years independently from the literature. This is the third listing replacing and increasing the former listing roughly by one fifth. However, in order to save space this printed version contains DNA sequence information only. The complete compilation is now available in machine readable form from the EMBL data library (ECD release 6). After deletion of all detected overlaps a total of 1 492 282 individual bp is found to be determined till the beginning of 1991. This corresponds to a total of 31.62% of the entire E.coli chromosome consisting of about 4,720 kbp. This number may actually be higher by some extra 2,5% derived from lysogenic bacteriophage lambda and various DNA sequences already received for statistical purposes only. PMID:2041799

  5. Compilation of DNA sequences of Escherichia coli (update 1990)

    PubMed Central

    Kröger, Manfred; Wahl, Ralf; Rice, Peter

    1990-01-01

    We have compiled the DNA sequence data for E.coli available from the GENBANK and EMBL data libraries and over a period of several years independently from the literature. This is the second listing replacing and increasing the former listing roughly by one third. After deletion of all detected overlaps a total of 1 248 696 individual bp is found to be determined till the beginning of 1990. This corresponds to a total of 26.46% of the entire E.coli chromosome consisting of about 4,720 kbp. This number may actually be higher by some extra 2% derived from the sequence of lysogenic bacteriophage lambda and various insertion sequences. This compilation is now available in machine readable form from the EMBL data library. PMID:2185457

  6. Whole genome sequence of Oscheius sp. TEL-2014 entomopathogenic nematodes isolated from South Africa

    PubMed Central

    Lephoto, Tiisetso E.; Mpangase, Phelelani T.; Aron, Shaun; Gray, Vincent M.

    2016-01-01

    We present the annotation of the draft genome sequence of Oscheius sp. TEL-2014 (Genbank accession number KM492926). This entomopathogenic nematode was isolated from grassland in Suikerbosrand Nature Reserve near Johannesburg in South Africa. Oscheius sp. Strain TEL has a genome size of 110,599,558 bp and a GC content of 42.24%. The genome sequence can be accessed at DDBJ/EMBL/GenBank under the accession number LNBV00000000. PMID:27054091

  7. Draft genome sequence of extremely acidophilic bacterium Acidithiobacillus ferrooxidans DLC-5 isolated from acid mine drainage in Northeast China.

    PubMed

    Chen, Peng; Yan, Lei; Wu, Zhengrong; Xu, Ruixiang; Li, Suyue; Wang, Ningbo; Liang, Ning; Li, Hongyu

    2015-12-01

    Acidithiobacillus ferrooxidans type strain DLC-5, isolated from Wudalianchi in Heihe of Heilongjiang Province, China. Here, we present the draft genome of strain DLC-5 which contains 4,232,149 bp in 2745 contigs with 57.628% GC content and includes 32,719 protein-coding genes and 64 tRNA-encoding genes. The genome sequence can be accessed at DDBJ/EMBL/GenBank under the accession no. JNNH00000000.1. PMID:26697393

  8. Draft genome sequence of extremely acidophilic bacterium Acidithiobacillus ferrooxidans DLC-5 isolated from acid mine drainage in Northeast China

    PubMed Central

    Chen, Peng; Yan, Lei; Wu, Zhengrong; Xu, Ruixiang; Li, Suyue; Wang, Ningbo; Liang, Ning; Li, Hongyu

    2015-01-01

    Acidithiobacillus ferrooxidans type strain DLC-5, isolated from Wudalianchi in Heihe of Heilongjiang Province, China. Here, we present the draft genome of strain DLC-5 which contains 4,232,149 bp in 2745 contigs with 57.628% GC content and includes 32,719 protein-coding genes and 64 tRNA-encoding genes. The genome sequence can be accessed at DDBJ/EMBL/GenBank under the accession no. JNNH00000000.1. PMID:26697393

  9. Whole-genome sequence of Sunxiuqinia dokdonensis DH1(T), isolated from deep sub-seafloor sediment in Dokdo Island.

    PubMed

    Lim, Sooyeon; Chang, Dong-Ho; Kim, Byoung-Chan

    2016-09-01

    Sunxiuqinia dokdonensis DH1(T) was isolated from deep sub-seafloor sediment at a depth of 900 m below the seafloor off Seo-do (the west part of Dokdo Island) in the East Sea of the Republic of Korea and subjected to whole genome sequencing on HiSeq platform and annotated on RAST. The nucleotide sequence of this genome was deposited into DDBJ/EMBL/GenBank under the accession LGIA00000000. PMID:27437183

  10. Complete genome sequence of a giant Vibrio phage ValKK3 infecting Vibrio alginolyticus.

    PubMed

    Lal, Tamrin M; Sano, Motohiko; Hatai, Kishio; Ransangan, Julian

    2016-06-01

    This paper describes the complete sequence of a giant lytic marine myophage, Vibrio phage ValKK3 that is specific to Vibrio alginolyticus ATCC(®) 17749™. Vibrio phage ValKK3 was subjected to whole genome sequencing on MiSeq sequencing platform and annotated using Blast2Go. The complete sequence of ValKK3 genome was deposited in DBBJ/EMBL/GenBank under accession number KP671755. PMID:27114905

  11. Whole genome sequencing of Halomonas sp. SUBG004 isolated from Little Rann of Kutch, a desert of India.

    PubMed

    Patel, Jigna H; Thaker, Vrinda S

    2015-12-01

    A salt tolerant strain, designated as SUBG004, was isolated from the desert of India, Little Rann of Kutch. The organism is a Gram-negative, facultatively anaerobic and rod shaped bacterium. Chemotaxonomic and phylogenetic properties were consistent with its classification in the genus Halomonas. Here we report the whole genome sequence of Halomonas sp. SUBG004 deposited in DDBJ/EMBL/GenBank under accession number JPEU0100000 which provides insights for salt stress adaptation through betaine synthesis. PMID:26697321

  12. RNA-Seq analysis of urea nutrition responsive transcriptome of Oryza sativa elite indica cultivar RP Bio 226.

    PubMed

    Reddy, Mettu Madhavi; Ulaganathan, Kandasamy

    2015-12-01

    Rice yield is greatly influenced by the nitrogen and rice varieties show variation in yield. For understanding the role of urea nutrition in the yield of elite indica rice cultivar RPBio-226, the urea responsive transcriptome was sequenced and analyzed. The raw reads and the Transcriptome Shotgun Assembly project has been deposited at DDBJ/EMBL/GenBank under the accession GDKM00000000. The version described in this paper is the first version, GDKM01000000. PMID:26697348

  13. Effect of site-specific modification on restriction endonucleases and DNA modification methyltransferases.

    PubMed Central

    McClelland, M; Nelson, M; Raschke, E

    1994-01-01

    Restriction endonucleases have site-specific interactions with DNA that can often be inhibited by site-specific DNA methylation and other site-specific DNA modifications. However, such inhibition cannot generally be predicted. The empirically acquired data on these effects are tabulated for over 320 restriction endonucleases. In addition, a table of known site-specific DNA modification methyltransferases and their specificities is presented along with EMBL database accession numbers for cloned genes. PMID:7937074

  14. Draft genome of iron-oxidizing bacterium Leptospirillum sp. YQP-1 isolated from a volcanic lake in the Wudalianchi volcano, China.

    PubMed

    Yan, Lei; Zhang, Shuang; Yu, Gaobo; Ni, Yongqing; Wang, Weidong; Hu, Huixin; Chen, Peng

    2015-12-01

    Leptospirillum sp. YQP-1, a member of iron-oxidizing bacteria was isolated from volcanic lake in northeast China. Here, we report the draft genome sequence of the strain YQP-1 with a total genome size of 3,103,789 bp from 85 scaffolds (104 contigs) with 58.64% G + C content. The genome sequence can be accessed at DDBJ/EMBL/GenBank under the accession no. LIEB00000000. PMID:26697362

  15. Compilation of DNA sequences of Escherichia coli K12: description of the interactive databases ECD and ECDC (update 1996).

    PubMed Central

    Kröger, M; Wahl, R

    1997-01-01

    We have compiled the DNA sequence data forEscherichia coliavailable from the GenBank and EMBL data libraries and independently from the literature. We provide the most definitive version of the ECDEscherichia colidatabase now exclusively via the World Wide Web System: http://susi.bio.uni-giessen.de/usr/local/www/ html/ecdc.html . Our database encloses an assembled set of contiguous sequences. Each of these contigs compiles all available sequence information, including those derived from a variety of elder sequences. The organisation of the database allows precise physical location of each individual gene or regulatory region, even taking into consideration discrepancies in nomenclature. The WWW program allows to branch into the original EMBL and SWISSPROT datafiles. A number of links to other WWW servers is provided. A FASTA and BLAST search may be performed online. Besides the WWW format a flat file version may be obtained via ftp. The ftp version may also be obtained from the EMBL data library as part of the CD-ROM issue of the EMBL sequence database, which is released and updated every 3 months. After deletion of all detected overlaps a total of 3 588 706 individual bp has been determined up to the end of September 1996. This corresponds to a total of 77.09% of the entire E.coli chromosome consisting of approximately 4655 kb. About 479 kb (10.3%) are additionally available from Kyoto (Japan). Another 94 kb (2%) are available, but mapping has not been confirmed. Thus the total may have reached 89.4%. PMID:9016501

  16. Genome sequencing and annotation of Proteus sp. SAS71

    PubMed Central

    Selim, Samy; Hassan, Sherif; Hagagy, Nashwa

    2015-01-01

    We report draft genome sequence of Proteus sp. strain SAS71, isolated from water spring in Aljouf region, Saudi Arabia. The draft genome size is 3,037,704 bp with a G + C content of 39.3% and contains 6 rRNA sequence (single copies of 5S, 16S & 23S rRNA). The genome sequence can be accessed at DDBJ/EMBL/GenBank under the accession no. LDIU00000000. PMID:26697338

  17. Draft genome of iron-oxidizing bacterium Leptospirillum sp. YQP-1 isolated from a volcanic lake in the Wudalianchi volcano, China

    PubMed Central

    Yan, Lei; Zhang, Shuang; Yu, Gaobo; Ni, Yongqing; Wang, Weidong; Hu, Huixin; Chen, Peng

    2015-01-01

    Leptospirillum sp. YQP-1, a member of iron-oxidizing bacteria was isolated from volcanic lake in northeast China. Here, we report the draft genome sequence of the strain YQP-1 with a total genome size of 3,103,789 bp from 85 scaffolds (104 contigs) with 58.64% G + C content. The genome sequence can be accessed at DDBJ/EMBL/GenBank under the accession no. LIEB00000000. PMID:26697362

  18. Genome sequencing and annotation of Serratia sp. strain TEL.

    PubMed

    Lephoto, Tiisetso E; Gray, Vincent M

    2015-12-01

    We present the annotation of the draft genome sequence of Serratia sp. strain TEL (GenBank accession number KP711410). This organism was isolated from entomopathogenic nematode Oscheius sp. strain TEL (GenBank accession number KM492926) collected from grassland soil and has a genome size of 5,000,541 bp and 542 subsystems. The genome sequence can be accessed at DDBJ/EMBL/GenBank under the accession number LDEG00000000. PMID:26697332

  19. JINN, an integrated software package for molecular geneticists.

    PubMed Central

    Johnsen, M

    1984-01-01

    I describe JINN, a microcomputer-based system designed to maintain and search a strain collection, to enter, modify and analyze sequences, and to use the EMBL Sequence Data Base. The major objective during development of this program has been integration of individual program modules to ensure a consistent and helpful user interface. The system is running under the CP/M operating system and requires little in the way of particular hardware configuration. PMID:6320101

  20. Genomic DNA sequence of a rice gene coding for a pullulanase-type of starch debranching enzyme.

    PubMed

    Francisco, P B; Zhang, Y; Park, S Y; Ogata, N; Yamanouchi, H; Nakamura, Y

    1998-09-01

    A genomic DNA containing a rice (Oryza sativa L., cv. Norin-8) gene coding for a pullulanase-type starch debranching enzyme (EC 3.2.1. 41) was sequenced (EMBL/GenBank/DDBJ accession number AB012915). Along the 15, 248 bp DNA, the pullulanase gene is split into 26 exons. The four pullulanase consensus regions are positioned in the middle portion of the sequence and are separated by long introns and 1-3 exons. Comparison of the rice cv. Norin-8 pullulanase genomic structure with that of barley pullulanase (limit dextrinase) (F. Lok et al., EMBL/GenBank/DDBJ accession number AF022725) indicates that most of the pullulanase exons are highly conserved. Alignment of the nucleotide bases of rice exon 8 with those of barley exon 8-intron 8-exon 9 fragment suggests that the 85 bp internal sequence of rice exon 8 was originally an intron, a possibility further indicated by the absence in barley and spinach (A. Renz et al., EMBL/GenBank/DDBJ accession number X83969) pullulanases of amino acid residues encoded by the 85 bp fragment. PMID:9748665

  1. Compilation of DNA sequences of Escherichia coli (update 1993).

    PubMed Central

    Kröger, M; Wahl, R; Rice, P

    1993-01-01

    We have compiled the DNA sequence data for E. coli available from the GENBANK and EMBL data libraries and over a period of several years independently from the literature. This is the fifth listing replacing and increasing the former listings substantially. However, in order to save space this printed version contains DNA sequence information only, if they are publically available in electronic form. The complete compilation including a full set of genetic map data and the E. coli protein index can be obtained in machine readable form from the EMBL data library (ECD release 15) as a part of the CD-ROM issue of the EMBL sequence database, released and updated every three months. After deletion of all detected overlaps a total of 2,353,635 individual bp is found to be determined till the end of April 1993. This corresponds to a total of 49.87% of the entire E. coli chromosome consisting of about 4,720 kbp. This number may actually be higher by 9161 bp derived from other strains of E. coli. PMID:8332520

  2. Predicting the protein targets for athletic performance-enhancing substances

    PubMed Central

    2013-01-01

    Background The World Anti-Doping Agency (WADA) publishes the Prohibited List, a manually compiled international standard of substances and methods prohibited in-competition, out-of-competition and in particular sports. It would be ideal to be able to identify all substances that have one or more performance-enhancing pharmacological actions in an automated, fast and cost effective way. Here, we use experimental data derived from the ChEMBL database (~7,000,000 activity records for 1,300,000 compounds) to build a database model that takes into account both structure and experimental information, and use this database to predict both on-target and off-target interactions between these molecules and targets relevant to doping in sport. Results The ChEMBL database was screened and eight well populated categories of activities (Ki, Kd, EC50, ED50, activity, potency, inhibition and IC50) were used for a rule-based filtering process to define the labels “active” or “inactive”. The “active” compounds for each of the ChEMBL families were thereby defined and these populated our bioactivity-based filtered families. A structure-based clustering step was subsequently performed in order to split families with more than one distinct chemical scaffold. This produced refined families, whose members share both a common chemical scaffold and bioactivity against a common target in ChEMBL. Conclusions We have used the Parzen-Rosenblatt machine learning approach to test whether compounds in ChEMBL can be correctly predicted to belong to their appropriate refined families. Validation tests using the refined families gave a significant increase in predictivity compared with the filtered or with the original families. Out of 61,660 queries in our Monte Carlo cross-validation, belonging to 19,639 refined families, 41,300 (66.98%) had the parent family as the top prediction and 53,797 (87.25%) had the parent family in the top four hits. Having thus validated our approach, we used

  3. Identifying duplicate crystal structures: XTALCOMP, an open-source solution

    NASA Astrophysics Data System (ADS)

    Lonie, David C.; Zurek, Eva

    2012-03-01

    applications may consider enantiomorphic structures to be identical. Solution method: The XtalComp algorithm overcomes these issues to detect duplicate structures regardless of differences in representation. It begins by performing a Niggli reduction on the inputs, standardizing the translation vectors and orientations. A transform search is performed to identify candidate sets of rotations, reflections, and translations that potentially map the description of one crystal onto the other, solving the problems of enantiomorphs and rotationally degenerate lattices. The atomic positions resulting from each candidate transform are then compared, using a cell-expansion technique to remove periodic boundary issues. Computational noise is treated by comparing non-integer quantities using a specified tolerance. Running time: The test run provided takes less than a second to complete.

  4. Crystallization and preliminary X-ray analysis of a native human tRNA synthetase whose allelic variants are associated with Charcot–Marie–Tooth disease

    SciTech Connect

    Xie, Wei; Schimmel, Paul; Yang, Xiang-Lei

    2006-12-01

    Crystallization and preliminary X-ray analysis of a native human tRNA synthetase whose allelic variants are associated with Charcot–Marie–Tooth Disease. Glycyl-tRNA synthetase (GlyRS) is one of a group of enzymes that catalyze the synthesis of aminoacyl-tRNAs for translation. Mutations of human and mouse GlyRSs are causally associated with Charcot–Marie–Tooth disease, the most common genetic disorder of the peripheral nervous system. As the first step towards a structure–function analysis of this disease, native human GlyRS was expressed, purified and crystallized. The crystal belonged to space group P4{sub 3}2{sub 1}2 or its enantiomorphic space group P4{sub 1}2{sub 1}2, with unit-cell parameters a = b = 91.74, c = 247.18 Å, and diffracted X-rays to 3.0 Å resolution. The asymmetric unit contained one GlyRS molecule and had a solvent content of 69%.

  5. The Time 'Onewayness' Shared by Quantum Mechanics and Relativity

    SciTech Connect

    Guzzetta, Giuseppe

    2006-11-03

    The measure of the mutation, or change, any material elementary particle unceasingly undergoes, is defined as that of the displacement of a point moving in a three-dimensional Euclidean space, at the velocity of light, on a trajectory decomposable in a rotation and a translation. The rotation accounts for the spin angular momentum of the particle, the translation for its change of location. Then, an elementary mutation is proportional to an elementary interval of universal time. The connection between space and time is such that the operation of universal time conjugation, that is, the change of sign of t, involves space inversion, so coinciding with the operation currently defined as TCP. It implies that to a given physical process, another equally possible one corresponds in which the sequence of events (that still follow the same time course) is reversed, and actors are the enantiomorphic counterparts (anti-particles instead of particles, and vice versa) of those playing in the first physical process. Since no alternative is left to any elementary particle, that exists in that it undergoes an everlasting mutation, the unidirectionality of time must not be understood as a choice between two alternative directions. Many formalisms of Special Relativity can be derived from the above definition of the mutation of a material elementary particle. Anyhow, some discordances seems to crop out whose discussion is beyond the purpose of the present paper.

  6. Crystal chemistry of layered structures formed by linear rigid silyl-capped molecules

    PubMed Central

    Lumpi, Daniel; Kautny, Paul; Stöger, Berthold; Fröhlich, Johannes

    2015-01-01

    The crystallization behavior of methylthio- or methylsulfonyl-containing spacer extended Z,Z-bis-ene–yne molecules capped with trimethylsilyl groups obtained by (tandem) thiophene ring fragmentation and of two non-spacer extended analogs were investigated. The rigid and linear molecules generally crystallized in layers whereby the flexibility of the layer interfaces formed by the silyl groups leads to a remarkably rich crystal chemistry. The molecules with benzene and thiophene spacers both crystallized with C2/c symmetry and can be considered as merotypes. Increasing the steric bulk of the core by introduction of ethylenedioxythiophene (EDOT) gave a structure incommensurately modulated in the [010] direction. Further increase of steric demand in the case of a dimethoxythiophene restored periodicity along [010] but resulted in a doubling of the c vector. Two different polytypes were observed, which feature geometrically different layer interfaces (non-OD, order–disorder, polytypes), one with a high stacking fault probability. Oxidation of the methylthio groups of the benzene-based molecule to methylsulfonyl groups led to three polymorphs (two temperature-dependent), which were analyzed by Hirshfeld surface d e/d i fingerprint plots. The analogously oxidized EDOT-based molecule crystallized as systematic twins owing to its OD polytypism. Shortening of the backbone by removal of the aryl core resulted in an enantiomorphic structure and a further shortening by removal of a methylthio-ene fragment again in a systematically twinned OD polytype. PMID:26306200

  7. Crystallization and preliminary X-ray crystallographic analysis of the heterodimeric crotoxin complex and the isolated subunits crotapotin and phospholipase A{sub 2}

    SciTech Connect

    Santos, K. F.; Murakami, M. T.; Toyama, M. H.; Marangoni, S.; Forrer, V. P.; Brandão Neto, J. R.; Polikarpov, I.; Arni, R. K.

    2007-04-01

    Crotoxin, a potent neurotoxin from the venom of the South American rattlesnake Crotalus durissus terrificus, exists as a heterodimer formed between a phospholipase A{sub 2} and a catalytically inactive acidic phospholipase A{sub 2} analogue (crotapotin). Large single crystals of the crotoxin complex and of the isolated subunits have been obtained. Crotoxin, a potent neurotoxin from the venom of the South American rattlesnake Crotalus durissus terrificus, exists as a heterodimer formed between a phospholipase A{sub 2} and a catalytically inactive acidic phospholipase A{sub 2} analogue (crotapotin). Large single crystals of the crotoxin complex and of the isolated subunits have been obtained. The crotoxin complex crystal belongs to the orthorhombic space group P2{sub 1}2{sub 1}2, with unit-cell parameters a = 38.2, b = 68.7, c = 84.2 Å, and diffracted to 1.75 Å resolution. The crystal of the phospholipase A{sub 2} domain belongs to the hexagonal space group P6{sub 1}22 (or its enantiomorph P6{sub 5}22), with unit-cell parameters a = b = 38.7, c = 286.7 Å, and diffracted to 2.6 Å resolution. The crotapotin crystal diffracted to 2.3 Å resolution; however, the highly diffuse diffraction pattern did not permit unambiguous assignment of the unit-cell parameters.

  8. On the microstructure and symmetry of apparently hexagonal BaAl 2O 4

    NASA Astrophysics Data System (ADS)

    Larsson, A.-K.; Withers, R. L.; Perez-Mato, J. M.; Fitz Gerald, J. D.; Saines, P. J.; Kennedy, B. J.; Liu, Y.

    2008-08-01

    The P6 3 ( a=2 ap, b=2 bp, c= cp) crystal structure reported for BaAl 2O 4 at room temperature has been carefully re-investigated by a combined transmission electron microscopy and neutron powder diffraction study. It is shown that the poor fit of this P6 3 ( a=2 ap, b=2 bp, c= cp) structure model for BaAl 2O 4 to neutron powder diffraction data is primarily due to the failure to take into account coherent scattering between different domains related by enantiomorphic twinning of the P6 322 parent sub-structure. Fast Fourier transformation of [0 0 1] lattice images from small localized real space regions (˜10 nm in diameter) are used to show that the P6 3 ( a=2 ap, b=2 bp, c= cp) crystal structure reported for BaAl 2O 4 is not correct on the local scale. The correct local symmetry of the very small nano-domains is most likely orthorhombic or monoclinic.

  9. Robust Cross-Linked Stereocomplexes and C60 Inclusion Complexes of Vinyl-Functionalized Stereoregular Polymers Derived from Chemo/Stereoselective Coordination Polymerization.

    PubMed

    Vidal, Fernando; Falivene, Laura; Caporaso, Lucia; Cavallo, Luigi; Chen, Eugene Y-X

    2016-08-01

    The successful synthesis of highly syndiotactic polar vinyl polymers bearing the reactive pendant vinyl group on each repeat unit, which is enabled by perfectly chemoselective and highly syndiospecific coordination polymerization of divinyl polar monomers developed through this work, has allowed the construction of robust cross-linked supramolecular stereocomplexes and C60 inclusion complexes. The metal-mediated coordination polymerization of three representative polar divinyl monomers, including vinyl methacrylate (VMA), allyl methacrylate (AMA), and N,N-diallyl acrylamide (DAA) by Cs-ligated zirconocenium ester enolate catalysts under ambient conditions exhibits complete chemoselectivity and high stereoselectivity, thus producing the corresponding vinyl-functionalized polymers with high (92% rr) to quantitative (>99% rr) syndiotacticity. A combined experimental (synthetic, kinetic, and mechanistic) and theoretical (DFT) investigation has yielded a unimetallic, enantiomorphic-site-controlled propagation mechanism. Postfunctionalization of the obtained syndiotactic vinyl-functionalized polymers via the thiol-ene click and photocuring reactions readily produced the corresponding thiolated polymers and flexible cross-linked thin-film materials, respectively. Complexation of such syndiotactic vinyl-functionalized polymers with isotactic poly(methyl methacrylate) and fullerene C60 generates supramolecular crystalline helical stereocomplexes and inclusion complexes, respectively. Cross-linking of such complexes affords robust cross-linked stereocomplexes that are solvent-resistant and also exhibit considerably enhanced thermal and mechanical properties compared with the un-cross-linked stereocomplexes. PMID:27388024

  10. Cloning, purification and preliminary crystallographic analysis of a putative DNA-binding membrane protein, YmfM, from Staphylococcus aureus

    SciTech Connect

    Xu, Ling; Sedelnikova, Svetlana E.; Baker, Patrick J.; Rice, David W.

    2008-07-01

    Truncation by the removal of the C-terminal hydrophobic transmembrane anchor has enabled the overexpression of a soluble domain of S. aureus YmfM in Escherichia coli, which has then been purified and subsequently crystallized. The Staphylococcus aureus protein YmfM contains a helix–turn–helix motif and is thought to be a putative DNA-binding protein which is associated with the membrane through a C-terminal hydrophobic transmembrane anchor. Truncation of the protein by the removal of this C-terminal hydrophobic segment has enabled the overexpression of a soluble domain of S. aureus YmfM (ΔYmfM) in Escherichia coli, which has been purified and subsequently crystallized. Crystals of ΔYmfM diffract to beyond 1.0 Å resolution and belong to one of the pair of enantiomorphic tetragonal space groups P4{sub 1}2{sub 1}2 or P4{sub 3}2{sub 1}2, with unit-cell parameters a = b = 45.5, c = 72.9 Å and one molecule in the asymmetric unit. The crystals of ΔYmfM have an unusually low V{sub M} of 1.6 Å{sup 3} Da{sup −1}, which is one of the lowest values observed for any protein to date. A full structure determination is under way in order to provide insights into the function of this protein.

  11. Theoretical estimates of the anapole magnetizabilities of C{sub 4}H{sub 4}X{sub 2} cyclic molecules for X=O, S, Se, and Te

    SciTech Connect

    Pagola, G. I.; Ferraro, M. B.; Provasi, P. F.; Pelloni, S.; Lazzeretti, P.

    2014-09-07

    Calculations have been carried out for C{sub 4}H{sub 4}X{sub 2} cyclic molecules, with X=O, S, Se, and Te, characterized by the presence of magnetic-field induced toroidal electron currents and associated orbital anapole moments. The orbital anapole induced by a static nonuniform magnetic field B, with uniform curl C=∇×B, is rationalized via a second-rank anapole magnetizability tensor a{sub αβ}, defined as minus the second derivative of the second-order interaction energy with respect to the components C{sub α} and B{sub β}. The average anapole magnetizability a{sup ¯} equals −χ{sup ¯}, the pseudoscalar obtained by spatial averaging of the dipole-quadrupole magnetizability χ{sub α,βγ}. It has different sign for D and L enantiomeric systems and can therefore be used for chiral discrimination. Therefore, in an isotropic chiral medium, a homogeneous magnetic field induces an electronic anapole A{sub α}, having the same magnitude, but opposite sign, for two enantiomorphs.

  12. Hydrodynamic interactions between two forced objects of arbitrary shape. II. Relative translation

    NASA Astrophysics Data System (ADS)

    Goldfriend, Tomer; Diamant, Haim; Witten, Thomas A.

    2016-04-01

    We study the relative translation of two arbitrarily shaped objects, caused by their hydrodynamic interaction as they are forced through a viscous fluid in the limit of zero Reynolds number. It is well known that in the case of two rigid spheres in an unbounded fluid, the hydrodynamic interaction does not produce relative translation. More generally, such an effective pair-interaction vanishes in configurations with spatial inversion symmetry; for example, an enantiomorphic pair in mirror image positions has no relative translation. We show that the breaking of inversion symmetry by boundaries of the system accounts for the interactions between two spheres in confined geometries, as observed in experiments. The same general principle also provides new predictions for interactions in other object configurations near obstacles. We examine the time-dependent relative translation of two self-aligning objects, extending the numerical analysis of our preceding publication [Goldfriend, Diamant, and Witten, Phys. Fluids 27, 123303 (2015)], 10.1063/1.4936894. The interplay between the orientational interaction and the translational one, in most cases, leads over time to repulsion between the two objects. The repulsion is qualitatively different for self-aligning objects compared to the more symmetric case of uniform prolate spheroids. The separation between the two objects increases with time t as t1 /3 in the former case, and more strongly, as t , in the latter.

  13. A lifelong Odyssey: from structural and morphological engineering of functional solids to bio-chirogenisis and pathological crystallization

    NASA Astrophysics Data System (ADS)

    Lahav, Meir; Leiserowitz, Leslie

    2015-11-01

    This cooperative endeavour first describes early studies in chemical crystallography, encompassing molecular packing modes, characterization of weak hydrogen bonds, the engineering of functional crystals and monitoring of reaction pathways in molecular crystals by x-ray and neutron diffraction. With the design of ‘tailor-made’ auxiliary molecules, it became possible to correlate molecular enantiomerism and crystal enantiomorphism, to control the early stages of crystal nucleation, to resolve enantiomers by crystallization, induce the precipitation of metastable polymorphs, and shed light on the role played by solvent on crystal growth. With such auxiliaries, the structure of mixed crystals was revised and the ability to perform ‘absolute’ asymmetric synthesis in host centrosymmetric crystals demonstrated. With the introduction of grazing incidence synchrotron x-ray diffraction from liquid surfaces it also became possible to design and characterize crystalline thin film architectures at the air-water interface providing a general insight on the mechanism of crystal nucleation at the molecular level, in particular that of ice and cholesterol. Finally the collective knowhow from these studies were crucial for obtaining homochiral peptides prepared from the polymerization of racemates of amphiphilic amino acids dissolved in aqueous solution, and for experiments towards elucidating the pathological crystallization of cholesterol and the malaria pigment in Plasmodium-infected red blood cells.

  14. Two pseudo-enantiomeric forms of N-benzyl-4-hydroxy-1-methyl-2,2-dioxo-1H-2λ(6),1-benzothiazine-3-carboxamide and their analgesic properties.

    PubMed

    Ukrainets, Igor V; Shishkina, Svitlana V; Baumer, Vyacheslav N; Gorokhova, Olga V; Petrushova, Lidiya A; Sim, Galina

    2016-05-01

    The fact that molecular crystals exist as different polymorphic modifications and the identification of as many polymorphs as possible are important considerations for the pharmaceutic industry. The molecule of N-benzyl-4-hydroxy-1-methyl-2,2-dioxo-1H-2λ(6),1-benzothiazine-3-carboxamide, C17H16N2O4S, does not contain a stereogenic atom, but intramolecular hydrogen-bonding interactions engender enantiomeric chiral conformations as a labile racemic mixture. The title compound crystallized in a solvent-dependent single chiral conformation within one of two conformationally polymorphic P212121 orthorhombic chiral crystals (denoted forms A and B). Each of these pseudo-enantiomorphic crystals contains one of two pseudo-enantiomeric diastereomers. Form A was obtained from methylene chloride and form B can be crystallized from N,N-dimethylformamide, ethanol, ethyl acetate or xylene. Pharmacological studies with solid-particulate suspensions have shown that crystalline form A exhibits an almost fourfold higher antinociceptive activity compared to form B. PMID:27146570

  15. On the group-theoretical approach to the study of interpenetrating nets.

    PubMed

    Baburin, Igor A

    2016-05-01

    Using group-subgroup and group-supergroup relations, a general theoretical framework is developed to describe and derive interpenetrating 3-periodic nets. The generation of interpenetration patterns is readily accomplished by replicating a single net with a supergroup G of its space group H under the condition that site symmetries of vertices and edges are the same in both H and G. It is shown that interpenetrating nets cannot be mapped onto each other by mirror reflections because otherwise edge crossings would necessarily occur in the embedding. For the same reason any other rotation or roto-inversion axes from G \\ H are not allowed to intersect vertices or edges of the nets. This property significantly narrows the set of supergroups to be included in the derivation of interpenetrating nets. A procedure is described based on the automorphism group of a Hopf ring net [Alexandrov et al. (2012). Acta Cryst. A68, 484-493] to determine maximal symmetries compatible with interpenetration patterns. The proposed approach is illustrated by examples of twofold interpenetrated utp, dia and pcu nets, as well as multiple copies of enantiomorphic quartz (qtz) networks. Some applications to polycatenated 2-periodic layers are also discussed. PMID:27126113

  16. Kaolin polytypes revisited ab initio.

    PubMed

    Mercier, Patrick H J; Le Page, Yvon

    2008-04-01

    The well known 36 distinguishable transformations between adjacent kaolin layers are split into 20 energetically distinguishable transformations (EDT) and 16 enantiomorphic transformations, hereafter denoted EDT*. For infinitesimal energy contribution of interactions between non-adjacent layers, the lowest-energy models must result from either (a) repeated application of an EDT or (b) alternate application of an EDT and its EDT*. All modeling, quantum input preparation and interpretation was performed with Materials Toolkit, and quantum optimizations with VASP. Kaolinite and dickite are the lowest-energy models at zero temperature and pressure, whereas nacrite and HP-dickite are the lowest-enthalpy models under moderate pressures based on a rough enthalpy/pressure graph built from numbers given in the supplementary tables. Minor temperature dependence of this calculated 0 K graph would explain the bulk of the current observations regarding synthesis, diagenesis and transformation of kaolin minerals. Other stackings that we list have energies so competitive that they might crystallize at ambient pressure. A homometric pair of energetically distinguishable ideal models, one of them for nacrite, is exposed. The printed experimental structure of nacrite correctly corresponds to the stable member of the pair. In our opinion, all recent literature measurements of the free energy of bulk kaolinite are too negative by approximately 15 kJ mol(-1) for some unknown reason. PMID:18369284

  17. Asymmetric Autocatalysis Induced by Chiral Crystals of Achiral Tetraphenylethylenes

    NASA Astrophysics Data System (ADS)

    Kawasaki, Tsuneomi; Nakaoda, Mai; Kaito, Nobuhiro; Sasagawa, Taisuke; Soai, Kenso

    2010-02-01

    The achiral hydrocarbon tetraphenylethylene crystallizes in enantiomorphous forms (chiral space group: P21) to afford right- and left-handed hemihedral crystals, which can be recognized by solid-state circular dichroism spectroscopic analysis. Chiral organic crystals of tetraphenylethylene mediated enantioselective addition of diisopropylzinc to pyrimidine-5-carbaldehyde to give, in conjunction with asymmetric autocatalysis with amplification of chirality, almost enantiomerically pure ( S)- and ( R)-5-pyrimidyl alkanols whose absolute configurations were controlled efficiently by the crystalline chirality of the tetraphenylethylene substrate. Tetrakis( p-chlorophenyl)ethylene and tetrakis( p-bromophenyl)ethylene also show chirality in the crystalline state, which can also act as a chiral substrate and induce enantioselectivity of diisopropylzinc addition to pyrimidine-5-carbaldehyde in asymmetric autocatalysis to give enantiomerically enriched 5-pyrimidyl alkanols with the absolute configuration correlated with that of the chiral crystals. Highly enantioselective synthesis has been achieved using chiral crystals composed of achiral hydrocarbons, tetraphenylethylenes, as chiral inducers. This chemical system enables significant amplification of the amount of chirality using spontaneously formed chiral crystals of achiral organic compounds as the seed for the chirality of asymmetric autocatalysis.

  18. Corolla chirality does not contribute to directed pollen movement in Hypericum perforatum (Hypericaceae): mirror image pinwheel flowers function as radially symmetric flowers in pollination.

    PubMed

    Diller, Carolina; Fenster, Charles B

    2016-07-01

    Corolla chirality, the pinwheel arrangement of petals within a flower, is found throughout the core eudicots. In 15 families, different chiral type flowers (i.e., right or left rotated corolla) exist on the same plant, and this condition is referred to as unfixed/enantiomorphic corolla chirality. There are no investigations on the significance of unfixed floral chirality on directed pollen movement even though analogous mirror image floral designs, for example, enantiostyly, has evolved in response to selection to direct pollinator and pollen movement. Here, we examine the role of corolla chirality on directing pollen transfer, pollinator behavior, and its potential influence on disassortative mating. We quantified pollen transfer and pollinator behavior and movement for both right and left rotated flowers in two populations of Hypericum perforatum. In addition, we quantified the number of right and left rotated flowers at the individual level. Pollinators were indifferent to corolla chirality resulting in no difference in pollen deposition between right and left flowers. Corolla chirality had no effect on pollinator and pollen movement between and within chiral morphs. Unlike other mirror image floral designs, corolla chirality appears to play no role in promoting disassortative mating in this species. PMID:27547334

  19. Purification, Crystallization and Preliminary X-ray Crystallographic Studies of RAIDD Death-Domain (DD)

    SciTech Connect

    Jang, T.; Park, H

    2009-01-01

    Caspase-2 activation by formation of PIDDosome is critical for genotoxic stress induced apoptosis. PIDDosome is composed of three proteins, RAIDD, PIDD, and Caspase-2. RAIDD is an adaptor protein containing an N-terminal Caspase-Recruiting-Domain (CARD) and a C-terminal Death-Domain (DD). Its interactions with Caspase-2 and PIDD through CARD and DD respectively and formation of PIDDosome are important for the activation of Caspase-2. RAIDD DD cloned into pET26b vector was expressed in E. coli cells and purified by nickel affinity chromatography and gel filtration. Although it has been known that the most DDs are not soluble in physiological condition, RAIDD DD was soluble and interacts tightly with PIDD DD in physiological condition. The purified RAIDD DD alone has been crystallized. Crystals are trigonal and belong to space group P3121 (or its enantiomorph P3221) with unit-cell parameters a = 56.3, b = 56.3, c = 64.9 and ? = 120 degrees. The crystals were obtained at room temperature and diffracted to 2.0 A resolution.

  20. Octagonal symmetry in low-discrepancy β-manganese.

    PubMed

    Hornfeck, Wolfgang; Kuhn, Philipp

    2014-09-01

    A low-discrepancy cubic variant of β-Mn is presented exhibiting local octagonal symmetry upon projection along any of the three mutually perpendicular 〈100〉 axes. Ideal structural parameters are derived to be x(8c) = (2-\\sqrt{2})\\big/16 and y(12d) = 1\\big/(4 \\sqrt{2}) for the P4132 enantiomorph. A comparison of the actual and ideal structure models of β-Mn is made in terms of the newly devised concept of geometrical discrepancy maps. Two-dimensional maps of both the geometrical star discrepancy D(*) and the minimal interatomic distance dmin are calculated over the combined structural parameter range 0 \\leq x(8c) \\,\\lt\\, 1/8 and 1/8 \\leq y(12d)\\, \\lt\\, 1/4 of generalized β-Mn type structures, showing that the `octagonal' variant of β-Mn is almost optimal in terms of globally minimizing D(*) while at the same time globally maximizing dmin. Geometrical discrepancy maps combine predictive and discriminatory powers to appear useful within a wide range of structural chemistry studies. PMID:25176992

  1. Theoretical estimates of the anapole magnetizabilities of C4H4X2 cyclic molecules for X=O, S, Se, and Te

    NASA Astrophysics Data System (ADS)

    Pagola, G. I.; Ferraro, M. B.; Provasi, P. F.; Pelloni, S.; Lazzeretti, P.

    2014-09-01

    Calculations have been carried out for C4H4X2 cyclic molecules, with X=O, S, Se, and Te, characterized by the presence of magnetic-field induced toroidal electron currents and associated orbital anapole moments. The orbital anapole induced by a static nonuniform magnetic field B, with uniform curl {{C}}=nabla × {{B}}, is rationalized via a second-rank anapole magnetizability tensor aαβ, defined as minus the second derivative of the second-order interaction energy with respect to the components Cα and Bβ. The average anapole magnetizability overline{a} equals -overline{χ }, the pseudoscalar obtained by spatial averaging of the dipole-quadrupole magnetizability χα,βγ. It has different sign for D and L enantiomeric systems and can therefore be used for chiral discrimination. Therefore, in an isotropic chiral medium, a homogeneous magnetic field induces an electronic anapole A_{α }, having the same magnitude, but opposite sign, for two enantiomorphs.

  2. Magnetizabilities of diatomic and linear triatomic molecules in a time-independent nonuniform magnetic field.

    PubMed

    Provasi, P F; Pagola, G I; Ferraro, M B; Pelloni, S; Lazzeretti, P

    2014-08-21

    The theory of response of a molecule in the presence of a static nonuniform magnetic field with uniform gradient is reviewed and extended. Induced magnetic dipole, quadrupole, and anapole moments are expressed via multipole magnetic susceptibilities. Dependence of response properties on the origin of the coordinate system with respect to which they are defined is investigated. Relationships describing the change of multipole and anapole susceptibilities in a translation of the reference system are reported. For a single molecule, two quantities are invariant and, in principle, experimentally measurable, that is, the induced magnetic dipole and the interaction energy. The trace of a second-rank anapole susceptibility, related to a pseudoscalar obtained by spatial averaging of the dipole-quadrupole susceptibility, of different sign for D and L enantiomeric systems, is origin independent. Therefore, in an isotropic chiral medium a homogeneous magnetic field induces an electronic anapole, having the same magnitude but opposite sign for two enantiomorphs. Calculations have been carried out for a set of diatomic and linear triatomic systems characterized by the presence of magnetic-field induced toroidal electron currents. PMID:24171551

  3. Theoretical estimates of the anapole magnetizabilities of C₄H₄X₂ cyclic molecules for X=O, S, Se, and Te.

    PubMed

    Pagola, G I; Ferraro, M B; Provasi, P F; Pelloni, S; Lazzeretti, P

    2014-09-01

    Calculations have been carried out for C4H4X2 cyclic molecules, with X=O, S, Se, and Te, characterized by the presence of magnetic-field induced toroidal electron currents and associated orbital anapole moments. The orbital anapole induced by a static nonuniform magnetic field B, with uniform curl C=∇×B, is rationalized via a second-rank anapole magnetizability tensor a(αβ), defined as minus the second derivative of the second-order interaction energy with respect to the components C(α) and B(β). The average anapole magnetizability a̅ equals -χ̅, the pseudoscalar obtained by spatial averaging of the dipole-quadrupole magnetizability χ(α,βγ). It has different sign for D and L enantiomeric systems and can therefore be used for chiral discrimination. Therefore, in an isotropic chiral medium, a homogeneous magnetic field induces an electronic anapole A(α), having the same magnitude, but opposite sign, for two enantiomorphs. PMID:25194370

  4. An Improved Canine Genome and a Comprehensive Catalogue of Coding Genes and Non-Coding Transcripts

    PubMed Central

    Hoeppner, Marc P.; Lundquist, Andrew; Pirun, Mono; Meadows, Jennifer R. S.; Zamani, Neda; Johnson, Jeremy; Sundström, Görel; Cook, April; FitzGerald, Michael G.; Swofford, Ross; Mauceli, Evan; Moghadam, Behrooz Torabi; Greka, Anna; Alföldi, Jessica; Abouelleil, Amr; Aftuck, Lynne; Bessette, Daniel; Berlin, Aaron; Brown, Adam; Gearin, Gary; Lui, Annie; Macdonald, J. Pendexter; Priest, Margaret; Shea, Terrance; Turner-Maier, Jason; Zimmer, Andrew; Lander, Eric S.; di Palma, Federica

    2014-01-01

    The domestic dog, Canis familiaris, is a well-established model system for mapping trait and disease loci. While the original draft sequence was of good quality, gaps were abundant particularly in promoter regions of the genome, negatively impacting the annotation and study of candidate genes. Here, we present an improved genome build, canFam3.1, which includes 85 MB of novel sequence and now covers 99.8% of the euchromatic portion of the genome. We also present multiple RNA-Sequencing data sets from 10 different canine tissues to catalog ∼175,000 expressed loci. While about 90% of the coding genes previously annotated by EnsEMBL have measurable expression in at least one sample, the number of transcript isoforms detected by our data expands the EnsEMBL annotations by a factor of four. Syntenic comparison with the human genome revealed an additional ∼3,000 loci that are characterized as protein coding in human and were also expressed in the dog, suggesting that those were previously not annotated in the EnsEMBL canine gene set. In addition to ∼20,700 high-confidence protein coding loci, we found ∼4,600 antisense transcripts overlapping exons of protein coding genes, ∼7,200 intergenic multi-exon transcripts without coding potential, likely candidates for long intergenic non-coding RNAs (lincRNAs) and ∼11,000 transcripts were reported by two different library construction methods but did not fit any of the above categories. Of the lincRNAs, about 6,000 have no annotated orthologs in human or mouse. Functional analysis of two novel transcripts with shRNA in a mouse kidney cell line altered cell morphology and motility. All in all, we provide a much-improved annotation of the canine genome and suggest regulatory functions for several of the novel non-coding transcripts. PMID:24625832

  5. The Swiss-Prot protein knowledgebase and ExPASy: providing the plant community with high quality proteomic data and tools.

    PubMed

    Schneider, Michel; Tognolli, Michael; Bairoch, Amos

    2004-12-01

    The Swiss-Prot protein knowledgebase provides manually annotated entries for all species, but concentrates on the annotation of entries from model organisms to ensure the presence of high quality annotation of representative members of all protein families. A specific Plant Protein Annotation Program (PPAP) was started to cope with the increasing amount of data produced by the complete sequencing of plant genomes. Its main goal is the annotation of proteins from the model plant organism Arabidopsis thaliana. In addition to bibliographic references, experimental results, computed features and sometimes even contradictory conclusions, direct links to specialized databases connect amino acid sequences with the current knowledge in plant sciences. As protein families and groups of plant-specific proteins are regularly reviewed to keep up with current scientific findings, we hope that the wealth of information of Arabidopsis origin accumulated in our knowledgebase, and the numerous software tools provided on the Expert Protein Analysis System (ExPASy) web site might help to identify and reveal the function of proteins originating from other plants. Recently, a single, centralized, authoritative resource for protein sequences and functional information, UniProt, was created by joining the information contained in Swiss-Prot, Translation of the EMBL nucleotide sequence (TrEMBL), and the Protein Information Resource-Protein Sequence Database (PIR-PSD). A rising problem is that an increasing number of nucleotide sequences are not being submitted to the public databases, and thus the proteins inferred from such sequences will have difficulties finding their way to the Swiss-Prot or TrEMBL databases. PMID:15707838

  6. Trans-splicing of pre-mRNA is predicted to occur in a wide range of organisms including vertebrates.

    PubMed Central

    Dandekar, T; Sibbald, P R

    1990-01-01

    Several known trans-splicing RNA structures were used to define a canonical trans-splicing structure which was then used to perform a computer search of the EMBL nucleotide database. In addition to most known trans-splicing structures, many putative new trans-splicing sites were detected. These were found in a broad range of organisms including the vertebrates. Control experiments indicate that the search predicts known false positives at a rate of only 20%. Trans-splicing may therefore be a very wide-spread phenomenon. PMID:2395638

  7. Draft genome sequence of Acidithiobacillus ferrooxidans YQH-1

    PubMed Central

    Yan, Lei; Zhang, Shuang; Wang, Weidong; Hu, Huixin; Wang, Yanjie; Yu, Gaobo; Chen, Peng

    2015-01-01

    Acidithiobacillus ferrooxidans YQH-1 is a moderate acidophilic bacterium isolated from a river in a volcano of Northeast China. Here, we describe the draft genome of strain YQH-1, which was assembled into 123 contigs containing 3,111,222 bp with a G + C content of 58.63%. A large number of genes related to carbon dioxide fixation, dinitrogen fixation, pH tolerance, heavy metal detoxification, and oxidative stress defense were detected. The genome sequence can be accessed at DDBJ/EMBL/GenBank under the accession no. LJBT00000000. PMID:26697394

  8. Maize MeJA-responsive proteins identified by high-resolution 2-DE PAGE.

    PubMed

    Zhang, Yuliang; Pennerman, Kayla K; Yang, Fengshan; Yin, Guohua

    2015-12-01

    Exogenous methyl jasmonate (MeJA) is well-known to induce plant defense mechanisms effective against a wide variety of insect and microbial pests. High-resolution 2-DE gel electrophoresis was used to discover changes in the leaf proteome of maize exposed to MeJA. We sequenced 62 MeJA-responsive proteins by tandem mass spectroscopy, and deposited the mass spectra and identities in the EMBL-EBI PRIDE repository under reference number PXD001793. An analysis and discussion of the identified proteins in relation to maize defense against Asian corn borer is published by Zhang et al. (2015) [1]. PMID:26509185

  9. Draft genome sequence of a multidrug-resistant Chryseobacterium indologenes isolate from Malaysia

    PubMed Central

    Yu, Choo Yee; Ang, Geik Yong; Cheng, Huey Jia; Cheong, Yuet Meng; Yin, Wai-Fong; Chan, Kok-Gan

    2015-01-01

    Chryseobacterium indologenes is an emerging pathogen which poses a threat in clinical healthcare setting due to its multidrug-resistant phenotype and its common association with nosocomial infections. Here, we report the draft genome of a multidrug-resistant C. indologenes CI_885 isolated in 2014 from Malaysia. The 908,704-kb genome harbors a repertoire of putative antibiotic resistance determinants which may elucidate the molecular basis and underlying mechanisms of its resistant to various classes of antibiotics. The genome sequence has been deposited in DDBJ/EMBL/GenBank under the accession number LJOD00000000. PMID:26981402

  10. Draft genome sequence of a multidrug-resistant Chryseobacterium indologenes isolate from Malaysia.

    PubMed

    Yu, Choo Yee; Ang, Geik Yong; Cheng, Huey Jia; Cheong, Yuet Meng; Yin, Wai-Fong; Chan, Kok-Gan

    2016-03-01

    Chryseobacterium indologenes is an emerging pathogen which poses a threat in clinical healthcare setting due to its multidrug-resistant phenotype and its common association with nosocomial infections. Here, we report the draft genome of a multidrug-resistant C. indologenes CI_885 isolated in 2014 from Malaysia. The 908,704-kb genome harbors a repertoire of putative antibiotic resistance determinants which may elucidate the molecular basis and underlying mechanisms of its resistant to various classes of antibiotics. The genome sequence has been deposited in DDBJ/EMBL/GenBank under the accession number LJOD00000000. PMID:26981402

  11. Whole genome sequences and annotation of Micrococcus luteus SUBG006, a novel phytopathogen of mango

    PubMed Central

    Rakhashiya, Purvi M.; Patel, Pooja P.; Thaker, Vrinda S.

    2015-01-01

    Actinobaceria, Micrococcus luteus SUBG006 was isolated from infected leaves of Mangifera indica L. vr. Nylon in Rajkot, (22.30°N, 70.78°E), Gujarat, India. The genome size is 3.86 Mb with G + C content of 69.80% and contains 112 rRNA sequences (5S, 16S and 23S). The whole genome sequencing has been deposited in DDBJ/EMBL/GenBank under the accession number JOKP00000000. PMID:26697318

  12. Whole genome sequences and annotation of Micrococcus luteus SUBG006, a novel phytopathogen of mango.

    PubMed

    Rakhashiya, Purvi M; Patel, Pooja P; Thaker, Vrinda S

    2015-12-01

    Actinobaceria, Micrococcus luteus SUBG006 was isolated from infected leaves of Mangifera indica L. vr. Nylon in Rajkot, (22.30°N, 70.78°E), Gujarat, India. The genome size is 3.86 Mb with G + C content of 69.80% and contains 112 rRNA sequences (5S, 16S and 23S). The whole genome sequencing has been deposited in DDBJ/EMBL/GenBank under the accession number JOKP00000000. PMID:26697318

  13. Genome sequencing and annotation of multidrug resistant Mycobacterium tuberculosis (MDR-TB) PR10 strain

    PubMed Central

    Halim, Mohd Zakihalani A.; Jaafar, Mohammad Maaruf; Teh, Lay Kek; Ismail, Mohamad Izwan; Lee, Lian Shien; Ngeow, Yun Fong; Nor, Norazmi Mohd; Zainuddin, Zainul Fadziruddin; Tang, Thean Hock; Najimudin, Mohd Nazalan Mohd; Salleh, Mohd Zaki

    2016-01-01

    Here, we report the draft genome sequence and annotation of a multidrug resistant Mycobacterium tuberculosis strain PR10 (MDR-TB PR10) isolated from a patient diagnosed with tuberculosis. The size of the draft genome MDR-TB PR10 is 4.34 Mbp with 65.6% of G + C content and consists of 4637 predicted genes. The determinants were categorized by RAST into 400 subsystems with 4286 coding sequences and 50 RNAs. The whole genome shotgun project has been deposited at DDBJ/EMBL/GenBank under the accession number CP010968. PMID:26981419

  14. Crystal growth and preliminary X-ray study of glutamic acid specific serine protease from Bacillus intermedius

    NASA Astrophysics Data System (ADS)

    Kuranova, I. P.; Blagova, E. V.; Levdikov, V. M.; Rudenskaya, G. N.; Balaban, N. P.; Shakirov, E. V.

    1999-01-01

    The glutamic acid specific protease (glutamyl-endopeptidase) from Bacillus intermedius, strain 3-19, was isolated and purified using ion exchange chromatography on CM-cellulose and Mono-S FPLC column. The conditions for crystallization of the enzyme have been discussed. The crystals of enzyme were grown using hanging-drop vapor-diffusion technique. Crystals belong to the space group C2 with unit cell parameters of a=61.62 Å, b=55.84 Å, c=60.40 Å, β=117.6° X-ray diffraction data to 1.68 Å resolution were collected using synchrotron radiation (EMBL, Hamburg) and an imaging plate scanner.

  15. Complete genome sequence of the gram-negative probiotic Escherichia coli strain Nissle 1917.

    PubMed

    Reister, Marten; Hoffmeier, Klaus; Krezdorn, Nicolas; Rotter, Bjoern; Liang, Chunguang; Rund, Stefan; Dandekar, Thomas; Sonnenborn, Ulrich; Oelschlaeger, Tobias A

    2014-10-10

    Escherichia coli strain Nissle 1917 (EcN) is the active principle of a probiotic preparation (trade name Mutaflor(®)) used for the treatment of patients with intestinal diseases such as ulcerative colitis and diarrhea. It has GRAS (generally recognized as save) status and has been shown to be a therapeutically effective drug (Sonnenborn and Schulze, 2009). The complete genomic DNA sequence will help in identifying genes and their products which are essential for the strains probiotic nature. Genbank/EMBL/DDBJ accession number: CP007799 (chromosome). PMID:25093936

  16. Draft genome sequence of Acidithiobacillus ferrooxidans YQH-1.

    PubMed

    Yan, Lei; Zhang, Shuang; Wang, Weidong; Hu, Huixin; Wang, Yanjie; Yu, Gaobo; Chen, Peng

    2015-12-01

    Acidithiobacillus ferrooxidans YQH-1 is a moderate acidophilic bacterium isolated from a river in a volcano of Northeast China. Here, we describe the draft genome of strain YQH-1, which was assembled into 123 contigs containing 3,111,222 bp with a G + C content of 58.63%. A large number of genes related to carbon dioxide fixation, dinitrogen fixation, pH tolerance, heavy metal detoxification, and oxidative stress defense were detected. The genome sequence can be accessed at DDBJ/EMBL/GenBank under the accession no. LJBT00000000. PMID:26697394

  17. Genome sequencing and annotation of multidrug resistant Mycobacterium tuberculosis (MDR-TB) PR10 strain.

    PubMed

    Halim, Mohd Zakihalani A; Jaafar, Mohammad Maaruf; Teh, Lay Kek; Ismail, Mohamad Izwan; Lee, Lian Shien; Ngeow, Yun Fong; Nor, Norazmi Mohd; Zainuddin, Zainul Fadziruddin; Tang, Thean Hock; Najimudin, Mohd Nazalan Mohd; Salleh, Mohd Zaki

    2016-03-01

    Here, we report the draft genome sequence and annotation of a multidrug resistant Mycobacterium tuberculosis strain PR10 (MDR-TB PR10) isolated from a patient diagnosed with tuberculosis. The size of the draft genome MDR-TB PR10 is 4.34 Mbp with 65.6% of G + C content and consists of 4637 predicted genes. The determinants were categorized by RAST into 400 subsystems with 4286 coding sequences and 50 RNAs. The whole genome shotgun project has been deposited at DDBJ/EMBL/GenBank under the accession number CP010968. PMID:26981419

  18. Database on the structure of small ribosomal subunit RNA.

    PubMed Central

    Van de Peer, Y; Van den Broeck, I; De Rijk, P; De Wachter, R

    1994-01-01

    The database on small ribosomal subunit RNA structure contains (June 1994) 2824 nucleotide sequences. All these sequences are stored in the form of an alignment based on the adopted secondary structure model, which in turn is corroborated by the observation of compensating substitutions in the alignment. The complete database is made available to the scientific community through anonymous ftp on our server in Antwerp. A special effort was made to improve electronic retrieval and a program is supplied that allows to create different file formats. The database can also be obtained from the EMBL nucleotide sequence library. PMID:7524022

  19. Draft genome sequence of the docosahexaenoic acid producing thraustochytrid Aurantiochytrium sp. T66.

    PubMed

    Liu, Bin; Ertesvåg, Helga; Aasen, Inga Marie; Vadstein, Olav; Brautaset, Trygve; Heggeset, Tonje Marita Bjerkan

    2016-06-01

    Thraustochytrids are unicellular, marine protists, and there is a growing industrial interest in these organisms, particularly because some species, including strains belonging to the genus Aurantiochytrium, accumulate high levels of docosahexaenoic acid (DHA). Here, we report the draft genome sequence of Aurantiochytrium sp. T66 (ATCC PRA-276), with a size of 43 Mbp, and 11,683 predicted protein-coding sequences. The data has been deposited at DDBJ/EMBL/Genbank under the accession LNGJ00000000. The genome sequence will contribute new insight into DHA biosynthesis and regulation, providing a basis for metabolic engineering of thraustochytrids. PMID:27222814

  20. Compilation of 5S rRNA and 5S rRNA gene sequences

    PubMed Central

    Specht, Thomas; Wolters, Jörn; Erdmann, Volker A.

    1990-01-01

    The BERLIN RNA DATABANK as of Dezember 31, 1989, contains a total of 667 sequences of 5S rRNAs or their genes, which is an increase of 114 new sequence entries over the last compilation (1). It covers sequences from 44 archaebacteria, 267 eubacteria, 20 plastids, 6 mitochondria, 319 eukaryotes and 11 eukaryotic pseudogenes. The hardcopy shows only the list (Table 1) of those organisms whose sequences have been determined. The BERLIN RNA DATABANK uses the format of the EMBL Nucleotide Sequence Data Library complemented by a Sequence Alignment (SA) field including secondary structure information. PMID:1692116

  1. Comparability of mixed IC₅₀ data - a statistical analysis.

    PubMed

    Kalliokoski, Tuomo; Kramer, Christian; Vulpetti, Anna; Gedeck, Peter

    2013-01-01

    The biochemical half maximal inhibitory concentration (IC50) is the most commonly used metric for on-target activity in lead optimization. It is used to guide lead optimization, build large-scale chemogenomics analysis, off-target activity and toxicity models based on public data. However, the use of public biochemical IC50 data is problematic, because they are assay specific and comparable only under certain conditions. For large scale analysis it is not feasible to check each data entry manually and it is very tempting to mix all available IC50 values from public database even if assay information is not reported. As previously reported for Ki database analysis, we first analyzed the types of errors, the redundancy and the variability that can be found in ChEMBL IC50 database. For assessing the variability of IC50 data independently measured in two different labs at least ten IC50 data for identical protein-ligand systems against the same target were searched in ChEMBL. As a not sufficient number of cases of this type are available, the variability of IC50 data was assessed by comparing all pairs of independent IC50 measurements on identical protein-ligand systems. The standard deviation of IC50 data is only 25% larger than the standard deviation of Ki data, suggesting that mixing IC50 data from different assays, even not knowing assay conditions details, only adds a moderate amount of noise to the overall data. The standard deviation of public ChEMBL IC50 data, as expected, resulted greater than the standard deviation of in-house intra-laboratory/inter-day IC50 data. Augmenting mixed public IC50 data by public Ki data does not deteriorate the quality of the mixed IC50 data, if the Ki is corrected by an offset. For a broad dataset such as ChEMBL database a Ki- IC50 conversion factor of 2 was found to be the most reasonable. PMID:23613770

  2. Non-redundant patent sequence databases with value-added annotations at two levels.

    PubMed

    Li, Weizhong; McWilliam, Hamish; de la Torre, Ana Richart; Grodowski, Adam; Benediktovich, Irina; Goujon, Mickael; Nauche, Stephane; Lopez, Rodrigo

    2010-01-01

    The European Bioinformatics Institute (EMBL-EBI) provides public access to patent data, including abstracts, chemical compounds and sequences. Sequences can appear multiple times due to the filing of the same invention with multiple patent offices, or the use of the same sequence by different inventors in different contexts. Information relating to the source invention may be incomplete, and biological information available in patent documents elsewhere may not be reflected in the annotation of the sequence. Search and analysis of these data have become increasingly challenging for both the scientific and intellectual-property communities. Here, we report a collection of non-redundant patent sequence databases, which cover the EMBL-Bank nucleotides patent class and the patent protein databases and contain value-added annotations from patent documents. The databases were created at two levels by the use of sequence MD5 checksums. Sequences within a level-1 cluster are 100% identical over their whole length. Level-2 clusters were defined by sub-grouping level-1 clusters based on patent family information. Value-added annotations, such as publication number corrections, earliest publication dates and feature collations, significantly enhance the quality of the data, allowing for better tracking and cross-referencing. The databases are available format: http://www.ebi.ac.uk/patentdata/nr/. PMID:19884134

  3. Web services at the European Bioinformatics Institute-2009

    PubMed Central

    Mcwilliam, Hamish; Valentin, Franck; Goujon, Mickael; Li, Weizhong; Narayanasamy, Menaka; Martin, Jenny; Miyar, Teresa; Lopez, Rodrigo

    2009-01-01

    The European Bioinformatics Institute (EMBL-EBI) has been providing access to mainstream databases and tools in bioinformatics since 1997. In addition to the traditional web form based interfaces, APIs exist for core data resources such as EMBL-Bank, Ensembl, UniProt, InterPro, PDB and ArrayExpress. These APIs are based on Web Services (SOAP/REST) interfaces that allow users to systematically access databases and analytical tools. From the user's point of view, these Web Services provide the same functionality as the browser-based forms. However, using the APIs frees the user from web page constraints and are ideal for the analysis of large batches of data, performing text-mining tasks and the casual or systematic evaluation of mathematical models in regulatory networks. Furthermore, these services are widespread and easy to use; require no prior knowledge of the technology and no more than basic experience in programming. In the following we wish to inform of new and updated services as well as briefly describe planned developments to be made available during the course of 2009–2010. PMID:19435877

  4. FourCSeq: analysis of 4C sequencing data

    PubMed Central

    Klein, Felix A.; Pakozdi, Tibor; Anders, Simon; Ghavi-Helm, Yad; Furlong, Eileen E. M.; Huber, Wolfgang

    2015-01-01

    Motivation: Circularized Chromosome Conformation Capture (4C) is a powerful technique for studying the spatial interactions of a specific genomic region called the ‘viewpoint’ with the rest of the genome, both in a single condition or comparing different experimental conditions or cell types. Observed ligation frequencies typically show a strong, regular dependence on genomic distance from the viewpoint, on top of which specific interaction peaks are superimposed. Here, we address the computational task to find these specific peaks and to detect changes between different biological conditions. Results: We model the overall trend of decreasing interaction frequency with genomic distance by fitting a smooth monotonically decreasing function to suitably transformed count data. Based on the fit, z-scores are calculated from the residuals, and high z-scores are interpreted as peaks providing evidence for specific interactions. To compare different conditions, we normalize fragment counts between samples, and call for differential contact frequencies using the statistical method DESeq2 adapted from RNA-Seq analysis. Availability and implementation: A full end-to-end analysis pipeline is implemented in the R package FourCSeq available at www.bioconductor.org. Contact: felix.klein@embl.de or whuber@embl.de Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26034064

  5. The identification of short linear motif-mediated interfaces within the human interactome

    PubMed Central

    Weatheritt, R. J.; Luck, K.; Petsalaki, E.; Davey, N. E.; Gibson, T. J.

    2012-01-01

    Motivation: Eukaryotic proteins are highly modular, containing multiple interaction interfaces that mediate binding to a network of regulators and effectors. Recent advances in high-throughput proteomics have rapidly expanded the number of known protein–protein interactions (PPIs); however, the molecular basis for the majority of these interactions remains to be elucidated. There has been a growing appreciation of the importance of a subset of these PPIs, namely those mediated by short linear motifs (SLiMs), particularly the canonical and ubiquitous SH2, SH3 and PDZ domain-binding motifs. However, these motif classes represent only a small fraction of known SLiMs and outside these examples little effort has been made, either bioinformatically or experimentally, to discover the full complement of motif instances. Results: In this article, interaction data are analysed to identify and characterize an important subset of PPIs, those involving SLiMs binding to globular domains. To do this, we introduce iELM, a method to identify interactions mediated by SLiMs and add molecular details of the interaction interfaces to both interacting proteins. The method identifies SLiM-mediated interfaces from PPI data by searching for known SLiM–domain pairs. This approach was applied to the human interactome to identify a set of high-confidence putative SLiM-mediated PPIs. Availability: iELM is freely available at http://elmint.embl.de Contact: toby.gibson@embl.de Supplementary information: Supplementary data are available at Bioinformatics online. PMID:22328783

  6. CART—a chemical annotation retrieval toolkit

    PubMed Central

    Deghou, Samy; Zeller, Georg; Iskar, Murat; Driessen, Marja; Castillo, Mercedes; van Noort, Vera; Bork, Peer

    2016-01-01

    Motivation: Data on bioactivities of drug-like chemicals are rapidly accumulating in public repositories, creating new opportunities for research in computational systems pharmacology. However, integrative analysis of these data sets is difficult due to prevailing ambiguity between chemical names and identifiers and a lack of cross-references between databases. Results: To address this challenge, we have developed CART, a Chemical Annotation Retrieval Toolkit. As a key functionality, it matches an input list of chemical names into a comprehensive reference space to assign unambiguous chemical identifiers. In this unified space, bioactivity annotations can be easily retrieved from databases covering a wide variety of chemical effects on biological systems. Subsequently, CART can determine annotations enriched in the input set of chemicals and display these in tabular format and interactive network visualizations, thereby facilitating integrative analysis of chemical bioactivity data. Availability and Implementation: CART is available as a Galaxy web service (cart.embl.de). Source code and an easy-to-install command line tool can also be obtained from the web site. Contact: bork@embl.de Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27256313

  7. Plant Protein Annotation in the UniProt Knowledgebase1

    PubMed Central

    Schneider, Michel; Bairoch, Amos; Wu, Cathy H.; Apweiler, Rolf

    2005-01-01

    The Swiss-Prot, TrEMBL, Protein Information Resource (PIR), and DNA Data Bank of Japan (DDBJ) protein database activities have united to form the Universal Protein Resource (UniProt) Consortium. UniProt presents three database layers: the UniProt Archive, the UniProt Knowledgebase (UniProtKB), and the UniProt Reference Clusters. The UniProtKB consists of two sections: UniProtKB/Swiss-Prot (fully manually curated entries) and UniProtKB/TrEMBL (automated annotation, classification and extensive cross-references). New releases are published fortnightly. A specific Plant Proteome Annotation Program (http://www.expasy.org/sprot/ppap/) was initiated to cope with the increasing amount of data produced by the complete sequencing of plant genomes. Through UniProt, our aim is to provide the scientific community with a single, centralized, authoritative resource for protein sequences and functional information that will allow the plant community to fully explore and utilize the wealth of information available for both plant and nonplant model organisms. PMID:15888679

  8. Compilation of DNA sequences of Escherichia coli K12: description of the interactive databases ECD and ECDC.

    PubMed Central

    Kröger, M; Wahl, R

    1998-01-01

    We have compiled the DNA sequence data for Escherichia coli K12 available from the GenBank and EMBL data libraries and independently from the literature. We provide the most definitive version of the ECD Escherichia coli database now exclusively via the World Wide Web System (http://susi.bio.uni-giessen.de/ecdc.html ). Our database encloses the completed genome sequence recently published by two competing groups and an assembled set of all elder sequences. The organisation of the database allows precise physical location of each individual gene or regulatory region, even taking into consideration discrepancies in nomenclature. The WWW program allows to the user to branch into the original EMBL and SWISS-PROT datafiles. A number of links to other WWW servers dealing with E. coli is provided. A FASTA and BLAST search may be performed online. Besides the WWW format a flat file version may be obtained via ftp. A number of discrepancies between the two systematic sequence determinations and/or the literature have not yet been resolved. However, our database may serve as a reference source for resolution and/or the assignment of strain difference. PMID:9399797

  9. The annotation-enriched non-redundant patent sequence databases.

    PubMed

    Li, Weizhong; Kondratowicz, Bartosz; McWilliam, Hamish; Nauche, Stephane; Lopez, Rodrigo

    2013-01-01

    The EMBL-European Bioinformatics Institute (EMBL-EBI) offers public access to patent sequence data, providing a valuable service to the intellectual property and scientific communities. The non-redundant (NR) patent sequence databases comprise two-level nucleotide and protein sequence clusters (NRNL1, NRNL2, NRPL1 and NRPL2) based on sequence identity (level-1) and patent family (level-2). Annotation from the source entries in these databases is merged and enhanced with additional information from the patent literature and biological context. Corrections in patent publication numbers, kind-codes and patent equivalents significantly improve the data quality. Data are available through various user interfaces including web browser, downloads via FTP, SRS, Dbfetch and EBI-Search. Sequence similarity/homology searches against the databases are available using BLAST, FASTA and PSI-Search. In this article, we describe the data collection and annotation and also outline major changes and improvements introduced since 2009. Apart from data growth, these changes include additional annotation for singleton clusters, the identifier versioning for tracking entry change and the entry mappings between the two-level databases. Database URL: http://www.ebi.ac.uk/patentdata/nr/ PMID:23396323

  10. The HUPO PSI's molecular interaction format--a community standard for the representation of protein interaction data.

    PubMed

    Hermjakob, Henning; Montecchi-Palazzi, Luisa; Bader, Gary; Wojcik, Jérôme; Salwinski, Lukasz; Ceol, Arnaud; Moore, Susan; Orchard, Sandra; Sarkans, Ugis; von Mering, Christian; Roechert, Bernd; Poux, Sylvain; Jung, Eva; Mersch, Henning; Kersey, Paul; Lappe, Michael; Li, Yixue; Zeng, Rong; Rana, Debashis; Nikolski, Macha; Husi, Holger; Brun, Christine; Shanker, K; Grant, Seth G N; Sander, Chris; Bork, Peer; Zhu, Weimin; Pandey, Akhilesh; Brazma, Alvis; Jacq, Bernard; Vidal, Marc; Sherman, David; Legrain, Pierre; Cesareni, Gianni; Xenarios, Ioannis; Eisenberg, David; Steipe, Boris; Hogue, Chris; Apweiler, Rolf

    2004-02-01

    A major goal of proteomics is the complete description of the protein interaction network underlying cell physiology. A large number of small scale and, more recently, large-scale experiments have contributed to expanding our understanding of the nature of the interaction network. However, the necessary data integration across experiments is currently hampered by the fragmentation of publicly available protein interaction data, which exists in different formats in databases, on authors' websites or sometimes only in print publications. Here, we propose a community standard data model for the representation and exchange of protein interaction data. This data model has been jointly developed by members of the Proteomics Standards Initiative (PSI), a work group of the Human Proteome Organization (HUPO), and is supported by major protein interaction data providers, in particular the Biomolecular Interaction Network Database (BIND), Cellzome (Heidelberg, Germany), the Database of Interacting Proteins (DIP), Dana Farber Cancer Institute (Boston, MA, USA), the Human Protein Reference Database (HPRD), Hybrigenics (Paris, France), the European Bioinformatics Institute's (EMBL-EBI, Hinxton, UK) IntAct, the Molecular Interactions (MINT, Rome, Italy) database, the Protein-Protein Interaction Database (PPID, Edinburgh, UK) and the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING, EMBL, Heidelberg, Germany). PMID:14755292

  11. Compilation of DNA sequences of Escherichia coli (update 1992)

    PubMed Central

    Kröger, Manfred; Wahl, Ralf; Schachtel, Gabriel; Rice, Peter

    1992-01-01

    We have compiled the DNA sequence data for E.coli available from the GENBANK and EMBL data libraries and over a period of several years independently from the literature. This is the fourth listing replacing and increasing the former listings substantially. However, in order to save space this printed version contains DNA sequence information only, if they are publically available in electronic form. The complete compilation including a full set of genetic map data and the E.coli protein index can be obtained in machine readable form from the EMBL data library (ECD release 10) or from the CD-ROM version of this supplement issue directly. After deletion of all detected overlaps a total of 1 820 237 individual bp is found to be determined till the beginning of 1992. This corresponds to a total of 38.56% of the entire E.coli chromosome consisting of about 4,720 kbp. This number may actually be higher by some extra 2,5% derived from lysogenic bacteriophage lambda and various DNA sequences already received for other strains of E.coli. PMID:1598239

  12. The UniProtKB/Swiss-Prot knowledgebase and its Plant Proteome Annotation Program

    PubMed Central

    Schneider, Michel; Lane, Lydie; Boutet, Emmanuel; Lieberherr, Damien; Tognolli, Michael; Bougueleret, Lydie; Bairoch, Amos

    2009-01-01

    The UniProt knowledgebase, UniProtKB, is the main product of the UniProt consortium. It consists of two sections, UniProtKB/Swiss-Prot, the manually curated section, and UniProtKB/TrEMBL, the computer translation of the EMBL/GenBank/DDBJ nucleotide sequence database. Taken together, these two sections cover all the proteins characterized or inferred from all publicly available nucleotide sequences. The Plant Proteome Annotation Program (PPAP) of UniProtKB/Swiss-Prot focuses on the manual annotation of plant-specific proteins and protein families. Our major effort is currently directed towards the two model plants Arabidopsis thaliana and Oryza sativa. In UniProtKB/Swiss-Prot, redundancy is minimized by merging all data from different sources in a single entry. The proposed protein sequence is frequently modified after comparison with ESTs, full length transcripts or homologous proteins from other species. The information present in manually curated entries allows the reconstruction of all described isoforms. The annotation also includes proteomics data such as PTM and protein identification MS experimental results. UniProtKB and the other products of the UniProt consortium are accessible online at www.uniprot.org. PMID:19084081

  13. Plant protein annotation in the UniProt Knowledgebase.

    PubMed

    Schneider, Michel; Bairoch, Amos; Wu, Cathy H; Apweiler, Rolf

    2005-05-01

    The Swiss-Prot, TrEMBL, Protein Information Resource (PIR), and DNA Data Bank of Japan (DDBJ) protein database activities have united to form the Universal Protein Resource (UniProt) Consortium. UniProt presents three database layers: the UniProt Archive, the UniProt Knowledgebase (UniProtKB), and the UniProt Reference Clusters. The UniProtKB consists of two sections: UniProtKB/Swiss-Prot (fully manually curated entries) and UniProtKB/TrEMBL (automated annotation, classification and extensive cross-references). New releases are published fortnightly. A specific Plant Proteome Annotation Program (http://www.expasy.org/sprot/ppap/) was initiated to cope with the increasing amount of data produced by the complete sequencing of plant genomes. Through UniProt, our aim is to provide the scientific community with a single, centralized, authoritative resource for protein sequences and functional information that will allow the plant community to fully explore and utilize the wealth of information available for both plant and non-plant model organisms. PMID:15888679

  14. LISTA, LISTA-HOP and LISTA-HON: a comprehensive compilation of protein encoding sequences and its associated homology databases from the yeast Saccharomyces.

    PubMed Central

    Dölz, R; Mossé, M O; Slonimski, P P; Bairoch, A; Linder, P

    1994-01-01

    We continued our effort to make a comprehensive database (LISTA) for the yeast Saccharomyces cerevisiae. In this database each sequence has been attributed a single genetic name. In the case of duplicated sequences a simple method has been applied to distinguish between sequences of one and the same gene from non-allelic sequences of duplicated genes. If necessary, synonyms are given in the case of allelic duplicated sequences. Thus sequences can be found either by the name or by synonyms given in LISTA. Each entry contains the genetic name, the mnemonic from the EMBL data bank, the codon bias, reference of the publication of the sequence, Chromosomal location as far as known, Swissprot and EMBL accession numbers. To obtain more information on the included sequences, each entry has been screened against non-redundant nucleotide and protein data bank collections resulting in LISTA-HON and LISTA-HOP. The LISTA data base can be linked to the associated data sets or to nucleotide and protein banks by the Sequence Retrieval System (SRS). PMID:7937046

  15. UniProt: the Universal Protein knowledgebase

    PubMed Central

    Apweiler, Rolf; Bairoch, Amos; Wu, Cathy H.; Barker, Winona C.; Boeckmann, Brigitte; Ferro, Serenella; Gasteiger, Elisabeth; Huang, Hongzhan; Lopez, Rodrigo; Magrane, Michele; Martin, Maria J.; Natale, Darren A.; O’Donovan, Claire; Redaschi, Nicole; Yeh, Lai-Su L.

    2004-01-01

    To provide the scientific community with a single, centralized, authoritative resource for protein sequences and functional information, the Swiss-Prot, TrEMBL and PIR protein database activities have united to form the Universal Protein Knowledgebase (UniProt) consortium. Our mission is to provide a comprehensive, fully classified, richly and accurately annotated protein sequence knowledgebase, with extensive cross-references and query interfaces. The central database will have two sections, corresponding to the familiar Swiss-Prot (fully manually curated entries) and TrEMBL (enriched with automated classification, annotation and extensive cross-references). For convenient sequence searches, UniProt also provides several non-redundant sequence databases. The UniProt NREF (UniRef) databases provide representative subsets of the knowledgebase suitable for efficient searching. The comprehensive UniProt Archive (UniParc) is updated daily from many public source databases. The UniProt databases can be accessed online (http://www.uniprot.org) or downloaded in several formats (ftp://ftp.uniprot.org/pub). The scientific community is encouraged to submit data for inclusion in UniProt. PMID:14681372

  16. The InterPro database, an integrated documentation resource for protein families, domains and functional sites.

    PubMed

    Apweiler, R; Attwood, T K; Bairoch, A; Bateman, A; Birney, E; Biswas, M; Bucher, P; Cerutti, L; Corpet, F; Croning, M D; Durbin, R; Falquet, L; Fleischmann, W; Gouzy, J; Hermjakob, H; Hulo, N; Jonassen, I; Kahn, D; Kanapin, A; Karavidopoulou, Y; Lopez, R; Marx, B; Mulder, N J; Oinn, T M; Pagni, M; Servant, F; Sigrist, C J; Zdobnov, E M

    2001-01-01

    Signature databases are vital tools for identifying distant relationships in novel sequences and hence for inferring protein function. InterPro is an integrated documentation resource for protein families, domains and functional sites, which amalgamates the efforts of the PROSITE, PRINTS, Pfam and ProDom database projects. Each InterPro entry includes a functional description, annotation, literature references and links back to the relevant member database(s). Release 2.0 of InterPro (October 2000) contains over 3000 entries, representing families, domains, repeats and sites of post-translational modification encoded by a total of 6804 different regular expressions, profiles, fingerprints and Hidden Markov Models. Each InterPro entry lists all the matches against SWISS-PROT and TrEMBL (more than 1,000,000 hits from 462,500 proteins in SWISS-PROT and TrEMBL). The database is accessible for text- and sequence-based searches at http://www.ebi.ac.uk/interpro/. Questions can be emailed to interhelp@ebi.ac.uk. PMID:11125043

  17. InterPro: an integrated documentation resource for protein families, domains and functional sites.

    PubMed

    Mulder, Nicola J; Apweiler, Rolf; Attwood, Terri K; Bairoch, Amos; Bateman, Alex; Binns, David; Biswas, Margaret; Bradley, Paul; Bork, Peer; Bucher, Phillip; Copley, Richard; Courcelle, Emmanuel; Durbin, Richard; Falquet, Laurent; Fleischmann, Wolfgang; Gouzy, Jerome; Griffith-Jones, Sam; Haft, Daniel; Hermjakob, Henning; Hulo, Nicolas; Kahn, Daniel; Kanapin, Alexander; Krestyaninova, Maria; Lopez, Rodrigo; Letunic, Ivica; Orchard, Sandra; Pagni, Marco; Peyruc, David; Ponting, Chris P; Servant, Florence; Sigrist, Christian J A

    2002-09-01

    The exponential increase in the submission of nucleotide sequences to the nucleotide sequence database by genome sequencing centres has resulted in a need for rapid, automatic methods for classification of the resulting protein sequences. There are several signature and sequence cluster-based methods for protein classification, each resource having distinct areas of optimum application owing to the differences in the underlying analysis methods. In recognition of this, InterPro was developed as an integrated documentation resource for protein families, domains and functional sites, to rationalise the complementary efforts of the individual protein signature database projects. The member databases - PRINTS, PROSITE, Pfam, ProDom, SMART and TIGRFAMs - form the InterPro core. Related signatures from each member database are unified into single InterPro entries. Each InterPro entry includes a unique accession number, functional descriptions and literature references, and links are made back to the relevant member database(s). Release 4.0 of InterPro (November 2001) contains 4,691 entries, representing 3,532 families, 1,068 domains, 74 repeats and 15 sites of post-translational modification (PTMs) encoded by different regular expressions, profiles, fingerprints and hidden Markov models (HMMs). Each InterPro entry lists all the matches against SWISS-PROT and TrEMBL (2,141,621 InterPro hits from 586,124 SWISS-PROT and TrEMBL protein sequences). The database is freely accessible for text- and sequence-based searches. PMID:12230031

  18. UniProt: the Universal Protein knowledgebase.

    PubMed

    Apweiler, Rolf; Bairoch, Amos; Wu, Cathy H; Barker, Winona C; Boeckmann, Brigitte; Ferro, Serenella; Gasteiger, Elisabeth; Huang, Hongzhan; Lopez, Rodrigo; Magrane, Michele; Martin, Maria J; Natale, Darren A; O'Donovan, Claire; Redaschi, Nicole; Yeh, Lai-Su L

    2004-01-01

    To provide the scientific community with a single, centralized, authoritative resource for protein sequences and functional information, the Swiss-Prot, TrEMBL and PIR protein database activities have united to form the Universal Protein Knowledgebase (UniProt) consortium. Our mission is to provide a comprehensive, fully classified, richly and accurately annotated protein sequence knowledgebase, with extensive cross-references and query interfaces. The central database will have two sections, corresponding to the familiar Swiss-Prot (fully manually curated entries) and TrEMBL (enriched with automated classification, annotation and extensive cross-references). For convenient sequence searches, UniProt also provides several non-redundant sequence databases. The UniProt NREF (UniRef) databases provide representative subsets of the knowledgebase suitable for efficient searching. The comprehensive UniProt Archive (UniParc) is updated daily from many public source databases. The UniProt databases can be accessed online (http://www.uniprot.org) or downloaded in several formats (ftp://ftp.uniprot.org/pub). The scientific community is encouraged to submit data for inclusion in UniProt. PMID:14681372

  19. The InterPro database, an integrated documentation resource for protein families, domains and functional sites

    PubMed Central

    Apweiler, R.; Attwood, T. K.; Bairoch, A.; Bateman, A.; Birney, E.; Biswas, M.; Bucher, P.; Cerutti, L.; Corpet, F.; Croning, M. D. R.; Durbin, R.; Falquet, L.; Fleischmann, W.; Gouzy, J.; Hermjakob, H.; Hulo, N.; Jonassen, I.; Kahn, D.; Kanapin, A.; Karavidopoulou, Y.; Lopez, R.; Marx, B.; Mulder, N. J.; Oinn, T. M.; Pagni, M.; Servant, F.; Sigrist, C. J. A.; Zdobnov, E. M.

    2001-01-01

    Signature databases are vital tools for identifying distant relationships in novel sequences and hence for inferring protein function. InterPro is an integrated documentation resource for protein families, domains and functional sites, which amalgamates the efforts of the PROSITE, PRINTS, Pfam and ProDom database projects. Each InterPro entry includes a functional description, annotation, literature references and links back to the relevant member database(s). Release 2.0 of InterPro (October 2000) contains over 3000 entries, representing families, domains, repeats and sites of post-translational modification encoded by a total of 6804 different regular expressions, profiles, fingerprints and Hidden Markov Models. Each InterPro entry lists all the matches against SWISS-PROT and TrEMBL (more than 1 000 000 hits from 462 500 proteins in SWISS-PROT and TrEMBL). The database is accessible for text- and sequence-based searches at http://www.ebi.ac.uk/interpro/. Questions can be emailed to interhelp@ebi.ac.uk. PMID:11125043

  20. The UniProtKB/Swiss-Prot knowledgebase and its Plant Proteome Annotation Program.

    PubMed

    Schneider, Michel; Lane, Lydie; Boutet, Emmanuel; Lieberherr, Damien; Tognolli, Michael; Bougueleret, Lydie; Bairoch, Amos

    2009-04-13

    The UniProt knowledgebase, UniProtKB, is the main product of the UniProt consortium. It consists of two sections, UniProtKB/Swiss-Prot, the manually curated section, and UniProtKB/TrEMBL, the computer translation of the EMBL/GenBank/DDBJ nucleotide sequence database. Taken together, these two sections cover all the proteins characterized or inferred from all publicly available nucleotide sequences. The Plant Proteome Annotation Program (PPAP) of UniProtKB/Swiss-Prot focuses on the manual annotation of plant-specific proteins and protein families. Our major effort is currently directed towards the two model plants Arabidopsis thaliana and Oryza sativa. In UniProtKB/Swiss-Prot, redundancy is minimized by merging all data from different sources in a single entry. The proposed protein sequence is frequently modified after comparison with ESTs, full length transcripts or homologous proteins from other species. The information present in manually curated entries allows the reconstruction of all described isoforms. The annotation also includes proteomics data such as PTM and protein identification MS experimental results. UniProtKB and the other products of the UniProt consortium are accessible online at www.uniprot.org. PMID:19084081

  1. Bioinformatics in protein analysis.

    PubMed

    Persson, B

    2000-01-01

    The chapter gives an overview of bioinformatic techniques of importance in protein analysis. These include database searches, sequence comparisons and structural predictions. Links to useful World Wide Web (WWW) pages are given in relation to each topic. Databases with biological information are reviewed with emphasis on databases for nucleotide sequences (EMBL, GenBank, DDBJ), genomes, amino acid sequences (Swissprot, PIR, TrEMBL, GenePept), and three-dimensional structures (PDB). Integrated user interfaces for databases (SRS and Entrez) are described. An introduction to databases of sequence patterns and protein families is also given (Prosite, Pfam, Blocks). Furthermore, the chapter describes the widespread methods for sequence comparisons, FASTA and BLAST, and the corresponding WWW services. The techniques involving multiple sequence alignments are also reviewed: alignment creation with the Clustal programs, phylogenetic tree calculation with the Clustal or Phylip packages and tree display using Drawtree, njplot or phylo_win. Finally, the chapter also treats the issue of structural prediction. Different methods for secondary structure predictions are described (Chou-Fasman, Garnier-Osguthorpe-Robson, Predator, PHD). Techniques for predicting membrane proteins, antigenic sites and postranslational modifications are also reviewed. PMID:10803381

  2. Towards a Universal SMILES representation - A standard method to generate canonical SMILES based on the InChI

    PubMed Central

    2012-01-01

    Background There are two line notations of chemical structures that have established themselves in the field: the SMILES string and the InChI string. The InChI aims to provide a unique, or canonical, identifier for chemical structures, while SMILES strings are widely used for storage and interchange of chemical structures, but no standard exists to generate a canonical SMILES string. Results I describe how to use the InChI canonicalisation to derive a canonical SMILES string in a straightforward way, either incorporating the InChI normalisations (Inchified SMILES) or not (Universal SMILES). This is the first description of a method to generate canonical SMILES that takes stereochemistry into account. When tested on the 1.1 m compounds in the ChEMBL database, and a 1 m compound subset of the PubChem Substance database, no canonicalisation failures were found with Inchified SMILES. Using Universal SMILES, 99.79% of the ChEMBL database was canonicalised successfully and 99.77% of the PubChem subset. Conclusions The InChI canonicalisation algorithm can successfully be used as the basis for a common standard for canonical SMILES. While challenges remain – such as the development of a standard aromatic model for SMILES – the ability to create the same SMILES using different toolkits will mean that for the first time it will be possible to easily compare the chemical models used by different toolkits. PMID:22989151

  3. ARS Component B: structural characterization, tissue expression and regulation of the gene and protein (SLURP-1) associated with Mal de Meleda.

    PubMed

    Mastrangeli, Renato; Donini, Silvia; Kelton, Christie A; He, Chaomei; Bressan, Alessandro; Milazzo, Ferdinando; Ciolli, Veniero; Borrelli, Francesco; Martelli, Fabrizio; Biffoni, Mauro; Serlupi-Crescenzi, Ottaviano; Serani, Serenella; Micangeli, Emilia; El Tayar, Nabil; Vaccaro, Rosa; Renda, Tindaro; Lisciani, Romeo; Rossi, Mara; Papoian, Ruben

    2003-01-01

    The ARS Component B gene (EMBL ID: HSARS81S, AC: X99977) encodes a 9 kD non-glycosylated polypeptide (also known as SLURP-1, SwissProt/TrEMBL: P55000), a soluble member of the human Ly6/uPAR superfamily. ARS Component B gene mutations have been implicated in Mal de Meleda. In this study we show by immunohistochemistry that SLURP-1 (secreted Ly-6/uPAR related protein, the protein product of the ARS Component B gene) is localized to human skin, exocervix, gums, stomach and esophagus. In the epidermis, keratinocytes underlying the stratum corneum are highly positive for SLURP1 immunostaining and cultured keratinocytes secrete the expected 9 kD protein. Circulating SLURP1 is detected in human plasma and urine. In the mouse, expression is evident in skin, eye, whole lung, trachea, esophagus and stomach. Human ARS Component B mRNA expression is regulated by retinoic acid, epidermal growth factor and interferon-gamma. The tissue localization and the association with Mal de Meleda suggest that ARS Component B and its protein product SLURP1 are implicated in maintaining the physiological and structural integrity of the keratinocyte layers of the skin. PMID:14721776

  4. On the microstructure and symmetry of apparently hexagonal BaAl{sub 2}O{sub 4}

    SciTech Connect

    Larsson, A.-K. Withers, R.L.; Perez-Mato, J.M.; Fitz Gerald, J.D.; Saines, P.J.; Kennedy, B.J.; Liu, Y.

    2008-08-15

    The P6{sub 3} (a=2a{sub p}, b=2b{sub p}, c=c{sub p}) crystal structure reported for BaAl{sub 2}O{sub 4} at room temperature has been carefully re-investigated by a combined transmission electron microscopy and neutron powder diffraction study. It is shown that the poor fit of this P6{sub 3} (a=2a{sub p}, b=2b{sub p}, c=c{sub p}) structure model for BaAl{sub 2}O{sub 4} to neutron powder diffraction data is primarily due to the failure to take into account coherent scattering between different domains related by enantiomorphic twinning of the P6{sub 3}22 parent sub-structure. Fast Fourier transformation of [0 0 1] lattice images from small localized real space regions ({approx}10 nm in diameter) are used to show that the P6{sub 3} (a=2a{sub p}, b=2b{sub p}, c=c{sub p}) crystal structure reported for BaAl{sub 2}O{sub 4} is not correct on the local scale. The correct local symmetry of the very small nano-domains is most likely orthorhombic or monoclinic. - Graphical abstract: The electron diffraction pattern of BaAl{sub 2}O{sub 4} (left) is compatible with the 3-q superstructure corresponding to the conventional P6{sub 3}, a=2a{sub p} structure (p refers to the tridymite-related parent P6{sub 3}22 structure). Fast Fourier transforms (right) of small domains of lattice images, however, show that the local structure in fact is single q, and that true symmetry is monoclinic or orthorhombic.

  5. Stereospecificity of amino acid hydroxamate inhibition of aminopeptidases.

    PubMed

    Wilkes, S H; Prescott, J M

    1983-11-25

    Hydroxamates of amino acids and aliphatic acids are effective inhibitors of Aeromonas proteolytica amino-peptidase (EC 3.4.11.10) and of both the cytosolic (EC 3.4.11.1) and microsomal (EC 3.4.11.2) aminopeptidases of swine kidney. Cytosolic leucine aminopeptidase and the Aeromonas enzyme were inhibited to a greater extent by D isomers than by the L enantiomorphs, manganese-activated kidney cytosolic leucine aminopeptidase being inhibited 10 times more effectively by D-leucine and D-valine hydroxamic acids than by the L isomers. The D isomers of these two compounds inhibited Aeromonas aminopeptidase to an even greater extent with Ki values of 2 X 10(-9) and 5 X 10(-9), respectively, whereas the corresponding L isomers were bound 150 times less tightly. With the Aeromonas enzyme, a comparison of inhibition by racemic mixtures with that of the corresponding L isomers indicated that in all cases the contribution of the D isomer was predominant. Isocaproic hydroxamic acid inhibited this enzyme equally well as L-leucine hydroxamic acid, indicating that the amino group orientation in the D isomer contributes to the binding efficacy. Swine kidney microsomal aminopeptidase was also inhibited by D isomers of leucine and valine hydroxamic acids but in contrast to the other two enzymes, the inhibition was 10-fold less than that observed for the corresponding L isomers. Cytosolic leucine aminopeptidase with either 6 g atoms of zinc per mol or 12 g atoms of zinc per mol was inhibited only slightly by any of the hydroxamic acid compounds; evidently enzyme-bound manganese (or magnesium) is specific for hydroxamate binding to this aminopeptidase. PMID:6643439

  6. Cloning, purification and preliminary crystallographic analysis of a putative pyridoxal kinase from Bacillus subtilis

    SciTech Connect

    Newman, Joseph A.; Das, Sanjan K.; Sedelnikova, Svetlana E.; Rice, David W.

    2006-10-01

    A putative pyridoxal kinase from B. subtilis has been cloned, overexpressed, purified and crystallized and data have been collected to 2.8 Å resolution. Pyridoxal kinases (PdxK) are able to catalyse the phosphorylation of three vitamin B{sub 6} precursors, pyridoxal, pyridoxine and pyridoxamine, to their 5′-phosphates and play an important role in the vitamin B{sub 6} salvage pathway. Recently, the thiD gene of Bacillus subtilis was found to encode an enzyme which has the activity expected of a pyridoxal kinase despite its previous assignment as an HMPP kinase owing to higher sequence similarity. As such, this enzyme would appear to represent a new class of ‘HMPP kinase-like’ pyridoxal kinases. B. subtilis thiD has been cloned and the protein has been overexpressed in Escherichia coli, purified and subsequently crystallized in a binary complex with ADP and Mg{sup 2+}. X-ray diffraction data have been collected from crystals to 2.8 Å resolution at 100 K. The crystals belong to a primitive tetragonal system, point group 422, and analysis of the systematic absences suggest that they belong to one of the enantiomorphic pair of space groups P4{sub 1}2{sub 1}2 or P4{sub 3}2{sub 1}2. Consideration of the space-group symmetry and unit-cell parameters (a = b = 102.9, c = 252.6 Å, α = β = γ = 90°) suggest that the crystals contain between three and six molecules in the asymmetric unit. A full structure determination is under way to provide insights into aspects of the enzyme mechanism and substrate specificity.

  7. K/Ar dating of illite in understanding fault-related diagenesis

    SciTech Connect

    Lee, M.

    1996-12-31

    K/Ar dating of diagenetic illite provides quantitative information regarding the timing of major geological events within sedimentary basins. K/Ar data, however, have to be evaluated in the framework of sedimentological, mineralogical and structural configuration. These information together with other quantitative data can reveal the dynamics of diagenetic events, such as timing, condition, fluid source, and their relationship to structural/tectonic processes. In several Rotliegendes gas fields in the North Sea basins, abundant 1M(cis) illite are found in some reservoirs. These samples are located near fracture/faulted zones. Synchrotron diffraction data and WIDEFIRE illite polytype modeling show that these illites consist of single 1M(cis) enantiomorph. Morphologically, they are platy, and form grain-coating clays oriented perpendicular to the grain surface. In these samples, detrital feldspars and lithic fragments are severely altered and pore-filling quartz cements, co-precipitated with illites, are extensive. In samples away from fractured/faulted zones, detrital feldspars are less altered and only moderate to minor amounts of 1M(trans) illites are found. K/Ar data show that illite in samples closest to the fractured/faulted regions formed first. Stable oxygen isotopes of illite and hand-picked quartz cements and fluid inclusions show hot, high salinity(>20 wt.%), CaCl{sub 2}-rich fluids were involved in these diagenetic processes, suggestive of hydrothermal origin. Our data and other published data show that these processes may be very common in rift settings, and significant reservoir heterogeneity could be generated by fault-related diagenesis.

  8. K/Ar dating of illite in understanding fault-related diagenesis

    SciTech Connect

    Lee, M. )

    1996-01-01

    K/Ar dating of diagenetic illite provides quantitative information regarding the timing of major geological events within sedimentary basins. K/Ar data, however, have to be evaluated in the framework of sedimentological, mineralogical and structural configuration. These information together with other quantitative data can reveal the dynamics of diagenetic events, such as timing, condition, fluid source, and their relationship to structural/tectonic processes. In several Rotliegendes gas fields in the North Sea basins, abundant 1M(cis) illite are found in some reservoirs. These samples are located near fracture/faulted zones. Synchrotron diffraction data and WIDEFIRE illite polytype modeling show that these illites consist of single 1M(cis) enantiomorph. Morphologically, they are platy, and form grain-coating clays oriented perpendicular to the grain surface. In these samples, detrital feldspars and lithic fragments are severely altered and pore-filling quartz cements, co-precipitated with illites, are extensive. In samples away from fractured/faulted zones, detrital feldspars are less altered and only moderate to minor amounts of 1M(trans) illites are found. K/Ar data show that illite in samples closest to the fractured/faulted regions formed first. Stable oxygen isotopes of illite and hand-picked quartz cements and fluid inclusions show hot, high salinity(>20 wt.%), CaCl[sub 2]-rich fluids were involved in these diagenetic processes, suggestive of hydrothermal origin. Our data and other published data show that these processes may be very common in rift settings, and significant reservoir heterogeneity could be generated by fault-related diagenesis.

  9. Life Depends upon Two Kinds of Water

    PubMed Central

    Wiggins, Philippa

    2008-01-01

    Background Many well-documented biochemical processes lack a molecular mechanism. Examples are: how ATP hydrolysis and an enzyme contrive to perform work, such as active transport; how peptides are formed from amino acids and DNA from nucleotides; how proteases cleave peptide bonds, how bone mineralises; how enzymes distinguish between sodium and potassium; how chirality of biopolymers was established prebiotically. Methodology/Principal Findings It is shown that involvement of water in all these processes is mandatory, but the water must be of the simplified configuration in which there are only two strengths of water-water hydrogen bonds, and in which these two types of water coexist as microdomains throughout the liquid temperature range. Since they have different strengths of hydrogen bonds, the microdomains differ in all their physical and chemical properties. Solutes partition asymmetrically, generating osmotic pressure gradients which must be compensated for or abolished. Displacement of the equilibrium between high and low density waters incurs a thermodynamic cost which limits solubility, depresses ionisation of water, drives protein folding and prevents high density water from boiling at its intrinsic boiling point which appears to be below 0°C. Active processes in biochemistry take place in sequential partial reactions, most of which release small amounts of free energy as heat. This ensures that the system is never far from equilibrium so that efficiency is extremely high. Energy transduction is neither possible and nor necessary. Chirality was probably established in prebiotic clays which must have carried stable populations of high density and low density water domains. Bioactive enantiomorphs partition into low density water in which they polymerise spontaneously. Conclusions/Significance The simplified model of water has great explanatory power. PMID:18183287

  10. Molecular characterization of Mycobacterium avium complex isolates giving discordant results in AccuProbe tests by PCR-restriction enzyme analysis, 16S rRNA gene sequencing, and DT1-DT6 PCR.

    PubMed Central

    Devallois, A; Picardeau, M; Paramasivan, C N; Vincent, V; Rastogi, N

    1997-01-01

    Based on cultural and biochemical tests, a total of 84 strains (72 clinical and 12 environmental isolates from the Caribbean Isles, Europe, and the Indian subcontinent) were identified as members of the Mycobacterium avium complex (MAC). They were further characterized with MAC, M. avium, and M. intracellulare probes of the AccuProbe system, and this was followed by selective amplification of DT6 and DT1 sequences. Seventy isolates gave concordant results; 63 were identified as M. avium, 5 were identified as M. intracellulare, and 24 remained untypeable by both methods. Fourteen isolates gave discrepant results, as they were DT1 positive but gave negative results by the M. intracellulare AccuProbe test. Consequently, a detailed molecular analysis of all DT1-positive isolates (14 discrepant strains plus 5 M. intracellulare strains) was performed by PCR-restriction analysis (PRA) of the hsp65 gene and 16S rRNA gene sequencing. The results confirmed the reported heterogeneity of M. intracellulare, as only 6 of 19 isolates (32%) gave PRA results compatible with published M. intracellulare profiles while the rest of the isolates were grouped in four previously unpublished profiles. 16S rRNA gene sequencing showed that only 8 of 19 isolates (42%) were related to M. intracellulare IWGMT 90247 (EMBL accession no. X88917), the rest being related to MCRO19 (EMBL accession no. X93030) and MIWGTMR10 (EMBL accession no. X88915). In conclusion, we have characterized a significant number of MAC isolates which were not identified by the AccuProbe test, PRA, or 16S rRNA sequencing. However, all of them were identifiable by DT1-DT6 PCR (they were DT6 negative and DT1 positive) and could be tentatively identified as M. intracellulare based on previously published observations. It is noteworthy that the majority of such isolates (14 of 19) were from the Indian subcontinent, with 12 of 14 being environmental isolates. Our study confirms the marked heterogeneity of M. intracellulare

  11. To Hit or Not to Hit, That Is the Question – Genome-wide Structure-Based Druggability Predictions for Pseudomonas aeruginosa Proteins

    PubMed Central

    Sarkar, Aurijit; Brenk, Ruth

    2015-01-01

    Pseudomonas aeruginosa is a Gram-negative bacterium known to cause opportunistic infections in immune-compromised or immunosuppressed individuals that often prove fatal. New drugs to combat this organism are therefore sought after. To this end, we subjected the gene products of predicted perturbative genes to structure-based druggability predictions using DrugPred. Making this approach suitable for large-scale predictions required the introduction of new methods for calculation of descriptors, development of a workflow to identify suitable pockets in homologous proteins and establishment of criteria to obtain valid druggability predictions based on homologs. We were able to identify 29 perturbative proteins of P. aeruginosa that may contain druggable pockets, including some of them with no or no drug-like inhibitors deposited in ChEMBL. These proteins form promising novel targets for drug discovery against P. aeruginosa. PMID:26360059

  12. Isolation and complete genome sequencing of Mimivirus bombay, a Giant Virus in sewage of Mumbai, India.

    PubMed

    Chatterjee, Anirvan; Ali, Farhan; Bange, Disha; Kondabagil, Kiran

    2016-09-01

    We report the isolation and complete genome sequencing of a new Mimiviridae family member, infecting Acanthamoeba castellanii, from sewage in Mumbai, India. The isolated virus has a particle size of about 435 nm and a 1,182,200-bp genome. A phylogeny based on the DNA polymerase sequence placed the isolate as a new member of the Mimiviridae family lineage A and was named as Mimivirus bombay. Extensive presence of Mimiviridae family members in different environmental niches, with remarkably similar genome size and genetic makeup, point towards an evolutionary advantage that needs to be further investigated. The complete genome sequence of Mimivirus bombay was deposited at GenBank/EMBL/DDBJ under the accession number KU761889. PMID:27330993

  13. A side effect resource to capture phenotypic effects of drugs

    PubMed Central

    Kuhn, Michael; Campillos, Monica; Letunic, Ivica; Jensen, Lars Juhl; Bork, Peer

    2010-01-01

    The molecular understanding of phenotypes caused by drugs in humans is essential for elucidating mechanisms of action and for developing personalized medicines. Side effects of drugs (also known as adverse drug reactions) are an important source of human phenotypic information, but so far research on this topic has been hampered by insufficient accessibility of data. Consequently, we have developed a public, computer-readable side effect resource (SIDER) that connects 888 drugs to 1450 side effect terms. It contains information on frequency in patients for one-third of the drug–side effect pairs. For 199 drugs, the side effect frequency of placebo administration could also be extracted. We illustrate the potential of SIDER with a number of analyses. The resource is freely available for academic research at http://sideeffects.embl.de. PMID:20087340

  14. Arrêt cardiocirculatoire par accidents d’électrisations: intérêt du défibrillateur semi-automatique

    PubMed Central

    Siah, S.; Fouadi, F.E.; Ababou, K.; Ihrai, I.; Drissi, N.K.

    2011-01-01

    Summary Les brûlures par accidents électriques sont graves car elles peuvent entraîner le décès par arrêt cardiocirculatoire. Les arrêts cardiocirculatoires induits par le courant de basse tension sont en règle générale dûs à une fibrillation ventriculaire, plutôt de bon pronostic si la chaîne des secours est efficace. Il faut donner la priorité à la défibrillation systématique d’emblée en utilisant un défibrillateur semi-automatique. La défibrillation électrique est susceptible de procurer immédiatement une restauration de l’activité circulatoire spontanée. PMID:21991238

  15. Genenames.org: the HGNC resources in 2015

    PubMed Central

    Gray, Kristian A.; Yates, Bethan; Seal, Ruth L.; Wright, Mathew W.; Bruford, Elspeth A.

    2015-01-01

    The HUGO Gene Nomenclature Committee (HGNC) based at the European Bioinformatics Institute (EMBL-EBI) assigns unique symbols and names to human genes. To date the HGNC have assigned over 39 000 gene names and, representing an increase of over 5000 entries in the past two years. As well as increasing the size of our database, we have continued redesigning our website http://www.genenames.org and have modified, updated and improved many aspects of the site including a faster and more powerful search, a vastly improved HCOP tool and a REST service to increase the number of ways users can retrieve our data. This article provides an overview of our current online data and resources, and highlights the changes we have made in recent years. PMID:25361968

  16. Genomic analysis of novel phytopathogenic Georgenia sp. strain SUB25

    PubMed Central

    Patel, Pooja P.; Rakhashiya, Purvi M.; Thaker, Vrinda S.

    2015-01-01

    A Gram positive bacterium, Georgenia sp. SUB25 was isolated from infected leaves of Solanum lycopersicum L. in Rajkot (22.30°N, 70.78°E), Gujarat, India. We sequenced and analyzed Georgenia sp. SUB25 that is novel plant pathogen using next generation sequencing platform and assembly yielded contigs representing a size of 4.84 Mb with 81 tRNAs and 88 rRNAs. The whole genome sequencing has been deposited in DDBJ/EMBL/GenBank under the accession number JNFL00000000. This genome sequence contains Type II secretion system genes, which involved in pathogenicity mechanism that may help to understand plant microbial interaction. PMID:26484278

  17. High-quality complete genome sequence of Microbacterium sp. SUBG005, a plant pathogen

    PubMed Central

    Rakhashiya, Purvi M.; Patel, Pooja P.; Thaker, Vrinda S.

    2015-01-01

    Microbacterium sp. SUBG005 is a Gram positive bacterium, isolated from infected leaf of Mangifera indica L. in Rajkot (22.30°N, 70.78°E), Gujarat, India. The genome sequencing of Microbacterium sp. SUBG005 is having type I secretion system genes of pathogenicity as well as heavy metal resistance unique genes. The genome size is 7.01 Mb with G + C content of 64.80% and contains rRNA sequences. Genome sequencing analysis provides information about the microbe role in host–pathogen interaction. The whole genome sequencing has been deposited in DDBJ/EMBL/GenBank under the accession number JNNT00000000. PMID:26484276

  18. Construction of a chromosome specific library of human MARs and mapping of matrix attachment regions on human chromosome 19.

    PubMed Central

    Nikolaev, L G; Tsevegiyn, T; Akopov, S B; Ashworth, L K; Sverdlov, E D

    1996-01-01

    Using a novel procedure a representative human chromosome 19-specific library was constructed of short sequences, which bind preferentially to the nuclear matrix (matrix attachment regions, or MARs). Judging by 20 clones sequenced so far, the library contains > 50% of human inserts, about 90% of which are matrix-binding by the in vitro test. Computer analysis of sequences of eight human MARs did not reveal any significant homologies with the EMBL Nucleotide Data Base entries as well as between MARs themselves. Eight MARs were assigned to individual positions on the chromosome 19 physical map. The library constructed can serve as a good source of MAR sequences for comparative analysis and classification and for further chromosome mapping of MARs as well. PMID:8614638

  19. Draft genome sequence of Mameliella alba strain UMTAT08 isolated from clonal culture of toxic dinoflagellate Alexandrium tamiyavanichii.

    PubMed

    Danish-Daniel, Muhd; Han Ming, Gan; Noor, Mohd Ezhar Mohd; Yeong, Yik Sung; Usup, Gires

    2016-12-01

    Mameliella alba strain UMTAT08 was isolated from clonal culture of paralytic shellfish toxin producing dinoflagellate, Alexandrium tamiyavanichii. Genome of the strain UMTAT08 was sequenced in order to gain insights into the dinoflagellate-bacteria interactions. The draft genome sequence of strain UMTAT08 contains 5.84Mbp with an estimated G + C content of 65%, 5717 open reading frames, 5 rRNAs and 49 tRNAs. It contains genes related to nutrients uptake, quorum sensing and environmental tolerance related genes. Gene clusters for the biosynthesis of type 1 polyketide synthase, bacteriocin, microcin, terpene and ectoine were also identified. This is suggesting that the bacterium possesses diverse adaptation strategy to survive within the dinoflagellate phycosphere. The draft genome sequence and annotation have been deposited at DDBJ/EMBL/GenBank under the accession number JSUQ00000000. PMID:27625991

  20. Compilation of small ribosomal subunit RNA structures.

    PubMed Central

    Neefs, J M; Van de Peer, Y; De Rijk, P; Chapelle, S; De Wachter, R

    1993-01-01

    The database on small ribosomal subunit RNA structure contained 1804 nucleotide sequences on April 23, 1993. This number comprises 365 eukaryotic, 65 archaeal, 1260 bacterial, 30 plastidial, and 84 mitochondrial sequences. These are stored in the form of an alignment in order to facilitate the use of the database as input for comparative studies on higher-order structure and for reconstruction of phylogenetic trees. The elements of the postulated secondary structure for each molecule are indicated by special symbols. The database is available on-line directly from the authors by ftp and can also be obtained from the EMBL nucleotide sequence library by electronic mail, ftp, and on CD ROM disk. PMID:8332525

  1. High quality, small molecule-activity datasets for kinase research.

    PubMed

    Sharma, Rajan; Schürer, Stephan C; Muskal, Steven M

    2016-01-01

    Kinases regulate cell growth, movement, and death. Deregulated kinase activity is a frequent cause of disease. The therapeutic potential of kinase inhibitors has led to large amounts of published structure activity relationship (SAR) data. Bioactivity databases such as the Kinase Knowledgebase (KKB), WOMBAT, GOSTAR, and ChEMBL provide researchers with quantitative data characterizing the activity of compounds across many biological assays. The KKB, for example, contains over 1.8M kinase structure-activity data points reported in peer-reviewed journals and patents. In the spirit of fostering methods development and validation worldwide, we have extracted and have made available from the KKB 258K structure activity data points and 76K associated unique chemical structures across eight kinase targets. These data are freely available for download within this data note. PMID:27429748

  2. Whole genome sequence of the emerging oomycete pathogen Pythium insidiosum strain CDC-B5653 isolated from an infected human in the USA

    PubMed Central

    Ascunce, Marina S.; Huguet-Tapia, Jose C.; Braun, Edward L.; Ortiz-Urquiza, Almudena; Keyhani, Nemat O.; Goss, Erica M.

    2015-01-01

    Pythium insidiosum ATCC 200269 strain CDC-B5653, an isolate from necrotizing lesions on the mouth and eye of a 2-year-old boy in Memphis, Tennessee, USA, was sequenced using a combination of Illumina MiSeq (300 bp paired-end, 14 millions reads) and PacBio (10  Kb fragment library, 356,001 reads). The sequencing data were assembled using SPAdes version 3.1.0, yielding a total genome size of 45.6 Mb contained in 8992 contigs, N50 of 13 Kb, 57% G + C content, and 17,867 putative protein-coding genes. This Whole Genome Shotgun project has been deposited at DDBJ/EMBL/GenBank under the accession JRHR00000000. PMID:26981361

  3. Whole genome sequence of the emerging oomycete pathogen Pythium insidiosum strain CDC-B5653 isolated from an infected human in the USA.

    PubMed

    Ascunce, Marina S; Huguet-Tapia, Jose C; Braun, Edward L; Ortiz-Urquiza, Almudena; Keyhani, Nemat O; Goss, Erica M

    2016-03-01

    Pythium insidiosum ATCC 200269 strain CDC-B5653, an isolate from necrotizing lesions on the mouth and eye of a 2-year-old boy in Memphis, Tennessee, USA, was sequenced using a combination of Illumina MiSeq (300 bp paired-end, 14 millions reads) and PacBio (10  Kb fragment library, 356,001 reads). The sequencing data were assembled using SPAdes version 3.1.0, yielding a total genome size of 45.6 Mb contained in 8992 contigs, N50 of 13 Kb, 57% G + C content, and 17,867 putative protein-coding genes. This Whole Genome Shotgun project has been deposited at DDBJ/EMBL/GenBank under the accession JRHR00000000. PMID:26981361

  4. SNPhood: investigate, quantify and visualise the epigenomic neighbourhood of SNPs using NGS data

    PubMed Central

    Arnold, Christian; Bhat, Pooja; Zaugg, Judith B.

    2016-01-01

    Motivation: The vast majority of the many thousands of disease-associated single nucleotide polymorphisms (SNPs) lie in the non-coding part of the genome. They are likely to affect regulatory elements, such as enhancers and promoters, rather than the function of a protein. To understand the molecular mechanisms underlying genetic diseases, it is therefore increasingly important to study the effect of a SNP on nearby molecular traits such as chromatin or transcription factor binding. Results: We developed SNPhood, a user-friendly Bioconductor R package to investigate, quantify and visualise the local epigenetic neighbourhood of a set of SNPs in terms of chromatin marks or TF binding sites using data from NGS experiments. Availability and implementation: SNPhood is publicly available and maintained as an R Bioconductor package at http://bioconductor.org/packages/SNPhood/. Contact: judith.zaugg@embl.de Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27153574

  5. Draft genome sequence of the Algerian bee Apis mellifera intermissa

    PubMed Central

    Haddad, Nizar Jamal; Loucif-Ayad, Wahida; Adjlane, Noureddine; Saini, Deepti; Manchiganti, Rushiraj; Krishnamurthy, Venkatesh; AlShagoor, Banan; Batainh, Ahmed Mahmud; Mugasimangalam, Raja

    2015-01-01

    Apis mellifera intermissa is the native honeybee subspecies of Algeria. A. m. intermissa occurs in Tunisia, Algeria and Morocco, between the Atlas and the Mediterranean and Atlantic coasts. This bee is very important due to its high ability to adapt to great variations in climatic conditions and due to its preferable cleaning behavior. Here we report the draft genome sequence of this honey bee, its Whole Genome Shotgun project has been deposited at DDBJ/EMBL/GenBank under the accession JSUV00000000. The 240-Mb genome is being annotated and analyzed. Comparison with the genome of other Apis mellifera sub-species promises to yield insights into the evolution of adaptations to high temperature and resistance to Varroa parasite infestation. PMID:26484171

  6. Development and implementation of (Q)SAR modeling within the CHARMMing web-user interface.

    PubMed

    Weidlich, Iwona E; Pevzner, Yuri; Miller, Benjamin T; Filippov, Igor V; Woodcock, H Lee; Brooks, Bernard R

    2015-01-01

    Recent availability of large publicly accessible databases of chemical compounds and their biological activities (PubChem, ChEMBL) has inspired us to develop a web-based tool for structure activity relationship and quantitative structure activity relationship modeling to add to the services provided by CHARMMing (www.charmming.org). This new module implements some of the most recent advances in modern machine learning algorithms-Random Forest, Support Vector Machine, Stochastic Gradient Descent, Gradient Tree Boosting, so forth. A user can import training data from Pubchem Bioassay data collections directly from our interface or upload his or her own SD files which contain structures and activity information to create new models (either categorical or numerical). A user can then track the model generation process and run models on new data to predict activity. PMID:25362883

  7. DILIMOT: discovery of linear motifs in proteins.

    PubMed

    Neduva, Victor; Russell, Robert B

    2006-07-01

    Discovery of protein functional motifs is critical in modern biology. Small segments of 3-10 residues play critical roles in protein interactions, post-translational modifications and trafficking. DILIMOT (DIscovery of LInear MOTifs) is a server for the prediction of these short linear motifs within a set of proteins. Given a set of sequences sharing a common functional feature (e.g. interaction partner or localization) the method finds statistically over-represented motifs likely to be responsible for it. The input sequences are first passed through a set of filters to remove regions unlikely to contain instances of linear motifs. Motifs are then found in the remaining sequence and ranked according to a statistic that measure over-representation and conservation across homologues in related species. The results are displayed via a visual interface for easy perusal. The server is available at http://dilimot.embl.de. PMID:16845024

  8. Predicting protein disorder by analyzing amino acid sequence

    PubMed Central

    Yang, Jack Y; Yang, Mary Qu

    2008-01-01

    Background Many protein regions and some entire proteins have no definite tertiary structure, presenting instead as dynamic, disorder ensembles under different physiochemical circumstances. These proteins and regions are known as Intrinsically Unstructured Proteins (IUP). IUP have been associated with a wide range of protein functions, along with roles in diseases characterized by protein misfolding and aggregation. Results Identifying IUP is important task in structural and functional genomics. We exact useful features from sequences and develop machine learning algorithms for the above task. We compare our IUP predictor with PONDRs (mainly neural-network-based predictors), disEMBL (also based on neural networks) and Globplot (based on disorder propensity). Conclusion We find that augmenting features derived from physiochemical properties of amino acids (such as hydrophobicity, complexity etc.) and using ensemble method proved beneficial. The IUP predictor is a viable alternative software tool for identifying IUP protein regions and proteins. PMID:18831799

  9. TRACTS: A program to map oligopurine.oligopyrimidine and other binary DNA tracts.

    PubMed

    Gal, Moshe; Katz, Tzvi; Ovadia, Amir; Yagil, Gad

    2003-07-01

    A program to map the locations and frequencies of DNA tracts composed of only two bases ('Binary DNA') is described. The program, TRACTS (URL http://bioportal.weizmann.ac.il/tracts/tracts.html and/or http://bip.weizmann.ac.il/miwbin/servers/tracts) is of interest because long tracts composed of only two bases are highly over-represented in most genomes. In eukaryotes, oligopurine.oligopyrimidine tracts ('R.Y tracts') are found in the highest excess. In prokaryotes, W tracts predominate (A,T 'rich'). A pre-program, ANEX, parses database annotation files of GenBank and EMBL, to produce a convenient one-line list of every gene (exon, intron) in a genome. The main unit lists and analyzes tracts of the three possible binary pairs (R.Y, K.M and S;W). As an example, the results of R.Y tract mapping of mammalian gene p53 is described. PMID:12824393

  10. Reassessing Domain Architecture Evolution of Metazoan Proteins: Major Impact of Gene Prediction Errors

    PubMed Central

    Nagy, Alinda; Szláma, György; Szarka, Eszter; Trexler, Mária; Bányai, László; Patthy, László

    2011-01-01

    In view of the fact that appearance of novel protein domain architectures (DA) is closely associated with biological innovations, there is a growing interest in the genome-scale reconstruction of the evolutionary history of the domain architectures of multidomain proteins. In such analyses, however, it is usually ignored that a significant proportion of Metazoan sequences analyzed is mispredicted and that this may seriously affect the validity of the conclusions. To estimate the contribution of errors in gene prediction to differences in DA of predicted proteins, we have used the high quality manually curated UniProtKB/Swiss-Prot database as a reference. For genome-scale analysis of domain architectures of predicted proteins we focused on RefSeq, EnsEMBL and NCBI's GNOMON predicted sequences of Metazoan species with completely sequenced genomes. Comparison of the DA of UniProtKB/Swiss-Prot sequences of worm, fly, zebrafish, frog, chick, mouse, rat and orangutan with those of human Swiss-Prot entries have identified relatively few cases where orthologs had different DA, although the percentage with different DA increased with evolutionary distance. In contrast with this, comparison of the DA of human, orangutan, rat, mouse, chicken, frog, zebrafish, worm and fly RefSeq, EnsEMBL and NCBI's GNOMON predicted protein sequences with those of the corresponding/orthologous human Swiss-Prot entries identified a significantly higher proportion of domain architecture differences than in the case of the comparison of Swiss-Prot entries. Analysis of RefSeq, EnsEMBL and NCBI's GNOMON predicted protein sequences with DAs different from those of their Swiss-Prot orthologs confirmed that the higher rate of domain architecture differences is due to errors in gene prediction, the majority of which could be corrected with our FixPred protocol. We have also demonstrated that contamination of databases with incomplete, abnormal or mispredicted sequences introduces a bias in DA

  11. Draft genome sequence of Staphylococcus aureus KT/312045, an ST1-MSSA PVL positive isolated from pus sample in East Coast Malaysia.

    PubMed

    Suhaili, Zarizal; Lean, Soo-Sum; Mohamad, Noor Muzamil; Rachman, Abdul R Abdul; Desa, Mohd Nasir Mohd; Yeo, Chew Chieng

    2016-09-01

    Most of the efforts in elucidating the molecular relatedness and epidemiology of Staphylococcus aureus in Malaysia have been largely focused on methicillin-resistant S. aureus (MRSA). Therefore, here we report the draft genome sequence of the methicillin-susceptible Staphylococcus aureus (MSSA) with sequence type 1 (ST1), spa type t127 with Panton-Valentine Leukocidin (pvl) pathogenic determinant isolated from pus sample designated as KT/314250 strain. The size of the draft genome is 2.86 Mbp with 32.7% of G + C content consisting 2673 coding sequences. The draft genome sequence has been deposited in DDBJ/EMBL/GenBank under the accession number AOCP00000000. PMID:27508119

  12. Development and implementation of (Q)SAR modeling within the CHARMMing Web-user interface

    PubMed Central

    Weidlich, Iwona E.; Pevzner, Yuri; Miller, Benjamin T.; Filippov, Igor V.; Woodcock, H. Lee; Brooks, Bernard R.

    2014-01-01

    Recent availability of large publicly accessible databases of chemical compounds and their biological activities (PubChem, ChEMBL) has inspired us to develop a Web-based tool for SAR and QSAR modeling to add to the services provided by CHARMMing (www.charmming.org). This new module implements some of the most recent advances in modern machine learning algorithms – Random Forest, Support Vector Machine (SVM), Stochastic Gradient Descent, Gradient Tree Boosting etc. A user can import training data from Pubchem Bioassay data collections directly from our interface or upload his or her own SD files which contain structures and activity information to create new models (either categorical or numerical). A user can then track the model generation process and run models on new data to predict activity. PMID:25362883

  13. Overview of selected molecular biological databases

    SciTech Connect

    Rayl, K.D.; Gaasterland, T.

    1994-11-01

    This paper presents an overview of the purpose, content, and design of a subset of the currently available biological databases, with an emphasis on protein databases. Databases included in this summary are 3D-ALI, Berlin RNA databank, Blocks, DSSP, EMBL Nucleotide Database, EMP, ENZYME, FSSP, GDB, GenBank, HSSP, LiMB, PDB, PIR, PKCDD, ProSite, and SWISS-PROT. The goal is to provide a starting point for researchers who wish to take advantage of the myriad available databases. Rather than providing a complete explanation of each database, we present its content and form by explaining the details of typical entries. Pointers to more complete ``user guides`` are included, along with general information on where to search for a new database.

  14. High angular resolution slope measuring deflectometry for the characterization of ultra-precise reflective x-ray optics

    NASA Astrophysics Data System (ADS)

    Siewert, F.; Buchheim, J.; Höft, T.; Fiedler, S.; Bourenkov, G.; Cianci, M.; Signorato, R.

    2012-07-01

    Slope measuring deflectometry has become a standard technique for inspection of ultra-precise reflective optical elements of synchrotron applications. We will report on the inspection of ultra-precise adaptive synchrotron mirrors (bimorph mirrors) to be used under grazing incidence condition. The measurements were performed at the BESSY-II Optics Laboratory of the Helmholtz Zentrum Berlin using the nanometer optical component measuring machine (NOM). Based on the data obtained by the optical measurements, we in this paper simulate the characteristics of the achievable x-ray focus by ray tracing calculations, demonstrated in the case of bimorph mirrors of the EMBL MX1 beamline for macromolecular crystallography at DESY's synchrotron radiation source PETRA III in Hamburg.

  15. Genome Sequencing and Annotation of Mycobacterium tuberculosis PR08 strain.

    PubMed

    Jaafar, Mohammad Maaruf; Halim, Mohd Zakihalani A; Ismail, Mohamad Izwan; Shien, Lee Lian; Kek, Teh Lay; Fong, Ngeow Yun; Nor, Norazmi Mohd; Zainuddin, Zainul Fadziruddin; Hock, Tang Thean; Najimudin, Mohd Nazalan Mohd; Salleh, Mohd Zaki

    2016-03-01

    Mycobacterium tuberculosis is an acid fast bacterial species in the family Mycobacteriaceae and is the causative agent of most cases of tuberculosis. Here, we report the genomic features of Mycobacterium tuberculosis isolated from the cerebrospinal fluid (CSF) of a patient diagnosed with both pulmonary and extrapulmonary tuberculosis (TB). The isolated strain was identified as Mycobacterium tuberculosis PR08 (MTB PR08). Genomic DNA of the MTB PR08 strain was extracted and subjected to whole genome sequencing using MiSeq (Illumina, CA,USA). The draft genome size of MTB PR08 strain is 4,292,364 bp with a G + C content of 65.2%. This strain was annotated to have 4723 genes and 48 RNAs. This whole genome shotgun project has been deposited at DDBJ/EMBL/GenBank under the accession number CP010895. PMID:26981383

  16. Genome sequence and description of Corynebacterium ihumii sp. nov.

    PubMed Central

    Padmanabhan, Roshan; Dubourg, Grégory; Lagier, Jean-Christophe; Couderc, Carine; Michelle, Caroline; Raoult, Didier; Fournier, Pierre-Edouard

    2014-01-01

    Corynebacterium ihumii strain GD7T sp. nov. is proposed as the type strain of a new species, which belongs to the family Corynebacteriaceae of the class Actinobacteria. This strain was isolated from the fecal flora of a 62 year-old male patient, as a part of the culturomics study. Corynebacterium ihumii is a Gram positive, facultativly anaerobic, nonsporulating bacillus. Here, we describe the features of this organism, together with the high quality draft genome sequence, annotation and the comparison with other member of the genus Corynebacteria. C. ihumii genome is 2,232,265 bp long (one chromosome but no plasmid) containing 2,125 protein-coding and 53 RNA genes, including 4 rRNA genes. The whole-genome shotgun sequence of Corynebacterium ihumii strain GD7T sp. nov has been deposited in EMBL under accession number GCA_000403725. PMID:25197488

  17. Computing fuzzy associations for the analysis of biological literature.

    PubMed

    Perez-Iratxeta, Carolina; Keer, Harindar S; Bork, Peer; Andrade, Miguel A

    2002-06-01

    The increase of information in biology makes it difficult for researchers in any field to keep current with the literature. The MEDLINE database of scientific abstracts can be quickly scanned using electronic mechanisms. Potentially interesting abstracts can be selected by matching words joined by Boolean operators. However this means of selecting documents is not optimal. Nonspecific queries have to be effected, resulting in large numbers of irrelevant abstracts that have to be manually scanned To facilitate this analysis, we have developed a system that compiles a summary of subjects and related documents on the results of a MEDLINE query. For this, we have applied a fuzzy binary relation formalism that deduces relations between words present in a set of abstracts preprocessed with a standard grammatical tagger. Those relations are used to derive ensembles of related words and their associated subsets of abstracts. The algorithm can be used publicly at http:// www.bork.embl-heidelberg.de/xplormed/. PMID:12074170

  18. To Hit or Not to Hit, That Is the Question - Genome-wide Structure-Based Druggability Predictions for Pseudomonas aeruginosa Proteins.

    PubMed

    Sarkar, Aurijit; Brenk, Ruth

    2015-01-01

    Pseudomonas aeruginosa is a Gram-negative bacterium known to cause opportunistic infections in immune-compromised or immunosuppressed individuals that often prove fatal. New drugs to combat this organism are therefore sought after. To this end, we subjected the gene products of predicted perturbative genes to structure-based druggability predictions using DrugPred. Making this approach suitable for large-scale predictions required the introduction of new methods for calculation of descriptors, development of a workflow to identify suitable pockets in homologous proteins and establishment of criteria to obtain valid druggability predictions based on homologs. We were able to identify 29 perturbative proteins of P. aeruginosa that may contain druggable pockets, including some of them with no or no drug-like inhibitors deposited in ChEMBL. These proteins form promising novel targets for drug discovery against P. aeruginosa. PMID:26360059

  19. SENTRA, a database of signal transduction proteins.

    SciTech Connect

    D'Souza, M.; Romine, M. F.; Maltsev, N.; Mathematics and Computer Science; PNNL

    2000-01-01

    SENTRA, available via URL http://wit.mcs.anl.gov/WIT2/Sentra/, is a database of proteins associated with microbial signal transduction. The database currently includes the classical two-component signal transduction pathway proteins and methyl-accepting chemotaxis proteins, but will be expanded to also include other classes of signal transduction systems that are modulated by phosphorylation or methylation reactions. Although the majority of database entries are from prokaryotic systems, eukaroytic proteins with bacterial-like signal transduction domains are also included. Currently SENTRA contains signal transduction proteins in 34 complete and almost completely sequenced prokaryotic genomes, as well as sequences from 243 organisms available in public databases (SWISS-PROT and EMBL). The analysis was carried out within the framework of the WIT2 system, which is designed and implemented to support genetic sequence analysis and comparative analysis of sequenced genomes.

  20. Sentra, a database of signal transduction proteins.

    SciTech Connect

    Maltsev, N.; Marland, E.; Yu, G. X.; Bhatnagar, S.; Lusk, R.; Mathematics and Computer Science

    2002-01-01

    Sentra (http://www-wit.mcs.anl.gov/sentra) is a database of signal transduction proteins with the emphasis on microbial signal transduction. The database was updated to include classes of signal transduction systems modulated by either phosphorylation or methylation reactions such as PAS proteins and serine/threonine kinases, as well as the classical two-component histidine kinases and methyl-accepting chemotaxis proteins. Currently, Sentra contains signal transduction proteins from 43 completely sequenced prokaryotic genomes as well as sequences from SWISS-PROT and TrEMBL. Signal transduction proteins are annotated with information describing conserved domains, paralogous and orthologous sequences, and conserved chromosomal gene clusters. The newly developed user interface supports flexible search capabilities and extensive visualization of the data.

  1. The SWISS-PROT protein sequence data bank and its new supplement TREMBL.

    PubMed Central

    Bairoch, A; Apweiler, R

    1996-01-01

    SWISS-PROT is a curated protein sequence database which strives to provide a high level of annotation (such as the description of the function of a protein, its domain structure, post-translational modifications, variants, etc), a minimal level of redundancy and a high level of integration with other databases. Recent developments of the database include: an increase in the number and scope of model organisms; cross-references to seven additional databases; a variety of new documentation files; the creation of TREMBL, and unannotated supplement to SWISS-PROT. This supplement consists of entries in SWISS-PROT-like format derived from the translation of all coding sequences (CDS) in the EMBL nucleotide sequence database, except CDS already included in SWISS-PROT. PMID:8594581

  2. Draft genome sequence of strain MC1A, a UV-resistant bacterium isolated from dry soil in Puerto Rico

    PubMed Central

    Cuebas-Irizarry, Mara F.; Pietri-Toro, Jariselle M.; Montalvo-Rodríguez, Rafael

    2016-01-01

    We report here the draft genome sequence of a novel UV-resistant bacterium isolated from dry soil on the south coast of Puerto Rico. Based on polyphasic taxonomy, strain MC1A represents a new species and the name Solirubrum puertoriconensis is proposed. Assembly was performed using NGEN Assembler into eight contigs (N50 = 1,292,788), the largest of which included 1,549,887 bp. The draft genome consists of 4,810,875 bp and has a GC content of 58.7%. Several genes related to DNA repair and UV resistance were found. The Whole Genome Shotgun project is available at DDBJ/EMBL/GenBank under the accession LNAL00000000. PMID:26981418

  3. The complete mitochondrial genome of the Antarctic stalked jellyfish, Haliclystus antarcticus Pfeffer, 1889 (Staurozoa: Stauromedusae).

    PubMed

    Li, Hsing-Hui; Sung, Ping-Jyun; Ho, Hsuan-Ching

    2016-06-01

    In present study, the complete mitogenome sequence of the Antarctic stalked jellyfish, Haliclystus antarcticus Pfeffer (Staurozoa: Stauromedusae) has been sequenced by next-generation sequencing method. The assembled mitogenome comprises of 15,766 bp including 13 protein coding genes, 7 transfer RNAs, and 2 ribosomal RNA genes. The overall base of Antarctic stalked jellyfish constitutes of 26.5% for A, 19.6% for C, 19.8% for G, 34.1% for T and show 90% identity to Sessile Jelly, Haliclystus sanjuanensis, in the northeastern Pacific Ocean. The complete mitogenome of the Antarctic stalked jellyfish, contributes fundamental and significant DNA molecular data for further phylogeography and evolutionary analysis for seahorse phylogeny. The complete sequence was deposited in DBBJ/EMBL/GenBank under accession number KU947038. PMID:27222813

  4. High-quality draft genome sequence of Enterobacter sp. Bisph2, a glyphosate-degrading bacterium isolated from a sandy soil of Biskra, Algeria

    PubMed Central

    Benslama, Ouided; Boulahrouf, Abderrahmane

    2016-01-01

    Enterobacter sp. strain Bisph2 was isolated from a sandy soil from Biskra, Algeria and exhibits glyphosate-degrading activity. Multilocus sequence analysis of the 16S rRNA, rpoB, hsp60, gyrB and dnaJ genes demonstrated that Bisph2 might be a member of a new species of the genus Enterobacter. Genomic sequencing of Bisph2 was used to better clarify the relationships among Enterobacter species. Annotation and analysis of the genome sequence showed that the 5.535.656 bp genome of Enterobacter sp. Bisph2 consists in one chromosome and no detectable plasmid, has a 53.19% GC content and 78% of genes were assigned a putative function. The genome contains four prophages of which 3 regions are intact and no CRISPER was detected. The nucleotide sequence of this genome was deposited into DDBJ/EMBL/GenBank under the accession JXAF00000000. PMID:27222800

  5. Draft genome sequence of Bacillus okhensis Kh10-101T, a halo-alkali tolerant bacterium from Indian saltpan.

    PubMed

    Krishna, Pilla Sankara; Sreenivas, Ara; Singh, Deepak Kumar; Shivaji, Sisinthy; Prakash, Jogadhenu S S

    2015-12-01

    We report the 4.86-Mb draft genome sequence of Bacillus okhensis strain Kh10-101T, a halo-alkali tolerant rod shaped bacterium isolated from a salt pan near port of Okha, India. This bacterium is a potential model to study the molecular response of bacteria to salt as well as alkaline stress, as it thrives under both high salt and high pH conditions. The draft genome consist of 4,865,284 bp with 38.2% G + C, 4952 predicted CDS, 157 tRNAs and 8 rRNAs. Sequence was deposited at DDBJ/EMBL/GenBank under the project accession JRJU00000000. PMID:26697400

  6. Draft genome sequence of the Algerian bee Apis mellifera intermissa.

    PubMed

    Haddad, Nizar Jamal; Loucif-Ayad, Wahida; Adjlane, Noureddine; Saini, Deepti; Manchiganti, Rushiraj; Krishnamurthy, Venkatesh; AlShagoor, Banan; Batainh, Ahmed Mahmud; Mugasimangalam, Raja

    2015-06-01

    Apis mellifera intermissa is the native honeybee subspecies of Algeria. A. m. intermissa occurs in Tunisia, Algeria and Morocco, between the Atlas and the Mediterranean and Atlantic coasts. This bee is very important due to its high ability to adapt to great variations in climatic conditions and due to its preferable cleaning behavior. Here we report the draft genome sequence of this honey bee, its Whole Genome Shotgun project has been deposited at DDBJ/EMBL/GenBank under the accession JSUV00000000. The 240-Mb genome is being annotated and analyzed. Comparison with the genome of other Apis mellifera sub-species promises to yield insights into the evolution of adaptations to high temperature and resistance to Varroa parasite infestation. PMID:26484171

  7. Isolation and characterization of the human tyrosine hydroxylase gene: identification of 5' alternative splice sites responsible for multiple mRNAs

    SciTech Connect

    O'Malley, K.L.; Anhalt, M.J.; Martin, B.M.; Kelsoe, J.R.; Winfield, S.L.; Ginns, E.I.

    1987-11-03

    A full-length genomic clone for human tyrosine hydroxylase (L-tyrosine, tetrahydropteridine:oxygen oxidoreductase, EC 1.14.16.2) has been isolated. A human brain genomic library constructed in EMBL3 was screened by using a rat cDNA for tyrosine hydroxylase as a probe. Out of one million recombinant phage, one clone was identified that hybridized to both 5' and 3' rat cDNA probes. Restriction endonuclease mapping, Southern blotting, and sequence analysis revealed that, like its rodent counterpart, the human gene is single copy, contains 13 primary exons, and spans approximately 8 kilobases (kb). In contrast to the rat gene, human tyrosine hydroxylase undergoes alternative RNA processing within intron 1, generating at least three distinct mRNAs. A comparison of the human tyrosine hydroxylase and phenylalanine hydroxylase genes indicates that although both probably evolved from a common ancestral gene, major changes in the size of introns have occurred since their divergence.

  8. EXProt: a database for proteins with an experimentally verified function.

    PubMed

    Ursing, Björn M; van Enckevort, Frank H J; Leunissen, Jack A M; Siezen, Roland J

    2002-01-01

    EXProt is a non-redundant protein database containing a selection of entries from genome annotation projects and public databases, aimed at including only proteins with an experimentally verified function. In EXProt release 2.0 we have collected entries from the Pseudomonas aeruginosa community annotation project (PseudoCAP), the Escherichia coli genome and proteome database (GenProtEC) and the translated coding sequences from the Prokaryotes division of EMBL nucleotide sequence database, which are described as having an experimentally verified function. Each entry in EXProt has a unique ID number and contains information about the species, amino acid sequence, functional annotation and, in most cases, links to references in MEDLINE/PubMed and to the entry in the original database. EXProt is indexed in SRS at CMBI (http://www.cmbi.kun.nl/srs/) and can be searched with BLAST and FASTA through the EXProt web page (http://www.cmbi.kun.nl/EXProt/). PMID:11752251

  9. High-quality draft genome sequence of Enterobacter sp. Bisph2, a glyphosate-degrading bacterium isolated from a sandy soil of Biskra, Algeria.

    PubMed

    Benslama, Ouided; Boulahrouf, Abderrahmane

    2016-06-01

    Enterobacter sp. strain Bisph2 was isolated from a sandy soil from Biskra, Algeria and exhibits glyphosate-degrading activity. Multilocus sequence analysis of the 16S rRNA, rpoB, hsp60, gyrB and dnaJ genes demonstrated that Bisph2 might be a member of a new species of the genus Enterobacter. Genomic sequencing of Bisph2 was used to better clarify the relationships among Enterobacter species. Annotation and analysis of the genome sequence showed that the 5.535.656 bp genome of Enterobacter sp. Bisph2 consists in one chromosome and no detectable plasmid, has a 53.19% GC content and 78% of genes were assigned a putative function. The genome contains four prophages of which 3 regions are intact and no CRISPER was detected. The nucleotide sequence of this genome was deposited into DDBJ/EMBL/GenBank under the accession JXAF00000000. PMID:27222800

  10. Reassessing domain architecture evolution of metazoan proteins: major impact of gene prediction errors.

    PubMed

    Nagy, Alinda; Szláma, György; Szarka, Eszter; Trexler, Mária; Bányai, László; Patthy, László

    2011-01-01

    In view of the fact that appearance of novel protein domain architectures (DA) is closely associated with biological innovations, there is a growing interest in the genome-scale reconstruction of the evolutionary history of the domain architectures of multidomain proteins. In such analyses, however, it is usually ignored that a significant proportion of Metazoan sequences analyzed is mispredicted and that this may seriously affect the validity of the conclusions. To estimate the contribution of errors in gene prediction to differences in DA of predicted proteins, we have used the high quality manually curated UniProtKB/Swiss-Prot database as a reference. For genome-scale analysis of domain architectures of predicted proteins we focused on RefSeq, EnsEMBL and NCBI's GNOMON predicted sequences of Metazoan species with completely sequenced genomes. Comparison of the DA of UniProtKB/Swiss-Prot sequences of worm, fly, zebrafish, frog, chick, mouse, rat and orangutan with those of human Swiss-Prot entries have identified relatively few cases where orthologs had different DA, although the percentage with different DA increased with evolutionary distance. In contrast with this, comparison of the DA of human, orangutan, rat, mouse, chicken, frog, zebrafish, worm and fly RefSeq, EnsEMBL and NCBI's GNOMON predicted protein sequences with those of the corresponding/orthologous human Swiss-Prot entries identified a significantly higher proportion of domain architecture differences than in the case of the comparison of Swiss-Prot entries. Analysis of RefSeq, EnsEMBL and NCBI's GNOMON predicted protein sequences with DAs different from those of their Swiss-Prot orthologs confirmed that the higher rate of domain architecture differences is due to errors in gene prediction, the majority of which could be corrected with our FixPred protocol. We have also demonstrated that contamination of databases with incomplete, abnormal or mispredicted sequences introduces a bias in DA

  11. High quality, small molecule-activity datasets for kinase research

    PubMed Central

    Sharma, Rajan; Schürer, Stephan C.; Muskal, Steven M.

    2016-01-01

    Kinases regulate cell growth, movement, and death. Deregulated kinase activity is a frequent cause of disease. The therapeutic potential of kinase inhibitors has led to large amounts of published structure activity relationship (SAR) data. Bioactivity databases such as the Kinase Knowledgebase (KKB), WOMBAT, GOSTAR, and ChEMBL provide researchers with quantitative data characterizing the activity of compounds across many biological assays. The KKB, for example, contains over 1.8M kinase structure-activity data points reported in peer-reviewed journals and patents. In the spirit of fostering methods development and validation worldwide, we have extracted and have made available from the KKB 258K structure activity data points and 76K associated unique chemical structures across eight kinase targets. These data are freely available for download within this data note. PMID:27429748

  12. The Characterization, Expression and in Silico Studies on the SLC39A13 Gene; It's Involvement in Breast Cancer

    NASA Astrophysics Data System (ADS)

    Zahari, Normawati Mohamad; Chong, Teoh Teow

    The zinc transporters superfamily is divided into four subfamilies, and SLC39A is one of the subfamilies. The SLC39A subfamily has 9 members. Based on our computer searchers, all 9 sequences each contain 8 transmembranes domains. Since it is related to the zinc transporters superfamily, the SLC39A subfamily may have the same function that is to transport zinc ion. This paper focus on SLC39A13 studies and using the recombinant technology with CHO cells, it is shown that the recombinant protein, pcDNA5/FRT/V5-His-TOPO®-SLC39A13, has 43kD molecular weight. A second study using immunofluorescence technique with MCF-7 cells, it is shown that the recombinant protein expresses intracellularly. Both studies demonstrate that SLC39A13 expresses in breast cancer cells line, therefore the gene has involvement in the development of breast cancer disease. In our computational studies which is divided into two; the homologous study and sequence analysis, both results are supporting our laboratory results. The homologous study using EMBL-EBI and UniProt tools concluded that SLC39A13 is a member to SLC39A subfamily and it is closely related to SLC39A7 member. Although the sequence analysis shows that the molecular weight of SLC39A13 is 38.35kD it is still comparable to our laboratory result. Separately, using Swiss-EMBnet tools, TMpred, has shown that SLC39A13 has 8 transmembranes domains similar to other family members of SLC39A subfamily. Another analysis using EMBL-EBI tools, PPsearch, shows that SLC39A13 has various protein motif such as the protein kinase C phospho, casein kinase II phospho, leucine zipper and ASN-glycosylation sites. These are the useful information that we need when we study its tertiary structure and simulation in the future.

  13. Chemical, target, and bioactive properties of allosteric modulation.

    PubMed

    van Westen, Gerard J P; Gaulton, Anna; Overington, John P

    2014-04-01

    Allosteric modulators are ligands for proteins that exert their effects via a different binding site than the natural (orthosteric) ligand site and hence form a conceptually distinct class of ligands for a target of interest. Here, the physicochemical and structural features of a large set of allosteric and non-allosteric ligands from the ChEMBL database of bioactive molecules are analyzed. In general allosteric modulators are relatively smaller, more lipophilic and more rigid compounds, though large differences exist between different targets and target classes. Furthermore, there are differences in the distribution of targets that bind these allosteric modulators. Allosteric modulators are over-represented in membrane receptors, ligand-gated ion channels and nuclear receptor targets, but are underrepresented in enzymes (primarily proteases and kinases). Moreover, allosteric modulators tend to bind to their targets with a slightly lower potency (5.96 log units versus 6.66 log units, p<0.01). However, this lower absolute affinity is compensated by their lower molecular weight and more lipophilic nature, leading to similar binding efficiency and surface efficiency indices. Subsequently a series of classifier models are trained, initially target class independent models followed by finer-grained target (architecture/functional class) based models using the target hierarchy of the ChEMBL database. Applications of these insights include the selection of likely allosteric modulators from existing compound collections, the design of novel chemical libraries biased towards allosteric regulators and the selection of targets potentially likely to yield allosteric modulators on screening. All data sets used in the paper are available for download. PMID:24699297

  14. Helix Nebula and CERN: A Symbiotic approach to exploiting commercial clouds

    NASA Astrophysics Data System (ADS)

    Barreiro Megino, Fernando H.; Jones, Robert; Kucharczyk, Katarzyna; Medrano Llamas, Ramón; van der Ster, Daniel

    2014-06-01

    The recent paradigm shift toward cloud computing in IT, and general interest in "Big Data" in particular, have demonstrated that the computing requirements of HEP are no longer globally unique. Indeed, the CERN IT department and LHC experiments have already made significant R&D investments in delivering and exploiting cloud computing resources. While a number of technical evaluations of interesting commercial offerings from global IT enterprises have been performed by various physics labs, further technical, security, sociological, and legal issues need to be address before their large-scale adoption by the research community can be envisaged. Helix Nebula - the Science Cloud is an initiative that explores these questions by joining the forces of three European research institutes (CERN, ESA and EMBL) with leading European commercial IT enterprises. The goals of Helix Nebula are to establish a cloud platform federating multiple commercial cloud providers, along with new business models, which can sustain the cloud marketplace for years to come. This contribution will summarize the participation of CERN in Helix Nebula. We will explain CERN's flagship use-case and the model used to integrate several cloud providers with an LHC experiment's workload management system. During the first proof of concept, this project contributed over 40.000 CPU-days of Monte Carlo production throughput to the ATLAS experiment with marginal manpower required. CERN's experience, together with that of ESA and EMBL, is providing a great insight into the cloud computing industry and highlighted several challenges that are being tackled in order to ease the export of the scientific workloads to the cloud environments.

  15. QSAR Modeling Using Large-Scale Databases: Case Study for HIV-1 Reverse Transcriptase Inhibitors.

    PubMed

    Tarasova, Olga A; Urusova, Aleksandra F; Filimonov, Dmitry A; Nicklaus, Marc C; Zakharov, Alexey V; Poroikov, Vladimir V

    2015-07-27

    Large-scale databases are important sources of training sets for various QSAR modeling approaches. Generally, these databases contain information extracted from different sources. This variety of sources can produce inconsistency in the data, defined as sometimes widely diverging activity results for the same compound against the same target. Because such inconsistency can reduce the accuracy of predictive models built from these data, we are addressing the question of how best to use data from publicly and commercially accessible databases to create accurate and predictive QSAR models. We investigate the suitability of commercially and publicly available databases to QSAR modeling of antiviral activity (HIV-1 reverse transcriptase (RT) inhibition). We present several methods for the creation of modeling (i.e., training and test) sets from two, either commercially or freely available, databases: Thomson Reuters Integrity and ChEMBL. We found that the typical predictivities of QSAR models obtained using these different modeling set compilation methods differ significantly from each other. The best results were obtained using training sets compiled for compounds tested using only one method and material (i.e., a specific type of biological assay). Compound sets aggregated by target only typically yielded poorly predictive models. We discuss the possibility of "mix-and-matching" assay data across aggregating databases such as ChEMBL and Integrity and their current severe limitations for this purpose. One of them is the general lack of complete and semantic/computer-parsable descriptions of assay methodology carried by these databases that would allow one to determine mix-and-matchability of result sets at the assay level. PMID:26046311

  16. Structures of two Arabidopsis thaliana major latex proteins represent novel helix-grip folds

    SciTech Connect

    Lytle, Betsy L.; Song, Jikui; de la Cruz, Norberto B.; Peterson, Francis C.; Johnson, Kenneth A.; Bingman, Craig A.; Phillips, Jr., George N.; Volkman, Brian F.

    2009-06-02

    Here we report the first structures of two major latex proteins (MLPs) which display unique structural differences from the canonical Bet v 1 fold described earlier. MLP28 (SwissProt/TrEMBL ID Q9SSK9), the product of gene At1g70830.1, and the At1g24000.1 gene product (Swiss- Prot/TrEMBL ID P0C0B0), proteins which share 32% sequence identity, were independently selected as foldspace targets by the Center for Eukaryotic Structural Genomics. The structure of a single domain (residues 17-173) of MLP28 was solved by NMR spectroscopy, while the full-length At1g24000.1 structure was determined by X-ray crystallography. MLP28 displays greater than 30% sequence identity to at least eight MLPs from other species. For example, the MLP28 sequence shares 64% identity to peach Pp-MLP119 and 55% identity to cucumber Csf2.20 In contrast, the At1g24000.1 sequence is highly divergent (see Fig. 1), containing a gap of 33 amino acids when compared with all other known MLPs. Even when the gap is excluded, the sequence identity with MLPs from other species is less than 30%. Unlike some of the MLPs from other species, none of the A. thaliana MLPs have been characterized biochemically. We show by NMR chemical shift mapping that At1g24000.1 binds progesterone, demonstrating that despite its sequence dissimilarity, the hydrophobic binding pocket is conserved and, therefore, may play a role in its biological function and that of the MLP family in general.

  17. Next generation models for storage and representation of microbial biological annotation

    PubMed Central

    2010-01-01

    Background Traditional genome annotation systems were developed in a very different computing era, one where the World Wide Web was just emerging. Consequently, these systems are built as centralized black boxes focused on generating high quality annotation submissions to GenBank/EMBL supported by expert manual curation. The exponential growth of sequence data drives a growing need for increasingly higher quality and automatically generated annotation. Typical annotation pipelines utilize traditional database technologies, clustered computing resources, Perl, C, and UNIX file systems to process raw sequence data, identify genes, and predict and categorize gene function. These technologies tightly couple the annotation software system to hardware and third party software (e.g. relational database systems and schemas). This makes annotation systems hard to reproduce, inflexible to modification over time, difficult to assess, difficult to partition across multiple geographic sites, and difficult to understand for those who are not domain experts. These systems are not readily open to scrutiny and therefore not scientifically tractable. The advent of Semantic Web standards such as Resource Description Framework (RDF) and OWL Web Ontology Language (OWL) enables us to construct systems that address these challenges in a new comprehensive way. Results Here, we develop a framework for linking traditional data to OWL-based ontologies in genome annotation. We show how data standards can decouple hardware and third party software tools from annotation pipelines, thereby making annotation pipelines easier to reproduce and assess. An illustrative example shows how TURTLE (Terse RDF Triple Language) can be used as a human readable, but also semantically-aware, equivalent to GenBank/EMBL files. Conclusions The power of this approach lies in its ability to assemble annotation data from multiple databases across multiple locations into a representation that is understandable to

  18. Construction and analysis of an hn-cDNA library derived from the p-arm of pig chromosome 12.

    PubMed

    Anderson Dear, D V; Miller, J R

    1996-09-01

    Our aim is to find unidentified genes on specific pig chromosomes or chromosome fragments. Our approach has involved the construction of a heterogeneous nuclear complementary (hn-c) DNA library of the p-arm of pig Chromosome (Chr) 12, the only pig chromosome present in the pig x hamster hybrid cell line 8990. Total RNA was extracted from the cells and first-strand synthesis of hn-cDNA carried out with random and oligo dT primers. Pig hn-cDNA was isolated by amplification of first-strand synthesized hn-cDNA with primers specific for Short Interspersed Repeat Elements (SINEs) via the polymerase chain reaction (PCR). Hn-cDNAs were size selected and cloned in E. coli XL-1 blue cells with PCR-Script as the vector. The library consisted of 6000 clones. Clone inserts were amplified by PCR with vector-specific primers, and randomly picked inserts greater than 600 bp were sequenced. Homology searches were carried out with the FASTA search program on the GenEmbl database. Thirty clones were sequenced, and of these three showed strong homologies to GenEmbl sequences: (1) to sheep, mouse, human, and rat mammary gland factor (MGF); (2) to MLN-50, a gene that is amplified in human familial breast cancer and is present on human Chr 17; the latter is homologous to pig chromosome 12; (3) to a family of unassigned overlapping human ESTs. Of the other sequenced clones, seven were over 80% homologous with pig SINE sequences; three were over 75% homologous to human LINE sequences; six displayed open reading frames over a mean distance equivalent to 50 amino acids, although these showed no significant similarities with sequences in the databases. Using this approach, we have been able to identify several new genes on the p-arm of pig Chr 12. This is the first report of gene isolation from a library derived from a pig chromosome fragment. PMID:8703117

  19. The Universal Protein Resource (UniProt).

    PubMed

    Bairoch, Amos; Apweiler, Rolf; Wu, Cathy H; Barker, Winona C; Boeckmann, Brigitte; Ferro, Serenella; Gasteiger, Elisabeth; Huang, Hongzhan; Lopez, Rodrigo; Magrane, Michele; Martin, Maria J; Natale, Darren A; O'Donovan, Claire; Redaschi, Nicole; Yeh, Lai-Su L

    2005-01-01

    The Universal Protein Resource (UniProt) provides the scientific community with a single, centralized, authoritative resource for protein sequences and functional information. Formed by uniting the Swiss-Prot, TrEMBL and PIR protein database activities, the UniProt consortium produces three layers of protein sequence databases: the UniProt Archive (UniParc), the UniProt Knowledgebase (UniProt) and the UniProt Reference (UniRef) databases. The UniProt Knowledgebase is a comprehensive, fully classified, richly and accurately annotated protein sequence knowledgebase with extensive cross-references. This centrepiece consists of two sections: UniProt/Swiss-Prot, with fully, manually curated entries; and UniProt/TrEMBL, enriched with automated classification and annotation. During 2004, tens of thousands of Knowledgebase records got manually annotated or updated; we introduced a new comment line topic: TOXIC DOSE to store information on the acute toxicity of a toxin; the UniProt keyword list got augmented by additional keywords; we improved the documentation of the keywords and are continuously overhauling and standardizing the annotation of post-translational modifications. Furthermore, we introduced a new documentation file of the strains and their synonyms. Many new database cross-references were introduced and we started to make use of Digital Object Identifiers. We also achieved in collaboration with the Macromolecular Structure Database group at EBI an improved integration with structural databases by residue level mapping of sequences from the Protein Data Bank entries onto corresponding UniProt entries. For convenient sequence searches we provide the UniRef non-redundant sequence databases. The comprehensive UniParc database stores the complete body of publicly available protein sequence data. The UniProt databases can be accessed online (http://www.uniprot.org) or downloaded in several formats (ftp://ftp.uniprot.org/pub). New releases are published every two

  20. LISTA, LISTA-HOP and LISTA-HON: a comprehensive compilation of protein encoding sequences and its associated homology databases from the yeast Saccharomyces.

    PubMed Central

    Dölz, R; Mossé, M O; Slonimski, P P; Bairoch, A; Linder, P

    1996-01-01

    We continued our effort to make a comprehensive database (LISTA) for the yeast Saccharomyces cerevisiae. As in previous editions the genetic names are consistently associated to each sequence with a known and confirmed ORF. If necessary, synonyms are given in the case of allelic duplicated sequences. Although the first publication of a sequence gives-according to our rules-the genetic name of a gene, in some instances more commonly used names are given to avoid nomenclature problems and the use of ancient designations which are no longer used. In these cases the old designation is given as synonym. Thus sequences can be found either by the name or by synonyms given in LISTA. Each entry contains the genetic name, the mnemonic from the EMBL data bank, the codon bias, reference of the publication of the sequence, Chromosomal location as far as known, SWISSPROT and EMBL accession numbers. New entries will also contain the name from the systematic sequencing efforts. Since the release of LISTA4.1 we update the database continuously. To obtain more information on the included sequences, each entry has been screened against non-redundant nucleotide and protein data bank collections resulting in LISTA-HON and LISTA-HOP. This release includes reports from full Smith and Watermann peptide-level searches against a non-redundant protein sequence database. The LISTA data base can be linked to the associated data sets or to nucleotide and protein banks by the Sequence Retrieval System (SRS). The database is available by FTP and on World Wide Web. PMID:8594599

  1. The Universal Protein Resource (UniProt)

    PubMed Central

    Bairoch, Amos; Apweiler, Rolf; Wu, Cathy H.; Barker, Winona C.; Boeckmann, Brigitte; Ferro, Serenella; Gasteiger, Elisabeth; Huang, Hongzhan; Lopez, Rodrigo; Magrane, Michele; Martin, Maria J.; Natale, Darren A.; O'Donovan, Claire; Redaschi, Nicole; Yeh, Lai-Su L.

    2005-01-01

    The Universal Protein Resource (UniProt) provides the scientific community with a single, centralized, authoritative resource for protein sequences and functional information. Formed by uniting the Swiss-Prot, TrEMBL and PIR protein database activities, the UniProt consortium produces three layers of protein sequence databases: the UniProt Archive (UniParc), the UniProt Knowledgebase (UniProt) and the UniProt Reference (UniRef) databases. The UniProt Knowledgebase is a comprehensive, fully classified, richly and accurately annotated protein sequence knowledgebase with extensive cross-references. This centrepiece consists of two sections: UniProt/Swiss-Prot, with fully, manually curated entries; and UniProt/TrEMBL, enriched with automated classification and annotation. During 2004, tens of thousands of Knowledgebase records got manually annotated or updated; we introduced a new comment line topic: TOXIC DOSE to store information on the acute toxicity of a toxin; the UniProt keyword list got augmented by additional keywords; we improved the documentation of the keywords and are continuously overhauling and standardizing the annotation of post-translational modifications. Furthermore, we introduced a new documentation file of the strains and their synonyms. Many new database cross-references were introduced and we started to make use of Digital Object Identifiers. We also achieved in collaboration with the Macromolecular Structure Database group at EBI an improved integration with structural databases by residue level mapping of sequences from the Protein Data Bank entries onto corresponding UniProt entries. For convenient sequence searches we provide the UniRef non-redundant sequence databases. The comprehensive UniParc database stores the complete body of publicly available protein sequence data. The UniProt databases can be accessed online (http://www.uniprot.org) or downloaded in several formats (ftp://ftp.uniprot.org/pub). New releases are published every two

  2. Splicy: a web-based tool for the prediction of possible alternative splicing events from Affymetrix probeset data

    PubMed Central

    Rambaldi, Davide; Felice, Barbara; Praz, Viviane; Bucher, Philip; Cittaro, Davide; Guffanti, Alessandro

    2007-01-01

    Background The Affymetrix™ technology is nowadays a well-established method for the analysis of gene expression profiles in cancer research studies. However, changes in gene expression levels are not the only way to link genes and disease. The existence of gene isoforms specifically linked with cancer or apoptosis is increasingly found in literature. Hence it is of great interest to associate the results of a gene expression study with updated evidences on the transcript structure and its possible variants. Results We present here a web-based software tool, Splicy, whose primary task is to retrieve data on the mapping of Affymetrix™ probes to single exons of gene transcripts and displaying graphically this information projected on the gene physical structure. Starting from a list of Affymetrix™ probesets the program produces a series of graphical displays, each relative to a transcript associated with the gene targeted by a given probe. The information on the transcript-by-transcript and exon-by-exon mapping of probe pairs can be retrieved both graphically and in the form of tab-separated files. The mapping of single probes to NCBI RefSeq or EMBL cDNAs is handled by the ISREC mapping tables used in the CleanEx Expression Reference Database Project. We currently maintain these mappings for most popular human and mouse Affymetrix™ chips, and Splicy can be queried for matches with human and mouse NCBI RefSeq or EMBL cDNAs. Conclusion Splicy generates probeset annotations and images describing the relation between the single probes and intron/exon structure of the target transcript in all its known variants. We think that Splicy will be useful for giving to the researcher a clearer picture of the possible transcript variants linked with a given gene and an additional view on the interpretation of microarray experiment data. Splicy is publicly available and has been realized in the framework of a bioinformatics grant from the Italian Cancer Research Association

  3. Next Generation Models for Storage and Representation of Microbial Biological Annotation

    SciTech Connect

    Quest, Daniel J; Land, Miriam L; Brettin, Thomas S; Cottingham, Robert W

    2010-01-01

    Background Traditional genome annotation systems were developed in a very different computing era, one where the World Wide Web was just emerging. Consequently, these systems are built as centralized black boxes focused on generating high quality annotation submissions to GenBank/EMBL supported by expert manual curation. The exponential growth of sequence data drives a growing need for increasingly higher quality and automatically generated annotation. Typical annotation pipelines utilize traditional database technologies, clustered computing resources, Perl, C, and UNIX file systems to process raw sequence data, identify genes, and predict and categorize gene function. These technologies tightly couple the annotation software system to hardware and third party software (e.g. relational database systems and schemas). This makes annotation systems hard to reproduce, inflexible to modification over time, difficult to assess, difficult to partition across multiple geographic sites, and difficult to understand for those who are not domain experts. These systems are not readily open to scrutiny and therefore not scientifically tractable. The advent of Semantic Web standards such as Resource Description Framework (RDF) and OWL Web Ontology Language (OWL) enables us to construct systems that address these challenges in a new comprehensive way. Results Here, we develop a framework for linking traditional data to OWL-based ontologies in genome annotation. We show how data standards can decouple hardware and third party software tools from annotation pipelines, thereby making annotation pipelines easier to reproduce and assess. An illustrative example shows how TURTLE (Terse RDF Triple Language) can be used as a human readable, but also semantically-aware, equivalent to GenBank/EMBL files. Conclusions The power of this approach lies in its ability to assemble annotation data from multiple databases across multiple locations into a representation that is understandable to

  4. Stereochemical Recognition of Helicenes on Metal Surfaces.

    PubMed

    Ernst, Karl-Heinz

    2016-06-21

    molecular handedness from single molecules into extended two-dimensional supramolecular structures are identified. For the problem of racemate versus conglomerate crystallization, the impact of surface and molecular structure and their interplay are analyzed. This leads to detailed conclusions about the importance of the match of molecular and surface binding sites for long-range self-assembly. The absence of polar groups puts emphasis on van der Waals interaction and their maximization by steric overlap of molecular parts in enantiomeric and diastereomeric interactions. With STM as a manipulation tool, dimers are manually separated in order to analyze their chiral composition. And finally, new nonlinear cooperative effects induced by small enantiospecific bias are discovered that lead to single enantiomorphism in two-dimensional racemate crystals as well as in racemic multilayered films. By means of these model studies many details that govern chiral recognition at surfaces are rationalized. PMID:27251099

  5. Amino ketone formation and aminopropanol-dehydrogenase activity in rat-liver preparations

    PubMed Central

    Turner, J. M.; Willetts, A. J.

    1967-01-01

    1. Rat tissue homogenates convert dl-1-aminopropan-2-ol into aminoacetone. Liver homogenates have relatively high aminopropanol-dehydrogenase activity compared with kidney, heart, spleen and muscle preparations. 2. Maximum activity of liver homogenates is exhibited at pH9·8. The Km for aminopropanol is approx. 15mm, calculated for a single enantiomorph, and the maximum activity is approx. 9mμmoles of aminoacetone formed/mg. wet wt. of liver/hr.at 37°. Aminoacetone is also formed from l-threonine, but less rapidly. An unidentified amino ketone is formed from dl-4-amino-3-hydroxybutyrate, the Km for which is approx. 200mm at pH9·8. 3. Aminopropanol-dehydrogenase activity in homogenates is inhibited non-competitively by dl-3-hydroxybutyrate, the Ki being approx. 200mm. EDTA and other chelating agents are weakly inhibitory, and whereas potassium chloride activates slightly at low concentrations, inhibition occurs at 50–100mm. 4. It is concluded that aminopropanol-dehydrogenase is located in mitochondria, and in contrast with l-threonine dehydrogenase can be readily solubilized from mitochondrial preparations by ultrasonic treatment. 5. Soluble extracts of disintegrated mitochondria exhibit maximum aminopropanol-dehydrogenase activity at pH9·1 At this pH, Km values for the amino alcohol and NAD+ are approx. 200 and 1·3mm respectively. Under optimum conditions the maximum velocity is approx. 70mμmoles of aminoacetone formed/mg. of protein/hr. at 37°. Chelating agents and thiol reagents appear to have little effect on enzyme activity, but potassium chloride inhibits at all concentrations tested up to 80mm. dl-3-Hydroxybutyrate is only slightly inhibitory. 6. Dehydrogenase activities for l-threonine and dl-4-amino-3-hydroxybutyrate appear to be distinct from that for aminopropanol. 7. Intraperitoneal injection of aminopropanol into rats leads to excretion of aminoacetone in the urine. Aminoacetone excretion proportional to the amount of the amino alcohol

  6. Stepwise enforcement of the notochord and its intersection with the myoseptum: an evolutionary path leading to development of the vertebra?

    PubMed Central

    Grotmol, Sindre; Kryvi, Harald; Keynes, Roger; Krossøy, Christel; Nordvik, Kari; Totland, Geir K

    2006-01-01

    The notochord constitutes the main axial support during the embryonic and larval stages, and the arrangement of collagen fibrils within the notochord sheath is assumed to play a decisive role in determining its functional properties as a fibre-wound hydrostatic skeleton. We have found that during early ontogeny in Atlantic salmon stepwise changes occur in the configuration of the collagen fibre-winding of the notochord sheath. The sheath consists of a basal lamina, a layer of type II collagen, and an elastica externa that delimits the notochord; and these constituents are secreted in a specific order. Initially, the collagen fibrils are circumferentially arranged perpendicular to the longitudinal axis, and this specific spatial fibril configuration is maintained until hatching when the collagen becomes reorganized into distinct layers or lamellae. Within each lamella, fibrils are parallel to each other, forming helices around the longitudinal axis of the notochord, with a tangent angle of 75–80° to the cranio-caudal axis. The helical geometry shifts between adjacent lamellae, forming enantiomorphous left- and right-handed coils, respectively, thus enforcing the sheath. The observed changes in the fibre-winding configuration may reflect adaptation of the notochord to functional demands related to stage in ontogeny. When the vertebral bodies initially form as chordacentra, the collagen lamellae of the sheath in the vertebral region are fixed by the deposition of minerals; in the intervertebral region, however, they represent a pre-adaptation providing torsional stability to the intervertebral joint. Hence, these modifications of the sheath transform the notochord per se into a functional vertebral column. The elastica externa, encasing the notochord, has serrated surfaces, connected inward to the type II collagen of the sheath, and outward to type I collagen of the mesenchymal connective tissue surrounding the notochord. In a similar manner, the collagen matrix of

  7. Two highly connected POM-based hybrids varying from 2D to 3D: The use of the isomeric ligands

    SciTech Connect

    Zhang Chunjing; Pang Haijun; Hu Mixia; Li Jia; Chen Yaguang

    2009-07-15

    Through employing two isomeric ligands, isonicotinic acid (HINA) and nicotinic acid (HNA), with different electron delocalization nature, two high-dimensional hybrids based on highly connected alpha-metatungstate clusters, [Na{sub 2}(H{sub 2}O){sub 8}Ag{sub 2}(HINA){sub 3}(INA)][Na(H{sub 2}O){sub 2}Ag{sub 2}(HINA){sub 4}(H{sub 2}W{sub 12}O{sub 40})].2H{sub 2}O (1) and [Na{sub 2}(H{sub 2}O){sub 4}Ag{sub 6}(HNA){sub 2}(NA){sub 2}(H{sub 2}W{sub 12}O{sub 40})].8H{sub 2}O (2), have been conventionally synthesized and structurally characterized. 1 exhibits an unusual 1D-in-2D pseudo-polyrotaxane entangled structure, namely, the 2D sheets [Na(H{sub 2}O){sub 2}Ag{sub 2}(HINA){sub 4}(H{sub 2}W{sub 12}O{sub 40})]{sub n}{sup 3n-} are penetrated by enantiomorphous meso-helical chains [Na{sub 2}(H{sub 2}O){sub 8}Ag{sub 2}(HINA){sub 3}(INA)]{sub n}{sup 3n+}. In the 2D sheets, each [H{sub 2}W{sub 12}O{sub 40}]{sup 6-} cluster is surrounded by six Ag and two Na atoms. 2 exhibits a 3D (4, 6)-net structure with (3{sup 2}6{sup 2}7{sup 2})(3{sup 2}4{sup 4}5{sup 4}6{sup 4}7)(3{sup 2}4{sup 4}6{sup 8}7) topology, in which each [H{sub 2}W{sub 12}O{sub 40}]{sup 6-} cluster is connected with ten Ag atoms. These facts indicate that the isomeric ligands play a key role in the formation of final structures. From 1 to 2, the connection number of the [H{sub 2}W{sub 12}O{sub 40}]{sup 6-} cluster changes from 8 to 10 and the dimensionality increases from 2 to 3. Moreover, 1 and 2 display photoluminescent properties in the blue range at room temperature. - Graphical abstract: Two high-dimensional and highly connected alpha-metatungstate-compounds modified by Ag{sup I}-HINA/HNA TMCs were successful obtained and the effect of isomeric organic ligands on the structures was systematically elucidated.

  8. Diastereoselectivity and molecular recognition in the self-assembly of double-stranded dinuclear metal complexes of the type [M2[(R,S)-tetraphos]2](PF6)2 (M = Ag and Au).

    PubMed

    Blake, Christopher J; Cook, Vernon C; Keniry, Max A; Kitto, Heather J; Rae, A David; Swiegers, Gerhard F; Willis, Anthony C; Zank, Johann; Wild, S Bruce

    2003-12-29

    The ligand (R,S)-Ph(2)PCH(2)CH(2)P(Ph)CH(2)CH(2)P(Ph)CH(2)CH(2)PPh(2), (R,S)-tetraphos, combines with silver(I) and gold(I) ions in the presence of hexafluorophosphate to diastereoselectively self-assemble the head-to-head (H,H) diastereomers of the double-stranded, dinuclear metal complexes [M(2)[(R,S)-tetraphos](2)](PF(6))(2) in which the two chiral metal centers in the complexes have M (R end of phosphine) and P (S end of phosphine) configurations. The crystal and molecular structures of the compounds have been determined: (H,H)-(M,P) -[Ag(2)[(R,S)-tetraphos](2)](PF(6))(2), monoclinic, P2(1)/c, a = 10.3784(2), b = 47.320(1), c = 17.3385(4) A, beta = 103.8963(5) degrees, Z = 4; (H,H)-(M,P)-[Au(2)[(R,S)-tetraphos](2)](PF(6))(2), monoclinic, P.2(1) (No. 4, c unique axis), a = 24.385(4), b = 46.175(3), c = 14.820(4) A, Z = 8. The complexes crystallize as racemic compounds in which the unit cell in each case contains equal numbers of enantiomorphic molecules of the cation and associated anions. The cations in both structures have similar side-by-side structures of idealized C(2) symmetry, the bulk helicity of each molecule in the solid state being due solely to the twist of the central ten-membered ring containing the two metal ions of opposite configuration, which has the chiral twist-boat-chair-boat conformation. When 1 equiv each of (R,S)-tetraphos, (R,R)-(+/-)-tetraphos, (S,S)-(+)-tetraphos, 2 equiv of Ph(2)PCH(2)CH(2)PPh(2) (dppe), and 7 equiv of [AuCl(SMe(2))] in dichloromethane are allowed to react for several minutes in the presence of an excess of ammonium hexafluorophosphate in water (two phases), the products are the double-stranded digold(I) complexes in which each ligand strand has recognized itself by stereoselective self-assembly, together with [Au(dppe)(2)]PF(6). PMID:14686848

  9. Leading European Intergovernmental Research Organisations at FP6 Launch Conference

    NASA Astrophysics Data System (ADS)

    2002-11-01

    EIROforum at "European Research 2002" (Brussels, November 11-13, 2002) Go to the EIROforum website Last year, seven of Europe's leading intergovernmental research organisations set up a high-level co-ordination and collaboration group, known as EIROforum , cf. ESO PR 12/01. They include CERN (particle physics), EMBL (molecular biology), ESA (space activities), ESO (astronomy and astrophysics), ESRF (synchrotron radiation), ILL (neutron source) and EFDA (fusion). All of them have powerful research infrastructures and laboratories which are used by an extensive network of scientists. Together, they represent European spearheads in some of the most crucial basic and applied research fields. The EIROforum organisations will be highly visible at the upcoming EU-conference on "European Research 2002 - The European Research Area and the Framework Programme" , to be held on November 11-13, 2002, at the "Palais du Heysel" in Brussels (Belgium). This meeting will be attended by more than 8000 scientists and decision-makers from all over Europe and serves to launch the 6th EC Framework Programme (2002 - 2006), which will have an important impact on Europe's R&D landscape during the coming years. A joint 400 sq.m. exhibition , featuring the individual EIROforum organisations, their current programmes and many front-line achievements in their respective areas of activity, will be set up at Stand L in Hall 11 . It includes a central area, with a small cinema, displaying information about their current interactions via EIROforum. The stands will be manned throughout the conference by high-level representatives from the seven organisations. On Tuesday, November 12, 2002, 14:00 hrs, a Press Conference will take place at this exhibition stand, in the presence of the European Commissioner for Research, M. Phillippe Busquin, and most of the Directors General (or equivalent) of the EIROforum organisations. The main themes will be the increasingly intense interaction and co

  10. Mapping chemical structure-activity information of HAART-drug cocktails over complex networks of AIDS epidemiology and socioeconomic data of U.S. counties.

    PubMed

    Herrera-Ibatá, Diana María; Pazos, Alejandro; Orbegozo-Medina, Ricardo Alfredo; Romero-Durán, Francisco Javier; González-Díaz, Humberto

    2015-06-01

    Using computational algorithms to design tailored drug cocktails for highly active antiretroviral therapy (HAART) on specific populations is a goal of major importance for both pharmaceutical industry and public health policy institutions. New combinations of compounds need to be predicted in order to design HAART cocktails. On the one hand, there are the biomolecular factors related to the drugs in the cocktail (experimental measure, chemical structure, drug target, assay organisms, etc.); on the other hand, there are the socioeconomic factors of the specific population (income inequalities, employment levels, fiscal pressure, education, migration, population structure, etc.) to study the relationship between the socioeconomic status and the disease. In this context, machine learning algorithms, able to seek models for problems with multi-source data, have to be used. In this work, the first artificial neural network (ANN) model is proposed for the prediction of HAART cocktails, to halt AIDS on epidemic networks of U.S. counties using information indices that codify both biomolecular and several socioeconomic factors. The data was obtained from at least three major sources. The first dataset included assays of anti-HIV chemical compounds released to ChEMBL. The second dataset is the AIDSVu database of Emory University. AIDSVu compiled AIDS prevalence for >2300 U.S. counties. The third data set included socioeconomic data from the U.S. Census Bureau. Three scales or levels were employed to group the counties according to the location or population structure codes: state, rural urban continuum code (RUCC) and urban influence code (UIC). An analysis of >130,000 pairs (network links) was performed, corresponding to AIDS prevalence in 2310 counties in U.S. vs. drug cocktails made up of combinations of ChEMBL results for 21,582 unique drugs, 9 viral or human protein targets, 4856 protocols, and 10 possible experimental measures. The best model found with the original

  11. A Multi-Platform Draft de novo Genome Assembly and Comparative Analysis for the Scarlet Macaw (Ara macao)

    PubMed Central

    Seabury, Christopher M.; Dowd, Scot E.; Seabury, Paul M.; Raudsepp, Terje; Brightsmith, Donald J.; Liboriussen, Poul; Halley, Yvette; Fisher, Colleen A.; Owens, Elaine; Viswanathan, Ganesh; Tizard, Ian R.

    2013-01-01

    Data deposition to NCBI Genomes This Whole Genome Shotgun project has been deposited at DDBJ/EMBL/GenBank under the accession AMXX00000000 (SMACv1.0, unscaffolded genome assembly). The version described in this paper is the first version (AMXX01000000). The scaffolded assembly (SMACv1.1) has been deposited at DDBJ/EMBL/GenBank under the accession AOUJ00000000, and is also the first version (AOUJ01000000). Strong biological interest in traits such as the acquisition and utilization of speech, cognitive abilities, and longevity catalyzed the utilization of two next-generation sequencing platforms to provide the first-draft de novo genome assembly for the large, new world parrot Ara macao (Scarlet Macaw). Despite the challenges associated with genome assembly for an outbred avian species, including 951,507 high-quality putative single nucleotide polymorphisms, the final genome assembly (>1.035 Gb) includes more than 997 Mb of unambiguous sequence data (excluding N’s). Cytogenetic analyses including ZooFISH revealed complex rearrangements associated with two scarlet macaw macrochromosomes (AMA6, AMA7), which supports the hypothesis that translocations, fusions, and intragenomic rearrangements are key factors associated with karyotype evolution among parrots. In silico annotation of the scarlet macaw genome provided robust evidence for 14,405 nuclear gene annotation models, their predicted transcripts and proteins, and a complete mitochondrial genome. Comparative analyses involving the scarlet macaw, chicken, and zebra finch genomes revealed high levels of nucleotide-based conservation as well as evidence for overall genome stability among the three highly divergent species. Application of a new whole-genome analysis of divergence involving all three species yielded prioritized candidate genes and noncoding regions for parrot traits of interest (i.e., speech, intelligence, longevity) which were independently supported by the results of previous human GWAS studies. We

  12. Enriching screening libraries with bioactive fragment space.

    PubMed

    Zhang, Na; Zhao, Hongtao

    2016-08-01

    By deconvoluting 238,073 bioactive molecules in the ChEMBL library into extended Murcko ring systems, we identified a set of 2245 ring systems present in at least 10 molecules. These ring systems belong to 2221 clusters by ECFP4 fingerprints with a minimum intracluster similarity of 0.8. Their overlap with ring systems in commercial libraries was further quantified. Our findings suggest that success of a small fragment library is driven by the convergence of effective coverage of bioactive ring systems (e.g., 10% coverage by 1000 fragments vs. 40% by 2million HTS compounds), high enrichment of bioactive ring systems, and low molecular complexity enhancing the probability of a match with the protein targets. Reconciling with the previous studies, bioactive ring systems are underrepresented in screening libraries. As such, we propose a library of virtual fragments with key functionalities via fragmentation of bioactive molecules. Its utility is exemplified by a prospective application on protein kinase CK2, resulting in the discovery of a series of novel inhibitors with the most potent compound having an IC50 of 0.5μM and a ligand efficiency of 0.41kcal/mol per heavy atom. PMID:27311891

  13. Assessment of the food habits of the Moroccan dorcas gazelle in M'Sabih Talaa, west central Morocco, using the trnL approach.

    PubMed

    Ait Baamrane, Moulay Abdeljalil; Shehzad, Wasim; Ouhammou, Ahmed; Abbad, Abdelaziz; Naimi, Mohamed; Coissac, Eric; Taberlet, Pierre; Znari, Mohammed

    2012-01-01

    Food habits of the Moroccan dorcas gazelle, Gazella dorcas massaesyla, previously investigated in the 1980s using microhistological fecal analysis, in the M'Sabih Talaa Reserve, west central Morocco, were re-evaluated over three seasons (spring, summer and autumn 2009) using the trnL approach to determine the diet composition and its seasonal variation from fecal samples. Taxonomic identification was carried out using the identification originating from the database built from EMBL and the list of plant species within the reserve. The total taxonomic richness in the reserve was 130 instead of 171 species in the 1980s. The diet composition revealed to be much more diversified (71 plant taxa belonging to 57 genus and 29 families) than it was 22 years ago (29 identified taxa). Thirty-four taxa were newly identified in the diet while 13 reported in 1986-87 were not found. Moroccan dorcas gazelle showed a high preference to Acacia gummifera, Anagallis arvensis, Glebionis coronaria, Cladanthus arabicus, Diplotaxis tenuisiliqua, Erodium salzmannii, Limonium thouini, Lotus arenarius and Zizyphus lotus. Seasonal variations occurred in both number (40-41 taxa in spring-summer and 49 taxa in autumn vs. respectively 23-22 and 26 in 1986-1987) and taxonomic type of eaten plant taxa. This dietary diversification could be attributed either to the difference in methods of analysis, trnL approach having a higher taxonomic resolution, or a potential change in nutritional quality of plants over time. PMID:22558187

  14. Cloning and DNA sequence of the gene coding for Bacillus stearothermophilus T-6 xylanase.

    PubMed Central

    Gat, O; Lapidot, A; Alchanati, I; Regueros, C; Shoham, Y

    1994-01-01

    Bacillus stearothermophilus T-6 produces an extracellular thermostable xylanase. Affinity-purified polyclonal serum raised against the enzyme was used to screen a genomic library of B. stearothermophilus T-6 constructed in lambda-EMBL3. Two positive phages were isolated, both containing similar 13-kb inserts, and their lysates exhibited xylanase activity. A 3,696-bp SalI-BamHI fragment containing the xylanase gene was subcloned in Escherichia coli and subsequently sequenced. The open reading frame of xylanase T-6 consists of 1,236 bp. On the basis of sequence similarity, two possible -10 and -35 regions, a ribosome-binding site at the 5' end of the gene and a potential transcriptional termination motif at the 3' end of the gene, were identified. From the previously known N-terminal amino acid sequence of xylanase T-6 and the possible ribosome-binding site, a putative 28-amino-acid signal peptide was deduced. The mature xylanase T-6 consists of 379 amino acids with a calculated molecular weight and pI of 43,808 and 6.88, respectively. Multiple alignment of beta-glycanase amino acid sequences revealed highly conserved regions. Northern (RNA) blot analysis indicated that the xylanase T-6 transcript is about 1.4 kb and that the induction of this enzyme synthesis by xylose is on the transcriptional level. Images PMID:8031084

  15. A phylogeny-based benchmarking test for orthology inference reveals the limitations of function-based validation.

    PubMed

    Trachana, Kalliopi; Forslund, Kristoffer; Larsson, Tomas; Powell, Sean; Doerks, Tobias; von Mering, Christian; Bork, Peer

    2014-01-01

    Accurate orthology prediction is crucial for many applications in the post-genomic era. The lack of broadly accepted benchmark tests precludes a comprehensive analysis of orthology inference. So far, functional annotation between orthologs serves as a performance proxy. However, this violates the fundamental principle of orthology as an evolutionary definition, while it is often not applicable due to limited experimental evidence for most species. Therefore, we constructed high quality "gold standard" orthologous groups that can serve as a benchmark set for orthology inference in bacterial species. Herein, we used this dataset to demonstrate 1) why a manually curated, phylogeny-based dataset is more appropriate for benchmarking orthology than other popular practices and 2) how it guides database design and parameterization through careful error quantification. More specifically, we illustrate how function-based tests often fail to identify false assignments, misjudging the true performance of orthology inference methods. We also examined how our dataset can instruct the selection of a "core" species repertoire to improve detection accuracy. We conclude that including more genomes at the proper evolutionary distances can influence the overall quality of orthology detection. The curated gene families, called Reference Orthologous Groups, are publicly available at http://eggnog.embl.de/orthobench2. PMID:25369365

  16. Isolation and characterization of halotolerant Streptomyces radiopugnans from Antarctica soil.

    PubMed

    Bhave, S V; Shanbhag, P V; Sonawane, S K; Parab, R R; Mahajan, G B

    2013-05-01

    An actinomycete wild strain PM0626271 (= MTCC 5447), producing novel antibacterial compounds, was isolated from soil collected from Antarctica. The taxonomic status of the isolate was established by polyphasic approach. Scanning electron microscopy observations and the presence of LL-Diaminopimelic acid in the cell wall hydrolysate confirmed the genus Streptomyces. Analysis of 16S rRNA gene sequence showed highest sequence similarity to Streptomyces radiopugnans (99%). The phylogenetic tree constructed using near complete 16S rRNA gene sequences of the isolate and closely related strains revealed that although the isolate fell within the S. radiopugnans gene subclade, it was allocated a different branch in the phylogenetic tree, separating it from the majority of the radiopugnans strains. Similar to type strain, S. radiopugnans R97(T) , the Antarctica isolate displayed thermo tolerance as well as resistance to (60) Co gamma radiation, up to the dose of 15 kGy. However, media and salt tolerance studies revealed that, unlike the type strain, this isolate needed higher salinity for its growth. This is the first report of S. radiopugnans isolated from the Antarctica region. The GenBank/EMBL/DDBJ accession number for the 16S rRNA gene sequence of Streptomyces radiopugnans MTCC 5447 is JQ723477. PMID:23384241

  17. The draft genome sequence of Mangrovibacter sp. strain MP23, an endophyte isolated from the roots of Phragmites karka.

    PubMed

    Behera, Pratiksha; Vaishampayan, Parag; Singh, Nitin K; Mishra, Samir R; Raina, Vishakha; Suar, Mrutyunjay; Pattnaik, Ajit K; Rastogi, Gurdeep

    2016-09-01

    Till date, only one draft genome has been reported within the genus Mangrovibacter. Here, we report the second draft genome shotgun sequence of a Mangrovibacter sp. strain MP23 that was isolated from the roots of Phargmites karka (P. karka), an invasive weed growing in the Chilika Lagoon, Odisha, India. Strain MP23 is a facultative anaerobic, nitrogen-fixing endophytic bacteria that grows optimally at 37 °C, 7.0 pH, and 1% NaCl concentration. The draft genome sequence of strain MP23 contains 4,947,475 bp with an estimated G + C content of 49.9% and total 4392 protein coding genes. The genome sequence has provided information on putative genes that code for proteins involved in oxidative stress, uptake of nutrients, and nitrogen fixation that might offer niche specific ecological fitness and explain the invasive success of P. karka in Chilika Lagoon. The draft genome sequence and annotation have been deposited at DDBJ/EMBL/GenBank under the accession number LYRP00000000. PMID:27508122

  18. PTA1, an essential gene of Saccharomyces cerevisiae affecting pre-tRNA processing.

    PubMed Central

    O'Connor, J P; Peebles, C L

    1992-01-01

    We have identified an essential Saccharomyces cerevisiae gene, PTA1, that affects pre-tRNA processing. PTA1 was initially defined by a UV-induced mutation, pta1-1, that causes the accumulation of all 10 end-trimmed, intron-containing pre-tRNAs and temperature-sensitive but osmotic-remedial growth. pta1-1 does not appear to be an allele of any other known gene affecting pre-tRNA processing. Extracts prepared from pta1-1 strains had normal pre-tRNA splicing endonuclease activity. pta1-1 was suppressed by the ochre suppressor tRNA gene SUP11, indicating that the pta1-1 mutation creates a termination codon within a protein reading frame. The PTA1 gene was isolated from a genomic library by complementation of the pta1-1 growth defect. Episome-borne PTA1 directs recombination to the pta1-1 locus. PTA1 has been mapped to the left arm of chromosome I near CDC24; the gene was sequenced and could encode a protein of 785 amino acids with a molecular weight of 88,417. No other protein sequences similar to that of the predicted PTA1 gene product have been identified within the EMBL or GenBank data base. Disruption of PTA1 near the carboxy terminus of the putative open reading frame was lethal. Possible functions of the PTA1 gene product are discussed. Images PMID:1508188

  19. Deciphering the microbiota of Tuwa hot spring, India using shotgun metagenomic sequencing approach.

    PubMed

    Mangrola, Amitsinh; Dudhagara, Pravin; Koringa, Prakash; Joshi, C G; Parmar, Mansi; Patel, Rajesh

    2015-06-01

    Here, we report metagenome from the Tuwa hot spring, India using shotgun sequencing approach. Metagenome consisted of 541,379 sequences with 98.7 Mbps size with 46% G + C content. Metagenomic sequence reads were deposited into the EMBL database under accession number ERP009321. Community analysis presented 99.1% sequences belong to bacteria, 0.3% of eukaryotic origin, 0.2% virus derived and 0.05% from archea. Unclassified and unidentified sequences were 0.4% and 0.07% respectively. A total of 22 bacterial phyla include 90 families and 201 species were observed in the hot spring metagenome. Firmicutes (97.0%), Proteobacteria (1.3%) and Actinobacteria (0.4%) were reported as dominant bacterial phyla. In functional analysis using Cluster of Orthologous Group (COG), 21.5% drops in the poorly characterized group. Using subsystem based annotation, 4.0% genes were assigned for stress responses and 3% genes were fit into the metabolism of aromatic compounds. The hot spring metagenome is very rich with novel sequences affiliated to unclassified and unidentified lineages, suggesting the potential source for novel microbial species and their products. PMID:26484204

  20. In Silico Mining for Antimalarial Structure-Activity Knowledge and Discovery of Novel Antimalarial Curcuminoids.

    PubMed

    Viira, Birgit; Gendron, Thibault; Lanfranchi, Don Antoine; Cojean, Sandrine; Horvath, Dragos; Marcou, Gilles; Varnek, Alexandre; Maes, Louis; Maran, Uko; Loiseau, Philippe M; Davioud-Charvet, Elisabeth

    2016-01-01

    Malaria is a parasitic tropical disease that kills around 600,000 patients every year. The emergence of resistant Plasmodium falciparum parasites to artemisinin-based combination therapies (ACTs) represents a significant public health threat, indicating the urgent need for new effective compounds to reverse ACT resistance and cure the disease. For this, extensive curation and homogenization of experimental anti-Plasmodium screening data from both in-house and ChEMBL sources were conducted. As a result, a coherent strategy was established that allowed compiling coherent training sets that associate compound structures to the respective antimalarial activity measurements. Seventeen of these training sets led to the successful generation of classification models discriminating whether a compound has a significant probability to be active under the specific conditions of the antimalarial test associated with each set. These models were used in consensus prediction of the most likely active from a series of curcuminoids available in-house. Positive predictions together with a few predicted as inactive were then submitted to experimental in vitro antimalarial testing. A large majority from predicted compounds showed antimalarial activity, but not those predicted as inactive, thus experimentally validating the in silico screening approach. The herein proposed consensus machine learning approach showed its potential to reduce the cost and duration of antimalarial drug discovery. PMID:27367660

  1. Microscopic, chemical, and molecular-biological investigation of the decayed medieval stained window glasses of two Catalonian churches

    PubMed Central

    Piñar, Guadalupe; Garcia-Valles, Maite; Gimeno-Torrente, Domingo; Fernandez-Turiel, Jose Luis; Ettenauer, Jörg; Sterflinger, Katja

    2013-01-01

    We investigated the decayed historical church window glasses of two Catalonian churches, both under Mediterranean climate. Glass surfaces were studied by scanning electron microscopy (SEM), energy dispersive spectrometry (EDS), and X-ray diffraction (XRD). Their chemical composition was determined by wavelength-dispersive spectrometry (WDS) microprobe analysis. The biodiversity was investigated by molecular methods: DNA extraction from glass, amplification by PCR targeting the16S rRNA and ITS regions, and fingerprint analyses by denaturing gradient gel electrophoresis (DGGE). Clone libraries containing either PCR fragments of the bacterial 16S rDNA or the fungal ITS regions were screened by DGGE. Clone inserts were sequenced and compared with the EMBL database. Similarity values ranged from 89 to 100% to known bacteria and fungi. Biological activity in both sites was evidenced in the form of orange patinas, bio-pitting, and mineral precipitation. Analyses revealed complex bacterial communities consisting of members of the phyla Proteobacteria, Bacteroidetes, Firmicutes, and Actinobacteria. Fungi showed less diversity than bacteria, and species of the genera Cladosporium and Phoma were dominant. The detected Actinobacteria and fungi may be responsible for the observed bio-pitting phenomenon. Moreover, some of the detected bacteria are known for their mineral precipitation capabilities. Sequence results also showed similarities with bacteria commonly found on deteriorated stone monuments, supporting the idea that medieval stained glass biodeterioration in the Mediterranean area shows a pattern comparable to that on stone. PMID:24092957

  2. Rat Gene Mapping Using Pcr-Analyzed Microsatellites

    PubMed Central

    Serikawa, T.; Kuramoto, T.; Hilbert, P.; Mori, M.; Yamada, J.; Dubay, C. J.; Lindpainter, K.; Ganten, D.; Guenet, J. L.; Lathrop, G. M.; Beckmann, J. S.

    1992-01-01

    One hundred and seventy-four rat loci which contain short tandem repeat sequences were extracted from the GenBank or EMBL data bases and used to define primers for amplification by the polymerase chain reaction (PCR) of the microsatellite regions, creating PCR-formatted sequence-tagged microsatellite sites (STMSs). One hundred and thirty-four STMSs for 118 loci, including 6 randomly cloned STMSs, were characterized: (i) PCR-analyzed loci were assigned to specific chromosomes using a panel of rat X mouse somatic cell hybrid clones. (ii) Length variation of the STMSs among 8 inbred rat strains could be visualized at 85 of 107 loci examined (79.4%). (iii) A genetic map, integrating biochemical, coat color, mutant and restriction fragment length polymorphism loci, was constructed based on the segregation of 125 polymorphic markers in seven rat backcrosses and in two F(2) crosses. Twenty four linkage groups were identified, all of which were assigned to a defined chromosome. As a reflection of the bias for coding sequences in the public data bases, the STMSs described herein are often associated with genes. Hence, the genetic map we report coincides with a gene map. The corresponding map locations of the homologous mouse and human genes are also listed for comparative mapping purposes. PMID:1628813

  3. An isolate of Arthroderma benhamiae with Trichophyton mentagrophytes var. erinacei anamorph isolated from a four-toed hedgehog (Atelerix albiventris) in Japan.

    PubMed

    Takahashi, Yoko; Haritani, Kuniko; Sano, Ayako; Takizawa, Kayoko; Fukushima, Kazutaka; Miyaji, Makoto; Nishimura, Kazuko

    2002-01-01

    A female four-toed hedgehog probably imported from Africa and kept as a pet by a family suffered from depilation and mite (Caparinia tripilis) infection. Depilated quills were inoculated on a commercially available medium and an isolate of the dermatophytes was obtained. A giant colony after 14 days incubation on yeast extract Sabourauds agar had a central umbo with white granular surface and a yellow pigment ring in the reverse. The hedgehog isolate produced numerous elongated microconidia singly attached along the sides of hyphae. Macroconidia were somewhat irregular in shape and size and 2-6 septa. Abundant intermediate sized spores between micro- and macro conidia and few spirals were observed. Hair perforation and urease activity tests were positive. Maximum growth temperature was 40 C. In the mating tests using the tester strains of both African and Americano-European races of Arthroderma benhamiae, the strain produced numerous gymnothecia only when paired with the African race mating type minus(-). In addition, 591 bases of the internal transcribed spacer region of the ribosomal RNA gene including the 5.8S region (ITS1-5.8S-ITS2) were sequenced and corresponded to those of T. mentagrophytes var. erinacei (DDBJ/EMBL/GenBank accession numbers Z97996 and Z97997) by more than 99.7%. Therefore, our case is the first isolation of A. benhamiae with T. mentagrophytes var. erinacei anamorph in Japan. PMID:12402026

  4. Prediction of Metabolic Pathway Involvement in Prokaryotic UniProtKB Data by Association Rule Mining

    PubMed Central

    Hoehndorf, Robert; Martin, Maria J.; Solovyev, Victor

    2016-01-01

    The widening gap between known proteins and their functions has encouraged the development of methods to automatically infer annotations. Automatic functional annotation of proteins is expected to meet the conflicting requirements of maximizing annotation coverage, while minimizing erroneous functional assignments. This trade-off imposes a great challenge in designing intelligent systems to tackle the problem of automatic protein annotation. In this work, we present a system that utilizes rule mining techniques to predict metabolic pathways in prokaryotes. The resulting knowledge represents predictive models that assign pathway involvement to UniProtKB entries. We carried out an evaluation study of our system performance using cross-validation technique. We found that it achieved very promising results in pathway identification with an F1-measure of 0.982 and an AUC of 0.987. Our prediction models were then successfully applied to 6.2 million UniProtKB/TrEMBL reference proteome entries of prokaryotes. As a result, 663,724 entries were covered, where 436,510 of them lacked any previous pathway annotations. PMID:27390860

  5. The Bioperl toolkit: Perl modules for the life sciences.

    PubMed

    Stajich, Jason E; Block, David; Boulez, Kris; Brenner, Steven E; Chervitz, Stephen A; Dagdigian, Chris; Fuellen, Georg; Gilbert, James G R; Korf, Ian; Lapp, Hilmar; Lehväslaiho, Heikki; Matsalla, Chad; Mungall, Chris J; Osborne, Brian I; Pocock, Matthew R; Schattner, Peter; Senger, Martin; Stein, Lincoln D; Stupka, Elia; Wilkinson, Mark D; Birney, Ewan

    2002-10-01

    The Bioperl project is an international open-source collaboration of biologists, bioinformaticians, and computer scientists that has evolved over the past 7 yr into the most comprehensive library of Perl modules available for managing and manipulating life-science information. Bioperl provides an easy-to-use, stable, and consistent programming interface for bioinformatics application programmers. The Bioperl modules have been successfully and repeatedly used to reduce otherwise complex tasks to only a few lines of code. The Bioperl object model has been proven to be flexible enough to support enterprise-level applications such as EnsEMBL, while maintaining an easy learning curve for novice Perl programmers. Bioperl is capable of executing analyses and processing results from programs such as BLAST, ClustalW, or the EMBOSS suite. Interoperation with modules written in Python and Java is supported through the evolving BioCORBA bridge. Bioperl provides access to data stores such as GenBank and SwissProt via a flexible series of sequence input/output modules, and to the emerging common sequence data storage format of the Open Bioinformatics Database Access project. This study describes the overall architecture of the toolkit, the problem domains that it addresses, and gives specific examples of how the toolkit can be used to solve common life-sciences problems. We conclude with a discussion of how the open-source nature of the project has contributed to the development effort. PMID:12368254

  6. Cloning of the cbhI and cbhII genes involved in cellulose utilisation by the straw mushroom Volvariella volvacea.

    PubMed

    Jia, J; Dyer, P S; Buswell, J A; Peberdy, J F

    1999-07-01

    The straw mushroom Volvariella volvacea is cultivated on substrates rich in cellulose and has been shown to produce a family of cellulolytic enzymes. A PCR-based strategy was adopted to clone genes involved in cellulose utilisation, using degenerate primers designed to amplify conserved catalytic domain sequences of cellobiohydrolases (CBHs). PCR with these primers produced two DNA fragments with sequence similarity to the cbhI and cbhII gene families detected in Trichoderma, Phanerochaete and Agaricus species. Full-length clones of these genes were obtained from an EMBL3 genomic library, and RACE-PCR was used to verify the presence of introns. The cbhI homologue has a coding region of 1722 bp, containing two introns, generating a 536 amino acid polypeptide product. The cbhII gene has a coding region of 1693 bp, containing five introns, and gives rise to a 470-amino acid polypeptide product. Northern and PCR analyses were used to study the expression of the genes. These revealed that transcripts of both genes were induced on medium containing cellulose with cbhI being expressed more strongly than cbhII - but were repressed on medium containing glucose. PMID:10485290

  7. Assessment of the Food Habits of the Moroccan Dorcas Gazelle in M’Sabih Talaa, West Central Morocco, Using the trnL Approach

    PubMed Central

    Ait Baamrane, Moulay Abdeljalil; Shehzad, Wasim; Ouhammou, Ahmed; Abbad, Abdelaziz; Naimi, Mohamed; Coissac, Eric; Taberlet, Pierre; Znari, Mohammed

    2012-01-01

    Food habits of the Moroccan dorcas gazelle, Gazella dorcas massaesyla, previously investigated in the 1980s using microhistological fecal analysis, in the M’Sabih Talaa Reserve, west central Morocco, were re-evaluated over three seasons (spring, summer and autumn 2009) using the trnL approach to determine the diet composition and its seasonal variation from fecal samples. Taxonomic identification was carried out using the identification originating from the database built from EMBL and the list of plant species within the reserve. The total taxonomic richness in the reserve was 130 instead of 171 species in the 1980s. The diet composition revealed to be much more diversified (71 plant taxa belonging to 57 genus and 29 families) than it was 22 years ago (29 identified taxa). Thirty-four taxa were newly identified in the diet while 13 reported in 1986–87 were not found. Moroccan dorcas gazelle showed a high preference to Acacia gummifera, Anagallis arvensis, Glebionis coronaria, Cladanthus arabicus, Diplotaxis tenuisiliqua, Erodium salzmannii, Limonium thouini, Lotus arenarius and Zizyphus lotus. Seasonal variations occurred in both number (40–41 taxa in spring-summer and 49 taxa in autumn vs. respectively 23–22 and 26 in 1986–1987) and taxonomic type of eaten plant taxa. This dietary diversification could be attributed either to the difference in methods of analysis, trnL approach having a higher taxonomic resolution, or a potential change in nutritional quality of plants over time. PMID:22558187

  8. Characterization of Halorubrum sfaxense sp. nov., a New Halophilic Archaeon Isolated from the Solar Saltern of Sfax in Tunisia

    PubMed Central

    Trigui, Hana; Masmoudi, Salma; Brochier-Armanet, Céline; Maalej, Sami; Dukan, Sam

    2011-01-01

    An extremely halophilic archaeon, strain ETD6, was isolated from a marine solar saltern in Sfax, Tunisia. Analysis of the 16S rRNA gene sequence showed that the isolate was phylogenetically related to species of the genus Halorubrum among the family Halobacteriaceae, with a close relationship to Hrr. xinjiangense (99.77% of identity). However, value for DNA-DNA hybridization between strain ETD6 and Hrr.xinjiangense were about 24.5%. The G+C content of the genomic DNA was 65.1 mol% (T(m)). Strain ETD6 grew in 15–35% (w/v) NaCl. The temperature and pH ranges for growth were 20–55°C and 6–9, respectively. Optimal growth occurred at 25% NaCl, 37°C, and pH 7.4. The results of the DNA hybridization against Hrr. xinjiangense and physiological and biochemical tests allowed genotypic and phenotypic differentiation of strain ETD6 from other Hrr. species. Therefore, strain ETD6 represents a novel species of the genus Halorubrum, for which the name Hrr. sfaxense sp. nov. is proposed. The Genbank EMBL-EBI accession number is GU724599. PMID:21754938

  9. PTMcode: a database of known and predicted functional associations between post-translational modifications in proteins

    PubMed Central

    Minguez, Pablo; Letunic, Ivica; Parca, Luca; Bork, Peer

    2013-01-01

    Post-translational modifications (PTMs) are involved in the regulation and structural stabilization of eukaryotic proteins. The combination of individual PTM states is a key to modulate cellular functions as became evident in a few well-studied proteins. This combinatorial setting, dubbed the PTM code, has been proposed to be extended to whole proteomes in eukaryotes. Although we are still far from deciphering such a complex language, thousands of protein PTM sites are being mapped by high-throughput technologies, thus providing sufficient data for comparative analysis. PTMcode (http://ptmcode.embl.de) aims to compile known and predicted PTM associations to provide a framework that would enable hypothesis-driven experimental or computational analysis of various scales. In its first release, PTMcode provides PTM functional associations of 13 different PTM types within proteins in 8 eukaryotes. They are based on five evidence channels: a literature survey, residue co-evolution, structural proximity, PTMs at the same residue and location within PTM highly enriched protein regions (hotspots). PTMcode is presented as a protein-based searchable database with an interactive web interface providing the context of the co-regulation of nearly 75 000 residues in >10 000 proteins. PMID:23193284

  10. Genomic analysis of the Xp21 region around the RP3 locus

    SciTech Connect

    Navia, B.A.; Eisenman, R.E.; Bruns, G.A.

    1994-09-01

    One form of X-linked retinitis pigmentosa has been localized by deletion and linkage analysis to proximal Xp21 near the OTC locus and the proximal breakpoint of the BB deletion. A deletion junction clone, previously isolated from this region, was used to initiate a series of bidirectional walks in a human genomic library in EMBL3A. A phage contig of nearly 70 kb has been cloned and systematically searched for conserved sequences and CA repeats. A number of unique sequences around the breakpoint have been sequenced and analyzed with exon identification programs. An HTF island was identified approximately 35 kb distal to the centromeric breakpoint of the BB deletion and several CA repeat-containing areas were found in the contig. Two YACs that contain the breakpoint and surrounding region were isolated. A phage sublibrary was constructed from one of the YACs and is being used to extend the contig map further centromeric. To isolate transcripts from the region, two rounds of cDNA selection from a combined short insert human retinal and fetal brain library were performed against the pooled phage clones from the contig and against the pooled phage from the YAC derived sublibrary. Among the selected cDNAs, several unique sequences have been identified and are currently being mapped and sequenced.

  11. Genome sequence of the clover symbiont Rhizobium leguminosarum bv. trifolii strain CC275e.

    PubMed

    Delestre, Clément; Laugraud, Aurélie; Ridgway, Hayley; Ronson, Clive; O'Callaghan, Maureen; Barrett, Brent; Ballard, Ross; Griffiths, Andrew; Young, Sandra; Blond, Celine; Gerard, Emily; Wakelin, Steve

    2015-01-01

    Rhizobium leguminosarum bv. trifolii strain CC275e is a highly effective, N2-fixing microsymbiont of white clover (Trifolium repens L.). The bacterium has been widely used in both Australia and New Zealand as a clover seed inoculant and, as such, has delivered the equivalent of millions of dollars of nitrogen into these pastoral systems. R. leguminosarum strain CC275e is a rod-shaped, motile, Gram-negative, non-spore forming bacterium. The genome was sequenced on an Illumina MiSeq instrument using a 2 × 150 bp paired end library and assembled into 29 scaffolds. The genome size is 7,077,367 nucleotides, with a GC content of 60.9 %. The final, high-quality draft genome contains 6693 protein coding genes, close to 85 % of which were assigned to COG categories. This Whole Genome Shotgun project has been deposited at DDBJ/EMBL/GenBank under the accession JRXL00000000. The sequencing of this genome will enable identification of genetic traits associated with host compatibility and high N2 fixation characteristics in Rhizobium leguminosarum. The sequence will also be useful for development of strain-specific markers to assess factors associated with environmental fitness, competiveness for host nodule occupancy, and survival on legume seeds (New Zealand Ministry of Business, Innovation and Employment program, 'Improving forage legume-rhizobia performance' contract C10X1308 and DairyNZ Ltd.). PMID:26649149

  12. STITCH 5: augmenting protein–chemical interaction networks with tissue and affinity data

    PubMed Central

    Szklarczyk, Damian; Santos, Alberto; von Mering, Christian; Jensen, Lars Juhl; Bork, Peer; Kuhn, Michael

    2016-01-01

    Interactions between proteins and small molecules are an integral part of biological processes in living organisms. Information on these interactions is dispersed over many databases, texts and prediction methods, which makes it difficult to get a comprehensive overview of the available evidence. To address this, we have developed STITCH (‘Search Tool for Interacting Chemicals’) that integrates these disparate data sources for 430 000 chemicals into a single, easy-to-use resource. In addition to the increased scope of the database, we have implemented a new network view that gives the user the ability to view binding affinities of chemicals in the interaction network. This enables the user to get a quick overview of the potential effects of the chemical on its interaction partners. For each organism, STITCH provides a global network; however, not all proteins have the same pattern of spatial expression. Therefore, only a certain subset of interactions can occur simultaneously. In the new, fifth release of STITCH, we have implemented functionality to filter out the proteins and chemicals not associated with a given tissue. The STITCH database can be downloaded in full, accessed programmatically via an extensive API, or searched via a redesigned web interface at http://stitch.embl.de. PMID:26590256

  13. eggNOG 4.5: a hierarchical orthology framework with improved functional annotations for eukaryotic, prokaryotic and viral sequences

    PubMed Central

    Huerta-Cepas, Jaime; Szklarczyk, Damian; Forslund, Kristoffer; Cook, Helen; Heller, Davide; Walter, Mathias C.; Rattei, Thomas; Mende, Daniel R.; Sunagawa, Shinichi; Kuhn, Michael; Jensen, Lars Juhl; von Mering, Christian; Bork, Peer

    2016-01-01

    eggNOG is a public resource that provides Orthologous Groups (OGs) of proteins at different taxonomic levels, each with integrated and summarized functional annotations. Developments since the latest public release include changes to the algorithm for creating OGs across taxonomic levels, making nested groups hierarchically consistent. This allows for a better propagation of functional terms across nested OGs and led to the novel annotation of 95 890 previously uncharacterized OGs, increasing overall annotation coverage from 67% to 72%. The functional annotations of OGs have been expanded to also provide Gene Ontology terms, KEGG pathways and SMART/Pfam domains for each group. Moreover, eggNOG now provides pairwise orthology relationships within OGs based on analysis of phylogenetic trees. We have also incorporated a framework for quickly mapping novel sequences to OGs based on precomputed HMM profiles. Finally, eggNOG version 4.5 incorporates a novel data set spanning 2605 viral OGs, covering 5228 proteins from 352 viral proteomes. All data are accessible for bulk downloading, as a web-service, and through a completely redesigned web interface. The new access points provide faster searches and a number of new browsing and visualization capabilities, facilitating the needs of both experts and less experienced users. eggNOG v4.5 is available at http://eggnog.embl.de. PMID:26582926

  14. The SIDER database of drugs and side effects

    PubMed Central

    Kuhn, Michael; Letunic, Ivica; Jensen, Lars Juhl; Bork, Peer

    2016-01-01

    Unwanted side effects of drugs are a burden on patients and a severe impediment in the development of new drugs. At the same time, adverse drug reactions (ADRs) recorded during clinical trials are an important source of human phenotypic data. It is therefore essential to combine data on drugs, targets and side effects into a more complete picture of the therapeutic mechanism of actions of drugs and the ways in which they cause adverse reactions. To this end, we have created the SIDER (‘Side Effect Resource’, http://sideeffects.embl.de) database of drugs and ADRs. The current release, SIDER 4, contains data on 1430 drugs, 5880 ADRs and 140 064 drug–ADR pairs, which is an increase of 40% compared to the previous version. For more fine-grained analyses, we extracted the frequency with which side effects occur from the package inserts. This information is available for 39% of drug–ADR pairs, 19% of which can be compared to the frequency under placebo treatment. SIDER furthermore contains a data set of drug indications, extracted from the package inserts using Natural Language Processing. These drug indications are used to reduce the rate of false positives by identifying medical terms that do not correspond to ADRs. PMID:26481350

  15. Molecular cloning of the cowpea leghemoglobin II gene and expression of its cDNA in Escherichia coli. Purification and characterization of the recombinant protein.

    PubMed Central

    Arredondo-Peter, R; Moran, J F; Sarath, G; Luan, P; Klucas, R V

    1997-01-01

    Cowpea (Vigna unguiculata) nodules contain three leghemoglobins (LbI, LbII, and LbIII) that are encoded by at least two genes. We have cloned and sequenced the gene that encodes for LbII (lbII), the most abundant Lb in cowpea nodules, using total DNA as the template for PCR. Primers were designed using the sequence of the soybean lbc gene. The lbII gene is 679 bp in length and codes for a predicted protein of 145 amino acids. Using sequences of the cowpea lbII gene for the synthesis of primers and total nodule RNA as the template, we cloned a cDNA for LbII into a constitutive expression vector (pEMBL19+) and then expressed it in Escherichia coli. Recombinant LbII (rLbII) and native LbII (nLbII) from cowpea nodules were purified to homogeneity using standard techniques. Properties of rLbII were compared with nLbII by partially sequencing the proteins and by sodium dodecyl sulfate- and isoelectric focusing polyacrylamide gel electrophoresis, western-blot analysis using anti-soybean Lba antibodies, tryptic and chymotryptic mapping, and spectrophotometric techniques. The data showed that the structural and spectral characteristics of rLbII and nLbII were similar. The rLbII was reversibly oxygenated/deoxygenated, showing that it is a functional hemoglobin. PMID:9193085

  16. Recent improvements to the SMART domain-based sequence annotation resource.

    PubMed

    Letunic, Ivica; Goodstadt, Leo; Dickens, Nicholas J; Doerks, Tobias; Schultz, Joerg; Mott, Richard; Ciccarelli, Francesca; Copley, Richard R; Ponting, Chris P; Bork, Peer

    2002-01-01

    SMART (Simple Modular Architecture Research Tool, http://smart.embl-heidelberg.de) is a web-based resource used for the annotation of protein domains and the analysis of domain architectures, with particular emphasis on mobile eukaryotic domains. Extensive annotation for each domain family is available, providing information relating to function, subcellular localization, phyletic distribution and tertiary structure. The January 2002 release has added more than 200 hand-curated domain models. This brings the total to over 600 domain families that are widely represented among nuclear, signalling and extracellular proteins. Annotation now includes links to the Online Mendelian Inheritance in Man (OMIM) database in cases where a human disease is associated with one or more mutations in a particular domain. We have implemented new analysis methods and updated others. New advanced queries provide direct access to the SMART relational database using SQL. This database now contains information on intrinsic sequence features such as transmembrane regions, coiled-coils, signal peptides and internal repeats. SMART output can now be easily included in users' documents. A SMART mirror has been created at http://smart.ox.ac.uk. PMID:11752305

  17. Using Molecular Initiating Events to Develop a Structural Alert Based Screening Workflow for Nuclear Receptor Ligands Associated with Hepatic Steatosis.

    PubMed

    Mellor, Claire L; Steinmetz, Fabian P; Cronin, Mark T D

    2016-02-15

    In silico models are essential for the development of integrated alternative methods to identify organ level toxicity and lead toward the replacement of animal testing. These models include (quantitative) structure-activity relationships ((Q)SARs) and, importantly, the identification of structural alerts associated with defined toxicological end points. Structural alerts are able both to predict toxicity directly and assist in the formation of categories to facilitate read-across. They are particularly important to decipher the myriad mechanisms of action that result in organ level toxicity. The aim of this study was to develop novel structural alerts for nuclear receptor (NR) ligands that are associated with inducing hepatic steatosis and to show the vast number of existing data that are available. Current knowledge on NR agonists was extended with data from the ChEMBL database (12,713 chemicals in total) of bioactive molecules and from studying NR ligand-binding interactions within the protein database (PDB, 624 human NR structure files). A computational structural alert based workflow was developed using KNIME from these data using molecular fragments and other relevant chemical features. In total, 214 structural features were recorded computationally as SMARTS strings, and therefore, they can be used for grouping and screening during drug development and hazard assessment and provide knowledge to anchor adverse outcome pathways (AOPs) via their molecular initiating events (MIEs). PMID:26787004

  18. Analysis of the 5S RNA Pool in Arabidopsis thaliana: RNAs Are Heterogeneous and Only Two of the Genomic 5S Loci Produce Mature 5S RNA

    PubMed Central

    Cloix, Catherine; Tutois, Sylvie; Yukawa, Yasushi; Mathieu, Olivier; Cuvillier, Claudine; Espagnol, Marie-Claude; Picard, Georges; Tourmente, Sylvette

    2002-01-01

    One major 5S RNA, 120 bases long, was revealed by an analysis of mature 5S RNA from tissues, developmental stages, and polysomes in Arabidopsis thaliana. Minor 5S RNA were also found, varying from the major one by one or two base substitutions; 5S rDNA units from each 5S array of the Arabidopsis genome were isolated by PCR using CIC yeast artificial chromosomes (YACs) mapped on the different loci. By using a comparison of the 5S DNA and RNA sequences, we could show that both major and minor 5S transcripts come from only two of the genomic 5S loci: chromosome 4 and chromosome 5 major block. Other 5S loci are either not transcribed or produce rapidly degraded 5S transcripts. Analysis of the 5′- and 3′-DNA flanking sequence has permitted the definition of specific signatures for each 5S rDNA array. [EMBL accession nos: AF330825-AF331032; AF335777-AF335873.] PMID:11779838

  19. Identification of probiotic lactobacilli used for animal feeds on the basis of 16S ribosomal RNA gene sequence.

    PubMed

    Higuchi, Wataru; Muramatsu, Mineo; Dohmae, Soshi; Takano, Tomomi; Isobe, Hirokazu; Yabe, Shizuka; Da, Shi; Baranovich, Tatiana; Yamamoto, Tatsuo

    2008-11-01

    The use of probiotics such as Lactobacillus in animal feeds has gained popularity in recent years. In this study the 16S rRNA gene sequence of L. acidophilus in two commercial agents which have been used in animal feeds, LAB-MOS and Ghenisson 22, was determined. Phylogenetic tree analysis revealed that the two agents, strain MNFLM01 in LAB-MOS and strain GAL-2 in Ghenisson 22, belonged to L. rhamnosus (a member of the L. casei group) and L. johnsonii (a member of the L. acidophilus group), respectively. Biochemical tests assigned the two as L. rhamnosus and ambiguously as L. acidophilus. The data suggest that 16S rRNA gene sequence analysis provides more accurate identification of Lactobacillus species than biochemical tests and would allow quality assurance of relevant commercial products. The 16S rRNA gene sequences of strains MNFLM01 and GAL-2 determined in this study have been submitted to the DDBJ/EMBL/GenBank accession numbers under accession numbers AB288235 and AB295648, respectively. PMID:19090836

  20. A Rational Approach for the Identification of Non-Hydroxamate HDAC6-Selective Inhibitors

    PubMed Central

    Goracci, Laura; Deschamps, Nathalie; Randazzo, Giuseppe Marco; Petit, Charlotte; Dos Santos Passos, Carolina; Carrupt, Pierre-Alain; Simões-Pires, Claudia; Nurisso, Alessandra

    2016-01-01

    The human histone deacetylase isoform 6 (HDAC6) has been demonstrated to play a major role in cell motility and aggresome formation, being interesting for the treatment of multiple tumour types and neurodegenerative conditions. Currently, most HDAC inhibitors in preclinical or clinical evaluations are non-selective inhibitors, characterised by a hydroxamate zinc-binding group (ZBG) showing off-target effects and mutagenicity. The identification of selective HDAC6 inhibitors with novel chemical properties has not been successful yet, also because of the absence of crystallographic information that makes the rational design of HDAC6 selective inhibitors difficult. Using HDAC inhibitory data retrieved from the ChEMBL database and ligand-based computational strategies, we identified 8 original new non-hydroxamate HDAC6 inhibitors from the SPECS database, with activity in the low μM range. The most potent and selective compound, bearing a hydrazide ZBG, was shown to increase tubulin acetylation in human cells. No effects on histone H4 acetylation were observed. To the best of our knowledge, this is the first report of an HDAC6 selective inhibitor bearing a hydrazide ZBG. Its capability to passively cross the blood-brain barrier (BBB), as observed through PAMPA assays, and its low cytotoxicity in vitro, suggested its potential for drug development. PMID:27404291

  1. BioSAXS Sample Changer: a robotic sample changer for rapid and reliable high-throughput X-ray solution scattering experiments.

    PubMed

    Round, Adam; Felisaz, Franck; Fodinger, Lukas; Gobbo, Alexandre; Huet, Julien; Villard, Cyril; Blanchet, Clement E; Pernot, Petra; McSweeney, Sean; Roessle, Manfred; Svergun, Dmitri I; Cipriani, Florent

    2015-01-01

    Small-angle X-ray scattering (SAXS) of macromolecules in solution is in increasing demand by an ever more diverse research community, both academic and industrial. To better serve user needs, and to allow automated and high-throughput operation, a sample changer (BioSAXS Sample Changer) that is able to perform unattended measurements of up to several hundred samples per day has been developed. The Sample Changer is able to handle and expose sample volumes of down to 5 µl with a measurement/cleaning cycle of under 1 min. The samples are stored in standard 96-well plates and the data are collected in a vacuum-mounted capillary with automated positioning of the solution in the X-ray beam. Fast and efficient capillary cleaning avoids cross-contamination and ensures reproducibility of the measurements. Independent temperature control for the well storage and for the measurement capillary allows the samples to be kept cool while still collecting data at physiological temperatures. The Sample Changer has been installed at three major third-generation synchrotrons: on the BM29 beamline at the European Synchrotron Radiation Facility (ESRF), the P12 beamline at the PETRA-III synchrotron (EMBL@PETRA-III) and the I22/B21 beamlines at Diamond Light Source, with the latter being the first commercial unit supplied by Bruker ASC. PMID:25615861

  2. PDEStrIAn: A Phosphodiesterase Structure and Ligand Interaction Annotated Database As a Tool for Structure-Based Drug Design.

    PubMed

    Jansen, Chimed; Kooistra, Albert J; Kanev, Georgi K; Leurs, Rob; de Esch, Iwan J P; de Graaf, Chris

    2016-08-11

    A systematic analysis is presented of the 220 phosphodiesterase (PDE) catalytic domain crystal structures present in the Protein Data Bank (PDB) with a focus on PDE-ligand interactions. The consistent structural alignment of 57 PDE ligand binding site residues enables the systematic analysis of PDE-ligand interaction fingerprints (IFPs), the identification of subtype-specific PDE-ligand interaction features, and the classification of ligands according to their binding modes. We illustrate how systematic mining of this phosphodiesterase structure and ligand interaction annotated (PDEStrIAn) database provides new insights into how conserved and selective PDE interaction hot spots can accommodate the large diversity of chemical scaffolds in PDE ligands. A substructure analysis of the cocrystallized PDE ligands in combination with those in the ChEMBL database provides a toolbox for scaffold hopping and ligand design. These analyses lead to an improved understanding of the structural requirements of PDE binding that will be useful in future drug discovery studies. PMID:26908025

  3. Analysis of a ribosomal RNA operon in the actinomycete Frankia.

    PubMed

    Normand, P; Cournoyer, B; Simonet, P; Nazaret, S

    1992-02-01

    The organisation of ribosomal RNA-encoding (rrn) genes has been studied in Frankia sp. strain ORS020606. The two rrn clusters present in Frankia strain ORS020606 were isolated from genomic banks in phage lambda EMBL3 by hybridization with oligodeoxyribonucleotide probes. The 5'-3' gene order is the usual one for bacteria: 16S-23S-5S. The two clusters are not distinguishable by restriction enzyme mapping inside the coding section, but vary considerably outside it. Sequencing showed that the 16S-rRNA-encoding gene of ORS020606 is very closely related to that of another Alnus-infective Frankia strain (Ag45/Mut15) and highly homologous to corresponding genes of Streptomyces spp. Two possible promoter sequences were detected upstream from the 16S gene, while no tRNA-encoding gene was detected in the whole operon. Regions with a high proportion of divergence for the study of phylogenetic relationships within the genus were looked for and found in the first intergenic spacer, in the 23S and in the 16S gene. PMID:1372279

  4. Cloning and expression of small cDNA fragment encoding strong antiviral peptide from Celosia cristata in Escherichia coli.

    PubMed

    Gholizadeh, A; Kohnehrouz, B Baghban; Santha, I M; Lodha, M L; Kapoor, H C

    2005-09-01

    A small cDNA fragment containing a ribosome-inactivating site was isolated from the leaf cDNA population of Celosia cristata by polymerase chain reaction (PCR). PCR was conducted linearly using a degenerate primer designed from the partially conserved peptide of ribosome-inactivating/antiviral proteins. Sequence analysis showed that it is 150 bp in length. The cDNA fragment was then cloned in a bacterial expression vector and expressed in Escherichia coli as a ~57 kD fused protein, and its presence was further confirmed by Western blot analysis. The recombinant protein was purified by affinity chromatography. The purified product showed strong antiviral activity towards tobacco mosaic virus on host plant leaves, Nicotiana glutinosa, indicating the presence of a putative antiviral determinant in the isolated cDNA product. It is speculated that antiviral site is at, or is separate but very close to, the ribosome-inactivating site. We nominate this short cDNA fragment reported here as a good candidate to investigate further the location of the antiviral determinants. The isolated cDNA sequence was submitted to EMBL databases under accession number of AJ535714. PMID:16266271

  5. Characterization and expression analysis of a banana gene encoding 1-aminocyclopropane-1-carboxylate oxidase.

    PubMed

    Huang, P L; Do, Y Y; Huang, F C; Thay, T S; Chang, T W

    1997-04-01

    A cDNA encoding the banana 1-aminocyclopropane-1-carboxylate (ACC) oxidase has previously been isolated from a cDNA library that was constructed by extracting poly(A)+ RNA from peels of ripening banana. This cDNA, designated as pMAO2, has 1,199 bp and contains an open reading frame of 318 amino acids. In order to identify ripening-related promoters of the banana ACC oxidase gene, pMAO2 was used as a probe to screen a banana genomic library constructed in the lambda EMBL3 vector. The banana ACC oxidase MAO2 gene has four exons and three introns, with all of the boundaries between these introns and exons sharing a consensus dinucleotide sequence of GT-AG. The expression of MAO2 gene in banana begins after the onset of ripening (stage 2) and continuous into later stages of the ripening process. The accumulation of MAO2 mRNA can be induced by 1 microliter/l exogenous ethylene, and it reached steady state level when 100 microliters/l exogenous ethylene was present. PMID:9137825

  6. A Phylogeny-Based Benchmarking Test for Orthology Inference Reveals the Limitations of Function-Based Validation

    PubMed Central

    Larsson, Tomas; Powell, Sean; Doerks, Tobias; von Mering, Christian

    2014-01-01

    Accurate orthology prediction is crucial for many applications in the post-genomic era. The lack of broadly accepted benchmark tests precludes a comprehensive analysis of orthology inference. So far, functional annotation between orthologs serves as a performance proxy. However, this violates the fundamental principle of orthology as an evolutionary definition, while it is often not applicable due to limited experimental evidence for most species. Therefore, we constructed high quality "gold standard" orthologous groups that can serve as a benchmark set for orthology inference in bacterial species. Herein, we used this dataset to demonstrate 1) why a manually curated, phylogeny-based dataset is more appropriate for benchmarking orthology than other popular practices and 2) how it guides database design and parameterization through careful error quantification. More specifically, we illustrate how function-based tests often fail to identify false assignments, misjudging the true performance of orthology inference methods. We also examined how our dataset can instruct the selection of a “core” species repertoire to improve detection accuracy. We conclude that including more genomes at the proper evolutionary distances can influence the overall quality of orthology detection. The curated gene families, called Reference Orthologous Groups, are publicly available at http://eggnog.embl.de/orthobench2. PMID:25369365

  7. A Manual Curation Strategy to Improve Genome Annotation: Application to a Set of Haloarchael Genomes

    PubMed Central

    Pfeiffer, Friedhelm; Oesterhelt, Dieter

    2015-01-01

    Genome annotation errors are a persistent problem that impede research in the biosciences. A manual curation effort is described that attempts to produce high-quality genome annotations for a set of haloarchaeal genomes (Halobacterium salinarum and Hbt. hubeiense, Haloferax volcanii and Hfx. mediterranei, Natronomonas pharaonis and Nmn. moolapensis, Haloquadratum walsbyi strains HBSQ001 and C23, Natrialba magadii, Haloarcula marismortui and Har. hispanica, and Halohasta litchfieldiae). Genomes are checked for missing genes, start codon misassignments, and disrupted genes. Assignments of a specific function are preferably based on experimentally characterized homologs (Gold Standard Proteins). To avoid overannotation, which is a major source of database errors, we restrict annotation to only general function assignments when support for a specific substrate assignment is insufficient. This strategy results in annotations that are resistant to the plethora of errors that compromise public databases. Annotation consistency is rigorously validated for ortholog pairs from the genomes surveyed. The annotation is regularly crosschecked against the UniProt database to further improve annotations and increase the level of standardization. Enhanced genome annotations are submitted to public databases (EMBL/GenBank, UniProt), to the benefit of the scientific community. The enhanced annotations are also publically available via HaloLex. PMID:26042526

  8. Next-generation sequencing of microRNAs in primary human polarized macrophages.

    PubMed

    Cobos Jiménez, Viviana; Willemsen, Antonius M; Bradley, Edward J; Baas, Frank; van Kampen, Antoine H C; Kootstra, Neeltje A

    2014-12-01

    Macrophages are important for mounting inflammatory responses to tissue damage or infection by invading pathogens, and therefore modulation of their cellular functions is essential for the success of the immune system as well as for maintaining tissue homeostasis. Small non-coding RNAs are important regulatory elements of gene expression and microRNAs are the most widely known to be fundamental for the proper development of cells of the immune system. Macrophages can exhibit different phenotypes, depending on the cytokine environment they encounter in the affected tissues. We have analyzed the microRNA expression profiles during maturation of human primary monocytes into macrophages and polarization by pro- or anti-inflammatory cytokines. Here we describe the analysis of next-generation sequencing data deposited in EMBL-EBI ArrayExpress under accession number E-MTAB-1969 and associated with the study published by Cobos Jiménez and collaborators in Physiological Genomics in 2014 (1). The data presented here contributes to our understanding of microRNA expression profiles in human monocytes and macrophages and will also serve as a resource for novel microRNAs and other small RNA species expressed in these cells. PMID:26484091

  9. Recent improvements to the SMART domain-based sequence annotation resource

    PubMed Central

    Letunic, Ivica; Goodstadt, Leo; Dickens, Nicholas J.; Doerks, Tobias; Schultz, Joerg; Mott, Richard; Ciccarelli, Francesca; Copley, Richard R.; Ponting, Chris P.; Bork, Peer

    2002-01-01

    SMART (Simple Modular Architecture Research Tool, http://smart.embl-heidelberg.de) is a web-based resource used for the annotation of protein domains and the analysis of domain architectures, with particular emphasis on mobile eukaryotic domains. Extensive annotation for each domain family is available, providing information relating to function, subcellular localization, phyletic distribution and tertiary structure. The January 2002 release has added more than 200 hand-curated domain models. This brings the total to over 600 domain families that are widely represented among nuclear, signalling and extracellular proteins. Annotation now includes links to the Online Mendelian Inheritance in Man (OMIM) database in cases where a human disease is associated with one or more mutations in a particular domain. We have implemented new analysis methods and updated others. New advanced queries provide direct access to the SMART relational database using SQL. This database now contains information on intrinsic sequence features such as transmembrane regions, coiled-coils, signal peptides and internal repeats. SMART output can now be easily included in users’ documents. A SMART mirror has been created at http://smart.ox.ac.uk. PMID:11752305

  10. Heterodimeric nitrate reductase (NapAB) from Cupriavidus necator H16: purification, crystallization and preliminary X-ray analysis

    SciTech Connect

    Coelho, Catarina; González, Pablo J.; Trincão, José; Carvalho, Ana L.; Najmudin, Shabir; Moura, José J. G.; Moura, Isabel; Romão, Maria J.

    2007-06-01

    Crystals of the oxidized form of the periplasmic nitrate reductase from Cupriavidus necator were obtained using polyethylene glycol 3350 as precipitant The periplasmic nitrate reductase from Cupriavidus necator (also known as Ralstonia eutropha) is a heterodimer that is able to reduce nitrate to nitrite. It comprises a 91 kDa catalytic subunit (NapA) and a 17 kDa subunit (NapB) that is involved in electron transfer. The larger subunit contains a molybdenum active site with a bis-molybdopterin guanine dinucleotide cofactor as well as one [4Fe–4S] cluster, while the small subunit is a di-haem c-type cytochrome. Crystals of the oxidized form of this enzyme were obtained using polyethylene glycol 3350 as precipitant. A single crystal grown at the High Throughput Crystallization Laboratory of the EMBL in Grenoble diffracted to beyond 1.5 Å at the ESRF (ID14-1), which is the highest resolution reported to date for a nitrate reductase. The unit-cell parameters are a = 142.2, b = 82.4, c = 96.8 Å, β = 100.7°, space group C2, and one heterodimer is present per asymmetric unit.

  11. The use of ionic liquids as crystallization additives allowed to overcome nanodrop scaling up problems: A success case for producing diffraction-quality crystals of a nitrate reductase

    NASA Astrophysics Data System (ADS)

    Coelho, Catarina; Trincão, José; João Romão, Maria

    2010-02-01

    The native structure of the heterodimeric periplasmic nitrate reductase (NapAB) from Cupriavidus ( C.) necator was solved at 1.5 Å resolution, using one single crystal obtained at the robot facility at the EMBL, Grenoble. The reaction mechanism for this family of proteins was recently revised, based on new crystallographic evidence, and new structural studies are required to clarify this new mechanistic implication. Several nanodrop crystallization trials yielded microcrystals of the C. necator NapAB. However, scale-up attempts systematically failed and did not yield any suitable crystals. Only with the use of ionic liquids (IL) were we able to grow, in a reproducible manner, larger crystals, which diffracted X-rays to 1.7 Å resolution. By using the IL [C 4mim]Cl as a crystallization additive, we achieved reproducibility in obtaining good quality crystals. Although no IL molecules could be identified in the electron density maps, the crystals grown in the presence and absence of IL have large differences in cell constants. This is the first report of the use of IL for a difficult crystallization problem. The procedure now reported can be applied for crystal optimization such as size increase or improvement of fine needles, as well as for scaling-up crystallization conditions from nanolitre to microlitre drop volumes.

  12. Sequence analysis of the complete genome of an iridovirus isolated from the tiger frog.

    PubMed

    He, Jian G; Lü, Ling; Deng, Min; He, Hua H; Weng, Shao P; Wang, Xiao H; Zhou, Song Y; Long, Qin X; Wang, Xun Z; Chan, Siu M

    2002-01-20

    We have isolated a tiger frog virus (TFV) from diseased tiger frogs, Rana tigrina rugulosa. The genome was a linear double-stranded DNA of 105,057 basepairs in length with a base composition of 55.01% G+C. About 105 open reading frames were identified with coding capacities for polypeptides ranging from 40 to 1294 amino acids. Computer-assisted analyses of the deduced amino acid sequences revealed that 39 of 105 putative gene products showed significant homology to functionally characterized proteins of other species in the GenBank/EMBL/DDBJ databases. These proteins included enzymes and structural proteins involved in virus replication, transcription, modification, and virus--host interaction. The deduced amino acid sequences of TFV gene products showed more than 90% identity to FV3, but a low degree of similarity among TFV, ISKNV, and LCDV-1. The results from this study indicated that TFV may belong to the genus Ranavirus of the family Iridoviridae. PMID:11878922

  13. Interactive tree of life (iTOL) v3: an online tool for the display and annotation of phylogenetic and other trees

    PubMed Central

    Letunic, Ivica; Bork, Peer

    2016-01-01

    Interactive Tree Of Life (http://itol.embl.de) is a web-based tool for the display, manipulation and annotation of phylogenetic trees. It is freely available and open to everyone. The current version was completely redesigned and rewritten, utilizing current web technologies for speedy and streamlined processing. Numerous new features were introduced and several new data types are now supported. Trees with up to 100,000 leaves can now be efficiently displayed. Full interactive control over precise positioning of various annotation features and an unlimited number of datasets allow the easy creation of complex tree visualizations. iTOL 3 is the first tool which supports direct visualization of the recently proposed phylogenetic placements format. Finally, iTOL's account system has been redesigned to simplify the management of trees in user-defined workspaces and projects, as it is heavily used and currently handles already more than 500,000 trees from more than 10,000 individual users. PMID:27095192

  14. The IUPHAR/BPS Guide to PHARMACOLOGY: an expert-driven knowledgebase of drug targets and their ligands

    PubMed Central

    Pawson, Adam J.; Sharman, Joanna L.; Benson, Helen E.; Faccenda, Elena; Alexander, Stephen P.H.; Buneman, O. Peter; Davenport, Anthony P.; McGrath, John C.; Peters, John A.; Southan, Christopher; Spedding, Michael; Yu, Wenyuan; Harmar, Anthony J.

    2014-01-01

    The International Union of Basic and Clinical Pharmacology/British Pharmacological Society (IUPHAR/BPS) Guide to PHARMACOLOGY (http://www.guidetopharmacology.org) is a new open access resource providing pharmacological, chemical, genetic, functional and pathophysiological data on the targets of approved and experimental drugs. Created under the auspices of the IUPHAR and the BPS, the portal provides concise, peer-reviewed overviews of the key properties of a wide range of established and potential drug targets, with in-depth information for a subset of important targets. The resource is the result of curation and integration of data from the IUPHAR Database (IUPHAR-DB) and the published BPS ‘Guide to Receptors and Channels’ (GRAC) compendium. The data are derived from a global network of expert contributors, and the information is extensively linked to relevant databases, including ChEMBL, DrugBank, Ensembl, PubChem, UniProt and PubMed. Each of the ∼6000 small molecule and peptide ligands is annotated with manually curated 2D chemical structures or amino acid sequences, nomenclature and database links. Future expansion of the resource will complete the coverage of all the targets of currently approved drugs and future candidate targets, alongside educational resources to guide scientists and students in pharmacological principles and techniques. PMID:24234439

  15. Identification of Tuber borchii Vittad. mycelium proteins separated by two-dimensional polyacrylamide gel electrophoresis using amino acid analysis and sequence tagging.

    PubMed

    Vallorani, L; Bernardini, F; Sacconi, C; Pierleoni, R; Pieretti, B; Piccoli, G; Buffalini, M; Stocchi, V

    2000-11-01

    This paper reports the first results in the proteome analysis of Tuber borchii Vittad. mycelium, an ectomycorrhizal fungus poorly defined genetically, but known for its generation of edible fruit bodies known as white truffles. Employing isoelectric focusing on immobilized pH gradients, followed by sodium dodecyl sulfate-polyacrylamide gel electrophoresis, we obtained an electropherogram presenting over 800 spots within the window of isoelectric points (pI) 3.5-9 and a molecular mass of 10-200 kDa. Different reducing agents were tested in the sample preparation buffers, and the standard lysis buffer plus 2% w/v polyvinylpolypyrrolidone allowed the best solubilization and resolution of the proteins. The T. borchii proteins separated in micropreparative gels were electroblotted onto polyvinylidene difluoride membranes and visualized by Coomassie staining. Twenty-three proteins were excised and analyzed by the combination of amino acid and N-terminal analysis. One protein was identified by matching its amino acid composition, estimated isoelectric point and molecular mass against the SWISS-PROT and EMBL databases. Four spots were successfully tagged by Edman microsequencing but no homologous sequences were found in databases. PMID:11271490

  16. INsPeCT: INtegrative Platform for Cancer Transcriptomics

    PubMed Central

    Madhamshettiwar, Piyush B.; Maetschke, Stefan R.; Davis, Melissa J.; Reverter, Antonio; Ragan, Mark A.

    2014-01-01

    The emergence of transcriptomics, fuelled by high-throughput sequencing technologies, has changed the nature of cancer research and resulted in a massive accumulation of data. Computational analysis, integration, and data visualization are now major bottlenecks in cancer biology and translational research. Although many tools have been brought to bear on these problems, their use remains unnecessarily restricted to computational biologists, as many tools require scripting skills, data infrastructure, and powerful computational facilities. New user-friendly, integrative, and automated analytical approaches are required to make computational methods more generally useful to the research community. Here we present INsPeCT (INtegrative Platform for Cancer Transcriptomics), which allows users with basic computer skills to perform comprehensive in-silico analyses of microarray, ChIP-seq, and RNA-seq data. INsPeCT supports the selection of interesting genes for advanced functional analysis. Included in its automated workflows are (i) a novel analytical framework, RMaNI (regulatory module network inference), which supports the inference of cancer subtype-specific transcriptional module networks and the analysis of modules; and (ii) WGCNA (weighted gene co-expression network analysis), which infers modules of highly correlated genes across microarray samples, associated with sample traits, eg survival time. INsPeCT is available free of cost from Bioinformatics Resource Australia-EMBL and can be accessed at http://inspect.braembl.org.au. PMID:24653643

  17. Gramene 2016: comparative plant genomics and pathway resources

    PubMed Central

    Tello-Ruiz, Marcela K.; Stein, Joshua; Wei, Sharon; Preece, Justin; Olson, Andrew; Naithani, Sushma; Amarasinghe, Vindhya; Dharmawardhana, Palitha; Jiao, Yinping; Mulvaney, Joseph; Kumari, Sunita; Chougule, Kapeel; Elser, Justin; Wang, Bo; Thomason, James; Bolser, Daniel M.; Kerhornou, Arnaud; Walts, Brandon; Fonseca, Nuno A.; Huerta, Laura; Keays, Maria; Tang, Y. Amy; Parkinson, Helen; Fabregat, Antonio; McKay, Sheldon; Weiser, Joel; D'Eustachio, Peter; Stein, Lincoln; Petryszak, Robert; Kersey, Paul J.; Jaiswal, Pankaj; Ware, Doreen

    2016-01-01

    Gramene (http://www.gramene.org) is an online resource for comparative functional genomics in crops and model plant species. Its two main frameworks are genomes (collaboration with Ensembl Plants) and pathways (The Plant Reactome and archival BioCyc databases). Since our last NAR update, the database website adopted a new Drupal management platform. The genomes section features 39 fully assembled reference genomes that are integrated using ontology-based annotation and comparative analyses, and accessed through both visual and programmatic interfaces. Additional community data, such as genetic variation, expression and methylation, are also mapped for a subset of genomes. The Plant Reactome pathway portal (http://plantreactome.gramene.org) provides a reference resource for analyzing plant metabolic and regulatory pathways. In addition to ∼200 curated rice reference pathways, the portal hosts gene homology-based pathway projections for 33 plant species. Both the genome and pathway browsers interface with the EMBL-EBI's Expression Atlas to enable the projection of baseline and differential expression data from curated expression studies in plants. Gramene's archive website (http://archive.gramene.org) continues to provide previously reported resources on comparative maps, markers and QTL. To further aid our users, we have also introduced a live monthly educational webinar series and a Gramene YouTube channel carrying video tutorials. PMID:26553803

  18. Reactome pathway analysis to enrich biological discovery in proteomics data sets.

    PubMed

    Haw, Robin; Hermjakob, Henning; D'Eustachio, Peter; Stein, Lincoln

    2011-09-01

    Reactome (http://www.reactome.org) is an open-source, expert-authored, peer-reviewed, manually curated database of reactions, pathways and biological processes. We provide an intuitive web-based user interface to pathway knowledge and a suite of data analysis tools. The Pathway Browser is a Systems Biology Graphical Notation-like visualization system that supports manual navigation of pathways by zooming, scrolling and event highlighting, and that exploits PSI Common Query Interface web services to overlay pathways with molecular interaction data from the Reactome Functional Interaction Network and interaction databases such as IntAct, ChEMBL and BioGRID. Pathway and expression analysis tools employ web services to provide ID mapping, pathway assignment and over-representation analysis of user-supplied data sets. By applying Ensembl Compara to curated human proteins and reactions, Reactome generates pathway inferences for 20 other species. The Species Comparison tool provides a summary of results for each of these species as a table showing numbers of orthologous proteins found by pathway from which users can navigate to inferred details for specific proteins and reactions. Reactome's diverse pathway knowledge and suite of data analysis tools provide a platform for data mining, modeling and analysis of large-scale proteomics data sets. This Tutorial is part of the International Proteomics Tutorial Programme (IPTP 8). PMID:21751369

  19. Reactome: a database of reactions, pathways and biological processes.

    PubMed

    Croft, David; O'Kelly, Gavin; Wu, Guanming; Haw, Robin; Gillespie, Marc; Matthews, Lisa; Caudy, Michael; Garapati, Phani; Gopinath, Gopal; Jassal, Bijay; Jupe, Steven; Kalatskaya, Irina; Mahajan, Shahana; May, Bruce; Ndegwa, Nelson; Schmidt, Esther; Shamovsky, Veronica; Yung, Christina; Birney, Ewan; Hermjakob, Henning; D'Eustachio, Peter; Stein, Lincoln

    2011-01-01

    Reactome (http://www.reactome.org) is a collaboration among groups at the Ontario Institute for Cancer Research, Cold Spring Harbor Laboratory, New York University School of Medicine and The European Bioinformatics Institute, to develop an open source curated bioinformatics database of human pathways and reactions. Recently, we developed a new web site with improved tools for pathway browsing and data analysis. The Pathway Browser is an Systems Biology Graphical Notation (SBGN)-based visualization system that supports zooming, scrolling and event highlighting. It exploits PSIQUIC web services to overlay our curated pathways with molecular interaction data from the Reactome Functional Interaction Network and external interaction databases such as IntAct, BioGRID, ChEMBL, iRefIndex, MINT and STRING. Our Pathway and Expression Analysis tools enable ID mapping, pathway assignment and overrepresentation analysis of user-supplied data sets. To support pathway annotation and analysis in other species, we continue to make orthology-based inferences of pathways in non-human species, applying Ensembl Compara to identify orthologs of curated human proteins in each of 20 other species. The resulting inferred pathway sets can be browsed and analyzed with our Species Comparison tool. Collaborations are also underway to create manually curated data sets on the Reactome framework for chicken, Drosophila and rice. PMID:21067998

  20. A Rational Approach for the Identification of Non-Hydroxamate HDAC6-Selective Inhibitors.

    PubMed

    Goracci, Laura; Deschamps, Nathalie; Randazzo, Giuseppe Marco; Petit, Charlotte; Dos Santos Passos, Carolina; Carrupt, Pierre-Alain; Simões-Pires, Claudia; Nurisso, Alessandra

    2016-01-01

    The human histone deacetylase isoform 6 (HDAC6) has been demonstrated to play a major role in cell motility and aggresome formation, being interesting for the treatment of multiple tumour types and neurodegenerative conditions. Currently, most HDAC inhibitors in preclinical or clinical evaluations are non-selective inhibitors, characterised by a hydroxamate zinc-binding group (ZBG) showing off-target effects and mutagenicity. The identification of selective HDAC6 inhibitors with novel chemical properties has not been successful yet, also because of the absence of crystallographic information that makes the rational design of HDAC6 selective inhibitors difficult. Using HDAC inhibitory data retrieved from the ChEMBL database and ligand-based computational strategies, we identified 8 original new non-hydroxamate HDAC6 inhibitors from the SPECS database, with activity in the low μM range. The most potent and selective compound, bearing a hydrazide ZBG, was shown to increase tubulin acetylation in human cells. No effects on histone H4 acetylation were observed. To the best of our knowledge, this is the first report of an HDAC6 selective inhibitor bearing a hydrazide ZBG. Its capability to passively cross the blood-brain barrier (BBB), as observed through PAMPA assays, and its low cytotoxicity in vitro, suggested its potential for drug development. PMID:27404291

  1. The X chromosome is necessary for somatic development in the dioecious Silene latifolia: cytogenetic and molecular evidence and sequencing of a haploid genome.

    PubMed

    Soukupova, Magda; Nevrtalova, Eva; Cížková, Jana; Vogel, Ivan; Cegan, Radim; Hobza, Roman; Vyskot, Boris

    2014-01-01

    Silene latifolia (or white campion) possesses a well-established sex determination system with a dominant Y chromosome in males (the mammalian type). The heteromorphic sex chromosomes X and Y in S. latifolia largely stopped recombination; thus, we can expect a gradual genetic degeneration of the Y chromosome. It is well proven that neither diploid nor polyploid S. latifolia sporophytes can survive without at least one X, so the only life stage possessing the Y as the sole sex chromosome is the male gametophyte (pollen tube), while the female gametophyte seems to be X-dependent. Previous studies on anther-derived plants of this species showed that the obtained plants (largely haploid or dihaploid) were phenotypically and cytologically female. In this paper, we provide molecular evidence for the inviability of plants lacking the X chromosome. Using sex-specific PCR primers, we show that all plantlets and plants derived from anther cultures are female. In studying anther-derived diploid females by sequencing of X-linked markers, we demonstrate that these plants are really homozygous dihaploids. A haploid regenerant plant was sequenced (8× genome coverage) using Illumina technology. Genome data are disposable in the EMBL database as a standard for full genome and X chromosome assembly in this model species. Homozygous dihaploids were back-crossed with males to yield a progeny useful for the study of the evolution of the Y chromosome. PMID:24993893

  2. The IUPHAR/BPS Guide to PHARMACOLOGY: an expert-driven knowledgebase of drug targets and their ligands.

    PubMed

    Pawson, Adam J; Sharman, Joanna L; Benson, Helen E; Faccenda, Elena; Alexander, Stephen P H; Buneman, O Peter; Davenport, Anthony P; McGrath, John C; Peters, John A; Southan, Christopher; Spedding, Michael; Yu, Wenyuan; Harmar, Anthony J

    2014-01-01

    The International Union of Basic and Clinical Pharmacology/British Pharmacological Society (IUPHAR/BPS) Guide to PHARMACOLOGY (http://www.guidetopharmacology.org) is a new open access resource providing pharmacological, chemical, genetic, functional and pathophysiological data on the targets of approved and experimental drugs. Created under the auspices of the IUPHAR and the BPS, the portal provides concise, peer-reviewed overviews of the key properties of a wide range of established and potential drug targets, with in-depth information for a subset of important targets. The resource is the result of curation and integration of data from the IUPHAR Database (IUPHAR-DB) and the published BPS 'Guide to Receptors and Channels' (GRAC) compendium. The data are derived from a global network of expert contributors, and the information is extensively linked to relevant databases, including ChEMBL, DrugBank, Ensembl, PubChem, UniProt and PubMed. Each of the ∼6000 small molecule and peptide ligands is annotated with manually curated 2D chemical structures or amino acid sequences, nomenclature and database links. Future expansion of the resource will complete the coverage of all the targets of currently approved drugs and future candidate targets, alongside educational resources to guide scientists and students in pharmacological principles and techniques. PMID:24234439

  3. A Comprehensive, High-Resolution Genomic Transcript Map of Human Skeletal Muscle

    PubMed Central

    Bortoluzzi, Stefania; Rampoldi, Luca; Simionati, Barbara; Zimbello, Rosanna; Barbon, Alessandro; d’Alessi, Fabio; Tiso, Natascia; Pallavicini, Alberto; Toppo, Stefano; Cannata, Nicola; Valle, Giorgio; Lanfranchi, Gerolamo; Danieli, Gian Antonio

    1998-01-01

    We present the Human Muscle Gene Map (HMGM), the first comprehensive and updated high-resolution expression map of human skeletal muscle. The 1078 entries of the map were obtained by merging data retrieved from UniGene with the RH mapping information on 46 novel muscle transcripts, which showed no similarity to any known sequence. In the map, distances are expressed in megabase pairs. About one-quarter of the map entries represents putative novel genes. Genes known to be specifically expressed in muscle account for <4% of the total. The genomic distribution of the map entries confirmed the previous finding that muscle genes are selectively concentrated in chromosomes 17, 19, and X. Five chromosomal regions are suspected to have a significant excess of muscle genes. Present data support the hypothesis that the biochemical and functional properties of differentiated muscle cells may result from the transcription of a very limited number of muscle-specific genes along with the activity of a large number of genes, shared with other tissues, but showing different levels of expression in muscle. [The sequence data described in this paper have been submitted to the EMBL data library under accession nos. F23198–F23242.] PMID:9724327

  4. [A new unique HIV-1 recombinant form detected in Belarus].

    PubMed

    Eremin, V F; Gasich, E L; Sosinovich, S V

    2012-01-01

    Republican Research-and-Practical Center for Epidemiology and Microbiology, Ministry of Health of Belarus, Minsk The paper presents data on the molecular genetic characteristics of a new HIV-1 recombinant form. The study has shown that the virus is referred to as HIV-1 subtype B in terms of the gag gene and HIV-1 subtype A in terms of the pol and env genes. At the same time the new isolate is closer, in terms of the gag gene, to the HIV-1 DQ207943 strain isolated in Georgia, in terms of the pol gene, to the HIV-1 AF413987.1 strain isolated in Ukraine and, in terms of the env gene to the HIV-1 AY500393 strain isolated in Russia. Thus, the described new HIV-1 recombinant form has the following structure: BgagApolAenv. The gag, pol, and env gene sequences from the new unique HIV-1 recombinant form have been registered in the international database EMBL/Genbank/DDBJ under accession numbers FR775442.1, FN995656.1, and FR775443.1. PMID:22905420

  5. Quantitative Structure-Antioxidant Activity Models of Isoflavonoids: A Theoretical Study.

    PubMed

    Castellano, Gloria; Torrens, Francisco

    2015-01-01

    Seventeen isoflavonoids from isoflavone, isoflavanone and isoflavan classes are selected from Dalbergia parviflora. The ChEMBL database is representative from these molecules, most of which result highly drug-like. Binary rules appear risky for the selection of compounds with high antioxidant capacity in complementary xanthine/xanthine oxidase, ORAC, and DPPH model assays. Isoflavonoid structure-activity analysis shows the most important properties (log P, log D, pKa, QED, PSA, NH + OH ≈ HBD, N + O ≈ HBA). Some descriptors (PSA, HBD) are detected as more important than others (size measure Mw, HBA). Linear and nonlinear models of antioxidant potency are obtained. Weak nonlinear relationships appear between log P, etc. and antioxidant activity. The different capacity trends for the three complementary assays are explained. Isoflavonoids potency depends on the chemical form that determines their solubility. Results from isoflavonoids analysis will be useful for activity prediction of new sets of flavones and to design drugs with antioxidant capacity, which will prove beneficial for health with implications for antiageing therapy. PMID:26062128

  6. Annotating Human P-Glycoprotein Bioassay Data

    PubMed Central

    Zdrazil, Barbara; Pinto, Marta; Vasanthanathan, Poongavanam; Williams, Antony J; Balderud, Linda Zander; Engkvist, Ola; Chichester, Christine; Hersey, Anne; Overington, John P; Ecker, Gerhard F

    2012-01-01

    Abstract Huge amounts of small compound bioactivity data have been entering the public domain as a consequence of open innovation initiatives. It is now the time to carefully analyse existing bioassay data and give it a systematic structure. Our study aims to annotate prominent in vitro assays used for the determination of bioactivities of human P-glycoprotein inhibitors and substrates as they are represented in the ChEMBL and TP-search open source databases. Furthermore, the ability of data, determined in different assays, to be combined with each other is explored. As a result of this study, it is suggested that for inhibitors of human P-glycoprotein it is possible to combine data coming from the same assay type, if the cell lines used are also identical and the fluorescent or radiolabeled substrate have overlapping binding sites. In addition, it demonstrates that there is a need for larger chemical diverse datasets that have been measured in a panel of different assays. This would certainly alleviate the search for other inter-correlations between bioactivity data yielded by different assay setups. PMID:23293680

  7. The use of a mini-κ goniometer head in macromolecular crystallography diffraction experiments

    SciTech Connect

    Brockhauser, Sandor; Ravelli, Raimond B. G.; McCarthy, Andrew A.

    2013-07-01

    Hardware and software solutions for MX data-collection strategies using the EMBL/ESRF miniaturized multi-axis goniometer head are presented. Most macromolecular crystallography (MX) diffraction experiments at synchrotrons use a single-axis goniometer. This markedly contrasts with small-molecule crystallography, in which the majority of the diffraction data are collected using multi-axis goniometers. A novel miniaturized κ-goniometer head, the MK3, has been developed to allow macromolecular crystals to be aligned. It is available on the majority of the structural biology beamlines at the ESRF, as well as elsewhere. In addition, the Strategy for the Alignment of Crystals (STAC) software package has been developed to facilitate the use of the MK3 and other similar devices. Use of the MK3 and STAC is streamlined by their incorporation into online analysis tools such as EDNA. The current use of STAC and MK3 on the MX beamlines at the ESRF is discussed. It is shown that the alignment of macromolecular crystals can result in improved diffraction data quality compared with data obtained from randomly aligned crystals.

  8. Selective deuteration of tryptophan and methionine residues in maltose binding protein: a model system for neutron scattering.

    PubMed

    Laux, Valerie; Callow, Phil; Svergun, Dmitri I; Timmins, Peter A; Forsyth, V Trevor; Haertlein, Michael

    2008-07-01

    We describe methods that have been developed within the ILL-EMBL Deuteration Laboratory for the production of maltose binding protein (MBP) that has been selectively labelled either with deuterated tryptophan or deuterated methionine (single labelling), or both (double labelling). MBP is used as an important model system for biophysical studies, and selective labelling can be helpful in the analysis of small-angle neutron scattering (SANS) data, neutron reflection (NR) data, and high-resolution neutron diffraction data. The selective labelling was carried out in E. coli high-cell density cultures using auxotrophic mutants in minimal medium containing the required deuterated precursors. Five types of sample were prepared and studied: (1) unmodified hydrogenated MBP (H-MBP), (2) perdeuterated MBP (D-MBP), (3) singly labelled MBP with the tryptophan residues deuterated (D-trp MBP), (4) singly labelled MBP with methionine residues deuterated (D-met MBP) and (5) doubly labelled MBP with both tryptophan and methionine residues deuterated (D-trp/met MBP). Labelled samples were characterised by size exclusion chromatography, gel electrophoresis, light scattering and mass spectroscopy. Preliminary small-angle neutron scattering (SANS) experiments have also been carried out and show measurable differences between the SANS data recorded for the various labelled analogues. More detailed SANS experiments using these labelled MBP analogues are planned; the degree to which such data could enhance structure determination by SANS is discussed. PMID:18274740

  9. Molecular characterization of selected dermatophytes and their identification by electrophoretic mutation scanning.

    PubMed

    Cafarchia, Claudia; Otranto, Domenico; Weigl, Stefania; Campbell, Bronwyn E; Parisi, Antonio; Cantacessi, Cinzia; Mancianti, Francesca; Danesi, Patrizia; Gasser, Robin B

    2009-10-01

    Dermatophytes are fungi that can be contagious and cause infections in the keratinized skin of mammals, including humans. The etiological diagnosis of dermatophytosis relies on a combination of in vitro-culture and microscopic methods. Effective molecular tools could overcome the limitations of conventional methods of identification. In the present study, following phenetic identification as M. canis, M. fulvum, M. gypseum, T. mentagrophytes and T. terrestre, we genetically characterized key dermatophytes, employing the sequences of the first and second internal transcribed spacers of nuclear ribosomal DNA as well as part of the chitin synthase-1 gene, and assessed the utility of these DNA regions (based on levels of nucleotide variation within and among species/taxa) as markers for the classification of species and genotypes. Employing partial chitin synthase-1 gene as the marker, we also established a PCR-coupled SSCP approach as a diagnostic/analytical mutation-scanning tool. This tool should facilitate fundamental investigations of the ecology, epidemiology and population genetics of dermatophytes and, importantly, should assist in allowing a more rapid diagnosis of dermatophytoses in humans and other animals, thus overcoming the significant delays in targeted chemotherapy following diagnosis using conventional methods. (Nucleotide sequence data reported in this paper are available in the EMBL, GenBank and DDJB datadases under accession numbers FJ897707-FJ897713 (ITS-1), FJ897714-FJ897720 (ITS-2) and FJ897700-FJ897706 (pchs-1)). PMID:19862737

  10. probeCheck--a central resource for evaluating oligonucleotide probe coverage and specificity.

    PubMed

    Loy, Alexander; Arnold, Roland; Tischler, Patrick; Rattei, Thomas; Wagner, Michael; Horn, Matthias

    2008-10-01

    The web server probeCheck, freely accessible at http://www.microbial-ecology.net/probecheck, provides a pivotal forum for rapid specificity and coverage evaluations of probes and primers against selected databases of phylogenetic and functional marker genes. Currently, 24 widely used sequence collections including the Ribosomal Database Project (RDP) II, Greengenes, SILVA and the Functional Gene Pipeline/Repository can be queried. For this purpose, probeCheck integrates a new online version of the popular ARB probe match tool with free energy (DeltaG) calculations for each perfectly matched and mismatched probe-target hybrid, allowing assessment of the theoretical binding stabilities of oligo-target and non-target hybrids. For each output sequence, the accession number, the GenBank taxonomy and a link to the respective entry at GenBank, EMBL and, if applicable, the query database are displayed. Filtering options allow customizing results on the output page. In addition, probeCheck is linked with probe match tools of RDP II and Greengenes, NCBI blast, the Oligonucleotide Properties Calculator, the two-state folding tool of the DINAMelt server and the rRNA-targeted probe database probeBase. Taken together, these features provide a multifunctional platform with maximal flexibility for the user in the choice of databases and options for the evaluation of published and newly developed probes and primers. PMID:18647333

  11. iPPI-DB: an online database of modulators of protein–protein interactions

    PubMed Central

    Labbé, Céline M.; Kuenemann, Mélaine A.; Zarzycka, Barbara; Vriend, Gert; Nicolaes, Gerry A.F.; Lagorce, David; Miteva, Maria A.; Villoutreix, Bruno O.; Sperandio, Olivier

    2016-01-01

    In order to boost the identification of low-molecular-weight drugs on protein–protein interactions (PPI), it is essential to properly collect and annotate experimental data about successful examples. This provides the scientific community with the necessary information to derive trends about privileged physicochemical properties and chemotypes that maximize the likelihood of promoting a given chemical probe to the most advanced stages of development. To this end we have developed iPPI-DB (freely accessible at http://www.ippidb.cdithem.fr), a database that contains the structure, some physicochemical characteristics, the pharmacological data and the profile of the PPI targets of several hundreds modulators of protein–protein interactions. iPPI-DB is accessible through a web application and can be queried according to two general approaches: using physicochemical/pharmacological criteria; or by chemical similarity to a user-defined structure input. In both cases the results are displayed as a sortable and exportable datasheet with links to external databases such as Uniprot, PubMed. Furthermore each compound in the table has a link to an individual ID card that contains its physicochemical and pharmacological profile derived from iPPI-DB data. This includes information about its binding data, ligand and lipophilic efficiencies, location in the PPI chemical space, and importantly similarity with known drugs, and links to external databases like PubChem, and ChEMBL. PMID:26432833

  12. Cloning and organization of seven arginine biosynthesis genes from Neisseria gonorrhoeae.

    PubMed Central

    Picard, F J; Dillon, J R

    1989-01-01

    A genomic library for Neisseria gonorrhoeae, constructed in the lambda cloning vector EMBL4, was screened for clones carrying arginine biosynthesis genes by complementation of Escherichia coli mutants. Clones complementing defects in argA, argB, argE, argG, argIF, carA, and carB were isolated. An E. coli defective in the acetylornithine deacetylase gene (argE) was complemented by the ornithine acetyltransferase gene (argJ) from N. gonorrhoeae. This heterologous complementation is reported for the first time. The carAB operon from E. coli hybridized with the gonococcal clones that carried carA or carB genes under conditions of high stringency, detecting 80% or greater similarity and showing that the nucleotide sequence of the carbamoylphosphate synthetase genes is very similar in these two organisms. Under these conditions for hybridization, the gonococcal clones carrying argB or argF genes did not hybridize with plasmids containing the corresponding E. coli gene. Cocomplementation experiments established gene linkage between carA and carB. Clones complementing a gene defect in argE were also able to complement an argA mutation. This suggests that the enzyme ornithine acetyltransferase from N. gonorrhoeae (encoded by argJ) may be able to complement both argA and argE mutations in E. coli. The arginine biosynthesis genes in N. gonorrhoeae appear to be scattered as in members of the family Pseudomonadaceae. Images PMID:2493452

  13. IDChase: Mitigating Identifier Migration Trap in Biological Databases

    NASA Astrophysics Data System (ADS)

    Bhattacharjee, Anupam; Islam, Aminul; Jamil, Hasan; Wildman, Derek

    A convenient mechanism to refer to large biological objects such as sequences, structures and networks is the use of identifiers or handles, commonly called IDs. IDs function as a unique place holder in an application for objects too large to be of immediate use in a table which is retrieved from a secondary archive when needed. Usually, applications use IDs of objects managed by remote databases that the applications do not have any control over such as GenBank, EMBL and UCSC. Unfortunately, IDs are generally not unique and frequently change as the objects they refer to change. Consequently, applications built using such IDs need to adapt by monitoring possible ID migration occurring in databases they do not control, or risk producing inconsistent, or out of date results, or even face loss of functionality. In this paper, we develop a wrapper based approach to recognizing ID migration in secondary databases, mapping obsolete IDs to valid new IDs, and updating databases to restore their intended functionality. We present our technique in detail using an example involving NCBI RefSeq as primary, and OCPAT as secondary databases. Based on the proposed technique, we introduce a new wrapper like tool, called IDChase, to address the ID migration problem in biological databases and as a general platform.

  14. [In silico identification of molecular mimicry between T-cell epitopes of Neisseria meningitidis B and the human proteome].

    PubMed

    Batista-Duharte, Alexander; Téllez, Bruno; Tamayo, Maybia; Portuondo, Deivys; Cabrera, Osmir; Sierra, Gustavo; Pérez, Oliver

    2013-07-01

    The objective of the study was to determine the T-cell epitopes of four of the most frequent antigenic proteins of the outer membrane of Neisseria meningitidis B, and to identify the most relevant sites for molecular mimicry with T-cell epitopes in humans. In order to do so, an in silico study -a type of study that uses bioinformatic tools- was carried out using SWISS-PROT/TrEMBL, SYFPEITHI and FASTA databases, which helped to determine the protein sequences, CD4 and CD8 T-cell epitope prediction, as well as the molecular mimicry with humans, respectively. Molecular similarity was found in several human proteins present in different organs and tissues such as: liver, skin and epithelial tissues, brain, lymphatic system and testicles. Of these, those found in testicles were more similar, showing the highest frequency of mimetic sequences. This finding shed light on the success of N. meningitidis B to colonize human tissues and the failure of certain vaccines against this bacterium, and it even helps to explain possible autoimmune reactions associated with the infection or vaccination. PMID:24100820

  15. From ontology selection and semantic web to an integrated information system for food-borne diseases and food safety.

    PubMed

    Yan, Xianghe; Peng, Yun; Meng, Jianghong; Ruzante, Juliana; Fratamico, Pina M; Huang, Lihan; Juneja, Vijay; Needleman, David S

    2011-01-01

    Several factors have hindered effective use of information and resources related to food safety due to inconsistency among semantically heterogeneous data resources, lack of knowledge on profiling of food-borne pathogens, and knowledge gaps among research communities, government risk assessors/managers, and end-users of the information. This paper discusses technical aspects in the establishment of a comprehensive food safety information system consisting of the following steps: (a) computational collection and compiling publicly available information, including published pathogen genomic, proteomic, and metabolomic data; (b) development of ontology libraries on food-borne pathogens and design automatic algorithms with formal inference and fuzzy and probabilistic reasoning to address the consistency and accuracy of distributed information resources (e.g., PulseNet, FoodNet, OutbreakNet, PubMed, NCBI, EMBL, and other online genetic databases and information); (c) integration of collected pathogen profiling data, Foodrisk.org ( http://www.foodrisk.org ), PMP, Combase, and other relevant information into a user-friendly, searchable, "homogeneous" information system available to scientists in academia, the food industry, and government agencies; and (d) development of a computational model in semantic web for greater adaptability and robustness. PMID:21431616

  16. PoSSuM v.2.0: data update and a new function for investigating ligand analogs and target proteins of small-molecule drugs.

    PubMed

    Ito, Jun-ichi; Ikeda, Kazuyoshi; Yamada, Kazunori; Mizuguchi, Kenji; Tomii, Kentaro

    2015-01-01

    PoSSuM (http://possum.cbrc.jp/PoSSuM/) is a database for detecting similar small-molecule binding sites on proteins. Since its initial release in 2011, PoSSuM has grown to provide information related to 49 million pairs of similar binding sites discovered among 5.5 million known and putative binding sites. This enlargement of the database is expected to enhance opportunities for biological and pharmaceutical applications, such as predictions of new functions and drug discovery. In this release, we have provided a new service named PoSSuM drug search (PoSSuMds) at http://possum.cbrc.jp/PoSSuM/drug_search/, in which we selected 194 approved drug compounds retrieved from ChEMBL, and detected their known binding pockets and pockets that are similar to them. Users can access and download all of the search results via a new web interface, which is useful for finding ligand analogs as well as potential target proteins. Furthermore, PoSSuMds enables users to explore the binding pocket universe within PoSSuM. Additionally, we have improved the web interface with new functions, including sortable tables and a viewer for visualizing and downloading superimposed pockets. PMID:25404129

  17. Feature expressions: creating and manipulating sequence datasets.

    PubMed

    Fristensky, B

    1993-12-25

    Annotation of features, such as introns, exons and protein coding regions in GenBank/EMBL/DDBJ entries is now standardized through use of the Features Table (FT) language. The essence of the FT language is described by the relation 'expression-->sequence', meaning that each FT expression evaluates to a sequence. For example, the expression M74750:1..50 evaluates to the first 50 bases of the sequence with accession number M74750. Because FT is intrinsic to the database definition, it can serve as a software- and platform-independent lingua franca for sequence manipulation. The XYLEM package makes it possible to create and manipulate sequence datasets using FT expressions. FEATURES is a program that resolves FT expressions into their corresponding sequences. Annotated features can be retrieved either by feature key or by expression. Even unannotated portions of a sequence can be retrieved by user-generated FT expressions. Applications of the FT language include retrieval of subsequences from large sequence entries, generation of chromosome models or artificial DNA constructs, and representation of restriction maps or mutants. PMID:8290362

  18. An American mink (Neovison vison) transcriptome.

    PubMed

    Christensen, Knud; Anistoroaei, Razvan

    2014-04-01

    HiSeq2000 Illumina pair-end sequenced transcript data originating from a pool of four different tissues of a wild-type American mink yielded approximately 90 Gb of raw data. Subsequently, unique contigs were assembled by a combined approach using velvet and phrap. Of these assembled contigs, about 136 000 match the dog genome and nearly 30 000 contigs match the human transcriptome at more than 17 000 unique gene locations. Gene annotation for these contigs was performed employing custom-made scripts run in combination with comparative sequence similarity search and alignment in the dog and human genome using blast algorithms. Transcripts representing five genes known to be associated with pigmentation were reliably aligned against large mink genomic contigs derived from BAC clones. Sequence comparison between transcript and genomic data revealed seven SNPs. In this study, we generated and annotated mink transcript sequences representing more than 16 000 known genes. This is the first comprehensive transcriptome for the American mink genome, which will facilitate further development in mink expression profiling studies and provide a good annotation basis in the perspectives of a whole genome sequencing project. The project was deposited at EMBL database with the accession number PRJEB1260. PMID:24444022

  19. Characterization of the 5'-flanking region for the human fibrinogen beta gene.

    PubMed Central

    Huber, P; Dalmon, J; Courtois, G; Laurent, M; Assouline, Z; Marguerie, G

    1987-01-01

    To identify the possible regulatory sequences in the genetic expression of fibrinogen, a human genomic DNA library raised in lambda EMBL 4 phage was screened using cDNA probes coding for the A alpha, B beta and gamma chains of human fibrinogen. The entire fibrinogen locus was characterized and its organization analysed by means of hybridization and restriction mapping. Among the clones identified, a single recombinant lambda phage contained the beta gene and its 5'- and 3'-flanking regions. A 1.5 kb fragment of the immediate 5'-flanking region was sequenced and S1 mapping experiments revealed three transcription start points. Comparison of this sequence with that previously reported for the same region upstream from the human gamma gene revealed no significant homology which suggests that the potential promoting sequences of these genes are different. In contrast, comparison of the 5'-flanking regions of human and rat beta genes revealed a 142 bp sequence of 80% homology situated 16 bp upstream from the human beta gene. This highly conserved region may well represents a potential candidate for a regulatory sequence of the human beta gene. Images PMID:3029722

  20. Molecular cloning and expression in Escherichia coli of an active fused Zea mays L. D-amino acid oxidase.

    PubMed

    Gholizadeh, A; Kohnehrouz, B B

    2009-02-01

    D-Amino acid oxidase (DAAO) is an FAD-dependent enzyme that metabolizes D-amino acids in microbes and animals. However, such ability has not been identified in plants so far. We predicted a complete DAAO coding sequence consisting of 1158 bp and encoding a protein of 386 amino acids. We cloned this sequence from the leaf cDNA population of maize plants that could utilize D-alanine as a nitrogen source and grow normally on media containing D-Ala at the concentrations of 100 and 1000 ppm. For more understanding of DAAO ability in maize plant, we produced a recombinant plasmid by the insertion of isolated cDNA into the pMALc2X Escherichia coli expression vector, downstream of the maltose-binding protein coding sequence. The pMALc2X-DAAO vector was used to transform the TB1 strain of E. coli cells. Under normal growth conditions, fused DAAO (with molecular weight of about 78 kDa) was expressed up to 5 mg/liter of bacterial cells. The expressed product was purified by affinity chromatography and subjected to in vitro DAAO activity assay in the presence of five different D-amino acids. Fused DAAO could oxidize D-alanine and D-aspartate, but not D-leucine, D-isoleucine, and D-serine. The cDNA sequence reported in this paper has been submitted to EMBL databases under accession number AM407717. PMID:19267668

  1. Solution NMR Structure of Lin0431 Protein from Listeria innocua Reveals High Structural Similarity with Domain II of Bacterial Transcription Antitermination Protein NusG

    PubMed Central

    Tang, Yuefeng; Xiao, Rong; Ciccosanti, Colleen; Janjua, Haleema; Lee, Dong Yup; Everett, John K.; Swapna, G.V.T.; Acton, Thomas B.; Rost, Burkhard; Montelione, Gaetano T.

    2010-01-01

    Lin0431 protein from Listeria innocua (UniProtKB/TrEMBL ID Q92EM7/Q92EM7_LISIN) was selected as a target of the Northeast Structural Genomics Consortium (target ID: LkR112). Here, we present the high-quality NMR solution structure of this protein which is the first representative for a member of DUF1312 domain family. Lin0431 protein exhibits a β-sandwich topology. Four anti-parallel β-strands form one face of the sandwich and the other three anti-parallel β-strands together with a short α-helix form the other face of the sandwich. Structure alignment by Dali reveals an unexpected structural similarity with domain II of NusG from Aquifex aeolicus. Analyses of the electrostatic protein surface potential and searches for protein surface cavities reveal the conserved basic charged surface cavities of both the Lin0431 and domain II of AaeNusG, suggesting they may bind the negatively charged nucleic acids and/or and other binding partners. The high structural similarity and similar surface features, despite the lack of recognizable sequence similarity, between Lin0431 and AaeNusG domain II suggest that the domain II of NusG and DUF1312 domains have a homologous relationship and may share similar biochemical functions. PMID:20602357

  2. Toward a general predictive QSAR model for gamma-secretase inhibitors.

    PubMed

    Ajmani, Subhash; Janardhan, Sridhara; Viswanadhan, Vellarkad N

    2013-08-01

    Gamma secretase (GS) is an appealing drug target for Alzheimer disease and cancer because of its central role in the processing of amyloid precursor protein and the notch family of proteins. In the absence of three-dimensional structure of GS, there is an urgent need for new methods for the prediction and screening of GS inhibitors, for facilitating discovery of novel GS inhibitors. The present study reports QSAR studies on diverse chemical classes comprising 233 compounds collected from the ChEMBL database. Herein, continuous [PLS regression and neural-network (NN)] and categorical QSAR models (NN and linear discriminant analysis) were developed to obtain pertinent descriptors responsible for variation of GS inhibitor potency. Also, SAR within various chemical classes of compounds is analyzed with respect to important QSAR descriptors, which revealed the significance of electronegative substitutions on aryl rings (PEOE3) in determining variation of GS inhibitor potency. Furthermore, substitution of acyclic amines with N-substituted cyclic amines appears to be favorable for enhancing GS inhibitor potency by increasing the values of sssN_Cnt and number of aliphatic rings. The models developed are statistically significant and improve our understanding of compounds contributing toward GS inhibitor potency and aid in the rational design of novel potent GS inhibitors. PMID:23612850

  3. A Rational Approach for the Identification of Non-Hydroxamate HDAC6-Selective Inhibitors

    NASA Astrophysics Data System (ADS)

    Goracci, Laura; Deschamps, Nathalie; Randazzo, Giuseppe Marco; Petit, Charlotte; Dos Santos Passos, Carolina; Carrupt, Pierre-Alain; Simões-Pires, Claudia; Nurisso, Alessandra

    2016-07-01

    The human histone deacetylase isoform 6 (HDAC6) has been demonstrated to play a major role in cell motility and aggresome formation, being interesting for the treatment of multiple tumour types and neurodegenerative conditions. Currently, most HDAC inhibitors in preclinical or clinical evaluations are non-selective inhibitors, characterised by a hydroxamate zinc-binding group (ZBG) showing off-target effects and mutagenicity. The identification of selective HDAC6 inhibitors with novel chemical properties has not been successful yet, also because of the absence of crystallographic information that makes the rational design of HDAC6 selective inhibitors difficult. Using HDAC inhibitory data retrieved from the ChEMBL database and ligand-based computational strategies, we identified 8 original new non-hydroxamate HDAC6 inhibitors from the SPECS database, with activity in the low μM range. The most potent and selective compound, bearing a hydrazide ZBG, was shown to increase tubulin acetylation in human cells. No effects on histone H4 acetylation were observed. To the best of our knowledge, this is the first report of an HDAC6 selective inhibitor bearing a hydrazide ZBG. Its capability to passively cross the blood-brain barrier (BBB), as observed through PAMPA assays, and its low cytotoxicity in vitro, suggested its potential for drug development.

  4. Expanding the fragrance chemical space for virtual screening

    PubMed Central

    2014-01-01

    The properties of fragrance molecules in the public databases SuperScent and Flavornet were analyzed to define a “fragrance-like” (FL) property range (Heavy Atom Count ≤ 21, only C, H, O, S, (O + S) ≤ 3, Hydrogen Bond Donor ≤ 1) and the corresponding chemical space including FL molecules from PubChem (NIH repository of molecules), ChEMBL (bioactive molecules), ZINC (drug-like molecules), and GDB-13 (all possible organic molecules up to 13 atoms of C, N, O, S, Cl). The FL subsets of these databases were classified by MQN (Molecular Quantum Numbers, a set of 42 integer value descriptors of molecular structure) and formatted for fast MQN-similarity searching and interactive exploration of color-coded principal component maps in form of the FL-mapplet and FL-browser applications freely available at http://www.gdb.unibe.ch. MQN-similarity is shown to efficiently recover 15 different fragrance molecule families from the different FL subsets, demonstrating the relevance of the MQN-based tool to explore the fragrance chemical space. PMID:24876890

  5. Identification of Damaging nsSNVs in HumanERCC2 Gene.

    PubMed

    Fang, Shuo; Zhang, Yuntong; Xu, Miao; Xue, Chunyu; He, Lin; Cai, Lei; Xing, Xin

    2016-09-01

    The hERCC2 gene is an important DNA repair molecule for initiating Cutaneous melanoma (CM). Therefore, it is advisable to study the possible functional SNVs in hERCC2. To achieve this goal, we collected total 2, 253 SNVs in hERCC2from the EMBL website, of which 303 are non-synonymous single nucleotide variants (nsSNVs). Then, SIFT and PolyPhen were used to predict the damaging nsSNVs, and four nsSNVs (rs368866996, rs377739017, rs370819591, and rs121913022) were suggested to be damaging mutations. Since I-Mutant2.0 showed a decrease in stability for the mutants containing each of the four nsSNVs, a 3D protein structure was modeled. Based on the comparison of the energy after minimization, RMSD and stabilizing residues between the native and mutant proteins' structure, rs121913022 was proposed to be the most damaging variant among the nsSNVs in hERCC2 gene by decreasing the stability of protein. The mutant G713R of hERCC2 protein caused by rs121913022 was found to have less expression level than native hERCC2 protein in melanoma cells. These results suggest that rs121913022 may have potentially important clinical and drug target implications. PMID:27085493

  6. Characterization of recombinant bacteriophages containing mosquito ribosomal RNA genes

    SciTech Connect

    Park, Y.J.

    1988-01-01

    A family of nine recombinant bacteriophages containing rRNA genes from cultured cells of the mosquito, Aedes albopictus, has been isolated by screening two different genomic DNA libraries - Charon 30 and EMBL 3 using {sup 32}P-labeled 18S and 28S rRNA as probes. These nine recombinant bacteriophages were characterized by restriction mapping, Southern blotting, and S1 nuclease analysis. The 18S rRNA coding region contains an evolutionarily conserved EcoRI site near the 3{prime}-end, and measures 1800 bp. The 28S rRNA genes were divided into {alpha} and {beta} coding regions measuring 1750 bp and 2000 bp, respectively. The gap between these two regions measures about 340 bp. No insertion sequences were found in the rRNA coding regions. The entire rDNA repeat unit had a minimum length of 15.6 kb, including a nontranscribed spacer region. The non-transcribed spacer region of cloned A. albopictus rDNA contained a common series of seven PvuI sites within a 1250 bp region upstream of the 18S rRNA coding region, and a proportion of this region also showed heterogeneity both in the length and in the restriction sites.

  7. Draft genome sequence of pathogenic bacteria Vibrio parahaemolyticus strain Ba94C2, associated with acute hepatopancreatic necrosis disease isolate from South America.

    PubMed

    Restrepo, Leda; Bayot, Bonny; Betancourt, Irma; Pinzón, Andres

    2016-09-01

    Vibrio parahaemolyticus is a pathogenic bacteria which has been associated to the early mortality syndrome (EMS) also known as hepatopancreatic necrosis disease (AHPND) causing high mortality in shrimp farms. Pathogenic strains contain two homologous genes related to insecticidal toxin genes, PirA and PirB, these toxin genes are located on a plasmid contained within the bacteria. Genomic sequences have allowed the finding of two strains with a divergent structure related to the geographic region from where they were found. The isolates from the geographic collection of Southeast Asia and Mexico show variable regions on the plasmid genome, indicating that even though they are not alike they still conserve the toxin genes. In this paper, we report for the first time, a pathogenic V. parahaemolyticus strain in shrimp from South America that showed symptoms of AHPND. The genomic analysis revealed that this strain of V. parahaemolyticus found in South America appears to be more related to the Southeast Asia as compared to the Mexican strains. This finding is of major importance for the shrimp industry, especially in regards to the urgent need for disease control strategies to avoid large EMS outbreaks and economic loss, and to determine its dispersion in South America. The whole-genome shotgun project of V. parahaemolyticus strain Ba94C2 have been deposited at DDBJ/EMBL/GenBank under the accession PRJNA335761. PMID:27570736

  8. STITCH 3: zooming in on protein–chemical interactions

    PubMed Central

    Kuhn, Michael; Szklarczyk, Damian; Franceschini, Andrea; von Mering, Christian; Jensen, Lars Juhl; Bork, Peer

    2012-01-01

    To facilitate the study of interactions between proteins and chemicals, we have created STITCH, an aggregated database of interactions connecting over 300 000 chemicals and 2.6 million proteins from 1133 organisms. Compared to the previous version, the number of chemicals with interactions and the number of high-confidence interactions both increase 4-fold. The database can be accessed interactively through a web interface, displaying interactions in an integrated network view. It is also available for computational studies through downloadable files and an API. As an extension in the current version, we offer the option to switch between two levels of detail, namely whether stereoisomers of a given compound are shown as a merged entity or as separate entities. Separate display of stereoisomers is necessary, for example, for carbohydrates and chiral drugs. Combining the isomers increases the coverage, as interaction databases and publications found through text mining will often refer to compounds without specifying the stereoisomer. The database is accessible at http://stitch.embl.de/. PMID:22075997

  9. [Typing of cattle leukemia virus circulating in the Ukraine].

    PubMed

    Limanskiĭ, A P; Geue, L; Limanskaia, O Iu; Beier, D

    2004-01-01

    Bovine leucosis virus (BLV), circulating in the Ukrainian territory, was characterized through the definition of its subspecies affiliation. The pro-viral BLV DNA was isolated from peripheral-blood lymphocytes of naturally-HIV-infected black-variegate animals taken from leucosis-affected farms in the Kharkov Region. The env-gene fragment of pro-viral DNA was amplified, sequenced and analyzed after the amplicon had been treated by three restriction enzymes, i.e. BamH I, Bcl I and Pvu II. According to the analysis of restriction-fragments' length polymorphism, the Ukrainian BLV isolate can be classified as belonging to the Australian subspecies, i.e. to one of the 3 known subspecies. Multiple alignment and phylogenetic analysis of the env-gene fragment of BLV isolates from the EMBL database showed that evolutionally the Ukrainian isolate is distantly located from the isolates' clusters of the Belgian, Japanese and Australian subspecie and has the biggest quantity (4) of non-coinciding nucleotides for the analyzed highly conservative locus of the BLV env-gene with a length of 444 pair of nucleotides. PMID:15017853

  10. Exploring new scaffolds for angiotensin II receptor antagonism.

    PubMed

    Kritsi, Eftichia; Matsoukas, Minos-Timotheos; Potamitis, Constantinos; Karageorgos, Vlasios; Detsi, Anastasia; Magafa, Vasilliki; Liapakis, George; Mavromoustakos, Thomas; Zoumpoulakis, Panagiotis

    2016-09-15

    Nowadays, AT1 receptor (AT1R) antagonists (ARBs) constitute the one of the most prevalent classes of antihypertensive drugs that modulate the renin-angiotensin system (RAS). Their main uses include also treatment of diabetic nephropathy (kidney damage due to diabetes) and congestive heart failure. Towards this direction, our study has been focused on the discovery of novel agents bearing different scaffolds which may evolve as a new class of AT1 receptor antagonists. To fulfill this aim, a combination of computational approaches and biological assays were implemented. Particularly, a pharmacophore model was established and served as a 3D search query to screen the ChEMBL15 database. The reliability and accuracy of virtual screening results were improved by using molecular docking studies. In total, 4 compounds with completely diverse chemical scaffolds from potential ARBs, were picked and tested for their binding affinity to AT1 receptor. Results revealed high nanomolar to micromolar affinity (IC50) for all the compounds. Especially, compound 4 exhibited a binding affinity of 199nM. Molecular dynamics simulations were utilized in an effort to provide a molecular basis of their binding to AT1R in accordance to their biological activities. PMID:27480029

  11. BioSAXS Sample Changer: a robotic sample changer for rapid and reliable high-throughput X-ray solution scattering experiments

    PubMed Central

    Round, Adam; Felisaz, Franck; Fodinger, Lukas; Gobbo, Alexandre; Huet, Julien; Villard, Cyril; Blanchet, Clement E.; Pernot, Petra; McSweeney, Sean; Roessle, Manfred; Svergun, Dmitri I.; Cipriani, Florent

    2015-01-01

    Small-angle X-ray scattering (SAXS) of macromolecules in solution is in increasing demand by an ever more diverse research community, both academic and industrial. To better serve user needs, and to allow automated and high-throughput operation, a sample changer (BioSAXS Sample Changer) that is able to perform unattended measurements of up to several hundred samples per day has been developed. The Sample Changer is able to handle and expose sample volumes of down to 5 µl with a measurement/cleaning cycle of under 1 min. The samples are stored in standard 96-well plates and the data are collected in a vacuum-mounted capillary with automated positioning of the solution in the X-ray beam. Fast and efficient capillary cleaning avoids cross-contamination and ensures reproducibility of the measurements. Independent temperature control for the well storage and for the measurement capillary allows the samples to be kept cool while still collecting data at physiological temperatures. The Sample Changer has been installed at three major third-generation synchrotrons: on the BM29 beamline at the European Synchrotron Radiation Facility (ESRF), the P12 beamline at the PETRA-III synchrotron (EMBL@PETRA-III) and the I22/B21 beamlines at Diamond Light Source, with the latter being the first commercial unit supplied by Bruker ASC. PMID:25615861

  12. ATDB: a uni-database platform for animal toxins

    PubMed Central

    He, Quan-Yuan; He, Quan-Ze; Deng, Xing-Can; Yao, Lei; Meng, Er; Liu, Zhong-Hua; Liang, Song-Ping

    2008-01-01

    Venomous animals possess an arsenal of toxins for predation and defense. These toxins have great diversity in function and structure as well as evolution and therefore are of value in both basic and applied research. Recently, toxinomics researches using cDNA library sequencing and proteomics profiling have revealed a large number of new toxins. Although several previous groups have attempted to manage these data, most of them are restricted to certain taxonomic groups and/or lack effective systems for data query and access. In addition, the description of the function and the classification of toxins is rather inconsistent resulting in a barrier against exchanging and comparing the data. Here, we report the ATDB database and website which contains more than 3235 animal toxins from UniProtKB/Swiss-Prot and TrEMBL and related toxin databases as well as published literature. A new ontology (Toxin Ontology) was constructed to standardize the toxin annotations, which includes 745 distinct terms within four term spaces. Furthermore, more than 8423 TO terms have been manually assigned to 2132 toxins by trained biologists. Queries to the database can be conducted via a user-friendly web interface at http://protchem.hunnu.edu.cn/toxin. PMID:17933766

  13. Ribosomal genes of Histoplasma capsulatum var. duboisii and var. farciminosum.

    PubMed

    Okeke, C N; Kappe, R; Zakikhani, S; Nolte, O; Sonntag, H G

    1998-11-01

    A total of 1704 basepairs of the 18S rDNA of Histoplasma capsulatum var. duboisii (HCD, strain CBS175.57) and H. capsulatum var. farciminosum (HCF, strain CBS478.64) were sequenced (EMBL accession no. Z75306 and no. Z75307). The 18S rDNA of HCD was 100% identical to a published sequence of H. capsulatum var. capsulatum (HCC). The 18S rDNA of HCF showed one transversional point mutation at the nucleotide position 114 (ref. Saccharomyces cerevisiae). Hybridization confirmed that, in the 18S rDNA of two out of five strains of HCF, guanine was substituted for cytosine at the nucleotide position 114. Furthermore, identical group 1C1 introns (403 bp) were found to be inserted after position 1165 in four out of five strains of HCF, including the two strains with point mutations in the 18S rDNA, and a slightly different group 1C1 intron (408 bp) was detected in one strain of HCC without this point mutation. Intraspecific sequence variability in the highly conserved 18S rDNA because of occurrence of introns and mutations as a possible source of error in molecular diagnostics is discussed. In addition, internal transcribed spacer regions between the 18S rDNA and the 5.8S rDNA (ITS1) of three strains of HCF, and one strain each of HCC and HCD showed significant sequence variability between varieties and strains of H. capsulatum. PMID:9916456

  14. The relationship between target-class and the physicochemical properties of antibacterial drugs

    PubMed Central

    Mugumbate, Grace; Overington, John P.

    2015-01-01

    The discovery of novel mechanism of action (MOA) antibacterials has been associated with the concept that antibacterial drugs occupy a differentiated region of physicochemical space compared to human-targeted drugs. With, in broad terms, antibacterials having higher molecular weight, lower log P and higher polar surface area (PSA). By analysing the physicochemical properties of about 1700 approved drugs listed in the ChEMBL database, we show, that antibacterials for whose targets are riboproteins (i.e., composed of a complex of RNA and protein) fall outside the conventional human ‘drug-like’ chemical space; whereas antibacterials that modulate bacterial protein targets, generally comply with the ‘rule-of-five’ guidelines for classical oral human drugs. Our analysis suggests a strong target-class association for antibacterials—either protein-targeted or riboprotein-targeted. There is much discussion in the literature on the failure of screening approaches to deliver novel antibacterial lead series, and linkage of this poor success rate for antibacterials with the chemical space properties of screening collections. Our analysis suggests that consideration of target-class may be an underappreciated factor in antibacterial lead discovery, and that in fact bacterial protein-targets may well have similar binding site characteristics to human protein targets, and questions the assumption that larger, more polar compounds are a key part of successful future antibacterial discovery. PMID:25975639

  15. Complete VAX/VMS DNA/protein sequence analysis system

    SciTech Connect

    Smith, D.W.

    1987-05-01

    A complete yet flexible system of programs and database libraries for analysis of DNA, RNA and protein sequences is implemented for VAX/VMS computers. Types of analysis include 1) construction and analysis of chimeric sequences (cloning in the VAX), 2) multiple analysis of one or more single sequences, 3) search and comparison studies using sequence libraries, and 4) direct input and analysis of experimental data. Published groups of programs, including the Staden, Los Alamos, Zuker, Pearson, and PHYLIP programs, are used. GenBank and EMBL DNA libraries and PIR and Doolittle NEWAT protein libraries are available, with associated programs. The system is tutorial, with online documentation for relevent VAX software, the programs, and the databases. The complete documentation is flexibly maintained on reserve via computer printout placed in 3-ring binders. Command files are used extensively; porting of the entire system to another VAX/VMS system requires modification of a single command. Users of the system are members of a VAX group, with automatic implementation of the system upon login. The present system occupies about 140,000 blocks, and is easily expanded, or contracted, as desired. The UCSD system is used extensively for both teaching and research purposes. Use of microcomputers emulating Tektronix 4014 graphics terminals permits saving of graphics output to disk for subsequent modification to generate high quality publishable figures.

  16. Candida famata (Debaryomyces hansenii) DNA sequences containing genes involved in riboflavin synthesis.

    PubMed

    Voronovsky, Andriy Y; Abbas, Charles A; Dmytruk, Kostyantyn V; Ishchuk, Olena P; Kshanovska, Barbara V; Sybirna, Kateryna A; Gaillardin, Claude; Sibirny, Andriy A

    2004-11-01

    Previously cloned Candida famata (Debaryomyces hansenii) strain VKM Y-9 genomic DNA fragments containing genes RIB1 (codes for GTP cyclohydrolase II), RIB2 (encodes specific reductase), RIB5 (codes for dimethylribityllumazine synthase), RIB6 (encodes dihydroxybutanone phosphate synthase) and RIB7 (codes for riboflavin synthase) were sequenced. The derived amino acid sequences of C. famata RIB genes showed extensive homology to the corresponding sequences of riboflavin synthesis enzymes of other yeast species. The highest identity was observed to homologues of D. hansenii CBS767, as C. famata is the anamorph of this hemiascomycetous yeast. The D. hansenii CBS767 RIB3 gene encoding specific deaminase was cloned. This gene successfully complemented riboflavin auxotrophy of the rib3 mutant of flavinogenic yeast, Pichia guilliermondii. Putative iron-responsive elements (potential sites for binding of the transcription factors Fep1p or Aft1p and Aft2p) were found in the upstream regions of some C. famata and D. hansenii RIB genes. The sequences of C. famata RIB genes have been submitted to the EMBL data library under Accession Nos AJ810169-AJ810173. PMID:15543522

  17. Gramene 2016: comparative plant genomics and pathway resources.

    PubMed

    Tello-Ruiz, Marcela K; Stein, Joshua; Wei, Sharon; Preece, Justin; Olson, Andrew; Naithani, Sushma; Amarasinghe, Vindhya; Dharmawardhana, Palitha; Jiao, Yinping; Mulvaney, Joseph; Kumari, Sunita; Chougule, Kapeel; Elser, Justin; Wang, Bo; Thomason, James; Bolser, Daniel M; Kerhornou, Arnaud; Walts, Brandon; Fonseca, Nuno A; Huerta, Laura; Keays, Maria; Tang, Y Amy; Parkinson, Helen; Fabregat, Antonio; McKay, Sheldon; Weiser, Joel; D'Eustachio, Peter; Stein, Lincoln; Petryszak, Robert; Kersey, Paul J; Jaiswal, Pankaj; Ware, Doreen

    2016-01-01

    Gramene (http://www.gramene.org) is an online resource for comparative functional genomics in crops and model plant species. Its two main frameworks are genomes (collaboration with Ensembl Plants) and pathways (The Plant Reactome and archival BioCyc databases). Since our last NAR update, the database website adopted a new Drupal management platform. The genomes section features 39 fully assembled reference genomes that are integrated using ontology-based annotation and comparative analyses, and accessed through both visual and programmatic interfaces. Additional community data, such as genetic variation, expression and methylation, are also mapped for a subset of genomes. The Plant Reactome pathway portal (http://plantreactome.gramene.org) provides a reference resource for analyzing plant metabolic and regulatory pathways. In addition to ∼ 200 curated rice reference pathways, the portal hosts gene homology-based pathway projections for 33 plant species. Both the genome and pathway browsers interface with the EMBL-EBI's Expression Atlas to enable the projection of baseline and differential expression data from curated expression studies in plants. Gramene's archive website (http://archive.gramene.org) continues to provide previously reported resources on comparative maps, markers and QTL. To further aid our users, we have also introduced a live monthly educational webinar series and a Gramene YouTube channel carrying video tutorials. PMID:26553803

  18. Computational classification models for predicting the interaction of compounds with hepatic organic ion importers.

    PubMed

    You, Hwan; Lee, Kyungro; Lee, Sangwon; Hwang, Sung Bo; Kim, Kwang-Yon; Cho, Kwang-Hwi; No, Kyoung Tai

    2015-10-01

    Hepatic transporters, a major determinant of pharmacokinetics, have been used to profile drug properties like efficacy. Among hepatic transporters, importers alter the concentration of the drug by facilitating the transport of a drug into a cell. Despite vast pharmacokinetic studies, the interacting mechanisms of the importers with its substrates or inhibitors are not well understood. Hence, we developed compound binary classification models of whether a compound is binder or nonbinder to a hepatic transporter with experimental data of 284 compounds for four representative hepatic importers, OATP1B1, OATP1B3, OAT2, and OCT1. Support Vector Machine (SVM) along with Genetic Algorithm (GA) was used to construct the classification models of binder versus nonbinder for each target importer. To construct the models, we prepared two data sets, a training data set from Fujitsu database (284 compounds) and an external validation data set from ChEMBL database (1738 compounds). Since an experimental classification criterion between binder and nonbinder has some ambiguity, there is an intrinsic limitation to expect high predictability of the binary classification models developed with the experimental data. The predictability of the classification models calculated with external validation sets were obtained as 77.72%, 84.31%, 84.21%, and 76.38 for OATP1B1, OATP1B3, OAT2, and OCT1, respectively. PMID:26293543

  19. Sequence and molecular characterization of a DNA region encoding the dibenzothiophene desulfurization operon of Rhodococcus sp. strain IGTS8.

    PubMed Central

    Piddington, C S; Kovacevich, B R; Rambosek, J

    1995-01-01

    Dibenzothiophene (DBT), a model compound for sulfur-containing organic molecules found in fossil fuels, can be desulfurized to 2-hydroxybiphenyl (2-HBP) by Rhodococcus sp. strain IGTS8. Complementation of a desulfurization (dsz) mutant provided the genes from Rhodococcus sp. strain IGTS8 responsible for desulfurization. A 6.7-kb TaqI fragment cloned in Escherichia coli-Rhodococcus shuttle vector pRR-6 was found to both complement this mutation and confer desulfurization to Rhodococcus fascians, which normally is not able to desulfurize DBT. Expression of this fragment in E. coli also conferred the ability to desulfurize DBT. A molecular analysis of the cloned fragment revealed a single operon containing three open reading frames involved in the conversion of DBT to 2-HBP. The three genes were designated dszA, dszB, and dszC. Neither the nucleotide sequences nor the deduced amino acid sequences of the enzymes exhibited significant similarity to sequences obtained from the GenBank, EMBL, and Swiss-Prot databases, indicating that these enzymes are novel enzymes. Subclone analyses revealed that the gene product of dszC converts DBT directly to DBT-sulfone and that the gene products of dszA and dszB act in concert to convert DBT-sulfone to 2-HBP. PMID:7574582

  20. Identification of a mannoprotein present in the inner layer of the cell wall of Saccharomyces cerevisiae.

    PubMed Central

    Moukadiri, I; Armero, J; Abad, A; Sentandreu, R; Zueco, J

    1997-01-01

    Cell wall extracts from the double-mutant mnn1 mnn9 strain were used as the immunogen to obtain a monoclonal antibody (MAb), SAC A6, that recognizes a specific mannoprotein--which we have named Icwp--in the walls of cells of Saccharomyces cerevisiae. Icwp runs as a polydisperse band of over 180 kDa in sodium dodecyl sulfate-polyacrylamide gel electrophoresis analysis of Zymolyase extracts of cell walls, although an analysis of the secretory pattern of the mannoprotein shows that at the level of secretory vesicles, it behaves like a discrete band of 140 kDa. Immunofluorescence analysis with the MAb showed that Icwp lies at the inner layer of the cell wall, being accessible to the antibody only after the outer layer of mannoproteins is disturbed by treatment with tunicamycin. The screening of a lambda gt11 expression library enabled us to identify the open reading frame (ORF) coding for Icwp. ICWP (EMBL accession number YLR391w, frame +3) codes for 238 amino acids, of which over 40% are serine or threonine, and contains a putative N-glycosylation site and a putative glycosylphosphatidylinositol attachment signal. Both disruption and overexpression of the ORF caused increased sensitivities to calcofluor white and Congo red, while the disruption caused an increased sensitivity to Zymolyase digestion, suggesting for Icwp a structural role in association with glucan. PMID:9079899

  1. IMGT/HLA Database—a sequence database for the human major histocompatibility complex

    PubMed Central

    Robinson, James; Waller, Matthew J.; Parham, Peter; Bodmer, Julia G.; Marsh, Steven G. E.

    2001-01-01

    The IMGT/HLA Database (www.ebi.ac.uk/imgt/hla/) specialises in sequences of polymorphic genes of the HLA system, the human major histocompatibility complex (MHC). The HLA complex is located within the 6p21.3 region on the short arm of human chromosome 6 and contains more than 220 genes of diverse function. Many of the genes encode proteins of the immune system and these include the 21 highly polymorphic HLA genes, which influence the outcome of clinical transplantation and confer susceptibility to a wide range of non-infectious diseases. The database contains sequences for all HLA alleles officially recognised by the WHO Nomenclature Committee for Factors of the HLA System and provides users with online tools and facilities for their retrieval and analysis. These include allele reports, alignment tools and detailed descriptions of the source cells. The online IMGT/HLA submission tool allows both new and confirmatory sequences to be submitted directly to the WHO Nomenclature Committee. The latest version (release 1.7.0 July 2000) contains 1220 HLA alleles derived from over 2700 component sequences from the EMBL/GenBank/DDBJ databases. The HLA database provides a model which will be extended to provide specialist databases for polymorphic MHC genes of other species. PMID:11125094

  2. The IMGT/HLA sequence database.

    PubMed

    Robinson, J; Marsh, S G

    2000-01-01

    The IMGT/HLA database (wwwebi.ac.uk/imgt/hla/) specialises in sequences of the polymorphic genes of the HLA system, the humanmajor histocompatibility complex (MHC). This complex is located within the 6p213 region on the short arm of human chromosome 6 and contains more than 220 genes of diverse function. Many of the genes encode proteins of the immune system and these include the 21 highly polymorphic HLA genes, which influence the outcome of clinical transplantation and confer susceptibility to a wide range of non-infectious diseases. The database contains sequences for all HLA alleles officially recognised by the WHO Nomenclature Committee for Factors of the HLA System and provides users with online tools and facilities for their retrieval and analysis. These include allele reports, alignment tools, and detailed descriptions of the source cells. The online submission tool allows both new and confirmatory sequences to be submitted directly to the WHO Nomenclature Committee. The latest version (release 1.10.0 April 2001) contains 1329 HLA alleles, 61 HLA related sequences, derived from around 3350 component sequences from the EMBL/ GenBank/DDBJ databases. The IMGT/HLA database provides a model that will be extended to provide specialist databases for polymorphic MHC genes of other species. PMID:12361093

  3. IMGT/HLA Database--a sequence database for the human major histocompatibility complex.

    PubMed

    Robinson, J; Waller, M J; Parham, P; Bodmer, J G; Marsh, S G

    2001-01-01

    The IMGT/HLA Database (www.ebi.ac.uk/imgt/hla/) specialises in sequences of polymorphic genes of the HLA system, the human major histocompatibility complex (MHC). The HLA complex is located within the 6p21.3 region on the short arm of human chromosome 6 and contains more than 220 genes of diverse function. Many of the genes encode proteins of the immune system and these include the 21 highly polymorphic HLA genes, which influence the outcome of clinical transplantation and confer susceptibility to a wide range of non-infectious diseases. The database contains sequences for all HLA alleles officially recognised by the WHO Nomenclature Committee for Factors of the HLA System and provides users with online tools and facilities for their retrieval and analysis. These include allele reports, alignment tools and detailed descriptions of the source cells. The online IMGT/HLA submission tool allows both new and confirmatory sequences to be submitted directly to the WHO Nomenclature Committee. The latest version (release 1.7.0 July 2000) contains 1220 HLA alleles derived from over 2700 component sequences from the EMBL/GenBank/DDBJ databases. The HLA database provides a model which will be extended to provide specialist databases for polymorphic MHC genes of other species. PMID:11125094

  4. SARConnect: A Tool to Interrogate the Connectivity Between Proteins, Chemical Structures and Activity Data

    PubMed Central

    Eriksson, Mats; Nilsson, Ingemar; Kogej, Thierry; Southan, Christopher; Johansson, Martin; Tyrchan, Christian; Muresan, Sorel; Blomberg, Niklas; Bjäreland, Marcus

    2012-01-01

    Abstract The access and use of large-scale structure-activity relationships (SAR) is increasing as the range of targets and availability of bioactive compound-to-protein mappings expands. However, effective exploitation requires merging and normalisation of activity data, mappings to target classifications as well as visual display of chemical structure relationships. This work describes the development of the application “SARConnect” to address these issues. We discuss options for delivery and analysis of large-scale SAR data together with a set of use-cases to illustrate the design choices and utility. The main activity sources of ChEMBL,1 GOSTAR2 and AstraZeneca’s internal system IBIS, had already been integrated in Chemistry Connect.3 For target relationships we selected human UniProtKB/Swiss-Prot4 as our primary source of a heuristic target classification. Similarly, to explore chemical relationships we combined several methods for framework and scaffold analysis into a unified, hierarchical classification where ease of navigation was the primary goal. An application was built on TIBCO Spotfire to retrieve data for visual display. Consequently, users can explore relationships between target, activity and structure across internal, external and commercial sources that encompass approximately 3 million compounds, 2000 human proteins and 10 million activity values. Examples showing the utility of the application are given. PMID:23308082

  5. IMGT/HLA database--a sequence database for the human major histocompatibility complex.

    PubMed

    Robinson, J; Malik, A; Parham, P; Bodmer, J G; Marsh, S G

    2000-03-01

    The IMGT/HLA Database is a specialist database for sequences of the human major histocompatibility (MHC) system. It includes all the HLA sequences officially recognised and named by the WHO Nomenclature Committee for Factors of the HLA System. The database provides users with online tools and facilities for the retrieval and analysis of these sequences. These include allele reports, alignment tools and a detailed database of all source cells. The online IMGT/HLA submission tool allows the submission of both new and confirmatory allele sequences directly to the WHO Nomenclature Committee for Factors of the HLA System. The latest version (release 1.4.1, November 1999) contains 1,015 HLA alleles from over 2,270 component sequences derived from the EMBL/GenBank/DDBJ databases. From its release in December 1998 until December 1999 the IMGT/HLA website received approximately 100,000 hits. The database currently focuses on the human major histocompatibility complex but will be used as a model system to provide specialist databases for the MHC sequences of other species. PMID:10777106

  6. easyRNASeq: a bioconductor package for processing RNA-Seq data

    PubMed Central

    Delhomme, Nicolas; Padioleau, Ismaël; Furlong, Eileen E.; Steinmetz, Lars M.

    2012-01-01

    Motivation: RNA sequencing is becoming a standard for expression profiling experiments and many tools have been developed in the past few years to analyze RNA-Seq data. Numerous ‘Bioconductor’ packages are available for next-generation sequencing data loading in R, e.g. ShortRead and Rsamtools as well as to perform differential gene expression analyses, e.g. DESeq and edgeR. However, the processing tasks lying in between these require the precise interplay of many Bioconductor packages, e.g. Biostrings, IRanges or external solutions are to be sought. Results: We developed ‘easyRNASeq’, an R package that simplifies the processing of RNA sequencing data, hiding the complex interplay of the required packages behind a single functionality. Availability: The package is implemented in R (as of version 2.15) and is available from Bioconductor (as of version 2.10) at the URL: http://bioconductor.org/packages/release/bioc/html/easyRNASeq.html, where installation and usage instructions can be found. Contact: delhomme@embl.de PMID:22847932

  7. Analyzing compound activity records and promiscuity degrees in light of publication statistics

    PubMed Central

    Hu, Ye; Bajorath, Jürgen

    2016-01-01

    For the generation of contemporary databases of bioactive compounds, activity information is usually extracted from the scientific literature. However, when activity data are analyzed, source publications are typically no longer taken into consideration. Therefore, compound activity data selected from ChEMBL were traced back to thousands of original publications, activity records including compound, assay, and target information were systematically generated, and their distributions across the literature were determined. In addition, publications were categorized on the basis of activity records. Furthermore, compound promiscuity, defined as the ability of small molecules to specifically interact with multiple target proteins, was analyzed in light of publication statistics, thus adding another layer of information to promiscuity assessment. It was shown that the degree of compound promiscuity was not influenced by increasing numbers of source publications. Rather, most non-promiscuous as well as promiscuous compounds, regardless of their degree of promiscuity, originated from single publications, which emerged as a characteristic feature of the medicinal chemistry literature. PMID:27347396

  8. Multi-output Model with Box-Jenkins Operators of Quadratic Indices for Prediction of Malaria and Cancer Inhibitors Targeting Ubiquitin- Proteasome Pathway (UPP) Proteins.

    PubMed

    Casañola-Martin, Gerardo M; Le-Thi-Thu, Huong; Pérez-Giménez, Facundo; Marrero-Ponce, Yovani; Merino-Sanjuán, Matilde; Abad, Concepción; González-Díaz, Humberto

    2016-01-01

    The ubiquitin-proteasome pathway (UPP) is the primary degradation system of short-lived regulatory proteins. Cellular processes such as the cell cycle, signal transduction, gene expression, DNA repair and apoptosis are regulated by this UPP and dysfunctions in this system have important implications in the development of cancer, neurodegenerative, cardiac and other human pathologies. UPP seems also to be very important in the function of eukaryote cells of the human parasites like Plasmodium falciparum, the causal agent of the neglected disease Malaria. Hence, the UPP could be considered as an attractive target for the development of compounds with Anti-Malarial or Anti-cancer properties. Recent online databases like ChEMBL contains a larger quantity of information in terms of pharmacological assay protocols and compounds tested as UPP inhibitors under many different conditions. This large amount of data give new openings for the computer-aided identification of UPP inhibitors, but the intrinsic data diversity is an obstacle for the development of successful classifiers. To solve this problem here we used the Bob-Jenkins moving average operators and the atom-based quadratic molecular indices calculated with the software TOMOCOMD-CARDD (TC) to develop a quantitative model for the prediction of the multiple outputs in this complex dataset. Our multi-target model can predict results for drugs against 22 molecular or cellular targets of different organisms with accuracies above 70% in both training and validation sets. PMID:26427384

  9. Secreted Aspartic Proteinase Family of Candida tropicalis

    PubMed Central

    Zaugg, Christophe; Borg-von Zepelin, Margarete; Reichard, Utz; Sanglard, Dominique; Monod, Michel

    2001-01-01

    Medically important yeasts of the genus Candida secrete aspartic proteinases (Saps), which are of particular interest as virulence factors. Like Candida albicans, Candida tropicalis secretes in vitro one dominant Sap (Sapt1p) in a medium containing bovine serum albumin (BSA) as the sole source of nitrogen. Using the gene SAPT1 as a probe and under low-stringency hybridization conditions, three new closely related gene sequences, SAPT2 to SAPT4, encoding secreted proteinases were cloned from a C. tropicalis λEMBL3 genomic library. All bands identified by Southern blotting of EcoRI-digested C. tropicalis genomic DNA with SAPT1 could be assigned to a specific SAP gene. Therefore, the SAPT gene family of C. tropicalis is likely to contain only four members. Interestingly, the SAPT2 and SAPT3 gene products, Sapt2p and Sapt3p, which have not yet been detected in C. tropicalis cultures in vitro, were produced as active recombinant enzymes with the methylotrophic yeast Pichia pastoris as an expression system. As expected, reverse transcriptase PCR experiments revealed a strong SAPT1 signal with RNA extracted from cells grown in BSA medium. However, a weak signal was obtained with all other SAPT genes under several conditions tested, showing that these SAPT genes could be expressed at a basic level. Together, these experiments suggest that the gene products Sapt2p, Sapt3p, and Sapt4p could be produced under conditions yet to be described in vitro or during infection. PMID:11119531

  10. MAPU 2.0: high-accuracy proteomes mapped to genomes

    PubMed Central

    Gnad, Florian; Oroshi, Mario; Birney, Ewan; Mann, Matthias

    2009-01-01

    The MAPU 2.0 database contains proteomes of organelles, tissues and cell types measured by mass spectrometry (MS)-based proteomics. In contrast to other databases it is meant to contain a limited number of experiments and only those with very high-resolution and -accuracy data. MAPU 2.0 displays the proteomes of organelles, tissues and body fluids or conversely displays the occurrence of proteins of interest in all these proteomes. The new release addresses MS-specific problems including ambiguous peptide-to-protein assignments and it provides insight into general functional features on the protein level ranging from gene ontology classification to comprehensive SwissProt annotation. Moreover, the derived proteomic data are used to annotate the genomes using Distributed Annotation Service (DAS) via EnsEMBL services. MAPU 2.0 is a model for a database specifically designed for high-accuracy proteomics and a member of the ProteomExchange Consortium. It is available on line at http://www.mapuproteome.com. PMID:18948283

  11. eggNOG 4.5: a hierarchical orthology framework with improved functional annotations for eukaryotic, prokaryotic and viral sequences.

    PubMed

    Huerta-Cepas, Jaime; Szklarczyk, Damian; Forslund, Kristoffer; Cook, Helen; Heller, Davide; Walter, Mathias C; Rattei, Thomas; Mende, Daniel R; Sunagawa, Shinichi; Kuhn, Michael; Jensen, Lars Juhl; von Mering, Christian; Bork, Peer

    2016-01-01

    eggNOG is a public resource that provides Orthologous Groups (OGs) of proteins at different taxonomic levels, each with integrated and summarized functional annotations. Developments since the latest public release include changes to the algorithm for creating OGs across taxonomic levels, making nested groups hierarchically consistent. This allows for a better propagation of functional terms across nested OGs and led to the novel annotation of 95 890 previously uncharacterized OGs, increasing overall annotation coverage from 67% to 72%. The functional annotations of OGs have been expanded to also provide Gene Ontology terms, KEGG pathways and SMART/Pfam domains for each group. Moreover, eggNOG now provides pairwise orthology relationships within OGs based on analysis of phylogenetic trees. We have also incorporated a framework for quickly mapping novel sequences to OGs based on precomputed HMM profiles. Finally, eggNOG version 4.5 incorporates a novel data set spanning 2605 viral OGs, covering 5228 proteins from 352 viral proteomes. All data are accessible for bulk downloading, as a web-service, and through a completely redesigned web interface. The new access points provide faster searches and a number of new browsing and visualization capabilities, facilitating the needs of both experts and less experienced users. eggNOG v4.5 is available at http://eggnog.embl.de. PMID:26582926

  12. Cloning of the cDNA for the human ATP synthase OSCP subunit (ATP5O) by exon trapping and mapping to chromosome 21q22.1-q22.2

    SciTech Connect

    Chen, Haiming; Morris, M.A.; Rossier, C.

    1995-08-10

    Exon trapping was used to clone portions of potential genes from human chromosome 21. One trapped sequence showed striking homology with the bovine and rat ATP synthase OSCP (oligomycin sensitivity conferring protein) subunit. We subsequently cloned the full-length human ATP synthase OSCP cDNA (GDB/HGMW approved name ATP50) from infant brain and muscle libraries and determined its nucleotide and deduced amino acid sequence (EMBL/GenBank Accession No. X83218). The encoded polypeptide contains 213 amino acids, with more than 80% identity to bovine and murine ATPase OSCP subunits and over 35% identity to Saccharomyces cerevisiae and sweet potato sequences. The human ATP5O gene is located at 21q22.1-q22.2, just proximal to D21S17, in YACs 860G11 and 838C7 of the Chumakov et al. YAC contig. The gene is expressed in all human tissues examined, most strongly in muscle and heart. This ATP5O subunit is a key structural component of the stalk of the mitochondrial respiratory chain F{sub 1}F{sub 0}-ATP synthase and as such may contribute in a gene dosage-dependent manner to the phenotype of Down syndrome (trisomy 21). 39 refs., 5 figs.

  13. Identification of endothelial antigens relevant to transplant coronary artery disease from a human endothelial cell cDNA expression library.

    PubMed

    Ationu, A

    1998-06-01

    Accelerated transplant coronary artery disease (TxCAD) results in increased expression of antiendothelial antibodies whose target antigens remain largely unidentified. One of these endothelial antigens has been identified as vimentin, a cytoskeletal protein present in cells of the blood vessel walls. In the present study, SDS-PAGE and Western blot analysis of human endothelial cell (EAHy 926) lysates probed with sera from a TxCAD patient were used to confirm immunoreactivity of antiendothelial antibodies towards several endothelial proteins. To further elucidate the identity of these putative antigens, a human endothelial cell (EAHy 926) cDNA expression library was immunoscreened with serum obtained from a TxCAD patient. Two positive cDNA clones were identified by partial nucleotide sequence analysis and GenBank/EMBL database searches for homology as the 85 kDa human CD36 antigen (a cell surface glycoprotein expressed in various cells including epithelial and endothelial cells) and a 50 kDa keratin-like protein (a member of the intermediate filament protein expressed in epithelial cells). These results are the first to demonstrate that human CD36 antigen and a keratin-like protein may be additional target proteins for the anti-endothelial antibodies associated with TxCAD. PMID:9852639

  14. De novo transcriptome analysis of an imminent biofuel crop, Camelina sativa L. using Illumina GAIIX sequencing platform and identification of SSR markers.

    PubMed

    Mudalkar, Shalini; Golla, Ramesh; Ghatty, Sreenivas; Reddy, Attipalli Ramachandra

    2014-01-01

    Camelina sativa L. is an emerging biofuel crop with potential applications in industry, medicine, cosmetics and human nutrition. The crop is unexploited owing to very limited availability of transcriptome and genomic data. In order to analyse the various metabolic pathways, we performed de novo assembly of the transcriptome on Illumina GAIIX platform with paired end sequencing for obtaining short reads. The sequencing output generated a FastQ file size of 2.97 GB with 10.83 million reads having a maximum read length of 101 nucleotides. The number of contigs generated was 53,854 with maximum and minimum lengths of 10,086 and 200 nucleotides respectively. These trancripts were annotated using BLAST search against the Aracyc, Swiss-Prot, TrEMBL, gene ontology and clusters of orthologous groups (KOG) databases. The genes involved in lipid metabolism were studied and the transcription factors were identified. Sequence similarity studies of Camelina with the other related organisms indicated the close relatedness of Camelina with Arabidopsis. In addition, bioinformatics analysis revealed the presence of a total of 19,379 simple sequence repeats. This is the first report on Camelina sativa L., where the transcriptome of the entire plant, including seedlings, seed, root, leaves and stem was done. Our data established an excellent resource for gene discovery and provide useful information for functional and comparative genomic studies in this promising biofuel crop. PMID:24002439

  15. Model for high-throughput screening of drug immunotoxicity--study of the anti-microbial G1 over peritoneal macrophages using flow cytometry.

    PubMed

    Tenorio-Borroto, Esvieta; Peñuelas-Rivas, Claudia G; Vásquez-Chagoyán, Juan C; Castañedo, Nilo; Prado-Prado, Francisco J; García-Mera, Xerardo; González-Díaz, Humberto

    2014-01-24

    Quantitative Structure-Activity (mt-QSAR) techniques may become an important tool for prediction of cytotoxicity and High-throughput Screening (HTS) of drugs to rationalize drug discovery process. In this work, we train and validate by the first time mt-QSAR model using TOPS-MODE approach to calculate drug molecular descriptors and Linear Discriminant Analysis (LDA) function. This model correctly classifies 8258 out of 9000 (Accuracy = 91.76%) multiplexing assay endpoints of 7903 drugs (including both train and validation series). Each endpoint correspond to one out of 1418 assays, 36 molecular and cellular targets, 46 standard type measures, in two possible organisms (human and mouse). After that, we determined experimentally, by the first time, the values of EC50 = 21.58 μg/mL and Cytotoxicity = 23.6% for the anti-microbial/anti-parasite drug G1 over Balb/C mouse peritoneal macrophages using flow cytometry. In addition, the model predicts for G1 only 7 positive endpoints out 1251 cytotoxicity assays (0.56% of probability of cytotoxicity in multiple assays). The results obtained complement the toxicological studies of this important drug. This work adds a new tool to the existing pool of few methods useful for multi-target HTS of ChEMBL and other libraries of compounds towards drug discovery. PMID:24445280

  16. Crystals of DhaA mutants from Rhodococcus rhodochrous NCIMB 13064 diffracted to ultrahigh resolution: crystallization and preliminary diffraction analysis

    SciTech Connect

    Stsiapanava, Alena; Koudelakova, Tana; Pavlova, Martina; Damborsky, Jiri

    2008-02-01

    Three mutants of the haloalkane dehalogenase DhaA derived from R. rhodochrous NCIMB 13064 were crystallized and diffracted to ultrahigh resolution. The enzyme DhaA from Rhodococcus rhodochrous NCIMB 13064 belongs to the haloalkane dehalogenases, which catalyze the hydrolysis of haloalkanes to the corresponding alcohols. The haloalkane dehalogenase DhaA and its variants can be used to detoxify the industrial pollutant 1,2,3-trichloropropane (TCP). Three mutants named DhaA04, DhaA14 and DhaA15 were constructed in order to study the importance of tunnels connecting the buried active site with the surrounding solvent to the enzymatic activity. All protein mutants were crystallized using the sitting-drop vapour-diffusion method. The crystals of DhaA04 belonged to the orthorhombic space group P2{sub 1}2{sub 1}2{sub 1}, while the crystals of the other two mutants DhaA14 and DhaA15 belonged to the triclinic space group P1. Native data sets were collected for the DhaA04, DhaA14 and DhaA15 mutants at beamline X11 of EMBL, DESY, Hamburg to the high resolutions of 1.30, 0.95 and 1.15 Å, respectively.

  17. A novel expression vector, designated as pHisJM, for producing recombinant His-fusion proteins.

    PubMed

    Masuda, Junko; Takayama, Eiji; Satoh, Ayano; Kojima-Aikawa, Kyoko; Suzuki, Kimihiro; Matsumoto, Isamu

    2004-10-01

    Compared to glutathione S -transferase (GST), tagging with hexahistidine residues (His) has several merits: low levels of toxicity and immunogenicity, a smaller size and no electric charge. We have constructed a novel expression vector, designated as pHisJM (EMBL/GenBank/DDJB accession no. AB116367), for producing recombinant His-fusion proteins. This vector was constructed by replacing GST and multiple cloning site (MCS) cassettes in pGEX-5X-3 with those of hexahistidine and MCS derived from pRSET C vector. Human annexin IV (Anx IV) was used as target protein. His-Anx IV fusion protein was expressed using pHisJM and gave a 40 kDa band when immuno-stained with anti-His mAb or anti-Anx IV mAb as predicted. To compare expression efficiency, a Anx IV cDNA inserted-pHisJM or pGEX-5X-3 was transformed into Escherichia coli DH5alpha, JM109, BL21 and BL21(DE3). Using pHisJM, Anx IV protein was highly expressed in all cell strains. In addition to the merits of using His-tag, pHisJM has several advantages: 1) it has high expression efficiency; 2) it can be used in any Escherichia coli strain; and 3) it can be used in a single strain of Escherichia coli in all steps from plasmid construction to the expression of the target gene. PMID:15604794

  18. Identification and expression profiling of low oxygen regulated genes from Citrus flavedo tissues using RT-PCR differential display.

    PubMed

    Pasentsis, Konstantinos; Falara, Vasiliki; Pateraki, Irene; Gerasopoulos, Dimitrios; Kanellis, Angelos K

    2007-01-01

    The molecular basis for the adaptation of fruit tissues to low oxygen treatments remains largely unknown. RT-PCR differential display (DD) was employed to isolate anoxic and/or hypoxic genes whose expression responded to short, low-oxygen regimes. This approach led to the isolation, cloning, successful sequencing, and bioinformatic analysis of 98 transcripts from Citrus flavedo tissues that were differentially expressed in DD gels in response to 0, 0.5, 3, and 21% O(2) for 24 h. RNA blot analysis of 25 DD clones revealed that 11 genes were induced under hypoxia and/or anoxia, 11 exhibited constitutive expression and three transcripts were suppressed by low oxygen levels. Almost half of the DD cDNAs were either of unknown function or shared no apparent homology to any expressed sequences in the GenBank/EMBL databases. Six DD genes were similar to molecules of the following functions: C-compound and carbohydrate utilization, plant development, amino acid metabolism, and biosynthesis of brasinosteroids. Time-course and stress-related experiments of low O(2)-regulated genes indicated that these genes responded differently in terms of their earliness, band intensity, and their specificity to stresses, showing that some of them can be termed hypoxia- or anoxia-induced genes. PMID:17525081

  19. Role of Chemical Reactivity and Transition State Modeling for Virtual Screening.

    PubMed

    Karthikeyan, Muthukumarasamy; Vyas, Renu; Tambe, Sanjeev S; Radhamohan, Deepthi; Kulkarni, Bhaskar D

    2015-01-01

    Every drug discovery research program involves synthesis of a novel and potential drug molecule utilizing atom efficient, economical and environment friendly synthetic strategies. The current work focuses on the role of the reactivity based fingerprints of compounds as filters for virtual screening using a tool ChemScore. A reactant-like (RLS) and a product- like (PLS) score can be predicted for a given compound using the binary fingerprints derived from the numerous known organic reactions which capture the molecule-molecule interactions in the form of addition, substitution, rearrangement, elimination and isomerization reactions. The reaction fingerprints were applied to large databases in biology and chemistry, namely ChEMBL, KEGG, HMDB, DSSTox, and the Drug Bank database. A large network of 1113 synthetic reactions was constructed to visualize and ascertain the reactant product mappings in the chemical reaction space. The cumulative reaction fingerprints were computed for 4000 molecules belonging to 29 therapeutic classes of compounds, and these were found capable of discriminating between the cognition disorder related and anti-allergy compounds with reasonable accuracy of 75% and AUC 0.8. In this study, the transition state based fingerprints were also developed and used effectively for virtual screening in drug related databases. The methodology presented here provides an efficient handle for the rapid scoring of molecular libraries for virtual screening. PMID:26138569

  20. MitBASE pilot: a database on nuclear genes involved in mitochondrial biogenesis and its regulation in Saccharomyces cerevisiae.

    PubMed

    de Pinto, B; Malladi, S B; Altamura, N

    1999-01-01

    In the framework of the EU BIOTECH PROGRAM and within the 'MITBASE: a comprehensive and integrated database on mtDNA' project, we have prepared a pilot database (MitBASE Pilot) on nuclear genes involved in mitochondrial biogenesis and its regulation in Saccharomyces cerevisiae. MitBASE Pilot includes nuclear genes encoding mitochondrial proteins as well as nuclear genes encoding products which are localised in other sub-cellular compartments but nevertheless interact with mitochondrial functions. Genes have been classified on the basis of the mitochondrial process in which they participate and the mitochondrial phenotype of the gene knockout. The structure of the MitBASE Pilot database has been conceived for a flexible organisation of the information. An intuitive visual query system has been developed which allows users to select information in different combinations, both in the query and the output format, according to their needs. MitBASE Pilot is a relational database, is maintained at the EMBL-European Bioinformatics Institute (EBI) and is available at the World Wide Web site http://www3.ebi.ac. uk/Research/Mitbase/mitbiog.pl PMID:9847161

  1. Update of MmtDB: a Metazoa mitochondrial DNA variants database.

    PubMed Central

    Attimonelli, M; Calò, D; De Montalvo, A; Lanave, C; Sasanelli, D; Tommaseo Ponzetta, M; Saccone, C

    1998-01-01

    The present paper describes the improvements in MmtDB, a specialised database designed to collect Metazoa mitochondrial DNA variants. Priority in the data collection has been given to Metazoa for which a large amount of variants is available, e.g., for humans. Starting from the sequences available in the Nucleotide Sequence Databases, the redundant sequences have been removed and new sequences from other sources have been added. Value-added information is associated to each variant sequence, e.g., analysed region, experimental method, tissue and cell lines, population data, sex, age, family code and information about the variation events (nucleotide position, involved gene, restriction site gain or loss). Cross-references are introduced to the EMBL Data Library, as well as an internal cross-referencing among MmtDB entries according to tissual, heteroplasmic, familiar and aplotypical correlation. Furthermore MmtDB has a new section, AMmtDB: Aligned Metazoan mitochondrial biosequences. MmtDB can be accessed through the World Wide Web at URL http://WWW.ba.cnr.it/[symbol: see text]areamt08/MmtDBWWW.htm PMID:9399815

  2. Deciphering the microbiota of Tuwa hot spring, India using shotgun metagenomic sequencing approach

    PubMed Central

    Mangrola, Amitsinh; Dudhagara, Pravin; Koringa, Prakash; Joshi, C.G.; Parmar, Mansi; Patel, Rajesh

    2015-01-01

    Here, we report metagenome from the Tuwa hot spring, India using shotgun sequencing approach. Metagenome consisted of 541,379 sequences with 98.7 Mbps size with 46% G + C content. Metagenomic sequence reads were deposited into the EMBL database under accession number ERP009321. Community analysis presented 99.1% sequences belong to bacteria, 0.3% of eukaryotic origin, 0.2% virus derived and 0.05% from archea. Unclassified and unidentified sequences were 0.4% and 0.07% respectively. A total of 22 bacterial phyla include 90 families and 201 species were observed in the hot spring metagenome. Firmicutes (97.0%), Proteobacteria (1.3%) and Actinobacteria (0.4%) were reported as dominant bacterial phyla. In functional analysis using Cluster of Orthologous Group (COG), 21.5% drops in the poorly characterized group. Using subsystem based annotation, 4.0% genes were assigned for stress responses and 3% genes were fit into the metabolism of aromatic compounds. The hot spring metagenome is very rich with novel sequences affiliated to unclassified and unidentified lineages, suggesting the potential source for novel microbial species and their products. PMID:26484204

  3. The translocation (6;9), associated with a specific subtype of acute myeloid leukemia, results in the fusion of two genes, dek and can, and the expression of a chimeric, leukemia-specific dek-can mRNA.

    PubMed Central

    von Lindern, M; Fornerod, M; van Baal, S; Jaegle, M; de Wit, T; Buijs, A; Grosveld, G

    1992-01-01

    The translocation (6;9) is associated with a specific subtype of acute myeloid leukemia (AML). Previously, it was found that breakpoints on chromosome 9 are clustered in one of the introns of a large gene named Cain (can). cDNA probes derived from the 3' part of can detect an aberrant, leukemia-specific 5.5-kb transcript in bone marrow cells from t(6;9) AML patients. cDNA cloning of this mRNA revealed that it is a fusion of sequences encoded on chromosome 6 and 3' can. A novel gene on chromosome 6 which was named dek was isolated. In dek the t(6;9) breakpoints also occur in one intron. As a result the dek-can fusion gene, present in t(6;9) AML, encodes an invariable dek-can transcript. Sequence analysis of the dek-can cDNA showed that dek and can are merged without disruption of the original open reading frames and therefore the fusion mRNA encodes a chimeric DEK-CAN protein of 165 kDa. The predicted DEK and CAN proteins have molecular masses of 43 and 220 kDa, respectively. Sequence comparison with the EMBL data base failed to show consistent homology with any known protein sequences. Images PMID:1549122

  4. INsPeCT: INtegrative Platform for Cancer Transcriptomics.

    PubMed

    Madhamshettiwar, Piyush B; Maetschke, Stefan R; Davis, Melissa J; Reverter, Antonio; Ragan, Mark A

    2014-01-01

    The emergence of transcriptomics, fuelled by high-throughput sequencing technologies, has changed the nature of cancer research and resulted in a massive accumulation of data. Computational analysis, integration, and data visualization are now major bottlenecks in cancer biology and translational research. Although many tools have been brought to bear on these problems, their use remains unnecessarily restricted to computational biologists, as many tools require scripting skills, data infrastructure, and powerful computational facilities. New user-friendly, integrative, and automated analytical approaches are required to make computational methods more generally useful to the research community. Here we present INsPeCT (INtegrative Platform for Cancer Transcriptomics), which allows users with basic computer skills to perform comprehensive in-silico analyses of microarray, ChIP-seq, and RNA-seq data. INsPeCT supports the selection of interesting genes for advanced functional analysis. Included in its automated workflows are (i) a novel analytical framework, RMaNI (regulatory module network inference), which supports the inference of cancer subtype-specific transcriptional module networks and the analysis of modules; and (ii) WGCNA (weighted gene co-expression network analysis), which infers modules of highly correlated genes across microarray samples, associated with sample traits, eg survival time. INsPeCT is available free of cost from Bioinformatics Resource Australia-EMBL and can be accessed at http://inspect.braembl.org.au. PMID:24653643

  5. Analysing the outer membrane subproteome of Methylococcus capsulatus (Bath) using proteomics and novel biocomputing tools.

    PubMed

    Berven, Frode S; Karlsen, Odd André; Straume, Anne Hege; Flikka, Kristian; Murrell, J Colin; Fjellbirkeland, Anne; Lillehaug, Johan R; Eidhammer, Ingvar; Jensen, Harald B

    2006-02-01

    High-resolution two-dimensional gel electrophoresis and mass spectrometry has been used to identify the outer membrane (OM) subproteome of the Gram-negative bacterium Methylococcus capsulatus (Bath). Twenty-eight unique polypeptide sequences were identified from protein samples enriched in OMs. Only six of these polypeptides had previously been identified. The predictions from novel bioinformatic methods predicting beta-barrel outer membrane proteins (OMPs) and OM lipoproteins were compared to proteins identified experimentally. BOMP ( http://www.bioinfo.no/tools/bomp ) predicted 43 beta-barrel OMPs (1.45%) from the 2,959 annotated open reading frames. This was a lower percentage than predicted from other Gram-negative proteomes (1.8-3%). More than half of the predicted BOMPs in M. capsulatus were annotated as (conserved) hypothetical proteins with significant similarity to very few sequences in Swiss-Prot or TrEMBL. The experimental data and the computer predictions indicated that the protein composition of the M. capsulatus OM subproteome was different from that of other Gram-negative bacteria studied in a similar manner. A new program, Lipo, was developed that can analyse entire predicted proteomes and give a list of recognised lipoproteins categorised according to their lipo-box similarity to known Gram-negative lipoproteins ( http://www.bioinfo.no/tools/lipo ). This report is the first using a proteomics and bioinformatics approach to identify the OM subproteome of an obligate methanotroph. PMID:16311759

  6. MobiDB 2.0: an improved database of intrinsically disordered and mobile proteins

    PubMed Central

    Potenza, Emilio; Domenico, Tomás Di; Walsh, Ian; Tosatto, Silvio C.E.

    2015-01-01

    MobiDB (http://mobidb.bio.unipd.it/) is a database of intrinsically disordered and mobile proteins. Intrinsically disordered regions are key for the function of numerous proteins. Here we provide a new version of MobiDB, a centralized source aimed at providing the most complete picture on different flavors of disorder in protein structures covering all UniProt sequences (currently over 80 million). The database features three levels of annotation: manually curated, indirect and predicted. Manually curated data is extracted from the DisProt database. Indirect data is inferred from PDB structures that are considered an indication of intrinsic disorder. The 10 predictors currently included (three ESpritz flavors, two IUPred flavors, two DisEMBL flavors, GlobPlot, VSL2b and JRONN) enable MobiDB to provide disorder annotations for every protein in absence of more reliable data. The new version also features a consensus annotation and classification for long disordered regions. In order to complement the disorder annotations, MobiDB features additional annotations from external sources. Annotations from the UniProt database include post-translational modifications and linear motifs. Pfam annotations are displayed in graphical form and are link-enabled, allowing the user to visit the corresponding Pfam page for further information. Experimental protein–protein interactions from STRING are also classified for disorder content. PMID:25361972

  7. MetaboLights: An Open-Access Database Repository for Metabolomics Data.

    PubMed

    Kale, Namrata S; Haug, Kenneth; Conesa, Pablo; Jayseelan, Kalaivani; Moreno, Pablo; Rocca-Serra, Philippe; Nainala, Venkata Chandrasekhar; Spicer, Rachel A; Williams, Mark; Li, Xuefei; Salek, Reza M; Griffin, Julian L; Steinbeck, Christoph

    2016-01-01

    MetaboLights is the first general purpose, open-access database repository for cross-platform and cross-species metabolomics research at the European Bioinformatics Institute (EMBL-EBI). Based upon the open-source ISA framework, MetaboLights provides Metabolomics Standard Initiative (MSI) compliant metadata and raw experimental data associated with metabolomics experiments. Users can upload their study datasets into the MetaboLights Repository. These studies are then automatically assigned a stable and unique identifier (e.g., MTBLS1) that can be used for publication reference. The MetaboLights Reference Layer associates metabolites with metabolomics studies in the archive and is extensively annotated with data fields such as structural and chemical information, NMR and MS spectra, target species, metabolic pathways, and reactions. The database is manually curated with no specific release schedules. MetaboLights is also recommended by journals for metabolomics data deposition. This unit provides a guide to using MetaboLights, downloading experimental data, and depositing metabolomics datasets using user-friendly submission tools. PMID:27010336

  8. Determination of internal transcribed spacer regions (ITS) in Trichomonas vaginalis isolates and differentiation among Trichomonas species.

    PubMed

    Ibáñez-Escribano, Alexandra; Nogal-Ruiz, Juan José; Arán, Vicente J; Escario, José Antonio; Gómez-Barrio, Alicia; Alderete, J F

    2014-04-01

    The nucleotide sequence of the 5.8S rRNA gene and the flanked internal transcribed spacer (ITS) regions of six Trichomonas vaginalis isolates with different metronidazole sensitivity and geographic origin were genotyped. A multiple sequence alignment was performed with different sequences of other isolates available at the GenBank/EMBL/DDBJ databases, which revealed 5 different sequence patterns. Although a stable mutation in position 66 of the ITS1 (C66T) was observed in 26% (9/34) of the T. vaginalis sequences analyzed, there was 99.7% ITS nucleotide sequence identity among isolates for this sequence. The nucleotide sequence variation among other species of the genus Trichomonas ranged from 3.4% to 9.1%. Surprisingly, the % identity between T. vaginalis and Pentatrichomonas hominis was ~83%. There was >40% divergence in the ITS sequence between T. vaginalis and Tritrichomonas spp., including Tritrichomonas augusta, Tritrichomonas muris, and Tritrichomonas nonconforma and with Tetratrichomonas prowazeki. Dendrograms grouped the trichomonadid sequences in robust clades according to their genera. The absence of nucleotide divergence in the hypervariable ITS regions between T. vaginalis isolates suggests the early divergence of the parasite. Importantly, these data show this ITS1-5.8S rRNA-ITS2 region suitable for inter-species differentiation. PMID:24412628

  9. Extraction of transcript diversity from scientific literature.

    PubMed

    Shah, Parantu K; Jensen, Lars J; Boué, Stéphanie; Bork, Peer

    2005-06-01

    Transcript diversity generated by alternative splicing and associated mechanisms contributes heavily to the functional complexity of biological systems. The numerous examples of the mechanisms and functional implications of these events are scattered throughout the scientific literature. Thus, it is crucial to have a tool that can automatically extract the relevant facts and collect them in a knowledge base that can aid the interpretation of data from high-throughput methods. We have developed and applied a composite text-mining method for extracting information on transcript diversity from the entire MEDLINE database in order to create a database of genes with alternative transcripts. It contains information on tissue specificity, number of isoforms, causative mechanisms, functional implications, and experimental methods used for detection. We have mined this resource to identify 959 instances of tissue-specific splicing. Our results in combination with those from EST-based methods suggest that alternative splicing is the preferred mechanism for generating transcript diversity in the nervous system. We provide new annotations for 1,860 genes with the potential for generating transcript diversity. We assign the MeSH term "alternative splicing" to 1,536 additional abstracts in the MEDLINE database and suggest new MeSH terms for other events. We have successfully extracted information about transcript diversity and semiautomatically generated a database, LSAT, that can provide a quantitative understanding of the mechanisms behind tissue-specific gene expression. LSAT (Literature Support for Alternative Transcripts) is publicly available at http://www.bork.embl.de/LSAT/. PMID:16103899

  10. ENZYMAP: Exploiting Protein Annotation for Modeling and Predicting EC Number Changes in UniProt/Swiss-Prot

    PubMed Central

    Silveira, Sabrina de Azevedo; de Melo-Minardi, Raquel Cardoso; da Silveira, Carlos Henrique; Santoro, Marcelo Matos; Meira Jr, Wagner

    2014-01-01

    The volume and diversity of biological data are increasing at very high rates. Vast amounts of protein sequences and structures, protein and genetic interactions and phenotype studies have been produced. The majority of data generated by high-throughput devices is automatically annotated because manually annotating them is not possible. Thus, efficient and precise automatic annotation methods are required to ensure the quality and reliability of both the biological data and associated annotations. We proposed ENZYMatic Annotation Predictor (ENZYMAP), a technique to characterize and predict EC number changes based on annotations from UniProt/Swiss-Prot using a supervised learning approach. We evaluated ENZYMAP experimentally, using test data sets from both UniProt/Swiss-Prot and UniProt/TrEMBL, and showed that predicting EC changes using selected types of annotation is possible. Finally, we compared ENZYMAP and DETECT with respect to their predictions and checked both against the UniProt/Swiss-Prot annotations. ENZYMAP was shown to be more accurate than DETECT, coming closer to the actual changes in UniProt/Swiss-Prot. Our proposal is intended to be an automatic complementary method (that can be used together with other techniques like the ones based on protein sequence and structure) that helps to improve the quality and reliability of enzyme annotations over time, suggesting possible corrections, anticipating annotation changes and propagating the implicit knowledge for the whole dataset. PMID:24586563

  11. The MIntAct project--IntAct as a common curation platform for 11 molecular interaction databases.

    PubMed

    Orchard, Sandra; Ammari, Mais; Aranda, Bruno; Breuza, Lionel; Briganti, Leonardo; Broackes-Carter, Fiona; Campbell, Nancy H; Chavali, Gayatri; Chen, Carol; del-Toro, Noemi; Duesbury, Margaret; Dumousseau, Marine; Galeota, Eugenia; Hinz, Ursula; Iannuccelli, Marta; Jagannathan, Sruthi; Jimenez, Rafael; Khadake, Jyoti; Lagreid, Astrid; Licata, Luana; Lovering, Ruth C; Meldal, Birgit; Melidoni, Anna N; Milagros, Mila; Peluso, Daniele; Perfetto, Livia; Porras, Pablo; Raghunath, Arathi; Ricard-Blum, Sylvie; Roechert, Bernd; Stutz, Andre; Tognolli, Michael; van Roey, Kim; Cesareni, Gianni; Hermjakob, Henning

    2014-01-01

    IntAct (freely available at http://www.ebi.ac.uk/intact) is an open-source, open data molecular interaction database populated by data either curated from the literature or from direct data depositions. IntAct has developed a sophisticated web-based curation tool, capable of supporting both IMEx- and MIMIx-level curation. This tool is now utilized by multiple additional curation teams, all of whom annotate data directly into the IntAct database. Members of the IntAct team supply appropriate levels of training, perform quality control on entries and take responsibility for long-term data maintenance. Recently, the MINT and IntAct databases decided to merge their separate efforts to make optimal use of limited developer resources and maximize the curation output. All data manually curated by the MINT curators have been moved into the IntAct database at EMBL-EBI and are merged with the existing IntAct dataset. Both IntAct and MINT are active contributors to the IMEx consortium (http://www.imexconsortium.org). PMID:24234451

  12. The Mechanism Research of Qishen Yiqi Formula by Module-Network Analysis

    PubMed Central

    Zheng, Shichao; Zhang, Yanling; Qiao, Yanjiang

    2015-01-01

    Qishen Yiqi formula (QSYQ) has the effect of tonifying Qi and promoting blood circulation, which is widely used to treat the cardiovascular diseases with Qi deficiency and blood stasis syndrome. However, the mechanism of QSYQ to tonify Qi and promote blood circulation is rarely reported at molecular or systems level. This study aimed to elucidate the mechanism of QSYQ based on the protein interaction network (PIN) analysis. The targets' information of the active components was obtained from ChEMBL and STITCH databases and was further used to search against protein-protein interactions by String database. Next, the PINs of QSYQ were constructed by Cytoscape and were analyzed by gene ontology enrichment analysis based on Markov Cluster algorithm. Finally, based on the topological parameters, the properties of scale-free, small world, and modularity of the QSYQ's PINs were analyzed. And based on function modules, the mechanism of QSYQ was elucidated. The results indicated that Qi-tonifying efficacy of QSYQ may be partly attributed to the regulation of amino acid metabolism, carbohydrate metabolism, lipid metabolism, and cAMP metabolism, while QSYQ improves the blood stasis through the regulation of blood coagulation and cardiac muscle contraction. Meanwhile, the “synergy” of formula compatibility was also illuminated. PMID:26379745

  13. Isolation of nine gene sequences induced by silica in murine macrophages

    SciTech Connect

    Segade, F.; Claudio, E.; Wrobel, K.; Ramos, S.; Lazo, P.S.

    1995-03-01

    Macrophage activation by silica is the initial step in the development of silicosis. To identify genes that might be involved in silica-mediated activation, RAW 264.7 mouse macrophages were treated with silica for 48 h, and a subtracted cDNA library enriched for silica-induced genes (SIG) was constructed and differently screened. Nine cDNA clones (designated SIG-12, -14, -20, -41, -61, -81, -91, and -111) were partially sequenced and compared with sequences in GenBank/EMBL databases. SIG-12, -14, and -20 corresponded to the genes for ribosomal proteins L13A, L32, and L26, respectively. SIG-61 is the mouse homologue of p21 RhoC. SIG-91 is identical to the 67-kDa high-affinity laminin receptor. Four genes were not identified and are novel. All of the mRNAs corresponding to the nine cloned cDNAs were inducible by silica. Steady-state levels of mRNAs in RAW 264.7 cells treated with various macrophage activators and inducers of signal transduction pathways were determined. A complex pattern of induction and repression was found, indicating that upon phagocytosis of silica particles, many regulatory mechanisms of genes expression are simultaneously triggered. 55 refs., 4 figs., 1 tab.

  14. UniProt-DAAC: domain architecture alignment and classification, a new method for automatic functional annotation in UniProtKB

    PubMed Central

    Doğan, Tunca; MacDougall, Alistair; Saidi, Rabie; Poggioli, Diego; Bateman, Alex; O’Donovan, Claire; Martin, Maria J.

    2016-01-01

    Motivation: Similarity-based methods have been widely used in order to infer the properties of genes and gene products containing little or no experimental annotation. New approaches that overcome the limitations of methods that rely solely upon sequence similarity are attracting increased attention. One of these novel approaches is to use the organization of the structural domains in proteins. Results: We propose a method for the automatic annotation of protein sequences in the UniProt Knowledgebase (UniProtKB) by comparing their domain architectures, classifying proteins based on the similarities and propagating functional annotation. The performance of this method was measured through a cross-validation analysis using the Gene Ontology (GO) annotation of a sub-set of UniProtKB/Swiss-Prot. The results demonstrate the effectiveness of this approach in detecting functional similarity with an average F-score: 0.85. We applied the method on nearly 55.3 million uncharacterized proteins in UniProtKB/TrEMBL resulted in 44 818 178 GO term predictions for 12 172 114 proteins. 22% of these predictions were for 2 812 016 previously non-annotated protein entries indicating the significance of the value added by this approach. Availability and implementation: The results of the method are available at: ftp://ftp.ebi.ac.uk/pub/contrib/martin/DAAC/. Contact: tdogan@ebi.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27153729

  15. Interactive tree of life (iTOL) v3: an online tool for the display and annotation of phylogenetic and other trees.

    PubMed

    Letunic, Ivica; Bork, Peer

    2016-07-01

    Interactive Tree Of Life (http://itol.embl.de) is a web-based tool for the display, manipulation and annotation of phylogenetic trees. It is freely available and open to everyone. The current version was completely redesigned and rewritten, utilizing current web technologies for speedy and streamlined processing. Numerous new features were introduced and several new data types are now supported. Trees with up to 100,000 leaves can now be efficiently displayed. Full interactive control over precise positioning of various annotation features and an unlimited number of datasets allow the easy creation of complex tree visualizations. iTOL 3 is the first tool which supports direct visualization of the recently proposed phylogenetic placements format. Finally, iTOL's account system has been redesigned to simplify the management of trees in user-defined workspaces and projects, as it is heavily used and currently handles already more than 500,000 trees from more than 10,000 individual users. PMID:27095192

  16. Mapping of the gene encoding the melanocortin-1 ([alpha]-melanocyte stimulating hormone) receptor (MC1R) to human chromosome 16q24. 3 by fluorescence in situ hybridization

    SciTech Connect

    Gantz, I.; Yamada, Tadataka; Tashiro, Takao; Konda, Yoshitaka; Shimoto, Yoshimasa; Miwa, Hiroto; Trent, J.M. )

    1994-01-15

    [alpha]-Melanocyte stimulating hormone ([alpha]-MSH), a hormone originally named for its ability to regulate pigmentation of melanocytes, is a 13-amino-acid post-translational product of the pro-opiomelanocortin (POMC) gene. [alpha]-MSH and the other products of POMC processing, which share the core heptapeptide amino acid sequence Met-Glu (Gly)-His-Phe-Arg-Trp-Gly (Asp), the adrenocorticotropic hormone (ACTH), [beta]-MSH, and [gamma]-MSH, are collectively referred to as melanocortins. While best known for their effects on the melanocyte (pigmentation) and adrenal cortical cells (steroidogenesis), melanocortins have been postulated to function in diverse activities, including enhancement of learning and memory, control of the cardiovascular system, analgesia, thermoregulation, immunomodulation, parturition, and neurotrophism. To identify the chromosomal band encoding the human melanocortin-1 receptor gene, 1 [mu]g of an EMBL clone coding region of the human MC1R and approximately 15 kb of surrounding DNA was labeled with biotin and hybridized to human metaphase chromosomes as previously described. The results indicate that the human MC1R gene is localized to 16q24.3. 15 refs., 1 fig.

  17. InterPro, progress and status in 2005.

    PubMed

    Mulder, Nicola J; Apweiler, Rolf; Attwood, Teresa K; Bairoch, Amos; Bateman, Alex; Binns, David; Bradley, Paul; Bork, Peer; Bucher, Phillip; Cerutti, Lorenzo; Copley, Richard; Courcelle, Emmanuel; Das, Ujjwal; Durbin, Richard; Fleischmann, Wolfgang; Gough, Julian; Haft, Daniel; Harte, Nicola; Hulo, Nicolas; Kahn, Daniel; Kanapin, Alexander; Krestyaninova, Maria; Lonsdale, David; Lopez, Rodrigo; Letunic, Ivica; Madera, Martin; Maslen, John; McDowall, Jennifer; Mitchell, Alex; Nikolskaya, Anastasia N; Orchard, Sandra; Pagni, Marco; Ponting, Chris P; Quevillon, Emmanuel; Selengut, Jeremy; Sigrist, Christian J A; Silventoinen, Ville; Studholme, David J; Vaughan, Robert; Wu, Cathy H

    2005-01-01

    InterPro, an integrated documentation resource of protein families, domains and functional sites, was created to integrate the major protein signature databases. Currently, it includes PROSITE, Pfam, PRINTS, ProDom, SMART, TIGRFAMs, PIRSF and SUPERFAMILY. Signatures are manually integrated into InterPro entries that are curated to provide biological and functional information. Annotation is provided in an abstract, Gene Ontology mapping and links to specialized databases. New features of InterPro include extended protein match views, taxonomic range information and protein 3D structure data. One of the new match views is the InterPro Domain Architecture view, which shows the domain composition of protein matches. Two new entry types were introduced to better describe InterPro entries: these are active site and binding site. PIRSF and the structure-based SUPERFAMILY are the latest member databases to join InterPro, and CATH and PANTHER are soon to be integrated. InterPro release 8.0 contains 11 007 entries, representing 2573 domains, 8166 families, 201 repeats, 26 active sites, 21 binding sites and 20 post-translational modification sites. InterPro covers over 78% of all proteins in the Swiss-Prot and TrEMBL components of UniProt. The database is available for text- and sequence-based searches via a webserver (http://www.ebi.ac.uk/interpro), and for download by anonymous FTP (ftp://ftp.ebi.ac.uk/pub/databases/interpro). PMID:15608177

  18. The InterPro Database, 2003 brings increased coverage and new features.

    PubMed

    Mulder, Nicola J; Apweiler, Rolf; Attwood, Teresa K; Bairoch, Amos; Barrell, Daniel; Bateman, Alex; Binns, David; Biswas, Margaret; Bradley, Paul; Bork, Peer; Bucher, Phillip; Copley, Richard R; Courcelle, Emmanuel; Das, Ujjwal; Durbin, Richard; Falquet, Laurent; Fleischmann, Wolfgang; Griffiths-Jones, Sam; Haft, Daniel; Harte, Nicola; Hulo, Nicolas; Kahn, Daniel; Kanapin, Alexander; Krestyaninova, Maria; Lopez, Rodrigo; Letunic, Ivica; Lonsdale, David; Silventoinen, Ville; Orchard, Sandra E; Pagni, Marco; Peyruc, David; Ponting, Chris P; Selengut, Jeremy D; Servant, Florence; Sigrist, Christian J A; Vaughan, Robert; Zdobnov, Evgueni M

    2003-01-01

    InterPro, an integrated documentation resource of protein families, domains and functional sites, was created in 1999 as a means of amalgamating the major protein signature databases into one comprehensive resource. PROSITE, Pfam, PRINTS, ProDom, SMART and TIGRFAMs have been manually integrated and curated and are available in InterPro for text- and sequence-based searching. The results are provided in a single format that rationalises the results that would be obtained by searching the member databases individually. The latest release of InterPro contains 5629 entries describing 4280 families, 1239 domains, 95 repeats and 15 post-translational modifications. Currently, the combined signatures in InterPro cover more than 74% of all proteins in SWISS-PROT and TrEMBL, an increase of nearly 15% since the inception of InterPro. New features of the database include improved searching capabilities and enhanced graphical user interfaces for visualisation of the data. The database is available via a webserver (http://www.ebi.ac.uk/interpro) and anonymous FTP (ftp://ftp.ebi.ac.uk/pub/databases/interpro). PMID:12520011

  19. InterPro, progress and status in 2005

    PubMed Central

    Mulder, Nicola J.; Apweiler, Rolf; Attwood, Teresa K.; Bairoch, Amos; Bateman, Alex; Binns, David; Bradley, Paul; Bork, Peer; Bucher, Phillip; Cerutti, Lorenzo; Copley, Richard; Courcelle, Emmanuel; Das, Ujjwal; Durbin, Richard; Fleischmann, Wolfgang; Gough, Julian; Haft, Daniel; Harte, Nicola; Hulo, Nicolas; Kahn, Daniel; Kanapin, Alexander; Krestyaninova, Maria; Lonsdale, David; Lopez, Rodrigo; Letunic, Ivica; Madera, Martin; Maslen, John; McDowall, Jennifer; Mitchell, Alex; Nikolskaya, Anastasia N.; Orchard, Sandra; Pagni, Marco; Ponting, Chris P.; Quevillon, Emmanuel; Selengut, Jeremy; Sigrist, Christian J. A.; Silventoinen, Ville; Studholme, David J.; Vaughan, Robert; Wu, Cathy H.

    2005-01-01

    InterPro, an integrated documentation resource of protein families, domains and functional sites, was created to integrate the major protein signature databases. Currently, it includes PROSITE, Pfam, PRINTS, ProDom, SMART, TIGRFAMs, PIRSF and SUPERFAMILY. Signatures are manually integrated into InterPro entries that are curated to provide biological and functional information. Annotation is provided in an abstract, Gene Ontology mapping and links to specialized databases. New features of InterPro include extended protein match views, taxonomic range information and protein 3D structure data. One of the new match views is the InterPro Domain Architecture view, which shows the domain composition of protein matches. Two new entry types were introduced to better describe InterPro entries: these are active site and binding site. PIRSF and the structure-based SUPERFAMILY are the latest member databases to join InterPro, and CATH and PANTHER are soon to be integrated. InterPro release 8.0 contains 11 007 entries, representing 2573 domains, 8166 families, 201 repeats, 26 active sites, 21 binding sites and 20 post-translational modification sites. InterPro covers over 78% of all proteins in the Swiss-Prot and TrEMBL components of UniProt. The database is available for text- and sequence-based searches via a webserver (http://www.ebi.ac.uk/interpro), and for download by anonymous FTP (ftp://ftp.ebi.ac.uk/pub/databases/interpro). PMID:15608177

  20. The Universal Protein Resource (UniProt): an expanding universe of protein information.

    PubMed

    Wu, Cathy H; Apweiler, Rolf; Bairoch, Amos; Natale, Darren A; Barker, Winona C; Boeckmann, Brigitte; Ferro, Serenella; Gasteiger, Elisabeth; Huang, Hongzhan; Lopez, Rodrigo; Magrane, Michele; Martin, Maria J; Mazumder, Raja; O'Donovan, Claire; Redaschi, Nicole; Suzek, Baris

    2006-01-01

    The Universal Protein Resource (UniProt) provides a central resource on protein sequences and functional annotation with three database components, each addressing a key need in protein bioinformatics. The UniProt Knowledgebase (UniProtKB), comprising the manually annotated UniProtKB/Swiss-Prot section and the automatically annotated UniProtKB/TrEMBL section, is the preeminent storehouse of protein annotation. The extensive cross-references, functional and feature annotations and literature-based evidence attribution enable scientists to analyse proteins and query across databases. The UniProt Reference Clusters (UniRef) speed similarity searches via sequence space compression by merging sequences that are 100% (UniRef100), 90% (UniRef90) or 50% (UniRef50) identical. Finally, the UniProt Archive (UniParc) stores all publicly available protein sequences, containing the history of sequence data with links to the source databases. UniProt databases continue to grow in size and in availability of information. Recent and upcoming changes to database contents, formats, controlled vocabularies and services are described. New download availability includes all major releases of UniProtKB, sequence collections by taxonomic division and complete proteomes. A bibliography mapping service has been added, and an ID mapping service will be available soon. UniProt databases can be accessed online at http://www.uniprot.org or downloaded at ftp://ftp.uniprot.org/pub/databases/. PMID:16381842

  1. ExPASy: The proteomics server for in-depth protein knowledge and analysis.

    PubMed

    Gasteiger, Elisabeth; Gattiker, Alexandre; Hoogland, Christine; Ivanyi, Ivan; Appel, Ron D; Bairoch, Amos

    2003-07-01

    The ExPASy (the Expert Protein Analysis System) World Wide Web server (http://www.expasy.org), is provided as a service to the life science community by a multidisciplinary team at the Swiss Institute of Bioinformatics (SIB). It provides access to a variety of databases and analytical tools dedicated to proteins and proteomics. ExPASy databases include SWISS-PROT and TrEMBL, SWISS-2DPAGE, PROSITE, ENZYME and the SWISS-MODEL repository. Analysis tools are available for specific tasks relevant to proteomics, similarity searches, pattern and profile searches, post-translational modification prediction, topology prediction, primary, secondary and tertiary structure analysis and sequence alignment. These databases and tools are tightly interlinked: a special emphasis is placed on integration of database entries with related resources developed at the SIB and elsewhere, and the proteomics tools have been designed to read the annotations in SWISS-PROT in order to enhance their predictions. ExPASy started to operate in 1993, as the first WWW server in the field of life sciences. In addition to the main site in Switzerland, seven mirror sites in different continents currently serve the user community. PMID:12824418

  2. The InterPro Database, 2003 brings increased coverage and new features

    PubMed Central

    Mulder, Nicola J.; Apweiler, Rolf; Attwood, Teresa K.; Bairoch, Amos; Barrell, Daniel; Bateman, Alex; Binns, David; Biswas, Margaret; Bradley, Paul; Bork, Peer; Bucher, Phillip; Copley, Richard R.; Courcelle, Emmanuel; Das, Ujjwal; Durbin, Richard; Falquet, Laurent; Fleischmann, Wolfgang; Griffiths-Jones, Sam; Haft, Daniel; Harte, Nicola; Hulo, Nicolas; Kahn, Daniel; Kanapin, Alexander; Krestyaninova, Maria; Lopez, Rodrigo; Letunic, Ivica; Lonsdale, David; Silventoinen, Ville; Orchard, Sandra E.; Pagni, Marco; Peyruc, David; Ponting, Chris P.; Selengut, Jeremy D.; Servant, Florence; Sigrist, Christian J. A.; Vaughan, Robert; Zdobnov, Evgueni M.

    2003-01-01

    InterPro, an integrated documentation resource of protein families, domains and functional sites, was created in 1999 as a means of amalgamating the major protein signature databases into one comprehensive resource. PROSITE, Pfam, PRINTS, ProDom, SMART and TIGRFAMs have been manually integrated and curated and are available in InterPro for text- and sequence-based searching. The results are provided in a single format that rationalises the results that would be obtained by searching the member databases individually. The latest release of InterPro contains 5629 entries describing 4280 families, 1239 domains, 95 repeats and 15 post-translational modifications. Currently, the combined signatures in InterPro cover more than 74% of all proteins in SWISS-PROT and TrEMBL, an increase of nearly 15% since the inception of InterPro. New features of the database include improved searching capabilities and enhanced graphical user interfaces for visualisation of the data. The database is available via a webserver (http://www.ebi.ac.uk/interpro) and anonymous FTP (ftp://ftp.ebi.ac.uk/pub/databases/interpro). PMID:12520011

  3. The Universal Protein Resource (UniProt): an expanding universe of protein information

    PubMed Central

    Wu, Cathy H.; Apweiler, Rolf; Bairoch, Amos; Natale, Darren A.; Barker, Winona C.; Boeckmann, Brigitte; Ferro, Serenella; Gasteiger, Elisabeth; Huang, Hongzhan; Lopez, Rodrigo; Magrane, Michele; Martin, Maria J.; Mazumder, Raja; O'Donovan, Claire; Redaschi, Nicole; Suzek, Baris

    2006-01-01

    The Universal Protein Resource (UniProt) provides a central resource on protein sequences and functional annotation with three database components, each addressing a key need in protein bioinformatics. The UniProt Knowledgebase (UniProtKB), comprising the manually annotated UniProtKB/Swiss-Prot section and the automatically annotated UniProtKB/TrEMBL section, is the preeminent storehouse of protein annotation. The extensive cross-references, functional and feature annotations and literature-based evidence attribution enable scientists to analyse proteins and query across databases. The UniProt Reference Clusters (UniRef) speed similarity searches via sequence space compression by merging sequences that are 100% (UniRef100), 90% (UniRef90) or 50% (UniRef50) identical. Finally, the UniProt Archive (UniParc) stores all publicly available protein sequences, containing the history of sequence data with links to the source databases. UniProt databases continue to grow in size and in availability of information. Recent and upcoming changes to database contents, formats, controlled vocabularies and services are described. New download availability includes all major releases of UniProtKB, sequence collections by taxonomic division and complete proteomes. A bibliography mapping service has been added, and an ID mapping service will be available soon. UniProt databases can be accessed online at or downloaded at . PMID:16381842

  4. ExPASy: the proteomics server for in-depth protein knowledge and analysis

    PubMed Central

    Gasteiger, Elisabeth; Gattiker, Alexandre; Hoogland, Christine; Ivanyi, Ivan; Appel, Ron D.; Bairoch, Amos

    2003-01-01

    The ExPASy (the Expert Protein Analysis System) World Wide Web server (http://www.expasy.org), is provided as a service to the life science community by a multidisciplinary team at the Swiss Institute of Bioinformatics (SIB). It provides access to a variety of databases and analytical tools dedicated to proteins and proteomics. ExPASy databases include SWISS-PROT and TrEMBL, SWISS-2DPAGE, PROSITE, ENZYME and the SWISS-MODEL repository. Analysis tools are available for specific tasks relevant to proteomics, similarity searches, pattern and profile searches, post-translational modification prediction, topology prediction, primary, secondary and tertiary structure analysis and sequence alignment. These databases and tools are tightly interlinked: a special emphasis is placed on integration of database entries with related resources developed at the SIB and elsewhere, and the proteomics tools have been designed to read the annotations in SWISS-PROT in order to enhance their predictions. ExPASy started to operate in 1993, as the first WWW server in the field of life sciences. In addition to the main site in Switzerland, seven mirror sites in different continents currently serve the user community. PMID:12824418

  5. Variable copy number DNA sequences in rice.

    PubMed

    Kikuchi, S; Takaiwa, F; Oono, K

    1987-12-01

    We have cloned two types of variable copy number DNA sequences from the rice embryo genome. One of these sequences, which was cloned in pRB301, was amplified about 50-fold during callus formation and diminished in copy number to the embryonic level during regeneration. The other clone, named pRB401, showed the reciprocal pattern. The copy numbers of both sequences were changed even in the early developmental stage and eliminated from nuclear DNA along with growth of the plant. Sequencing analysis of the pRB301 insert revealed some open reading frames and direct repeat structures, but corresponding sequences were not identified in the EMBL and LASL DNA databases. Sequencing of the nuclear genomic fragment cloned in pRB401 revealed the presence of the 3'rps12-rps7 region of rice chloroplast DNA. Our observations suggest that during callus formation (dedifferentiation), regeneration and the growth process the copy numbers of some DNA sequences are variable and that nuclear integrated chloroplast DNA acts as a variable copy number sequence in the rice genome. Based on data showing a common sequence in mitochondria and chloroplast DNA of maize (Stern and Lonsdale 1982) and that the rps12 gene of tobacco chloroplast DNA is a divided gene (Torazawa et al. 1986), it is suggested that the sequence on the inverted repeat structure of chloroplast DNA may have the character of a movable genetic element. PMID:3481021

  6. RSAT: regulatory sequence analysis tools.

    PubMed

    Thomas-Chollier, Morgane; Sand, Olivier; Turatsinze, Jean-Valéry; Janky, Rekin's; Defrance, Matthieu; Vervisch, Eric; Brohée, Sylvain; van Helden, Jacques

    2008-07-01

    The regulatory sequence analysis tools (RSAT, http://rsat.ulb.ac.be/rsat/) is a software suite that integrates a wide collection of modular tools for the detection of cis-regulatory elements in genome sequences. The suite includes programs for sequence retrieval, pattern discovery, phylogenetic footprint detection, pattern matching, genome scanning and feature map drawing. Random controls can be performed with random gene selections or by generating random sequences according to a variety of background models (Bernoulli, Markov). Beyond the original word-based pattern-discovery tools (oligo-analysis and dyad-analysis), we recently added a battery of tools for matrix-based detection of cis-acting elements, with some original features (adaptive background models, Markov-chain estimation of P-values) that do not exist in other matrix-based scanning tools. The web server offers an intuitive interface, where each program can be accessed either separately or connected to the other tools. In addition, the tools are now available as web services, enabling their integration in programmatic workflows. Genomes are regularly updated from various genome repositories (NCBI and EnsEMBL) and 682 organisms are currently supported. Since 1998, the tools have been used by several hundreds of researchers from all over the world. Several predictions made with RSAT were validated experimentally and published. PMID:18495751

  7. Automated cycle sequencing with Taquenase: protocols for internal labeling, dye primer and "doublex" simultaneous sequencing.

    PubMed

    Voss, H; Nentwich, U; Duthie, S; Wiemann, S; Benes, V; Zimmermann, J; Ansorge, W

    1997-08-01

    This paper describes automated cycle sequencing protocols for internal labeling, dye primer and "doublex" simultaneous sequencing using Taquenase, a new genetically modified DNA polymerase with increased thermostability. Sequencing performance both with labeled and unlabeled primer yields uniform unambiguous signals up to the resolution limit of the sequencing gels. Primer walking with internal labeling was successfully performed on Pl-derived artificial chromosome (PAC) constructs with 130-kb inserts. Taquenase, a commercially available modified thermostable sequencing enzyme (delta 280, F667Y Taq DNA polymerase), incorporates a variety of fluorescent dNTPs carrying fluorescein isothiocyanate, TexasRed or Cy5 labels during the cycle-sequencing process with higher efficiency than other thermostable DNA polymerases. Comparison to other modified Taq DNA polymerases suggests that the particular N-terminal deletion of Taquenase rather than the presence of the F667Y mutation is responsible for the efficient incorporation and extension of labeled dNTPs. Taquenase makes feasible highly accurate "doublex" simultaneous cylce sequencing on both strands of template DNA with two internal labels or two dye-labeled primers in combination with the EMBL-2-dye DNA sequencing system, ARAKIS, or with two commercial DNA sequencers. It allows up to 2000 bases at > 99% accuracy to be determined in a single reaction. PMID:9266089

  8. Molecular cloning, characterisation, and expression of a neutral trehalase from the insect pathogenic fungus Metarhizium anisopliae.

    PubMed

    Xia, Yuxian; Gao, Meiying; Clarkson, John; Charnley, A

    2002-06-01

    A neutral trehalase gene (NTH1) was isolated from a lambdaEMBL3 genomic library of the insect pathogenic fungus Metarhizium anisopliae. Sequencing of the gene revealed extensive homology with other fungal neutral trehalases. The NTH1 gene exists as a single copy in the genome. Two STREs exist in the 5'UTR of NTH1, which may mediate transcriptional activation of the NTH1 gene in response to various stresses. The NTH1 gene encodes a protein of 737 amino acids with a calculated M(r) of 83.1kDa. A cyclic adenosine 3',5'-monophosphate-dependent phosphorylation consensus site and a putative calcium binding site were found in the amino-terminal domain of NTH1, consistent with a regulatory enzyme. Expression of the trehalase cDNA was achieved in Saccharomyces cerevisiae. Southern blot analysis of RT-PCR products indicated that the neutral trehalase gene is transcribed in vitro in cell-free haemolymph of the tobacco hornworm Manduca sexta and in vivo in the early stage of infection. PMID:12383437

  9. Distinct Mutations in IRAK-4 Confer Hyporesponsiveness to Lipopolysaccharide and Interleukin-1 in a Patient with Recurrent Bacterial Infections

    PubMed Central

    Medvedev, Andrei E.; Lentschat, Arnd; Kuhns, Douglas B.; Blanco, Jorge C.G.; Salkowski, Cindy; Zhang, Shuling; Arditi, Moshe; Gallin, John I.; Vogel, Stefanie N.

    2003-01-01

    We identified previously a patient with recurrent bacterial infections who failed to respond to gram-negative LPS in vivo, and whose leukocytes were profoundly hyporesponsive to LPS and IL-1 in vitro. We now demonstrate that this patient also exhibits deficient responses in a skin blister model of aseptic inflammation. A lack of IL-18 responsiveness, coupled with diminished LPS and/or IL-1–induced nuclear factor–κB and activator protein-1 translocation, p38 phosphorylation, gene expression, and dysregulated IL-1R–associated kinase (IRAK)–1 activity in vitro support the hypothesis that the defect lies within the signaling pathway common to toll-like receptor 4, IL-1R, and IL-18R. This patient expresses a “compound heterozygous” genotype, with a point mutation (C877T in cDNA) and a two-nucleotide, AC deletion (620–621del in cDNA) encoded by distinct alleles of the IRAK-4 gene (GenBank/EMBL/DDBJ accession nos. AF445802 and AY186092). Both mutations encode proteins with an intact death domain, but a truncated kinase domain, thereby precluding expression of full-length IRAK-4 (i.e., a recessive phenotype). When overexpressed in HEK293T cells, neither truncated form augmented endogenous IRAK-1 kinase activity, and both inhibited endogenous IRAK-1 activity modestly. Thus, IRAK-4 is pivotal in the development of a normal inflammatory response initiated by bacterial or nonbacterial insults. PMID:12925671

  10. Purification of acetoacetate decarboxylase from Clostridium acetobutylicum ATCC 824 and cloning of the acetoacetate decarboxylase gene in Escherichia coli

    SciTech Connect

    Petersen, D.J.; Bennett, G.N. )

    1990-11-01

    In Clostridium acetobutylicum ATCC 824, acetoacetate decarboxylase (EC 4.1.1.4) is essential for solvent production, catalyzing the decarboxylation of acetoacetate to acetone. We report here the purification of the enzyme from C. acetobutylicum ATCC 824 and the cloning and expression of the gene encoding the acetoacetate decarboxylase enzyme in Escherichia coli. A bacteriophage lambda EMBL3 library of C. acetobutylicum DNA was screened by plaque hybridization, using oligodeoxynucleotide probes derived from the N-terminal amino acid sequence obtained from the purified protein. Phage DNA from positive plaques was analyzed by Southern hybridization. Restriction mapping and subsequent subcloning of DNA fragments hybridizing to the probes localized the gene within an {approximately}2.1-kb EcoRI/BglII fragment. A polypeptide with a molecular weight of {approximately}28,000 corresponding to that of the purified acetoacetate decarboxylase was observed in both Western blots (immunoblots) and maxicell analysis of whole-cell extracts of E. coli harboring the clostridial gene. Although the expression of the gene is tightly regulated in C. acetobutylicum, it was well expressed in E. coli, although from a promoter sequence of clostridial origin.

  11. Structure of the coding region and mRNA variants of the apyrase gene from pea (Pisum sativum)

    NASA Technical Reports Server (NTRS)

    Shibata, K.; Abe, S.; Davies, E.

    2001-01-01

    Partial amino acid sequences of a 49 kDa apyrase (ATP diphosphohydrolase, EC 3.6.1.5) from the cytoskeletal fraction of etiolated pea stems were used to derive oligonucleotide DNA primers to generate a cDNA fragment of pea apyrase mRNA by RT-PCR and these primers were used to screen a pea stem cDNA library. Two almost identical cDNAs differing in just 6 nucleotides within the coding regions were found, and these cDNA sequences were used to clone genomic fragments by PCR. Two nearly identical gene fragments containing 8 exons and 7 introns were obtained. One of them (H-type) encoded the mRNA sequence described by Hsieh et al. (1996) (DDBJ/EMBL/GenBank Z32743), while the other (S-type) differed by the same 6 nucleotides as the mRNAs, suggesting that these genes may be alleles. The six nucleotide differences between these two alleles were found solely in the first exon, and these mutation sites had two types of consensus sequences. These mRNAs were found with varying lengths of 3' untranslated regions (3'-UTR). There are some similarities between the 3'-UTR of these mRNAs and those of actin and actin binding proteins in plants. The putative roles of the 3'-UTR and alternative polyadenylation sites are discussed in relation to their possible role in targeting the mRNAs to different subcellular compartments.

  12. PTMcode v2: a resource for functional associations of post-translational modifications within and between proteins

    PubMed Central

    Minguez, Pablo; Letunic, Ivica; Parca, Luca; Garcia-Alonso, Luz; Dopazo, Joaquin; Huerta-Cepas, Jaime; Bork, Peer

    2015-01-01

    The post-translational regulation of proteins is mainly driven by two molecular events, their modification by several types of moieties and their interaction with other proteins. These two processes are interdependent and together are responsible for the function of the protein in a particular cell state. Several databases focus on the prediction and compilation of protein–protein interactions (PPIs) and no less on the collection and analysis of protein post-translational modifications (PTMs), however, there are no resources that concentrate on describing the regulatory role of PTMs in PPIs. We developed several methods based on residue co-evolution and proximity to predict the functional associations of pairs of PTMs that we apply to modifications in the same protein and between two interacting proteins. In order to make data available for understudied organisms, PTMcode v2 (http://ptmcode.embl.de) includes a new strategy to propagate PTMs from validated modified sites through orthologous proteins. The second release of PTMcode covers 19 eukaryotic species from which we collected more than 300 000 experimentally verified PTMs (>1 300 000 propagated) of 69 types extracting the post-translational regulation of >100 000 proteins and >100 000 interactions. In total, we report 8 million associations of PTMs regulating single proteins and over 9.4 million interplays tuning PPIs. PMID:25361965

  13. Monoclonal antibodies specific for elongation factor Tu and complete nucleotide sequence of the tuf gene in Mycobacterium tuberculosis.

    PubMed Central

    Carlin, N I; Löfdahl, S; Magnusson, M

    1992-01-01

    Monoclonal antibodies against mycobacterial antigens were produced by immunizing LOU/C rats with live Mycobacterium bovis BCG. The antibodies were characterized by an enzyme-linked immunosorbent assay and by sodium dodecyl sulfate-polyacrylamide gel electrophoresis followed by Western blotting (immunoblotting). One antibody, MAMB 2, reactive with a 47-kDa protein was used to screen a lambda gt11 M. tuberculosis gene library (R. A. Young, B. R. Bloom, C. M. Grosskinsky, J. Ivanji, D. Thomas, and R. W. Davis, Proc. Natl. Acad. Sci. USA 82:2583-2587, 1985). Three recombinant phages reactive with MAMB 2 in plaque lysates were isolated, and part of the insert was sequenced. The mycobacterial inserts were all expressed as proteins fused with beta-galactosidase when the phages were induced as lysogens in Escherichia coli. The entire M. tuberculosis tuf gene was obtained by screening the lambda gt11 library with a DNA probe specific for the primary clones. A phage isolated from this screening was able to express the native protein in E. coli when introduced as a lysogen. A comparison of the entire gene sequence and the deduced protein sequence with the EMBL DNA and Swiss-Prot protein data libraries revealed strong homologies with elongation factors of bacteria, yeast mitochondria, and a plant chloroplast. Images PMID:1639483

  14. Refined elasticity sampling for Monte Carlo-based identification of stabilizing network patterns

    PubMed Central

    Childs, Dorothee; Grimbs, Sergio; Selbig, Joachim

    2015-01-01

    Motivation: Structural kinetic modelling (SKM) is a framework to analyse whether a metabolic steady state remains stable under perturbation, without requiring detailed knowledge about individual rate equations. It provides a representation of the system’s Jacobian matrix that depends solely on the network structure, steady state measurements, and the elasticities at the steady state. For a measured steady state, stability criteria can be derived by generating a large number of SKMs with randomly sampled elasticities and evaluating the resulting Jacobian matrices. The elasticity space can be analysed statistically in order to detect network positions that contribute significantly to the perturbation response. Here, we extend this approach by examining the kinetic feasibility of the elasticity combinations created during Monte Carlo sampling. Results: Using a set of small example systems, we show that the majority of sampled SKMs would yield negative kinetic parameters if they were translated back into kinetic models. To overcome this problem, a simple criterion is formulated that mitigates such infeasible models. After evaluating the small example pathways, the methodology was used to study two steady states of the neuronal TCA cycle and the intrinsic mechanisms responsible for their stability or instability. The findings of the statistical elasticity analysis confirm that several elasticities are jointly coordinated to control stability and that the main source for potential instabilities are mutations in the enzyme alpha-ketoglutarate dehydrogenase. Contact: dorothee.childs@embl.de Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26072485

  15. Expanding the fragrance chemical space for virtual screening.

    PubMed

    Ruddigkeit, Lars; Awale, Mahendra; Reymond, Jean-Louis

    2014-01-01

    The properties of fragrance molecules in the public databases SuperScent and Flavornet were analyzed to define a "fragrance-like" (FL) property range (Heavy Atom Count ≤ 21, only C, H, O, S, (O + S) ≤ 3, Hydrogen Bond Donor ≤ 1) and the corresponding chemical space including FL molecules from PubChem (NIH repository of molecules), ChEMBL (bioactive molecules), ZINC (drug-like molecules), and GDB-13 (all possible organic molecules up to 13 atoms of C, N, O, S, Cl). The FL subsets of these databases were classified by MQN (Molecular Quantum Numbers, a set of 42 integer value descriptors of molecular structure) and formatted for fast MQN-similarity searching and interactive exploration of color-coded principal component maps in form of the FL-mapplet and FL-browser applications freely available at http://www.gdb.unibe.ch. MQN-similarity is shown to efficiently recover 15 different fragrance molecule families from the different FL subsets, demonstrating the relevance of the MQN-based tool to explore the fragrance chemical space. PMID:24876890

  16. Quantitative Structure-Antioxidant Activity Models of Isoflavonoids: A Theoretical Study

    PubMed Central

    Castellano, Gloria; Torrens, Francisco

    2015-01-01

    Seventeen isoflavonoids from isoflavone, isoflavanone and isoflavan classes are selected from Dalbergia parviflora. The ChEMBL database is representative from these molecules, most of which result highly drug-like. Binary rules appear risky for the selection of compounds with high antioxidant capacity in complementary xanthine/xanthine oxidase, ORAC, and DPPH model assays. Isoflavonoid structure-activity analysis shows the most important properties (log P, log D, pKa, QED, PSA, NH + OH ≈ HBD, N + O ≈ HBA). Some descriptors (PSA, HBD) are detected as more important than others (size measure Mw, HBA). Linear and nonlinear models of antioxidant potency are obtained. Weak nonlinear relationships appear between log P, etc. and antioxidant activity. The different capacity trends for the three complementary assays are explained. Isoflavonoids potency depends on the chemical form that determines their solubility. Results from isoflavonoids analysis will be useful for activity prediction of new sets of flavones and to design drugs with antioxidant capacity, which will prove beneficial for health with implications for antiageing therapy. PMID:26062128

  17. Remote access to ACNUC nucleotide and protein sequence databases at PBIL.

    PubMed

    Gouy, Manolo; Delmotte, Stéphane

    2008-04-01

    The ACNUC biological sequence database system provides powerful and fast query and extraction capabilities to a variety of nucleotide and protein sequence databases. The collection of ACNUC databases served by the Pôle Bio-Informatique Lyonnais includes the EMBL, GenBank, RefSeq and UniProt nucleotide and protein sequence databases and a series of other sequence databases that support comparative genomics analyses: HOVERGEN and HOGENOM containing families of homologous protein-coding genes from vertebrate and prokaryotic genomes, respectively; Ensembl and Genome Reviews for analyses of prokaryotic and of selected eukaryotic genomes. This report describes the main features of the ACNUC system and the access to ACNUC databases from any internet-connected computer. Such access was made possible by the definition of a remote ACNUC access protocol and the implementation of Application Programming Interfaces between the C, Python and R languages and this communication protocol. Two retrieval programs for ACNUC databases, Query_win, with a graphical user interface and raa_query, with a command line interface, are also described. Altogether, these bioinformatics tools provide users with either ready-to-use means of querying remote sequence databases through a variety of selection criteria, or a simple way to endow application programs with an extensive access to these databases. Remote access to ACNUC databases is open to all and fully documented (http://pbil.univ-lyon1.fr/databases/acnuc/acnuc.html). PMID:17825976

  18. Translation calibration of inverse-kappa goniometers in macromolecular crystallography

    PubMed Central

    Brockhauser, Sandor; White, Kristopher I.; McCarthy, Andrew A.; Ravelli, Raimond B. G.

    2011-01-01

    Precise and convenient crystal reorientation is of experimental importance in macromolecular crystallography (MX). The development of multi-axis goniometers, such as the ESRF/EMBL mini-κ, necessitates the corresponding development of calibration procedures that can be used for the setup, maintenance and troubleshooting of such devices. While traditional multi-axis goniometers require all rotation axes to intersect the unique point of the sample position, recently developed miniaturized instruments for sample reorientation in MX are not as restricted. However, the samples must always be re-centred following a change in orientation. To overcome this inconvenience and allow the use of multi-axis goniometers without the fundamental restriction of having all axes intersecting in the same point, an automatic translation correction protocol has been developed for such instruments. It requires precise information about the direction and location of the rotation axes. To measure and supply this information, a general, easy-to-perform translation calibration (TC) procedure has also been developed. The TC procedure is routinely performed on most MX beamlines at the ESRF and some results are presented for reference. PMID:21487180

  19. antiSMASH 2.0—a versatile platform for genome mining of secondary metabolite producers

    PubMed Central

    Blin, Kai; Medema, Marnix H.; Kazempour, Daniyal; Fischbach, Michael A.; Breitling, Rainer; Takano, Eriko; Weber, Tilmann

    2013-01-01

    Microbial secondary metabolites are a potent source of antibiotics and other pharmaceuticals. Genome mining of their biosynthetic gene clusters has become a key method to accelerate their identification and characterization. In 2011, we developed antiSMASH, a web-based analysis platform that automates this process. Here, we present the highly improved antiSMASH 2.0 release, available at http://antismash.secondarymetabolites.org/. For the new version, antiSMASH was entirely re-designed using a plug-and-play concept that allows easy integration of novel predictor or output modules. antiSMASH 2.0 now supports input of multiple related sequences simultaneously (multi-FASTA/GenBank/EMBL), which allows the analysis of draft genomes comprising multiple contigs. Moreover, direct analysis of protein sequences is now possible. antiSMASH 2.0 has also been equipped with the capacity to detect additional classes of secondary metabolites, including oligosaccharide antibiotics, phenazines, thiopeptides, homo-serine lactones, phosphonates and furans. The algorithm for predicting the core structure of the cluster end product is now also covering lantipeptides, in addition to polyketides and non-ribosomal peptides. The antiSMASH ClusterBlast functionality has been extended to identify sub-clusters involved in the biosynthesis of specific chemical building blocks. The new features currently make antiSMASH 2.0 the most comprehensive resource for identifying and analyzing novel secondary metabolite biosynthetic pathways in microorganisms. PMID:23737449

  20. GlobPlot: exploring protein sequences for globularity and disorder

    PubMed Central

    Linding, Rune; Russell, Robert B.; Neduva, Victor; Gibson, Toby J.

    2003-01-01

    A major challenge in the proteomics and structural genomics era is to predict protein structure and function, including identification of those proteins that are partially or wholly unstructured. Non-globular sequence segments often contain short linear peptide motifs (e.g. SH3-binding sites) which are important for protein function. We present here a new tool for discovery of such unstructured, or disordered regions within proteins. GlobPlot (http://globplot.embl.de) is a web service that allows the user to plot the tendency within the query protein for order/globularity and disorder. We show examples with known proteins where it successfully identifies inter-domain segments containing linear motifs, and also apparently ordered regions that do not contain any recognised domain. GlobPlot may be useful in domain hunting efforts. The plots indicate that instances of known domains may often contain additional N- or C-terminal segments that appear ordered. Thus GlobPlot may be of use in the design of constructs corresponding to globular proteins, as needed for many biochemical studies, particularly structural biology. GlobPlot has a pipeline interface—GlobPipe—for the advanced user to do whole proteome analysis. GlobPlot can also be used as a generic infrastructure package for graphical displaying of any possible propensity. PMID:12824398

  1. Diopatra neapolitana and Diopatra marocensis from the Portuguese coast: Morphological and genetic comparison

    NASA Astrophysics Data System (ADS)

    Rodrigues, Ana Maria; Pires, Adília; Mendo, Sónia; Quintino, Victor

    2009-12-01

    This paper reports the presence of Diopatra marocensis in European waters, for which Diopatra neapolitana was the only species recognized until recently. Both species coexist in transitional waters, where D. marocensis may be mistaken for young specimens of D. neapolitana. The population of D. marocensis studied in the coastal shelf can be traced back to 1997 and is increasing in density, apparently benefiting from a local anthropogenic organic enrichment source. This study emphasizes the main morphological characteristics that allow discriminating the two species and uses a molecular approach through the mitochondrial DNA genes 16S rDNA and COI (cytochrome c oxidase subunit I) analysis to confirm their distinction. The percentage of nucleotides divergence of the 16S and COI genes between the two species was 14% and 17%, respectively. The nucleotide sequence was conserved among all specimens of the same species for 16S gene, and the differences observed between individuals of the same species for the COI gene always corresponded to a silent alteration with no amino acid change. The nucleotide sequences of the two genes of both species were also compared to the sequences of Diopatra aciculata deposited in the EMBL database. The divergence values between Diopatra marocensis and D. aciculata were 14% and 18% for 16S and COI, respectively whereas between Diopatra neapolitana and D. aciculata were 1% and 5% for 16S and COI, respectively. Phylogenetic analysis was performed to deduce relationships among the Diopatra species studied. This analysis showed that D. marocensis and D. neapolitana are in different clades and thus could be considered different species, whereas D. aciculata and D. neapolitana are in sister clades thus emphasising their similarities, already noticed at a morphological level.

  2. Tuning hERG out: Antitarget QSAR Models for Drug Development

    PubMed Central

    Braga, Rodolpho C.; Alves, Vinícius M.; Silva, Meryck F. B.; Muratov, Eugene; Fourches, Denis; Tropsha, Alexander; Andrade, Carolina H.

    2015-01-01

    Several non-cardiovascular drugs have been withdrawn from the market due to their inhibition of hERG K+ channels that can potentially lead to severe heart arrhythmia and death. As hERG safety testing is a mandatory FDA-required procedure, there is a considerable interest for developing predictive computational tools to identify and filter out potential hERG blockers early in the drug discovery process. In this study, we aimed to generate predictive and well-characterized quantitative structure–activity relationship (QSAR) models for hERG blockage using the largest publicly available dataset of 11,958 compounds from the ChEMBL database. The models have been developed and validated according to OECD guidelines using four types of descriptors and four different machine-learning techniques. The classification accuracies discriminating blockers from non-blockers were as high as 0.83–0.93 on external set. Model interpretation revealed several SAR rules, which can guide structural optimization of some hERG blockers into non-blockers. We have also applied the generated models for screening the World Drug Index (WDI) database and identify putative hERG blockers and non-blockers among currently marketed drugs. The developed models can reliably identify blockers and non-blockers, which could be useful for the scientific community. A freely accessible web server has been developed allowing users to identify putative hERG blockers and non-blockers in chemical libraries of their interest (http://labmol.farmacia.ufg.br/predherg). PMID:24805060

  3. Increased macroH2A1.1 Expression Correlates with Poor Survival of Triple-Negative Breast Cancer Patients

    PubMed Central

    Lavigne, Anne-Claire; Castells, Magali; Mermet, Jérôme; Kocanova, Silvia; Dalvai, Mathieu; Bystricky, Kerstin

    2014-01-01

    Purpose Epithelial-Mesenchymal Transition (EMT) features appear to be key events in development and progression of breast cancer. Epigenetic modifications contribute to the establishment and maintenance of cancer subclasses, as well as to the EMT process. Whether histone variants contribute to these transformations is not known. We investigated the relative expression levels of histone macroH2A1 splice variants and correlated it with breast cancer status/prognosis/types. Methods To detect differential expression of macroH2A1 variant mRNAs in breast cancer cells and tumor samples, we used the following databases: GEO, EMBL-EBI and publisher databases (may-august 2012). We extracted macroH2A1.1/macroH2A1 mRNA ratios and performed correlation studies on intrinsic molecular subclasses of breast cancer and on molecular characteristics of EMT. Associations between molecular and survival data were determined. Results We found increased macroH2A1.1/macroH2A1 mRNA ratios to be associated with the claudin-low intrinsic subtype in breast cancer cell lines. At the molecular level this association translates into a positive correlation between macroH2A1 ratios and molecular characteristics of the EMT process. Moreover, untreated Triple Negative Breast Cancers presenting a high macroH2A1.1 mRNA ratio exhibit a poor outcome. Conclusion These results provide first evidence that macroH2A1.1 could be exploited as an actor in the maintenance of a transient cellular state in EMT progress towards metastatic development of breast tumors. PMID:24911873

  4. Evolutionary and functional analysis of fructose bisphosphate aldolase of plant parasitic nematodes.

    PubMed

    Prasad, Cvs Siva; Gupta, Saurabh; Kumar, Himansu; Tiwari, Murlidhar

    2013-01-01

    The essential and ubiquitous enzyme fructose bisphosphate aldolase (FBPA) has been a good target for controlling the various types of infections caused by pathogens and parasites. The parasitic infections of nematodes are the major concern of scientific community, leading to biochemical characterization of this enzyme. In this work we have developed a small dataset of all types of FBPA sequences collected from publically available databases (EMBL, NCBI and Uni-Port). The Phylogenetic study shows that evolutionary relationships among sequences of FBPA are clustered into three main groups. FBPA sequences of Globodera rostochiensis (FBPA_GR) and Heterodera glycines (FBPA_HG) are placed in group II, sharing the similar evolutionary relationship. The catalytic mechanism of these enzymes depends upon which class of aldolase, it belongs. The class of enzyme has been confirmed on the basis of sequences and structural similarity with template structure of class I FBPA. To confirm catalytic mechanism of above said model structures, the known substrate fructose-1, 6-bisphosphate (FBP) and competitive inhibitor Mannitol-1, 6 bisphosphate (MBP) were docked at known catalytic site of enzyme of interest. The comparative docking analysis shows that enzyme-substrate complex is forming similar Schiff base intermediate and conducts C(3)-C(4) bond cleavage by forming Hydrogen bonding with reaction catalyzing Glu-191, reactive Lys-150, and Schiff base forming Lys-233. On the other hand enzymeinhibitor noncovalent complex is forming cabinolamine precursor and the proton transfer by the formation of hydrogen bond between MBP O(2) with Glu191 enabling stabilization of cabinolamine transition state, which confirms the similar inhibition mechanism. Thus we conclude that Plant Parasitic Nematodes (PPNs) have evolutionary and functional relationship with the class I aldolase enzyme. Hence, FBPA can be targeted to control plant parasitic nematodes. PMID:23390337

  5. Optimization of protein purification and characterization using Thermofluor screens.

    PubMed

    Boivin, Stephane; Kozak, Sandra; Meijers, Rob

    2013-10-01

    The efficient large scale production of recombinant proteins depends on the careful conditioning of the protein as it is isolated and purified to homogeneity. Low protein stability leads to low purification yields as a result of protein degradation, precipitation and folding instability. It is often necessary to go through several iterations of trial-and-error to optimize the homogeneity, stability and solubility of the protein sample. We have set up Thermofluor assays to identify customized protocols for the preparation and characterization of individual protein constructs. We apply a two-step approach: we first screen for global parameters, followed by a search for protein-specific additives. The first screen has been designed in such a way, that it is possible to discern global stability trends according to pH, salt concentration, buffer type and concentration. The second screen contains small molecules that can affect the folding, aggregation state and solubility of the protein construct and also includes small molecules that specifically bind and stabilize proteins. The screens are designed to evaluate purification and storage protocols, and aim to provide hints to optimize these protocols. The home-made screens have been tested on more than 200 different protein constructs at the Sample Preparation and Characterization (SPC) facility at EMBL Hamburg. We describe which RT-PCR machines can be adapted to perform Thermofluor assays, what are the necessary experimental conditions to set up a screen, some leads on how to interpret the data and we give several examples of Thermofluor applications beyond stability screens. PMID:23948764

  6. Induced defence responses of contrasting bread wheat genotypes under differential salt stress imposition.

    PubMed

    Singh, Archana; Bhushan, Bharat; Gaikwad, Kishor; Yadav, O P; Kumar, Suresh; Rai, R D

    2015-02-01

    Plants, being sessile in nature, have developed mechanisms to cope with high salt concentrations in the soil. In this study, the effects of NaCl (50-200 mM) on expression of high-affinity potassium transporters (HKTs), antioxidant enzymes and their isozyme profiles were investigated in two contrasting bread wheat (Triticum aestivum L.) genotypes viz., HD2329 (salt-sensitive) and Kharchia65 (salt-tolerant). Kharchia65 can successfully grow in salt affected soils, while HD2329 cannot tolerate salt stress. Differential expression studies of two HKT genes (TaHKT2;1.1 and TaHKT2;3.1) revealed their up-regulated expression (-1.5-fold) in the salt-sensitive HD2329 and down-regulated (-5-fold) inducible expression in the salt-tolerant genotype (Kharchia65). Specific activity of antioxidant enzymes, viz. superoxide dismutase (SOD), peroxidase (POX), ascorbate peroxidase (APX), catalase (CAT) and glutathione reductase (GR) was found to be higher in the salt-tolerant genotype. Isozyme profile of two (POX and GR) antioxidant enzymes showed polymorphism between salt-tolerant and salt-sensitive genotypes. A new gene TaHKT2;3.1 was also identified and its expression profile and role in salt stress tolerance in wheat was also studied. Partial sequences of the TaHKT2;1.1 and TaHKT2;3.1 genes from bread wheat were submitted to the EMBL GenBank database. Our findings indicated that defence responses to salt stress were induced differentially in contrasting bread wheat genotypes which provide evidences for functional correlation between salt stress tolerance and differential biochemical and molecular expression patterns in bread wheat. PMID:26040114

  7. The obligate respiratory supercomplex from Actinobacteria.

    PubMed

    Kao, Wei-Chun; Kleinschroth, Thomas; Nitschke, Wolfgang; Baymann, Frauke; Neehaul, Yashvin; Hellwig, Petra; Richers, Sebastian; Vonck, Janet; Bott, Michael; Hunte, Carola

    2016-10-01

    Actinobacteria are closely linked to human life as industrial producers of bioactive molecules and as human pathogens. Respiratory cytochrome bcc complex and cytochrome aa3 oxidase are key components of their aerobic energy metabolism. They form a supercomplex in the actinobacterial species Corynebacterium glutamicum. With comprehensive bioinformatics and phylogenetic analysis we show that genes for cyt bcc-aa3 supercomplex are characteristic for Actinobacteria (Actinobacteria and Acidimicrobiia, except the anaerobic orders Actinomycetales and Bifidobacteriales). An obligatory supercomplex is likely, due to the lack of genes encoding alternative electron transfer partners such as mono-heme cyt c. Instead, subunit QcrC of bcc complex, here classified as short di-heme cyt c, will provide the exclusive electron transfer link between the complexes as in C. glutamicum. Purified to high homogeneity, the C. glutamicum bcc-aa3 supercomplex contained all subunits and cofactors as analyzed by SDS-PAGE, BN-PAGE, absorption and EPR spectroscopy. Highly uniform supercomplex particles in electron microscopy analysis support a distinct structural composition. The supercomplex possesses a dimeric stoichiometry with a ratio of a-type, b-type and c-type hemes close to 1:1:1. Redox titrations revealed a low potential bcc complex (Em(ISP)=+160mV, Em(bL)=-291mV, Em(bH)=-163mV, Em(cc)=+100mV) fined-tuned for oxidation of menaquinol and a mixed potential aa3 oxidase (Em(CuA)=+150mV, Em(a/a3)=+143/+317mV) mediating between low and high redox potential to accomplish dioxygen reduction. The generated molecular model supports a stable assembled supercomplex with defined architecture which permits energetically efficient coupling of menaquinol oxidation and dioxygen reduction in one supramolecular entity. PMID:27472998

  8. Benzo- and thienobenzo- diazepines: multi-target drugs for CNS disorders.

    PubMed

    Mendonça Júnior, F J B; Scotti, L; Ishiki, H; Botelho, S P S; Da Silva, M S; Scotti, M T

    2015-01-01

    Benzodiazepines (BZ or BZD) are a class of gabaminergic psychoactive chemicals used in hypnotics, sedation, in the treatment of anxiety, and in other CNS disorders. These drugs include alprazolam (Xanax), diazepam (Valium), clonazepam (Klonopin), and others. There are two distinct types of pharmacological binding sites for benzodiazepines in the brain (BZ1 and BZ2), these sites are on GABA-A receptors, and are classified as short, intermediate, or long-acting. From the thienobenzodiazepine class (TBZ), Olanzapine (2-methyl-4-(4-methyl-l-piperazinyl)-10H-thieno[2,3-b][1,5]benzodiazepine) (Zyprexa) was used as an example to demonstrate the antagonism of this class of compounds for multiples receptors including: dopamine D1-D5, α-adrenoreceptor, histamine H1, muscarinic M1-M5 and 5-HT2A, 5-HT2B, 5-HT2C, 5-HT3 and 5-HT6 receptors. Olanzapine is an atypical antipsychotic agent, structurally related to clozapine, and extensively used for the treatment of schizophrenia, bipolar disorder-associated mania, and the behavioral symptoms of Alzheimer's disease. The functional blockade of these multiple receptors contributes to the wide range of its pharmacologic and therapeutic activities, having relatively few side effects when compared to other antipsychotics agents. Thienobenzodiazepines (such as Olanzapine) are characterized as multi- receptor- targeted- acting- agents. This mini-review discusses these 2 drug classes that act on the central nervous system, the main active compounds used, and the various receptors with which they interact. In addition, we propose 12 olanzapine analogues, and generated Random Forest models, from a data set obtained from the ChEMBL database, to classify the structures as active or inactive against 5 dopamine receptors (D1, D2, D3, D4, D5 and D6), and dopamine transporter. PMID:25694077

  9. Evaluating the Impact of Different Sequence Databases on Metaproteome Analysis: Insights from a Lab-Assembled Microbial Mixture

    PubMed Central

    Tanca, Alessandro; Palomba, Antonio; Deligios, Massimo; Cubeddu, Tiziana; Fraumene, Cristina; Biosa, Grazia; Pagnozzi, Daniela; Addis, Maria Filippa; Uzzau, Sergio

    2013-01-01

    Metaproteomics enables the investigation of the protein repertoire expressed by complex microbial communities. However, to unleash its full potential, refinements in bioinformatic approaches for data analysis are still needed. In this context, sequence databases selection represents a major challenge. This work assessed the impact of different databases in metaproteomic investigations by using a mock microbial mixture including nine diverse bacterial and eukaryotic species, which was subjected to shotgun metaproteomic analysis. Then, both the microbial mixture and the single microorganisms were subjected to next generation sequencing to obtain experimental metagenomic- and genomic-derived databases, which were used along with public databases (namely, NCBI, UniProtKB/SwissProt and UniProtKB/TrEMBL, parsed at different taxonomic levels) to analyze the metaproteomic dataset. First, a quantitative comparison in terms of number and overlap of peptide identifications was carried out among all databases. As a result, only 35% of peptides were common to all database classes; moreover, genus/species-specific databases provided up to 17% more identifications compared to databases with generic taxonomy, while the metagenomic database enabled a slight increment in respect to public databases. Then, database behavior in terms of false discovery rate and peptide degeneracy was critically evaluated. Public databases with generic taxonomy exhibited a markedly different trend compared to the counterparts. Finally, the reliability of taxonomic attribution according to the lowest common ancestor approach (using MEGAN and Unipept software) was assessed. The level of misassignments varied among the different databases, and specific thresholds based on the number of taxon-specific peptides were established to minimize false positives. This study confirms that database selection has a significant impact in metaproteomics, and provides critical indications for improving depth and

  10. Characterization of the bovine C alpha gene.

    PubMed Central

    Brown, W R; Rabbani, H; Butler, J E; Hammarström, L

    1997-01-01

    The complete genomic sequence of a bovine C alpha gene is reported here. The genomic sequence was obtained from a C alpha phage clone that had been cloned from a genomic EMBL4 phage vector library. The C alpha sequence had previously been expressed as a chimeric antibody and identified as IgA using IgA-specific antibodies. Intron/exon boundaries were determined by comparison of the genomic sequence with an expressed bovine C alpha sequence obtained from spleen by reverse transcription-polymerase chain reaction (RT-PCR). Analysis of 50 Swedish bovine genomic DNA samples using genomic blots and five different restriction enzymes failed to detect evidence of polymorphism. However, PstI digests of Brown Swiss DNA showed a restriction fragment length polymorphism (RFLP), suggesting that at least two allelic variants of bovine IgA exist. Comparison of the deduced amino acid sequence of bovine IgA with sequences available for other species indicated that the highest homology was with that of swine, another artiodactyl. This was the highest homology observed for all mammalian IgA compared except for that between IgA1 and IgA2 in humans. Bovine IgA shares with rabbit IgA3 and IgA4, an additional N-linked glycosylation site at position 282. However, the collective data indicate that cattle are like swine and rodents and unlike rabbits in having a single locus of the gene encoding IgA of this species. Images Figure 4 PMID:9203958

  11. Identification and expression analysis of the cyclophilin gene in Kandelia candel under stress of salt.

    PubMed

    Huang, Wei; Lin, Qi Feng; Li, Guan Yi; Zhao, Wen Ming

    2003-06-01

    Two cDNA fragments, named for SRGKC2 and SRGKC3, encoding cyclophilin in Kandelia candel were isolated by Representational Difference Analysis of cDNA. The two cDNA fragments were 282 bp and 160 bp, respectively. Sequence analysis shows that both of the SRGKC2 and SRGKC3 come from the same gene region, and SRGKC3 is a part of SRGKC2. In addition the SRGKC2 displayed 90% sequence identity over a region of 84 amino acids to the cyclophilin from Euphorbia esula and the SRGKC3 displayed 93% sequence identity over a region of 47 amino acids to the fava bean. The Northern blotting showed that the expression of SRGKC2 was suppressed under stress of salt. Based on the sequence of SRGKC2, a full-length cDNA (KCCYP1) was isolated by RACE reaction (This sequence data has been submitted to the EMBL databases under accession No. AY150052). The full-length cDNA was about 0.9 kb, which contained an open reading frame (ORF) of 516 bp and coded for 172 amino acid residues with isoelectric point of 8.57 and molecular weight of 18.2 kD. The motif A of the ATP/GTP-binding site in KCCYP1 appears at amino acid residues of 41-49, and seven-amino-acids-residue was inserted at 48-54 amino acid residues. The expression patterns of SRGKC2 in various species were also investigated. PMID:12966731

  12. MMpI: A WideRange of Available Compounds of Matrix Metalloproteinase Inhibitors.

    PubMed

    Muvva, Charuvaka; Patra, Sanjukta; Venkatesan, Subramanian

    2016-01-01

    Matrix metalloproteinases (MMPs) are a family of zinc-dependent proteinases involved in the regulation of the extracellular signaling and structural matrix environment of cells and tissues. MMPs are considered as promising targets for the treatment of many diseases. Therefore, creation of database on the inhibitors of MMP would definitely accelerate the research activities in this area due to its implication in above-mentioned diseases and associated limitations in the first and second generation inhibitors. In this communication, we report the development of a new MMpI database which provides resourceful information for all researchers working in this field. It is a web-accessible, unique resource that contains detailed information on the inhibitors of MMP including small molecules, peptides and MMP Drug Leads. The database contains entries of ~3000 inhibitors including ~72 MMP Drug Leads and ~73 peptide based inhibitors. This database provides the detailed molecular and structural details which are necessary for the drug discovery and development. The MMpI database contains physical properties, 2D and 3D structures (mol2 and pdb format files) of inhibitors of MMP. Other data fields are hyperlinked to PubChem, ChEMBL, BindingDB, DrugBank, PDB, MEROPS and PubMed. The database has extensive searching facility with MMpI ID, IUPAC name, chemical structure and with the title of research article. The MMP inhibitors provided in MMpI database are optimized using Python-based Hierarchical Environment for Integrated Xtallography (Phenix) software. MMpI Database is unique and it is the only public database that contains and provides the complete information on the inhibitors of MMP. Database URL: http://clri.res.in/subramanian/databases/mmpi/index.php. PMID:27509041

  13. Characterization of Insertions of IS476 and Two Newly Identified Insertion Sequences, IS1478 and IS1479, in Xanthomonas campestris pv. campestris

    PubMed Central

    Chen, Jiann-Hwa; Hsieh, Yu-Ying; Hsiau, Su-Lian; Lo, Ta-Chun; Shau, Chen-Chun

    1999-01-01

    Thirty-two plasmid insertion mutants were independently isolated from two strains of Xanthomonas campestris pv. campestris in Taiwan. Of the 32 mutants, 14 (44%), 8 (25%), and 4 (12%) mutants resulted from separate insertions of an IS3 family member, IS476, and two new insertion sequences (IS), IS1478 and IS1479. While IS1478 does not have significant sequence homology with any IS elements in the EMBL/GenBank/DDBJ database, IS1479 demonstrated 73% sequence homology with IS1051 in X. campestris pv. dieffenbachiae, 62% homology with IS52 in Pseudomonas syringae pv. glycinea, and 60% homology with IS5 in Escherichia coli. Based on the predicted transposase sequences as well as the terminal nucleotide sequences, IS1478 by itself constitutes a new subfamily of the widespread IS5 family, whereas IS1479, along with IS1051, IS52, and IS5, belongs to the IS5 subfamily of the IS5 family. All but one of the IS476 insertions had duplications of 4 bp at the target sites without sequence preference and were randomly distributed. An IS476 insertion carried a duplication of 952 bp at the target site. A model for generating these long direct repeats is proposed. Insertions of IS1478 and IS1479, on the other hand, were not random, and IS1478 and IS1479 each showed conservation of PyPuNTTA and PyTAPu sequences (Py is a pyrimidine, Pu is a purine, and N is any nucleotide) for duplications at the target sites. The results of Southern blot hybridization analysis indicated that multiple copies of IS476, IS1478, and IS1479 are present in the genomes of all seven X. campestris pv. campestris strains tested and several X. campestris pathovars. PMID:9973349

  14. Substrate-Driven Mapping of the Degradome by Comparison of Sequence Logos

    PubMed Central

    Fuchs, Julian E.; von Grafenstein, Susanne; Huber, Roland G.; Kramer, Christian; Liedl, Klaus R.

    2013-01-01

    Sequence logos are frequently used to illustrate substrate preferences and specificity of proteases. Here, we employed the compiled substrates of the MEROPS database to introduce a novel metric for comparison of protease substrate preferences. The constructed similarity matrix of 62 proteases can be used to intuitively visualize similarities in protease substrate readout via principal component analysis and construction of protease specificity trees. Since our new metric is solely based on substrate data, we can engraft the protease tree including proteolytic enzymes of different evolutionary origin. Thereby, our analyses confirm pronounced overlaps in substrate recognition not only between proteases closely related on sequence basis but also between proteolytic enzymes of different evolutionary origin and catalytic type. To illustrate the applicability of our approach we analyze the distribution of targets of small molecules from the ChEMBL database in our substrate-based protease specificity trees. We observe a striking clustering of annotated targets in tree branches even though these grouped targets do not necessarily share similarity on protein sequence level. This highlights the value and applicability of knowledge acquired from peptide substrates in drug design of small molecules, e.g., for the prediction of off-target effects or drug repurposing. Consequently, our similarity metric allows to map the degradome and its associated drug target network via comparison of known substrate peptides. The substrate-driven view of protein-protein interfaces is not limited to the field of proteases but can be applied to any target class where a sufficient amount of known substrate data is available. PMID:24244149

  15. Assessing strategies for improved superfamily recognition

    PubMed Central

    Sillitoe, Ian; Dibley, Mark; Bray, James; Addou, Sarah; Orengo, Christine

    2005-01-01

    There are more than 200 completed genomes and over 1 million nonredundant sequences in public repositories. Although the structural data are more sparse (~13,000 nonredundant structures solved to date), several powerful sequence-based methodologies now allow these structures to be mapped onto related regions in a significant proportion of genome sequences. We review a number of publicly available strategies for providing structural annotations for genome sequences, and we describe the protocol adopted to provide CATH structural annotations for completed genomes. In particular, we assess the performance of several sequence-based protocols employing Hidden Markov model (HMM) technologies for superfamily recognition, including a new approach (SAMOSA [sequence augmented models of structure alignments]) that exploits multiple structural alignments from the CATH domain structure database when building the models. Using a data set of remote homologs detected by structure comparison and manually validated in CATH, a single-seed HMM library was able to recognize 76% of the data set. Including the SAMOSA models in the HMM library showed little gain in homolog recognition, although a slight improvement in alignment quality was observed for very remote homologs. However, using an expanded 1D-HMM library, CATH-ISL increased the coverage to 86%. The single-seed HMM library has been used to annotate the protein sequences of 120 genomes from all three major kingdoms, allowing up to 70% of the genes or partial genes to be assigned to CATH superfamilies. It has also been used to recruit sequences from Swiss-Prot and TrEMBL into CATH domain superfamilies, expanding the CATH database eightfold. PMID:15937274

  16. Human monomethylarsonic acid (MMA(V)) reductase is a member of the glutathione-S-transferase superfamily.

    PubMed

    Zakharyan, R A; Sampayo-Reyes, A; Healy, S M; Tsaprailis, G; Board, P G; Liebler, D C; Aposhian, H V

    2001-08-01

    The drinking of water containing large amounts of inorganic arsenic is a worldwide major public health problem because of arsenic carcinogenicity. Yet an understanding of the specific mechanism(s) of inorganic arsenic toxicity has been elusive. We have now partially purified the rate-limiting enzyme of inorganic arsenic metabolism, human liver MMA(V) reductase, using ion exchange, molecular exclusion, and hydroxyapatite chromatography. When SDS-beta-mercaptoethanol-PAGE was performed on the most purified fraction, seven protein bands were obtained. Each band was excised from the gel, sequenced by LC-MS/MS and identified according to the SWISS-PROT and TrEMBL Protein Sequence databases. Human liver MMA(V) reductase is 100% identical, over 92% of sequence that we analyzed, with the recently discovered human glutathione-S-transferase Omega class hGSTO 1-1. Recombinant human GSTO1-1 had MMA(V) reductase activity with K(m) and V(max) values comparable to those of human liver MMA(V) reductase. The partially purified human liver MMA(V) reductase had glutathione S-transferase (GST) activity. MMA(V) reductase activity was competitively inhibited by the GST substrate, 1-chloro 2,4-dinitrobenzene and also by the GST inhibitor, deoxycholate. Western blot analysis of the most purified human liver MMA(V) reductase showed one band when probed with hGSTO1-1 antiserum. We propose that MMA(V) reductase and hGSTO 1-1 are identical proteins. PMID:11511179

  17. Human histone gene organization: Nonregular arrangement within a large cluster

    SciTech Connect

    Albig, W.; Meergans, K.; Doenecke, D.

    1997-03-01

    We have previously located the genes of the five human main type H1 genes and the gene encoding the testicular subtype H1t to the region 21.1 to 22.2 on the short arm of chromosome 6. To investigate the organization of the histone genes in this region, we isolated two YACs from a human YAC library by PCR screening with primers specific for histone H1.1. This screen revealed two YAC clones. YAC Y23 (corresponding to ICRFy901D1223) contains an insert of about 480 kb, whereas the smaller YAC 4A (corresponding to ICRFy900C104) spans about 340 kb and is completely covered by YAC Y23. We have subcloned the YAC inserts in cosmids, determined the linear orientation of the cosmids by cosmid walking, and constructed a restriction map of the entire region by mapping the individual cosmids using partial digests and hybridization with labeled oligonucleotides complementary to the cos site of the vector. Hybridization analysis, subcloning, restriction mapping, and sequencing revealed that most of the previously isolated phage and cosmid clones containing histone genes are part of this YAC including the clones containing the four human main type H1 histone genes H1.1 to H1.4, the H1t gene, and core histone genes. Thirty-five histone genes map within 260 kb of the YAC Y23 insert. All newly identified histone genes were sequenced, and the sequences were deposited with the EMBL nucleotide sequence database. The histone H1.5 gene is not part of this region, and we therefore conclude that the H1.5 gene and the associated core histone genes form a separate subcluster within this chromosomal region. 53 refs., 4 figs., 1 tab.

  18. Whole-Genome Sequence of Chryseobacterium oranimense, a Colistin-Resistant Bacterium Isolated from a Cystic Fibrosis Patient in France

    PubMed Central

    Sharma, Poonam; Gupta, Sushim Kumar; Diene, Seydina M.

    2015-01-01

    For the first time, we report the whole-genome sequence analysis of Chryseobacterium oranimense G311, a multidrug-resistant bacterium, from a cystic fibrosis patient in France, including resistance to colistin. Whole-genome sequencing of C. oranimense G311 was performed using Ion Torrent PGM, and RAST, the EMBL-EBI server, and the Antibiotic Resistance Gene-ANNOTation (ARG-ANNOT) database were used for annotation of all genes, including antibiotic resistance (AR) genes. General features of the C. oranimense G311 draft genome were compared to the other available genomes of Chryseobacterium gleum and Chryseobacterium sp. strain CF314. C. oranimense G311 was found to be resistant to all β-lactams, including imipenem, and to colistin. The genome size of C. oranimense G311 is 4,457,049 bp in length, with 37.70% GC content. We found 27 AR genes in the genome, including β-lactamase genes which showed little similarity to the known β-lactamase genes and could likely be novel. We found the type I polyketide synthase operon followed by a zeaxanthin glycosyltransferase gene in the genome, which could impart the yellow pigmentation of the isolate. We located the O-antigen biosynthesis cluster, and we also discovered a novel capsular polysaccharide biosynthesis cluster. We also found known mutations in the orthologs of the pmrA (E8D), pmrB (L208F and P360Q), and lpxA (G68D) genes. We speculate that the presence of the capsular cluster and mutations in these genes could explain the resistance of this bacterium to colistin. We demonstrate that whole-genome sequencing was successfully applied to decipher the resistome of a multidrug resistance bacterium associated with cystic fibrosis patients. PMID:25583710

  19. Benchmarking the Predictive Power of Ligand Efficiency Indices in QSAR.

    PubMed

    Cortes-Ciriano, Isidro

    2016-08-22

    Compound physicochemical properties favoring in vitro potency are not always correlated to desirable pharmacokinetic profiles. Therefore, using potency (i.e., IC50) as the main criterion to prioritize candidate drugs at early stage drug discovery campaigns has been questioned. Yet, the vast majority of the virtual screening models reported in the medicinal chemistry literature predict the biological activity of compounds by regressing in vitro potency on topological or physicochemical descriptors. Two studies published in this journal showed that higher predictive power on external molecules can be achieved by using ligand efficiency indices as the dependent variable instead of a metric of potency (IC50) or binding affinity (Ki). The present study aims at filling the shortage of a thorough assessment of the predictive power of ligand efficiency indices in QSAR. To this aim, the predictive power of 11 ligand efficiency indices has been benchmarked across four algorithms (Gradient Boosting Machines, Partial Least Squares, Random Forest, and Support Vector Machines), two descriptor types (Morgan fingerprints, and physicochemical descriptors), and 29 data sets collected from the literature and ChEMBL database. Ligand efficiency metrics led to the highest predictive power on external molecules irrespective of the descriptor type or algorithm used, with an R(2)test difference of ∼0.3 units and a this difference ∼0.4 units when modeling small data sets and a normalized RMSE decrease of >0.1 units in some cases. Polarity indices, such as SEI and NSEI, led to higher predictive power than metrics based on molecular size, i.e., BEI, NBEI, and LE. LELP, which comprises a polarity factor (cLogP) and a size parameter (LE) constantly led to the most predictive models, suggesting that these two properties convey a complementary predictive signal. Overall, this study suggests that using ligand efficiency indices as the dependent variable might be an efficient strategy to model

  20. Whole-genome sequence of Chryseobacterium oranimense, a colistin-resistant bacterium isolated from a cystic fibrosis patient in France.

    PubMed

    Sharma, Poonam; Gupta, Sushim Kumar; Diene, Seydina M; Rolain, Jean-Marc

    2015-03-01

    For the first time, we report the whole-genome sequence analysis of Chryseobacterium oranimense G311, a multidrug-resistant bacterium, from a cystic fibrosis patient in France, including resistance to colistin. Whole-genome sequencing of C. oranimense G311 was performed using Ion Torrent PGM, and RAST, the EMBL-EBI server, and the Antibiotic Resistance Gene-ANNOTation (ARG-ANNOT) database were used for annotation of all genes, including antibiotic resistance (AR) genes. General features of the C. oranimense G311 draft genome were compared to the other available genomes of Chryseobacterium gleum and Chryseobacterium sp. strain CF314. C. oranimense G311 was found to be resistant to all β-lactams, including imipenem, and to colistin. The genome size of C. oranimense G311 is 4,457,049 bp in length, with 37.70% GC content. We found 27 AR genes in the genome, including β-lactamase genes which showed little similarity to the known β-lactamase genes and could likely be novel. We found the type I polyketide synthase operon followed by a zeaxanthin glycosyltransferase gene in the genome, which could impart the yellow pigmentation of the isolate. We located the O-antigen biosynthesis cluster, and we also discovered a novel capsular polysaccharide biosynthesis cluster. We also found known mutations in the orthologs of the pmrA (E8D), pmrB (L208F and P360Q), and lpxA (G68D) genes. We speculate that the presence of the capsular cluster and mutations in these genes could explain the resistance of this bacterium to colistin. We demonstrate that whole-genome sequencing was successfully applied to decipher the resistome of a multidrug resistance bacterium associated with cystic fibrosis patients. PMID:25583710

  1. Molecular cloning of a mouse DNA repair gene that complements the defect of group-A xeroderma pigmentosum.

    PubMed Central

    Tanaka, K; Satokata, I; Ogita, Z; Uchida, T; Okada, Y

    1989-01-01

    For isolation of the gene responsible for xeroderma pigmentosum (XP) complementation group A, plasmid pSV2gpt and genomic DNA from a mouse embryo were cotransfected into XP2OSSV cells, a group-A XP cell line. Two primary UV-resistant XP transfectants were isolated from about 1.6 X 10(5) pSV2gpt-transformed XP colonies. pSV2gpt and genomic DNA from the primary transfectants were again cotransfected into XP2OSSV cells and a secondary UV-resistant XP transfectant was obtained by screening about 4.8 X 10(5) pSV2gpt-transformed XP colonies. The secondary transfectant retained fewer mouse repetitive sequences. A mouse gene that complements the defect of XP2OSSV cells was cloned into an EMBL3 vector from the genome of a secondary transfectant. Transfections of the cloned DNA also conferred UV resistance on another group-A XP cell line but not on XP cell lines of group C, D, F, or G. Northern blot analysis of poly(A)+ RNA with a subfragment of cloned mouse DNA repair gene as the probe revealed that an approximately 1.0 kilobase mRNA was transcribed in the donor mouse embryo and secondary transfectant, and approximately 1.0- and approximately 1.3-kilobase mRNAs were transcribed in normal human cells, but none of these mRNAs was detected in three strains of group-A XP cells. These results suggest that the cloned DNA repair gene is specific for group-A XP and may be the mouse homologue of the group-A XP human gene. Images PMID:2748601

  2. Nucleotide sequence of the tobacco (Nicotiana tabacum) anionic peroxidase gene

    SciTech Connect

    Diaz-De-Leon, F.; Klotz, K.L.; Lagrimini, L.M. )

    1993-03-01

    Peroxidases have been implicated in numerous physiological processes including lignification (Grisebach, 1981), wound-healing (Espelie et al., 1986), phenol oxidation (Lagrimini, 1991), pathogen defense (Ye et al., 1990), and the regulation of cell elongation through the formation of interchain covalent bonds between various cell wall polymers (Fry, 1986; Goldberg et al., 1986; Bradley et al., 1992). However, a complete description of peroxidase action in vivo is not available because of the vast number of potential substrates and the existence of multiple isoenzymes. The tobacco anionic peroxidase is one of the better-characterized isoenzymes. This enzyme has been shown to oxidize a number of significant plant secondary compounds in vitro including cinnamyl alcohols, phenolic acids, and indole-3-acetic acid (Maeder, 1980; Lagrimini, 1991). A cDNA encoding the enzyme has been obtained, and this enzyme was shown to be expressed at the highest levels in lignifying tissues (xylem and tracheary elements) and also in epidermal tissue (Lagrimini et al., 1987). It was shown at this time that there were four distinct copies of the anionic peroxidase gene in tobacco (Nicotiana tabacum). A tobacco genomic DNA library was constructed in the [lambda]-phase EMBL3, from which two unique peroxidase genes were sequenced. One of these clones, [lambda]POD1, was designated as a pseudogene when the exonic sequences were found to differ from the cDNA sequences by 1%, and several frame shifts in the coding sequences indicated a dysfunctional gene (the authors' unpublished results). The other clone, [lambda]POD3, described in this manuscript, was designated as the functional tobacco anionic peroxidase gene because of 100% homology with the cDNA. Significant structural elements include an AS-2 box indicated in shoot-specific expression (Lam and Chua, 1989), a TATA box, and two intervening sequences. 10 refs., 1 tab.

  3. Data collection with a tailored X-ray beam size at 2.69 Å wavelength (4.6 keV): sulfur SAD phasing of Cdc23Nterm

    PubMed Central

    Cianci, Michele; Groves, Matthew R.; Barford, David; Schneider, Thomas R.

    2016-01-01

    The capability to reach wavelengths of up to 3.1 Å at the newly established EMBL P13 beamline at PETRA III, the new third-generation synchrotron at DESY in Hamburg, provides the opportunity to explore very long wavelengths to harness the sulfur anomalous signal for phase determination. Data collection at λ = 2.69 Å (4.6 keV) allowed the crystal structure determination by sulfur SAD phasing of Cdc23Nterm, a subunit of the multimeric anaphase-promoting complex (APC/C). At this energy, Cdc23Nterm has an expected Bijvoet ratio 〈|F anom|〉/〈F〉 of 2.2%, with 282 residues, including six cysteines and five methionine residues, and two molecules in the asymmetric unit (65.4 kDa; 12 Cys and ten Met residues). Selectively illuminating two separate portions of the same crystal with an X-ray beam of 50 µm in diameter allowed crystal twinning to be overcome. The crystals diffracted to 3.1 Å resolution, with unit-cell parameters a = b = 61.2, c = 151.5 Å, and belonged to space group P43. The refined structure to 3.1 Å resolution has an R factor of 18.7% and an R free of 25.9%. This paper reports the structure solution, related methods and a discussion of the instrumentation. PMID:26960127

  4. Isolation and phylogenetic footprinting analysis of the 5'-regulatory region of the floral homeotic gene OrcPI from Orchis italica (Orchidaceae).

    PubMed

    Aceto, Serena; Cantone, Carmela; Chiaiese, Pasquale; Ruotolo, Gianluca; Sica, Maria; Gaudio, Luciano

    2010-01-01

    The nucleotide sequences of regulatory elements from homologous genes can be strongly divergent. Phylogenetic footprinting, a comparative analysis of noncoding regions, can detect putative transcription factor binding sites (TFBSs) shared among the regulatory regions of 2 or more homologous genes. These conserved motifs have the potential to serve the same regulatory function in distantly related taxa. We isolated the 5'-noncoding region of the OrcPI gene, a MADS-box transcription factor involved in flower development in Orchis italica, using the thermal asymmetric interlaced polymerase chain reaction technique. This region (comprising 1352 bp) induced transient beta-glucuronidase expression in the petal tissue of white Rosa hybrida flowers and represents the 5'-regulatory sequence of the OrcPI gene. Phylogenetic footprinting analysis detected conserved regions within the 5'-regulatory sequence of OrcPI and the homologous regions of Oryza sativa, Lilium regale, and Arabidopsis thaliana. Some of these sequences are known TFBSs described in databases of plant regulatory elements. Nucleotide sequence data reported are available in the DDBJ/EMBL/GenBank databases under the following accession numbers: AF198055 promoter region of the PISTILLATA (PI) gene of A. thaliana; AB094985 cDNA of OrcPI (PI/GLOBOSA [PI/GLO] homologue) of O. italica; AB378089 5'-regulatory region of the OrcPI gene of O. italica; AP008211 putative promoter region of OSMADS2 (PI/GLO homologue) of O. sativa; AP008207 putative promoter region of OSMADS4 (PI/GLO homologue) of O. sativa; and AB158292 putative promoter region of the PI/GLO homologue of L. regale. PMID:19861638

  5. GOLD: The Genomes Online Database

    DOE Data Explorer

    Kyrpides, Nikos; Liolios, Dinos; Chen, Amy; Tavernarakis, Nektarios; Hugenholtz, Philip; Markowitz, Victor; Bernal, Alex

    Since its inception in 1997, GOLD has continuously monitored genome sequencing projects worldwide and has provided the community with a unique centralized resource that integrates diverse information related to Archaea, Bacteria, Eukaryotic and more recently Metagenomic sequencing projects. As of September 2007, GOLD recorded 639 completed genome projects. These projects have their complete sequence deposited into the public archival sequence databases such as GenBank EMBL,and DDBJ. From the total of 639 complete and published genome projects as of 9/2007, 527 were bacterial, 47 were archaeal and 65 were eukaryotic. In addition to the complete projects, there were 2158 ongoing sequencing projects. 1328 of those were bacterial, 59 archaeal and 771 eukaryotic projects. Two types of metadata are provided by GOLD: (i) project metadata and (ii) organism/environment metadata. GOLD CARD pages for every project are available from the link of every GOLD_STAMP ID. The information in every one of these pages is organized into three tables: (a) Organism information, (b) Genome project information and (c) External links. [The Genomes On Line Database (GOLD) in 2007: Status of genomic and metagenomic projects and their associated metadata, Konstantinos Liolios, Konstantinos Mavromatis, Nektarios Tavernarakis and Nikos C. Kyrpides, Nucleic Acids Research Advance Access published online on November 2, 2007, Nucleic Acids Research, doi:10.1093/nar/gkm884]

    The basic tables in the GOLD database that can be browsed or searched include the following information:

    • Gold Stamp ID
    • Organism name
    • Domain
    • Links to information sources
    • Size and link to a map, when available
    • Chromosome number, Plas number, and GC content
    • A link for downloading the actual genome data
    • Institution that did the sequencing
    • Funding source
    • Database where information resides
    • Publication status and information

    • Laminin alpha 5, a major transcript of normal and malignant rat liver epithelial cells, is differentially expressed in developing and adult liver.

      PubMed

      Seebacher, T; Medina, J L; Bade, E G

      1997-11-25

      The laminin family of extracellular matrix glycoproteins plays a major role in cell migration and differentiation and in tumor cell invasion. As previously shown, the laminin deposited by normal and malignant rat liver epithelial cells in their extracellular matrix (ECM) and into their ECM migration tracks does not contain a typical (EHS-like) alpha 1 heavy chain. By RT-PCR screening we have now identified two alpha chains among a total of five additional laminin chains produced by these cells. Three of the newly identified chains were not previously known for the rat. Their sequences have been deposited in the EMBL nucleotide sequence data bank. The alpha 5 chain now identified is expressed at comparably high levels by both the normal and the malignant liver epithelial cells. The chain is also expressed in fetal liver together with the alpha 2 and beta 2 chains, but it is only vestigially expressed in the mature organ as shown by RT-PCR. These results suggest for alpha 5 a role in development and production of the chain by only a small subset of cells in adult liver. At the level of detection used, no changes were observed in regenerating liver after partial hepatectomy. In addition to the alpha 5 chain, the cultured cells express the beta 1 and beta 2 light chains, indicating the expression of more than one laminin isoform by the same cell line. The expression of the alpha 5 chain and of the other new non-EHS isoform chains was also analyzed in various tissues. The malignant liver epithelial cells, but not their nontumorigenic parental cells, also express, in addition to the alpha 5 chain the alpha 2 chain, which is expressed at high level by the NBT II bladder carcinoma cell line, suggesting a relationship with malignancy. PMID:9417868

    • Open Source Bayesian Models. 3. Composite Models for Prediction of Binned Responses

      PubMed Central

      2016-01-01

      Bayesian models constructed from structure-derived fingerprints have been a popular and useful method for drug discovery research when applied to bioactivity measurements that can be effectively classified as active or inactive. The results can be used to rank candidate structures according to their probability of activity, and this ranking benefits from the high degree of interpretability when structure-based fingerprints are used, making the results chemically intuitive. Besides selecting an activity threshold, building a Bayesian model is fast and requires few or no parameters or user intervention. The method also does not suffer from such acute overtraining problems as quantitative structure–activity relationships or quantitative structure–property relationships (QSAR/QSPR). This makes it an approach highly suitable for automated workflows that are independent of user expertise or prior knowledge of the training data. We now describe a new method for creating a composite group of Bayesian models to extend the method to work with multiple states, rather than just binary. Incoming activities are divided into bins, each covering a mutually exclusive range of activities. For each of these bins, a Bayesian model is created to model whether or not the compound belongs in the bin. Analyzing putative molecules using the composite model involves making a prediction for each bin and examining the relative likelihood for each assignment, for example, highest value wins. The method has been evaluated on a collection of hundreds of data sets extracted from ChEMBL v20 and validated data sets for ADME/Tox and bioactivity. PMID:26750305

    • MMpI: A WideRange of Available Compounds of Matrix Metalloproteinase Inhibitors

      PubMed Central

      Muvva, Charuvaka; Patra, Sanjukta; Venkatesan, Subramanian

      2016-01-01

      Matrix metalloproteinases (MMPs) are a family of zinc-dependent proteinases involved in the regulation of the extracellular signaling and structural matrix environment of cells and tissues. MMPs are considered as promising targets for the treatment of many diseases. Therefore, creation of database on the inhibitors of MMP would definitely accelerate the research activities in this area due to its implication in above-mentioned diseases and associated limitations in the first and second generation inhibitors. In this communication, we report the development of a new MMpI database which provides resourceful information for all researchers working in this field. It is a web-accessible, unique resource that contains detailed information on the inhibitors of MMP including small molecules, peptides and MMP Drug Leads. The database contains entries of ~3000 inhibitors including ~72 MMP Drug Leads and ~73 peptide based inhibitors. This database provides the detailed molecular and structural details which are necessary for the drug discovery and development. The MMpI database contains physical properties, 2D and 3D structures (mol2 and pdb format files) of inhibitors of MMP. Other data fields are hyperlinked to PubChem, ChEMBL, BindingDB, DrugBank, PDB, MEROPS and PubMed. The database has extensive searching facility with MMpI ID, IUPAC name, chemical structure and with the title of research article. The MMP inhibitors provided in MMpI database are optimized using Python-based Hierarchical Environment for Integrated Xtallography (Phenix) software. MMpI Database is unique and it is the only public database that contains and provides the complete information on the inhibitors of MMP. Database URL: http://clri.res.in/subramanian/databases/mmpi/index.php. PMID:27509041

    • Panagrellus redivivus ornithine decarboxylase: structure of the gene, expression in Escherichia coli and characterization of the recombinant protein.

      PubMed Central

      Niemann, G; von Besser, H; Walter, R D

      1996-01-01

      A southern blot analysis of the Panagrellus redivivus ornithine decarboxylase (ODC) gene suggests that it is a single-copy gene that resides on a genomic 3.2 kb EcoRI fragment. Phage clones possessing ODC gene sequences were isolated from a genomic EMBL-4 library and purified. The phage DNA inserts were analysed and a 3.2 kb EcoRI fragment containing the entire ODC gene was isolated. The nucleotide sequence analysis of this fragment reveals that the gene is interrupted by two introns of 47 and 49 bp. In the 5' non-translated region of the gene, putative AP1, VPE2 and c-Myc binding sites were identified. The ODC cDNA was expressed in a bacterial system as a His-fusion protein and the enzyme was purified by Ni(2+)-chelating affinity chromatography. The subunit molecular mass, as deduced from the cDNA and shown by SDS/PAGE, is 47.1 kDa. On the basis of gel filtration analyses it is shown that the active enzyme is a dimer. The specific enzyme activity was determined to be 4.2 mumol CO2/min/mg protein. The enzyme is dependent on pyridoxal 5-phosphate as a cofactor, and the presence of dithioerythritol or other thiol-reducing agents is essential for maximal activity. The Km value for L-ornithine was determined as 44 microM. The Ki values for putrescine, alpha-diffluoromethylornithine, alpha-hydrazino-ornithine and alpha-methylornithine were calculated as 51, 34, 0.34 and 42 microM respectively. PMID:8694755

    • Privacy-preserving search for chemical compound databases

      PubMed Central

      2015-01-01

      Background Searching for similar compounds in a database is the most important process for in-silico drug screening. Since a query compound is an important starting point for the new drug, a query holder, who is afraid of the query being monitored by the database server, usually downloads all the records in the database and uses them in a closed network. However, a serious dilemma arises when the database holder also wants to output no information except for the search results, and such a dilemma prevents the use of many important data resources. Results In order to overcome this dilemma, we developed a novel cryptographic protocol that enables database searching while keeping both the query holder's privacy and database holder's privacy. Generally, the application of cryptographic techniques to practical problems is difficult because versatile techniques are computationally expensive while computationally inexpensive techniques can perform only trivial computation tasks. In this study, our protocol is successfully built only from an additive-homomorphic cryptosystem, which allows only addition performed on encrypted values but is computationally efficient compared with versatile techniques such as general purpose multi-party computation. In an experiment searching ChEMBL, which consists of more than 1,200,000 compounds, the proposed method was 36,900 times faster in CPU time and 12,000 times as efficient in communication size compared with general purpose multi-party computation. Conclusion We proposed a novel privacy-preserving protocol for searching chemical compound databases. The proposed method, easily scaling for large-scale databases, may help to accelerate drug discovery research by making full use of unused but valuable data that includes sensitive information. PMID:26678650

    • Molecular cloning and structural characterization of the human histidase gene (HAL)

      SciTech Connect

      Suchi, Mariko; Sano, Hirofumi; Mizuno, Haruo; Wada, Yoshiro

      1995-09-01

      Histidase (EC 4.3.1.3) is a cytosolic enzyme that catalyzes the nonoxidative determination of histidine to urocanic acid. Histidinemia, resulting from reduced histidase activity as reported in Cambridge stock his/her mice and in humans, is the most frequent inborn metabolic error in Japan. The histidase chromosomal gene (HAL) was isolated from a {lambda}EMBL-3 human genomic library using the human histidase cDNA as a probe. Restriction mapping and Southern blot analysis of the isolated clones reveal a single-copy gene spanning approximately 25 kb and consisting of 21 exons. Exon 1 encodes only 5{prime} untranslated sequence of liver histidase mRNA, with protein coding beginning in exon 2. A rarely observed 5{prime}GC, similar to that reported in the human P-450(SCC) gene, is present in intron 20. All other splicing junctions adhere to the canonical GT/AG rule. A TATA box sequence is located 25 bp upstream of the liver histidase transcription initiation site determined by S1 nuclease protection analysis. Several liver- and epidermis-specific transcription factor binding sites, including C/EBP, NFIL6, HNF5, AP2/ KER1, MNF, and others, are also identified in the 5{prime} flanking region. Consistent with the hepatic and epidermal expression of histidase, this finding suggests that histidase transcription may be regulated by these factors. We further identify a polymorphism (A to G transition) in the histidase coding region of exon 16. The human histidase genomic structure presented here should facilitate the molecular investigation of symptomatic and asymptomatic forms of histidinemia. 69 refs., 4 figs., 1 tab.

    • Amyotrophic Lateral Sclerosis Type 20 - In Silico Analysis and Molecular Dynamics Simulation of hnRNPA1

      PubMed Central

      Krebs, Bruna Baumgarten

      2016-01-01

      Amyotrophic Lateral Sclerosis (ALS) is a fatal neurodegenerative disease that affects the upper and lower motor neurons. 5–10% of cases are genetically inherited, including ALS type 20, which is caused by mutations in the hnRNPA1 gene. The goals of this work are to analyze the effects of non-synonymous single nucleotide polymorphisms (nsSNPs) on hnRNPA1 protein function, to model the complete tridimensional structure of the protein using computational methods and to assess structural and functional differences between the wild type and its variants through Molecular Dynamics simulations. nsSNP, PhD-SNP, Polyphen2, SIFT, SNAP, SNPs&GO, SNPeffect and PROVEAN were used to predict the functional effects of nsSNPs. Ab initio modeling of hnRNPA1 was made using Rosetta and refined using KoBaMIN. The structure was validated by PROCHECK, Rampage, ERRAT, Verify3D, ProSA and Qmean. TM-align was used for the structural alignment. FoldIndex, DICHOT, ELM, D2P2, Disopred and DisEMBL were used to predict disordered regions within the protein. Amino acid conservation analysis was assessed by Consurf, and the molecular dynamics simulations were performed using GROMACS. Mutations D314V and D314N were predicted to increase amyloid propensity, and predicted as deleterious by at least three algorithms, while mutation N73S was predicted as neutral by all the algorithms. D314N and D314V occur in a highly conserved amino acid. The Molecular Dynamics results indicate that all mutations increase protein stability when compared to the wild type. Mutants D314N and N319S showed higher overall dimensions and accessible surface when compared to the wild type. The flexibility level of the C-terminal residues of hnRNPA1 is affected by all mutations, which may affect protein function, especially regarding the protein ability to interact with other proteins. PMID:27414033

    • Transcriptional profiling in pearl millet (Pennisetum glaucum L.R. Br.) for identification of differentially expressed drought responsive genes.

      PubMed

      Choudhary, Minakshi; Jayanand; Padaria, Jasdeep Chatrath

      2015-04-01

      Pearl millet (Pennisetum glaucum) is an important cereal of traditional farming systems that has the natural ability to withstand various abiotic stresses. The present study aims at the identification and validation of major differentially expressed genes in response to drought stress in P. glaucum by Suppression Subtractive Hybridization (SSH) analysis. Twenty-two days old seedlings of P. glaucum cultivar PPMI741 were subjected to drought stress by treatment of 30 % Polyethylene glycol for different time periods 30 min (T1), 2 h (T2), 4 h (T3), 8 h (T4), 16 h (T5), 24 h (T6) and 48 h (T7) respectively, monitored by examining the RWC of seedlings. Total RNA was isolated to construct drought responsive subtractive cDNA library through SSH, sequenced to identify the differentially expressed genes in response to drought stress and validated by qRT-PCR.745 ESTs were assembled into a collection of 299 unigenes having 52 contigs and 247 singletons. All 745 ESTs were submitted to ENA-EMBL databases (Accession no. HG516611- HG517355). After analysis, 10 differentially expressed genes were validated namely Abscisic stress ripening protein, Ascorbate peroxidase, Inosine-5'-monophosphate dehydrogenase, Putative beta-1, 3-glucanase, Glyoxalase, Rab7, Aspartic proteinase Oryzasin, DnaJ-like protein and Calmodulin-like protein by qRT-PCR. The identified ESTs reveal a major portion of the stress responsive transcriptome that may prove to be a vent to unravel molecular basis underlying tolerance of pearl millet (Pennisetum glaucum) to drought stress. These genes could be utilized for transgenic breeding or transferred to crop plants through marker assisted selection for the development of better drought resistant cultivars having enhanced adaptability to survive harsh environmental conditions. PMID:25964713

    • Historeceptomic Fingerprints for Drug-Like Compounds

      PubMed Central

      Shmelkov, Evgeny; Grigoryan, Arsen; Swetnam, James; Xin, Junyang; Tivon, Doreen; Shmelkov, Sergey V.; Cardozo, Timothy

      2015-01-01

      Most drugs exert their beneficial and adverse effects through their combined action on several different molecular targets (polypharmacology). The true molecular fingerprint of the direct action of a drug has two components: the ensemble of all the receptors upon which a drug acts and their level of expression in organs/tissues. Conversely, the fingerprint of the adverse effects of a drug may derive from its action in bystander tissues. The ensemble of targets is almost always only partially known. Here we describe an approach improving upon and integrating both components: in silico identification of a more comprehensive ensemble of targets for any drug weighted by the expression of those receptors in relevant tissues. Our system combines more than 300,000 experimentally determined bioactivity values from the ChEMBL database and 4.2 billion molecular docking scores. We integrated these scores with gene expression data for human receptors across a panel of human tissues to produce drug-specific tissue-receptor (historeceptomics) scores. A statistical model was designed to identify significant scores, which define an improved fingerprint representing the unique activity of any drug. These multi-dimensional historeceptomic fingerprints describe, in a novel, intuitive, and easy to interpret style, the holistic, in vivo picture of the mechanism of any drug's action. Valuable applications in drug discovery and personalized medicine, including the identification of molecular signatures for drugs with polypharmacologic modes of action, detection of tissue-specific adverse effects of drugs, matching molecular signatures of a disease to drugs, target identification for bioactive compounds with unknown receptors, and hypothesis generation for drug/compound phenotypes may be enabled by this approach. The system has been deployed at drugable.org for access through a user-friendly web site. PMID:26733872

    • Automatic Discovery and Inferencing of Complex Bioinformatics Web Interfaces

      SciTech Connect

      Ngu, A; Rocco, D; Critchlow, T; Buttler, D

      2003-12-22

      The World Wide Web provides a vast resource to genomics researchers in the form of web-based access to distributed data sources--e.g. BLAST sequence homology search interfaces. However, the process for seeking the desired scientific information is still very tedious and frustrating. While there are several known servers on genomic data (e.g., GeneBank, EMBL, NCBI), that are shared and accessed frequently, new data sources are created each day in laboratories all over the world. The sharing of these newly discovered genomics results are hindered by the lack of a common interface or data exchange mechanism. Moreover, the number of autonomous genomics sources and their rate of change out-pace the speed at which they can be manually identified, meaning that the available data is not being utilized to its full potential. An automated system that can find, classify, describe and wrap new sources without tedious and low-level coding of source specific wrappers is needed to assist scientists to access to hundreds of dynamically changing bioinformatics web data sources through a single interface. A correct classification of any kind of Web data source must address both the capability of the source and the conversation/interaction semantics which is inherent in the design of the Web data source. In this paper, we propose an automatic approach to classify Web data sources that takes into account both the capability and the conversational semantics of the source. The ability to discover the interaction pattern of a Web source leads to increased accuracy in the classification process. At the same time, it facilitates the extraction of process semantics, which is necessary for the automatic generation of wrappers that can interact correctly with the sources.

    • The Protein Information Resource: an integrated public resource of functional annotation of proteins

      PubMed Central

      Wu, Cathy H.; Huang, Hongzhan; Arminski, Leslie; Castro-Alvear, Jorge; Chen, Yongxing; Hu, Zhang-Zhi; Ledley, Robert S.; Lewis, Kali C.; Mewes, Hans-Werner; Orcutt, Bruce C.; Suzek, Baris E.; Tsugita, Akira; Vinayaka, C. R.; Yeh, Lai-Su L.; Zhang, Jian; Barker, Winona C.

      2002-01-01

      The Protein Information Resource (PIR) serves as an integrated public resource of functional annotation of protein data to support genomic/proteomic research and scientific discovery. The PIR, in collaboration with the Munich Information Center for Protein Sequences (MIPS) and the Japan International Protein Information Database (JIPID), produces the PIR-International Protein Sequence Database (PSD), the major annotated protein sequence database in the public domain, containing about 250 000 proteins. To improve protein annotation and the coverage of experimentally validated data, a bibliography submission system is developed for scientists to submit, categorize and retrieve literature information. Comprehensive protein information is available from iProClass, which includes family classification at the superfamily, domain and motif levels, structural and functional features of proteins, as well as cross-references to over 40 biological databases. To provide timely and comprehensive protein data with source attribution, we have introduced a non-redundant reference protein database, PIR-NREF. The database consists of about 800 000 proteins collected from PIR-PSD, SWISS-PROT, TrEMBL, GenPept, RefSeq and PDB, with composite protein names and literature data. To promote database interoperability, we provide XML data distribution and open database schema, and adopt common ontologies. The PIR web site (http://pir.georgetown.edu/) features data mining and sequence analysis tools for information retrieval and functional identification of proteins based on both sequence and annotation information. The PIR databases and other files are also available by FTP (ftp://nbrfa.georgetown.edu/pir_databases). PMID:11752247

    • Identification of a novel yolk protein in the hermatypic coral Galaxea fascicularis.

      PubMed

      Hayakawa, Hideki; Andoh, Tadashi; Watanabe, Toshiki

      2007-03-01

      The reef-building (or hermatypic) coral Galaxea fascicularis (Anthozoa, Hexacorallia, Scleractinia) has an annual reproductive cycle. Females of G. fascicularis release packages (or ;bundles') of eggs for external fertilization, whereas male individuals form bundles consisting of sperm and infertile ;pseudo-eggs' that are thought to confer buoyancy to the male bundle. In the egg of G. fascicularis, four proteins (GfEP-1 to 4) were found to be stored in high abundance, and three of them (GfEP-1, 2 and 3) are generated by processing of a vitellogenin (Vg)-like precursor. In the present study, a cDNA encoding GfEP-4 was cloned and its sequence determined (GenBank/EMBL/DDBJ accession no. AB259859). The amino acid sequence of this protein does not exhibit similarity to known proteins, including Vgs or other yolk proteins found in some invertebrates. The expression of GfEP-4 mRNA was observed in females, and also in the majority of males examined, although expression levels were lower than in females. The GfEP-4 protein was detected in pseudo-eggs, where its concentration was 20-100 times lower than in eggs. In contrast, GfEP-1, 2 and 3 were not detected in pseudo-eggs. A protein (28 kDa) which cross-reacted with anti-GfEP-4 antibodies was detected in eggs of the coral Montipora digitata, suggesting the possibility that homologs of this protein are present in the eggs of other scleractinian corals. PMID:17551245

    • An improved approach for predicting drug-target interaction: proteochemometrics to molecular docking.

      PubMed

      Shaikh, Naeem; Sharma, Mahesh; Garg, Prabha

      2016-02-23

      Proteochemometric (PCM) methods, which use descriptors of both the interacting species, i.e. drug and the target, are being successfully employed for the prediction of drug-target interactions (DTI). However, unavailability of non-interacting dataset and determining the applicability domain (AD) of model are a main concern in PCM modeling. In the present study, traditional PCM modeling was improved by devising novel methodologies for reliable negative dataset generation and fingerprint based AD analysis. In addition, various types of descriptors and classifiers were evaluated for their performance. The Random Forest and Support Vector Machine models outperformed the other classifiers (accuracies >98% and >89% for 10-fold cross validation and external validation, respectively). The type of protein descriptors had negligible effect on the developed models, encouraging the use of sequence-based descriptors over the structure-based descriptors. To establish the practical utility of built models, targets were predicted for approved anticancer drugs of natural origin. The molecular recognition interactions between the predicted drug-target pair were quantified with the help of a reverse molecular docking approach. The majority of predicted targets are known for anticancer therapy. These results thus correlate well with anticancer potential of the selected drugs. Interestingly, out of all predicted DTIs, thirty were found to be reported in the ChEMBL database, further validating the adopted methodology. The outcome of this study suggests that the proposed approach, involving use of the improved PCM methodology and molecular docking, can be successfully employed to elucidate the intricate mode of action for drug molecules as well as repositioning them for new therapeutic applications. PMID:26822863

    • Characterization and genome functional analysis of a novel metamitron-degrading strain Rhodococcus sp. MET via both triazinone and phenyl rings cleavage.

      PubMed

      Fang, Hua; Xu, Tianheng; Cao, Duantao; Cheng, Longyin; Yu, Yunlong

      2016-01-01

      A novel bacterium capable of utilizing metamitron as the sole source of carbon and energy was isolated from contaminated soil and identified as Rhodococcus sp. MET based on its morphological characteristics, BIOLOG GP2 microplate profile, and 16S rDNA phylogeny. Genome sequencing and functional annotation of the isolate MET showed a 6,340,880 bp genome with a 62.47% GC content and 5,987 protein-coding genes. In total, 5,907 genes were annotated with the COG, GO, KEGG, Pfam, Swiss-Prot, TrEMBL, and nr databases. The degradation rate of metamitron by the isolate MET obviously increased with increasing substrate concentrations from 1 to 10 mg/l and subsequently decreased at 100 mg/l. The optimal pH and temperature for metamitron biodegradation were 7.0 and 20-30 °C, respectively. Based on genome annotation of the metamitron degradation genes and the metabolites detected by HPLC-MS/MS, the following metamitron biodegradation pathways were proposed: 1) Metamitron was transformed into 2-(3-hydrazinyl-2-ethyl)-hydrazono-2-phenylacetic acid by triazinone ring cleavage and further mineralization; 2) Metamitron was converted into 3-methyl-4-amino-6(2-hydroxy-muconic acid)-1,2,4-triazine-5(4H)-one by phenyl ring cleavage and further mineralization. The coexistence of diverse mineralization pathways indicates that our isolate may effectively bioremediate triazinone herbicide-contaminated soils. PMID:27578531

    • Analysis of Edg-Like LPA Receptor-Ligand Interactions.

      PubMed

      Balogh, Balazs; Pazmany, Tamas; Matyus, Peter

      2015-01-01

      The phospholipid derivative lysophosphatidic acid (LPA) serves as a signalling molecule through the activation of LPA receptors, which belong to the G-protein-coupled receptors. From a pharmacological point of view, the ('EDG-like') LPA1-3 receptors have attracted much attention, therefore we have also been focusing in our study on these subtypes. The LPA1receptors are widely expressed in the human body; interestingly, LPA1 might have a role in the pathomechanism of obesity. In order to recognize key structural features of the molecular interactions of human LPA1with its agonists, we built up the 3D structure of the LPA1 through homology modeling. Next, LPA1 agonists and antagonists were docked into the model. The mode of binding and the interactions between ligands and key amino acids (R3.28 and Q3.29) were consistent with mutagenesis assays and previously published models, indicating that this model is able to discriminate high-affinity compounds and may be useful for the development of novel agonists of LPA1. Homology models were also constructed for LPA2 and LPA3. All available agonists with published EC50 values, antagonists with IC50 values and compounds with Ki values for either of LPA1, LPA2 or LPA3 were collected from the ChEMBL database and were docked into the corresponding model.Ourmodels for the LPA1-3 receptors can discriminate high-affinity compounds identified in silico HTS studies and may be useful for the development of novel agonistsof LPA receptors. With a better understanding of the differences between LPA1-3 receptors new, selective agonists and antagonist could be designed, which could be used in the therapy of various diseases with a better side-effect profile. PMID:25686617

  1. [Prediction of short loops in the proteins with internal disorder].

    PubMed

    Deriusheva, E I; Galzitskaia, O V; Serdiuk, I N

    2008-01-01

    New possibility of the FoldUnfold program for prediction of short disordered regions (loops), which appears by using the short window width (3 amino acid residues), was described. For three representatives of the proteins G family the FoldUnfold program predicted almost all short loops and yield results are well compatible with the X-ray structure data. We have classified the loops predicted in the protein Ras-p21 structure in two types. In the first type, loops have high values of the Debye-Waller factor typical of the so-called functional loops (flexible loops). In the other type, loops have lower values of the Debye-Waller factor and can be considered as loops connecting secondary structure elements (rigid loops). When the results of prediction with the use of our program are compared with the results of other programs (PONDR, RONN, DisEMBL, PreLINK, IUPred, GlobPlot 2, FoldIndex), it is seen that the first enables far better prediction of short loop positions. Use of FoldUnfold for ubiquitin-like domain h-PLIC-2 allows to resolve such task as definition of boundary between the structured and unstructured regions in proteins with a big portion of disordered regions. The FoldUnfold program defines a clear boundary between the structured and unstructured regions at amino acid residues 30-31,whereas each of the other programs outlines the boundary from the 28-th amino acid residues through the 70th. PMID:19140328

  2. Antagonistic activity of Bacillus sp. obtained from an Algerian oilfield and chemical biocide THPS against sulfate-reducing bacteria consortium inducing corrosion in the oil industry.

    PubMed

    Gana, Mohamed Lamine; Kebbouche-Gana, Salima; Touzi, Abdelkader; Zorgani, Mohamed Amine; Pauss, André; Lounici, Hakim; Mameri, Nabil

    2011-03-01

    The present study enlightens the role of the antagonistic potential of nonpathogenic strain B21 against sulfate-reducing bacteria (SRB) consortium. The inhibitor effects of strain B21 were compared with those of the chemical biocide tetrakishydroxymethylphosphonium sulfate (THPS), generally used in the petroleum industry. The biological inhibitor exhibited much better and effective performance. Growth of SRB in coculture with bacteria strain B21 antagonist exhibited decline in SRB growth, reduction in production of sulfides, with consumption of sulfate. The observed effect seems more important in comparison with the effect caused by the tested biocide (THPS). Strain B21, a dominant facultative aerobic species, has salt growth requirement always above 5% (w/v) salts with optimal concentration of 10-15%. Phylogenetic analysis based on partial 16S rRNA gene sequences showed that strain B21 is a member of the genus Bacillus, being most closely related to Bacillus qingdaonensis DQ115802 (94.0% sequence similarity), Bacillus aidingensis DQ504377 (94.0%), and Bacillus salarius AY667494 (92.2%). Comparative analysis of partial 16S rRNA gene sequence data plus physiological, biochemical, and phenotypic features of the novel isolate and related species of Bacillus indicated that strain B21 may represent a novel species within the genus Bacillus, named Bacillus sp. (EMBL, FR671419). The results of this study indicate the application potential of Bacillus strain B21 as a biocontrol agent to fight corrosion in the oil industry. PMID:20949304

  3. Degradation of 4-amylphenol and 4-hexylphenol by a new activated sludge isolate of Pseudomonas veronii and proposal for a new subspecies status.

    PubMed

    Ajithkumar, Bindu; Ajithkumar, Vasudevan P; Iriye, Ryozo

    2003-01-01

    Novel Pseudomonas strains INA04, INA05, and INA06, were isolated from activated sludge. Strain INA06 was found to degrade long chain alkylphenols such as 4-n-amylphenol and 4-n-hexylphenol as the sole source of carbon, apart from co-metabolic degradation of 4-n-nonylphenol in the presence of phenol, while INA04 and INA05 could grow on phenol, but could not grow well on alkylphenols. Induction studies on strain INA06 revealed a broad substrate-specific phenol hydroxylase, for the metabolism of phenol and alkylphenols, inducible with phenol or para-substituted alkylphenol. They bore close resemblance to members of Pseudomonas sensu stricto. 16S rDNA sequence homology of INA06 was closest to P. veronii (99.7%). DNA-DNA hybridization pointed out higher linkage (64% similarity) to the type strain of P. veronii than to other species of Pseudomonas sensu stricto (>60%). The BOX-PCR profile of all INA strains was similar, but different from that of P. veronii. Since biochemical characteristics were similar to those of P. veronii, and genetic relatedness was at the margin of species differentiation level (70%), we propose these strains to be treated as a new subspecies of P. veronii. The type strain of this new subspecies, named P. veronii subsp. inensis subsp. nov., is strain INA06. The accession number of strain INA05 is CIP 107595=JCM11829, and that of INA06 is CIP107594(T)=JCM11828(T). The 16S rDNA sequence accession number (DDBJ/EMBL/GenBank) of strain INA06 is AB056120. PMID:12576154

  4. Inferring multi-target QSAR models with taxonomy-based multi-task learning

    PubMed Central

    2013-01-01

    Background A plethora of studies indicate that the development of multi-target drugs is beneficial for complex diseases like cancer. Accurate QSAR models for each of the desired targets assist the optimization of a lead candidate by the prediction of affinity profiles. Often, the targets of a multi-target drug are sufficiently similar such that, in principle, knowledge can be transferred between the QSAR models to improve the model accuracy. In this study, we present two different multi-task algorithms from the field of transfer learning that can exploit the similarity between several targets to transfer knowledge between the target specific QSAR models. Results We evaluated the two methods on simulated data and a data set of 112 human kinases assembled from the public database ChEMBL. The relatedness between the kinase targets was derived from the taxonomy of the humane kinome. The experiments show that multi-task learning increases the performance compared to training separate models on both types of data given a sufficient similarity between the tasks. On the kinase data, the best multi-task approach improved the mean squared error of the QSAR models of 58 kinase targets. Conclusions Multi-task learning is a valuable approach for inferring multi-target QSAR models for lead optimization. The application of multi-task learning is most beneficial if knowledge can be transferred from a similar task with a lot of in-domain knowledge to a task with little in-domain knowledge. Furthermore, the benefit increases with a decreasing overlap between the chemical space spanned by the tasks. PMID:23842210

  5. Sequence and transcriptional start site of the Pseudomonas aeruginosa outer membrane porin protein F gene.

    PubMed Central

    Duchêne, M; Schweizer, A; Lottspeich, F; Krauss, G; Marget, M; Vogel, K; von Specht, B U; Domdey, H

    1988-01-01

    Porin F is one of the major proteins of the outer membrane of Pseudomonas aeruginosa. It forms water-filled pores of variable size. Porin F is a candidate for a vaccine against P. aeruginosa because it antigenically cross-reacts in all serotype strains of the International Antigenic Typing Scheme. We have isolated the gene for porin F from a lambda EMBL3 bacteriophage library by using oligodeoxynucleotide hybridization probes and have determined its nucleotide sequence. Different peptide sequences obtained from isolated porin F confirmed the deduced protein sequence. The mature protein consists of 326 amino acid residues and has a molecular weight of 35,250. The precursor contains an N-terminal signal peptide of 24 amino acid residues. S1 protection and primer extension experiments, together with Northern (RNA) blots, indicate that the mRNA coding for porin F is monocistronic with short untranslated regions of about 58 bases at the 5' end and about 47 bases at the 3' end. The sequences in the -10 and -35 regions upstream of the transcriptional start site are closely related to the Escherichia coli promoter consensus sequences, which explains why the porin F gene is expressed in E. coli under the control of its own promoter. The amino acid sequence of porin F is not homologous to the different E. coli porins OmpF, OmpC, LamB, and PhoE. On the other hand, a highly homologous region of 30 amino acids between the OmpA proteins of different enteric bacteria and porin F of P. aeruginosa was detected. The core region of the homology to E. coli OmpA had 11 of 12 amino acid residues in common. Images PMID:2447060

  6. Get Your Atoms in Order--An Open-Source Implementation of a Novel and Robust Molecular Canonicalization Algorithm.

    PubMed

    Schneider, Nadine; Sayle, Roger A; Landrum, Gregory A

    2015-10-26

    Finding a canonical ordering of the atoms in a molecule is a prerequisite for generating a unique representation of the molecule. The canonicalization of a molecule is usually accomplished by applying some sort of graph relaxation algorithm, the most common of which is the Morgan algorithm. There are known issues with that algorithm that lead to noncanonical atom orderings as well as problems when it is applied to large molecules like proteins. Furthermore, each cheminformatics toolkit or software provides its own version of a canonical ordering, most based on unpublished algorithms, which also complicates the generation of a universal unique identifier for molecules. We present an alternative canonicalization approach that uses a standard stable-sorting algorithm instead of a Morgan-like index. Two new invariants that allow canonical ordering of molecules with dependent chirality as well as those with highly symmetrical cyclic graphs have been developed. The new approach proved to be robust and fast when tested on the 1.45 million compounds of the ChEMBL 20 data set in different scenarios like random renumbering of input atoms or SMILES round tripping. Our new algorithm is able to generate a canonical order of the atoms of protein molecules within a few milliseconds. The novel algorithm is implemented in the open-source cheminformatics toolkit RDKit. With this paper, we provide a reference Python implementation of the algorithm that could easily be integrated in any cheminformatics toolkit. This provides a first step toward a common standard for canonical atom ordering to generate a universal unique identifier for molecules other than InChI. PMID:26441310

  7. Differentiation of Phylogenetically Related Slowly Growing Mycobacteria Based on 16S-23S rRNA Gene Internal Transcribed Spacer Sequences

    PubMed Central

    Roth, Andreas; Fischer, Marga; Hamid, Mohamed E.; Michalke, Sabine; Ludwig, Wolfgang; Mauch, Harald

    1998-01-01

    Interspecific polymorphisms of the 16S rRNA gene (rDNA) are widely used for species identification of mycobacteria. 16S rDNA sequences, however, do not vary greatly within a species, and they are either indistinguishable in some species, for example, in Mycobacterium kansasii and M. gastri, or highly similar, for example, in M. malmoense and M. szulgai. We determined 16S-23S rDNA internal transcribed spacer (ITS) sequences of 60 strains in the genus Mycobacterium representing 13 species (M. avium, M. conspicuum, M. gastri, M. genavense, M. kansasii, M. malmoense, M. marinum, M. shimoidei, M. simiae, M. szulgai, M. triplex, M. ulcerans, and M. xenopi). An alignment of these sequences together with additional sequences available in the EMBL database (for M. intracellulare, M. phlei, M. smegmatis, and M. tuberculosis) was established according to primary- and secondary-structure similarities. Comparative sequence analysis applying different treeing methods grouped the strains into species-specific clusters with low sequence divergence between strains belonging to the same species (0 to 2%). The ITS-based tree topology only partially correlated to that based on 16S rDNA, but the main branching orders were preserved, notably, the division of fast-growing from slowly growing mycobacteria, separate branching for M. simiae, M. genavense, and M. triplex, and distinct branches for M. xenopi and M. shimoidei. Comparisons of M. gastri with M. kansasii and M. malmoense with M. szulgai revealed ITS sequence similarities of 93 and 88%, respectively. M. marinum and M. ulcerans possessed identical ITS sequences. Our results show that ITS sequencing represents a supplement to 16S rRNA gene sequences for the differentiation of closely related species. Slowly growing mycobacteria show a high sequence variation in the ITS; this variation has the potential to be used for the development of probes as a rapid approach to mycobacterial identification. PMID:9431937

  8. Analysis of Litopenaeus vannamei Transcriptome Using the Next-Generation DNA Sequencing Technique

    PubMed Central

    Li, Chaozheng; Weng, Shaoping; Chen, Yonggui; Yu, Xiaoqiang; Lü, Ling; Zhang, Haiqing; He, Jianguo; Xu, Xiaopeng

    2012-01-01

    Background Pacific white shrimp (Litopenaeus vannamei), the major species of farmed shrimps in the world, has been attracting extensive studies, which require more and more genome background knowledge. The now available transcriptome data of L. vannamei are insufficient for research requirements, and have not been adequately assembled and annotated. Methodology/Principal Findings This is the first study that used a next-generation high-throughput DNA sequencing technique, the Solexa/Illumina GA II method, to analyze the transcriptome from whole bodies of L. vannamei larvae. More than 2.4 Gb of raw data were generated, and 109,169 unigenes with a mean length of 396 bp were assembled using the SOAP denovo software. 73,505 unigenes (>200 bp) with good quality sequences were selected and subjected to annotation analysis, among which 37.80% can be matched in NCBI Nr database, 37.3% matched in Swissprot, and 44.1% matched in TrEMBL. Using BLAST and BLAST2Go softwares, 11,153 unigenes were classified into 25 Clusters of Orthologous Groups of proteins (COG) categories, 8171 unigenes were assigned into 51 Gene ontology (GO) functional groups, and 18,154 unigenes were divided into 220 Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. To primarily verify part of the results of assembly and annotations, 12 assembled unigenes that are homologous to many embryo development-related genes were chosen and subjected to RT-PCR for electrophoresis and Sanger sequencing analyses, and to real-time PCR for expression profile analyses during embryo development. Conclusions/Significance The L. vannamei transcriptome analyzed using the next-generation sequencing technique enriches the information of L. vannamei genes, which will facilitate our understanding of the genome background of crustaceans, and promote the studies on L. vannamei. PMID:23071809

  9. Benchmarking the next generation of homology inference tools

    PubMed Central

    Saripella, Ganapathi Varma; Sonnhammer, Erik L. L.; Forslund, Kristoffer

    2016-01-01

    Motivation: Over the last decades, vast numbers of sequences were deposited in public databases. Bioinformatics tools allow homology and consequently functional inference for these sequences. New profile-based homology search tools have been introduced, allowing reliable detection of remote homologs, but have not been systematically benchmarked. To provide such a comparison, which can guide bioinformatics workflows, we extend and apply our previously developed benchmark approach to evaluate the ‘next generation’ of profile-based approaches, including CS-BLAST, HHSEARCH and PHMMER, in comparison with the non-profile based search tools NCBI-BLAST, USEARCH, UBLAST and FASTA. Method: We generated challenging benchmark datasets based on protein domain architectures within either the PFAM + Clan, SCOP/Superfamily or CATH/Gene3D domain definition schemes. From each dataset, homologous and non-homologous protein pairs were aligned using each tool, and standard performance metrics calculated. We further measured congruence of domain architecture assignments in the three domain databases. Results: CSBLAST and PHMMER had overall highest accuracy. FASTA, UBLAST and USEARCH showed large trade-offs of accuracy for speed optimization. Conclusion: Profile methods are superior at inferring remote homologs but the difference in accuracy between methods is relatively small. PHMMER and CSBLAST stand out with the highest accuracy, yet still at a reasonable computational cost. Additionally, we show that less than 0.1% of Swiss-Prot protein pairs considered homologous by one database are considered non-homologous by another, implying that these classifications represent equivalent underlying biological phenomena, differing mostly in coverage and granularity. Availability and Implementation: Benchmark datasets and all scripts are placed at (http://sonnhammer.org/download/Homology_benchmark). Contact: forslund@embl.de Supplementary information: Supplementary data are available at

  10. Characterization of genes for an alternative nitrogenase in the cyanobacterium Anabaena variabilis.

    PubMed Central

    Thiel, T

    1993-01-01

    Anabaena variabilis ATCC 29413 is a heterotrophic, nitrogen-fixing cyanobacterium that has been reported to fix nitrogen and reduce acetylene to ethane in the absence of molybdenum. DNA from this strain hybridized well at low stringency to the nitrogenase 2 (vnfDGK) genes of Azotobacter vinelandii. The hybridizing region was cloned from a lambda EMBL3 genomic library of A. variabilis, mapped, and sequenced. The deduced amino acid sequences of the vnfD and vnfK genes of A. variabilis showed only about 56% similarity to the nifDK genes of Anabaena sp. strain PCC 7120 but were 76 to 86% similar to the anfDK or vnfDK genes of A. vinelandii. The organization of the vnf gene cluster in A. variabilis was similar to that of A. vinelandii. However, in A. variabilis, the vnfG gene was fused to vnfD; hence, this gene is designated vnfDG. A vnfH gene was not contiguous with the vnfDG gene and has not yet been identified. A mutant strain, in which a neomycin resistance cassette was inserted into the vnf cluster, grew well in a medium lacking a source of fixed nitrogen in the presence of molybdenum but grew poorly when vanadium replaced molybdenum. In contrast, the parent strain grew equally well in media containing either molybdenum or vanadium. The vnf genes were transcribed in the absence of molybdenum, with or without vanadium. The vnf gene cluster did not hybridize to chromosomal DNA from Anabaena sp. strain PCC 7120 or from the heterotrophic strains, Nostoc sp. strain Mac and Nostoc sp. strain ATCC 29150. A hybridizing ClaI fragment very similar in size to the A. variabilis ClaI fragment was present in DNA isolated from several independent, cultured isolates of Anabaena sp. from the Azolla symbiosis. Images PMID:8407800

  11. Purification and cloning of a proline 3-hydroxylase, a novel enzyme which hydroxylates free L-proline to cis-3-hydroxy-L-proline.

    PubMed Central

    Mori, H; Shibasaki, T; Yano, K; Ozaki, A

    1997-01-01

    Proline 3-hydroxylase was purified from Streptomyces sp. strain TH1, and its structural gene was cloned. The purified enzyme hydroxylated free L-proline to cis-3-hydroxy-L-proline and showed properties of a 2-oxoglutarate-dependent dioxygenase (H. Mori, T. Shibasaki, Y. Uosaki, K. Ochiai, and A. Ozaki, Appl. Environ. Microbiol, 62:1903-1907, 1996). The molecular mass of the purified enzyme was 35 kDa as determined by sodium dodecyl sulfate-polyacrylamide gel electrophoresis. The isoelectric point of the enzyme was 4.3. The optimal pH and temperature were 7.0 and 35 degrees C, respectively. The K(m) values were 0.56 and 0.11 mM for L-proline and 2-oxoglutarate, respectively. The Kcat value of hydroxylation was 3.2 s-1. Determined N-terminal and internal amino acid sequences of the purified protein were not found in the SwissProt protein database. A DNA fragment of 74 bp was amplified by PCR with degenerate primers based on the determined N-terminal amino acid sequence. With this fragment as a template, a digoxigenin-labeled N-terminal probe was synthesized by PCR. A 6.5-kbp chromosome fragment was cloned by colony hybridization with the labeled probe. The determined DNA sequence of the cloned fragment revealed a 870-bp open reading frame (ORF 3), encoding a protein of 290 amino acids with a calculated molecular weight of 33,158. No sequence homolog was found in EMBL, GenBank, and DDBJ databases. ORF 3 was expressed in Escherichia coli DH1. Recombinants showed hydroxylating activity five times higher than that of the original bacterium, Streptomyces sp. strain TH1. It was concluded that the ORF 3 encodes functional proline 3-hydroxylase. PMID:9294421

  12. BioSAXS Sample Changer: a robotic sample changer for rapid and reliable high-throughput X-ray solution scattering experiments

    SciTech Connect

    Round, Adam Felisaz, Franck; Fodinger, Lukas; Gobbo, Alexandre; Huet, Julien; Villard, Cyril; Blanchet, Clement E.; Roessle, Manfred; Svergun, Dmitri I.

    2015-01-01

    A robotic sample changer for solution X-ray scattering experiments optimized for speed and to use the minimum amount of material has been developed. This system is now in routine use at three high-brilliance European synchrotron sites, each capable of several hundred measurements per day. Small-angle X-ray scattering (SAXS) of macromolecules in solution is in increasing demand by an ever more diverse research community, both academic and industrial. To better serve user needs, and to allow automated and high-throughput operation, a sample changer (BioSAXS Sample Changer) that is able to perform unattended measurements of up to several hundred samples per day has been developed. The Sample Changer is able to handle and expose sample volumes of down to 5 µl with a measurement/cleaning cycle of under 1 min. The samples are stored in standard 96-well plates and the data are collected in a vacuum-mounted capillary with automated positioning of the solution in the X-ray beam. Fast and efficient capillary cleaning avoids cross-contamination and ensures reproducibility of the measurements. Independent temperature control for the well storage and for the measurement capillary allows the samples to be kept cool while still collecting data at physiological temperatures. The Sample Changer has been installed at three major third-generation synchrotrons: on the BM29 beamline at the European Synchrotron Radiation Facility (ESRF), the P12 beamline at the PETRA-III synchrotron (EMBL@PETRA-III) and the I22/B21 beamlines at Diamond Light Source, with the latter being the first commercial unit supplied by Bruker ASC.

  13. Versatile sample environments and automation for biological solution X-ray scattering experiments at the P12 beamline (PETRA III, DESY)

    PubMed Central

    Blanchet, Clement E.; Spilotros, Alessandro; Schwemmer, Frank; Graewert, Melissa A.; Kikhney, Alexey; Jeffries, Cy M.; Franke, Daniel; Mark, Daniel; Zengerle, Roland; Cipriani, Florent; Fiedler, Stefan; Roessle, Manfred; Svergun, Dmitri I.

    2015-01-01

    A high-brilliance synchrotron P12 beamline of the EMBL located at the PETRA III storage ring (DESY, Hamburg) is dedicated to biological small-angle X-ray scattering (SAXS) and has been designed and optimized for scattering experiments on macromolecular solutions. Scatterless slits reduce the parasitic scattering, a custom-designed miniature active beamstop ensures accurate data normalization and the photon-counting PILATUS 2M detector enables the background-free detection of weak scattering signals. The high flux and small beam size allow for rapid experiments with exposure time down to 30–50 ms covering the resolution range from about 300 to 0.5 nm. P12 possesses a versatile and flexible sample environment system that caters for the diverse experimental needs required to study macromolecular solutions. These include an in-vacuum capillary mode for standard batch sample analyses with robotic sample delivery and for continuous-flow in-line sample purification and characterization, as well as an in-air capillary time-resolved stopped-flow setup. A novel microfluidic centrifugal mixing device (SAXS disc) is developed for a high-throughput screening mode using sub-microlitre sample volumes. Automation is a key feature of P12; it is controlled by a beamline meta server, which coordinates and schedules experiments from either standard or nonstandard operational setups. The integrated SASFLOW pipeline automatically checks for consistency, and processes and analyses the data, providing near real-time assessments of overall parameters and the generation of low-resolution models within minutes of data collection. These advances, combined with a remote access option, allow for rapid high-throughput analysis, as well as time-resolved and screening experiments for novice and expert biological SAXS users. PMID:25844078

  14. Next-generation sequencing: a challenge to meet the increasing demand for training workshops in Australia.

    PubMed

    Watson-Haigh, Nathan S; Shang, Catherine A; Haimel, Matthias; Kostadima, Myrto; Loos, Remco; Deshpande, Nandan; Duesing, Konsta; Li, Xi; McGrath, Annette; McWilliam, Sean; Michnowicz, Simon; Moolhuijzen, Paula; Quenette, Steve; Revote, Jerico Nico De Leon; Tyagi, Sonika; Schneider, Maria V

    2013-09-01

    The widespread adoption of high-throughput next-generation sequencing (NGS) technology among the Australian life science research community is highlighting an urgent need to up-skill biologists in tools required for handling and analysing their NGS data. There is currently a shortage of cutting-edge bioinformatics training courses in Australia as a consequence of a scarcity of skilled trainers with time and funding to develop and deliver training courses. To address this, a consortium of Australian research organizations, including Bioplatforms Australia, the Commonwealth Scientific and Industrial Research Organisation and the Australian Bioinformatics Network, have been collaborating with EMBL-EBI training team. A group of Australian bioinformaticians attended the train-the-trainer workshop to improve training skills in developing and delivering bioinformatics workshop curriculum. A 2-day NGS workshop was jointly developed to provide hands-on knowledge and understanding of typical NGS data analysis workflows. The road show-style workshop was successfully delivered at five geographically distant venues in Australia using the newly established Australian NeCTAR Research Cloud. We highlight the challenges we had to overcome at different stages from design to delivery, including the establishment of an Australian bioinformatics training network and the computing infrastructure and resource development. A virtual machine image, workshop materials and scripts for configuring a machine with workshop contents have all been made available under a Creative Commons Attribution 3.0 Unported License. This means participants continue to have convenient access to an environment they had become familiar and bioinformatics trainers are able to access and reuse these resources. PMID:23543352

  15. Molecular cloning of a mouse DNA repair gene that complements the defect of group-A xeroderma pigmentosum

    SciTech Connect

    Tanaka, K.; Satokata, I.; Ogita, Z.; Uchida, T.; Okada, Y.

    1989-07-01

    For isolation of the gene responsible for xeroderma pigmentosum (XP) complementation group A, plasmid pSV2gpt and genomic DNA from a mouse embryo were cotransfected into XP2OSSV cells, a group-A XP cell line. Two primary UV-resistant XP transfectants were isolated from about 1.6 X 10(5) pSV2gpt-transformed XP colonies. pSV2gpt and genomic DNA from the primary transfectants were again cotransfected into XP2OSSV cells and a secondary UV-resistant XP transfectant was obtained by screening about 4.8 X 10(5) pSV2gpt-transformed XP colonies. The secondary transfectant retained fewer mouse repetitive sequences. A mouse gene that complements the defect of XP2OSSV cells was cloned into an EMBL3 vector from the genome of a secondary transfectant. Transfections of the cloned DNA also conferred UV resistance on another group-A XP cell line but not on XP cell lines of group C, D, F, or G. Northern blot analysis of poly(A)+ RNA with a subfragment of cloned mouse DNA repair gene as the probe revealed that an approximately 1.0 kilobase mRNA was transcribed in the donor mouse embryo and secondary transfectant, and approximately 1.0- and approximately 1.3-kilobase mRNAs were transcribed in normal human cells, but none of these mRNAs was detected in three strains of group-A XP cells. These results suggest that the cloned DNA repair gene is specific for group-A XP and may be the mouse homologue of the group-A XP human gene.

  16. Whole genome analysis of Vietnamese G2P[4] rotavirus strains possessing the NSP2 gene sharing an ancestral sequence with Chinese sheep and goat rotavirus strains.

    PubMed

    Do, Loan Phuong; Doan, Yen Hai; Nakagomi, Toyoko; Gauchan, Punita; Kaneko, Miho; Agbemabiese, Chantal; Dang, Anh Duc; Nakagomi, Osamu

    2015-10-01

    Because imminent introduction into Vietnam of a vaccine against Rotavirus A is anticipated, baseline information on the whole genome of representative strains is needed to understand changes in circulating strains that may occur after vaccine introduction. In this study, the whole genomes of two G2P[4] strains detected in Nha Trang, Vietnam in 2008 were sequenced, this being the last period during which virtually no rotavirus vaccine was used in this country. The two strains were found to be >99.9% identical in sequence and had a typical DS-1 like G2-P[4]-I2-R2-C2-M2-A2-N2-T2-E2-H2 genotype constellation. Analysis of the Vietnamese strains with >184 G2P[4] strains retrieved from GenBank/EMBL/DDBJ DNA databases placed the Vietnamese strains in one of the lineages commonly found among contemporary strains, with the exception of the NSP2 and NSP4 genes. The NSP2 genes were found to belong to a previously undescribed lineage that diverged from Chinese sheep and goat rotavirus strains, including a Chinese rotavirus vaccine strain LLR with 95% nucleotide identity; the time of their most recent common ancestor was 1975. The NSP4 genes were found to belong, together with Thai and USA strains, to an emergent lineage (VIII), adding further diversity to ever diversifying NSP4 lineages. Thus, there is a need to enhance surveillance of locally-circulating strains from both children and animals at the whole genome level to address the effect of rotavirus vaccines on changing strain distribution. PMID:26382233

  17. Heterogeneous classifier fusion for ligand-based virtual screening: or, how decision making by committee can be a good thing.

    PubMed

    Riniker, Sereina; Fechner, Nikolas; Landrum, Gregory A

    2013-11-25

    The concept of data fusion - the combination of information from different sources describing the same object with the expectation to generate a more accurate representation - has found application in a very broad range of disciplines. In the context of ligand-based virtual screening (VS), data fusion has been applied to combine knowledge from either different active molecules or different fingerprints to improve similarity search performance. Machine-learning (ML) methods based on fusion of multiple homogeneous classifiers, in particular random forests, have also been widely applied in the ML literature. The heterogeneous version of classifier fusion - fusing the predictions from different model types - has been less explored. Here, we investigate heterogeneous classifier fusion for ligand-based VS using three different ML methods, RF, naïve Bayes (NB), and logistic regression (LR), with four 2D fingerprints, atom pairs, topological torsions, RDKit fingerprint, and circular fingerprint. The methods are compared using a previously developed benchmarking platform for 2D fingerprints which is extended to ML methods in this article. The original data sets are filtered for difficulty, and a new set of challenging data sets from ChEMBL is added. Data sets were also generated for a second use case: starting from a small set of related actives instead of diverse actives. The final fused model consistently outperforms the other approaches across the broad variety of targets studied, indicating that heterogeneous classifier fusion is a very promising approach for ligand-based VS. The new data sets together with the adapted source code for ML methods are provided in the Supporting Information . PMID:24171408

  18. LabDisk for SAXS: a centrifugal microfluidic sample preparation platform for small-angle X-ray scattering.

    PubMed

    Schwemmer, Frank; Blanchet, Clement E; Spilotros, Alessandro; Kosse, Dominique; Zehnle, Steffen; Mertens, Haydyn D T; Graewert, Melissa A; Rössle, Manfred; Paust, Nils; Svergun, Dmitri I; von Stetten, Felix; Zengerle, Roland; Mark, Daniel

    2016-03-23

    We present a centrifugal microfluidic LabDisk for protein structure analysis via small-angle X-ray scattering (SAXS) on synchrotron beamlines. One LabDisk prepares 120 different measurement conditions, grouped into six dilution matrices. Each dilution matrix: (1) features automatic generation of 20 different measurement conditions from three input liquids and (2) requires only 2.5 μl of protein solution, which corresponds to a tenfold reduction in sample volume in comparison to the state of the art. Total hands on time for preparation of 120 different measurement conditions is less than 5 min. Read-out is performed on disk within the synchrotron beamline P12 at EMBL Hamburg (PETRA III, DESY). We demonstrate: (1) aliquoting of 40 nl aliquots for five different liquids typically used in SAXS and (2) confirm fluidic performance of aliquoting, merging, mixing and read-out from SAXS experiments (2.7-4.4% CV of protein concentration). We apply the LabDisk for SAXS for basic analysis methods, such as measurement of the radius of gyration, and advanced analysis methods, such as the ab initio calculation of 3D models. The suitability of the LabDisk for SAXS for protein structure analysis under different environmental conditions is demonstrated for glucose isomerase under varying protein and NaCl concentrations. We show that the apparent radius of gyration of the negatively charged glucose isomerase decreases with increasing protein concentration at low salt concentration. At high salt concentration the radius of gyration (Rg) does not change with protein concentrations. Such experiments can be performed by a non-expert, since the LabDisk for SAXS does not require attachment of tubings or pumps and can be filled with regular pipettes. The new platform has the potential to introduce routine high-throughput SAXS screening of protein structures with minimal input volumes to the regular operation of synchrotron beamlines. PMID:26931639

  19. Improving protein identification from peptide mass fingerprinting through a parameterized multi-level scoring algorithm and an optimized peak detection.

    PubMed

    Gras, R; Müller, M; Gasteiger, E; Gay, S; Binz, P A; Bienvenut, W; Hoogland, C; Sanchez, J C; Bairoch, A; Hochstrasser, D F; Appel, R D

    1999-12-01

    We have developed a new algorithm to identify proteins by means of peptide mass fingerprinting. Starting from the matrix-assisted laser desorption/ionization-time-of-flight (MALDI-TOF) spectra and environmental data such as species, isoelectric point and molecular weight, as well as chemical modifications or number of missed cleavages of a protein, the program performs a fully automated identification of the protein. The first step is a peak detection algorithm, which allows precise and fast determination of peptide masses, even if the peaks are of low intensity or they overlap. In the second step the masses and environmental data are used by the identification algorithm to search in protein sequence databases (SWISS-PROT and/or TrEMBL) for protein entries that match the input data. Consequently, a list of candidate proteins is selected from the database, and a score calculation provides a ranking according to the quality of the match. To define the most discriminating scoring calculation we analyzed the respective role of each parameter in two directions. The first one is based on filtering and exploratory effects, while the second direction focuses on the levels where the parameters intervene in the identification process. Thus, according to our analysis, all input parameters contribute to the score, however with different weights. Since it is difficult to estimate the weights in advance, they have been computed with a generic algorithm, using a training set of 91 protein spectra with their environmental data. We tested the resulting scoring calculation on a test set of ten proteins and compared the identification results with those of other peptide mass fingerprinting programs. PMID:10612280

  20. UniProtKB/Swiss-Prot.

    PubMed

    Boutet, Emmanuel; Lieberherr, Damien; Tognolli, Michael; Schneider, Michel; Bairoch, Amos

    2007-01-01

    The Swiss Institute of Bioinformatics (SIB), the European Bioinformatics Institute (EBI), and the Protein Information Resource (PIR) form the Universal Protein Resource (UniProt) consortium. Its main goal is to provide the scientific community with a central resource for protein sequences and functional information. The UniProt consortium maintains the UniProt KnowledgeBase (UniProtKB) and several supplementary databases including the UniProt Reference Clusters (UniRef) and the UniProt Archive (UniParc). (1) UniProtKB is a comprehensive protein sequence knowledgebase that consists of two sections: UniProtKB/Swiss-Prot, which contains manually annotated entries, and UniProtKB/TrEMBL, which contains computer-annotated entries. UniProtKB/Swiss-Prot entries contain information curated by biologists and provide users with cross-links to about 100 external databases and with access to additional information or tools. (2) The UniRef databases (UniRef100, UniRef90, and UniRef50) define clusters of protein sequences that share 100, 90, or 50% identity. (3) The UniParc database stores and maps all publicly available protein sequence data, including obsolete data excluded from UniProtKB. The UniProt databases can be accessed online (http://www.uniprot.org/) or downloaded in several formats (ftp://ftp.uniprot.org/pub). New releases are published every 2 weeks. The purpose of this chapter is to present a guided tour of a UniProtKB/Swiss-Prot entry, paying particular attention to the specificities of plant protein annotation. We will also present some of the tools and databases that are linked to each entry. PMID:18287689

  1. Next-generation sequencing: a challenge to meet the increasing demand for training workshops in Australia

    PubMed Central

    Watson-Haigh, Nathan S.; Shang, Catherine A.; Haimel, Matthias; Kostadima, Myrto; Loos, Remco; Deshpande, Nandan; Duesing, Konsta; Li, Xi; McGrath, Annette; McWilliam, Sean; Michnowicz, Simon; Moolhuijzen, Paula; Quenette, Steve; Revote, Jerico Nico De Leon; Tyagi, Sonika; Schneider, Maria V.

    2013-01-01

    The widespread adoption of high-throughput next-generation sequencing (NGS) technology among the Australian life science research community is highlighting an urgent need to up-skill biologists in tools required for handling and analysing their NGS data. There is currently a shortage of cutting-edge bioinformatics training courses in Australia as a consequence of a scarcity of skilled trainers with time and funding to develop and deliver training courses. To address this, a consortium of Australian research organizations, including Bioplatforms Australia, the Commonwealth Scientific and Industrial Research Organisation and the Australian Bioinformatics Network, have been collaborating with EMBL-EBI training team. A group of Australian bioinformaticians attended the train-the-trainer workshop to improve training skills in developing and delivering bioinformatics workshop curriculum. A 2-day NGS workshop was jointly developed to provide hands-on knowledge and understanding of typical NGS data analysis workflows. The road show–style workshop was successfully delivered at five geographically distant venues in Australia using the newly established Australian NeCTAR Research Cloud. We highlight the challenges we had to overcome at different stages from design to delivery, including the establishment of an Australian bioinformatics training network and the computing infrastructure and resource development. A virtual machine image, workshop materials and scripts for configuring a machine with workshop contents have all been made available under a Creative Commons Attribution 3.0 Unported License. This means participants continue to have convenient access to an environment they had become familiar and bioinformatics trainers are able to access and reuse these resources. PMID:23543352

  2. Use of a selection technique to identify the diversity of binding sites for the yeast RAP1 transcription factor.

    PubMed Central

    Graham, I R; Chambers, A

    1994-01-01

    We have used the technique known as selected and amplified binding (SAAB) to isolate binding sites for the yeast transcription factor RAP1 from a degenerate pool of oligonucleotides. A total of 47 sequences were isolated, of which two were shown to be contaminating non-RAP1 binding sites. After excluding these two sequences the remainder of the sequences were used to derive a new consensus binding site for RAP1. The new consensus 5' A/G T A/G C A C C C A N N C C/A C C 3' is a significant extension of the existing consensus (4). It is longer by two base pairs at the 5' end and is significantly more constrained at the 3' end. An analysis of the combinations of mis-matches in individual SAAB sequences, compared to the consensus RAP1 binding site, has allowed us to analyse the structure of the RAP1 binding site in some detail. The binding site can be sub-divided into three regions; a core binding site, a 5' flanking region and a 3' flanking region. The core binding site, consisting of the sequence 5'CACCCA3', is critical for recognition by RAP1. The less conserved flanking regions are not as important. Interactions between RAP1 and these regions probably stabilise the interaction between RAP1 and the core binding site. Each of the sequences isolated in the SAAB analysis was used to search release 78 of the EMBL+GenBank DNA data base. The searches identified 102 potential binding sites for RAP1 within promoters of yeast genes. Images PMID:8121795

  3. Presence of two transcribed malate synthase genes in an n-alkane-utilizing yeast, Candida tropicalis.

    PubMed

    Hikida, M; Atomi, H; Fukuda, Y; Aoki, A; Hishida, T; Teranishi, Y; Ueda, M; Tanaka, A

    1991-12-01

    The presence of two genomic DNA regions encoding malate synthase (MS) was shown by Southern blot analysis of the genomic DNA from an n-alkane-assimilating yeast, Candida tropicalis, using a partial MS cDNA probe, in accordance with the fact that two types of partial MS cDNAs have previously been isolated. This was also confirmed by the restriction mapping of the two genes screened from the yeast lambda EMBL library. Nucleotide sequence analysis of the respective genomic DNAs, named MS-1 gene and MS-2 gene, revealed that both regions encoding MS had the same length of 1,653 base pairs, corresponding to 551 amino acids (molecular mass of MS-1, 62,448 Da; MS-2, 62,421 Da). Although 29 nucleotide pairs differed in the sequences of the coding regions, the number of amino acid replacements was only one: 159Asn (MS-1)----159Ser (MS-2). In the 5'-flanking regions, there were replacements of four nucleotide pairs, deletion of one pair, and insertion of four pairs. In spite of the fact that two genomic genes were present and transcribed, RNA blot analysis demonstrated that only one band (about 2 kb) was observable even when the carbon sources in the cultivation medium were changed. A comparison of the amino acid sequences was made with MSs of rape (Brassica napus L.), cucumber seed, pumpkin seed, Escherichia coli, and Hansenula polymorpha. A high homology was observed among these enzymes, the results indicating that the protein structure was relatively well conserved through the evolution of the molecule.(ABSTRACT TRUNCATED AT 250 WORDS) PMID:1794980

  4. Insights from the GC content analysis of 76genome survey sequences (GSS) from Elaeisoleiferaψ

    PubMed Central

    Bhore, Subhash J; Kassim, Amelia; Shah, Farida H

    2010-01-01

    South American oil-palm (Elaeis oleifera) is not cultivated in tropical countries like Malaysia on large scale due to low yield of palm oil derived from its fruit mesocarp. However, its fruit mesocarp oil contains about 68.6 % oleic acid (C18:1) which is more than double in comparison to commercially cultivated oilpalm, E. guineensis Jacq Tenera (hybrid of Dura (♀) x Pisifera (♂)). It is also known that E. oleifera is a good source of tocotrienols and carotenoids. Therefore, it is of interest to know the genome sequence of E. oleifera. The objective of this study is to generate genome survey sequences (GSS) to get GC content insight in the E. oleifera genome. The nuclear genomic DNA isolated from young leaf‐tissues was digested with EcoRI and NdeI/DraI restriction enzymes; and three genomic DNA libraries were constructed using Lambda ZAP‐II, pGEM®‐T Easy, and pDONR 222™ as cloning vectors. Generated 76 GSSs were analyzed by using Bioinformatics tools. The analysis result indicates that the adenine, cytosine, guanine and thymine content in generated GSSs are 30%, 20%, 20%, and 30% respectively. In conclusion, based on the precise GC content analysis of the randomly isolated 76 GSSs by using Bioinformatics tools we hypothesize that GC content in E. oleifera genome is 40%. The hypothesized 40% GC content in E. oleifera genome is expected to remain close to the GC content based on the whole genome analysis. ψThe nucleotide sequence data reported in this paper have been submitted to dbGSS division of the international DNA database (GenBank/DDBJ/EMBL) under accession numbers: DX575945- DX575972 and EI798032-EI798079. Abbreviations gDNA - Nuclear genomic DNA, GSSs - Genome survey sequences K12, SAOP - South American oil‐palm Db1 PMID:21364775

  5. Structure-Based Consensus Scoring Scheme for Selecting Class A Aminergic GPCR Fragments.

    PubMed

    Kelemen, Ádám A; Kiss, Róbert; Ferenczy, György G; Kovács, László; Flachner, Beáta; Lőrincz, Zsolt; Keserű, György M

    2016-02-22

    Aminergic G-protein coupled receptors (GPRCs) represent well-known targets of central nervous-system related diseases. In this study a structure-based consensus virtual screening scheme was developed for designing targeted fragment libraries against class A aminergic GPCRs. Nine representative aminergic GPCR structures were selected by first clustering available X-ray structures and then choosing the one in each cluster that performs best in self-docking calculations. A consensus scoring protocol was developed using known promiscuous aminergic ligands and decoys as a training set. The consensus score (FrACS-fragment aminergic consensus score) calculated for the optimized protein ensemble showed improved enrichments in most cases as compared to stand-alone structures. Retrospective validation was carried out on public screening data for aminergic targets (5-HT1 serotonin receptor, TA1 trace-amine receptor) showing 8-17-fold enrichments using an ensemble of aminergic receptor structures. The performance of the structure based FrACS in combination with our ligand-based prefilter (FrAGS) was investigated both in a retrospective validation on the ChEMBL database and in a prospective validation on an in-house fragment library. In prospective validation virtual fragment hits were tested on 5-HT6 serotonin receptors not involved in the development of FrACS. Six out of the 36 experimentally tested fragments exhibited remarkable antagonist efficacies, and 4 showed IC50 values in the low micromolar or submicromolar range in a cell-based assay. Both retrospective and prospective validations revealed that the methodology is suitable for designing focused class A GPCR fragment libraries from large screening decks, commercial compound collections, or virtual databases. PMID:26760056

  6. The 2016 database issue of Nucleic Acids Research and an updated molecular biology database collection

    PubMed Central

    Rigden, Daniel J.; Fernández-Suárez, Xosé M.; Galperin, Michael Y.

    2016-01-01

    The 2016 Database Issue of Nucleic Acids Research starts with overviews of the resources provided by three major bioinformatics centers, the U.S. National Center for Biotechnology Information (NCBI), the European Bioinformatics Institute (EMBL-EBI) and Swiss Institute for Bioinformatics (SIB). Also included are descriptions of 62 new databases and updates on 95 databases that have been previously featured in NAR plus 17 previously described elsewhere. A number of papers in this issue deal with resources on nucleic acids, including various kinds of non-coding RNAs and their interactions, molecular dynamics simulations of nucleic acid structure, and two databases of super-enhancers. The protein database section features important updates on the EBI's Pfam, PDBe and PRIDE databases, as well as a variety of resources on pathways, metabolomics and metabolic modeling. This issue also includes updates on popular metagenomics resources, such as MG-RAST, EBI Metagenomics, and probeBASE, as well as a newly compiled Human Pan-Microbe Communities database. A significant fraction of the new and updated databases are dedicated to the genetic basis of disease, primarily cancer, and various aspects of drug research, including resources for patented drugs, their side effects, withdrawn drugs, and potential drug targets. A further six papers present updated databases of various antimicrobial and anticancer peptides. The entire Database Issue is freely available online on the Nucleic Acids Research website (http://nar.oxfordjournals.org/). The NAR online Molecular Biology Database Collection, http://www.oxfordjournals.org/nar/database/c/, has been updated with the addition of 88 new resources and removal of 23 obsolete websites, which brought the current listing to 1685 databases. PMID:26740669

  7. Making Transporter Models for Drug-Drug Interaction Prediction Mobile.

    PubMed

    Ekins, Sean; Clark, Alex M; Wright, Stephen H

    2015-10-01

    The past decade has seen increased numbers of studies publishing ligand-based computational models for drug transporters. Although they generally use small experimental data sets, these models can provide insights into structure-activity relationships for the transporter. In addition, such models have helped to identify new compounds as substrates or inhibitors of transporters of interest. We recently proposed that many transporters are promiscuous and may require profiling of new chemical entities against multiple substrates for a specific transporter. Furthermore, it should be noted that virtually all of the published ligand-based transporter models are only accessible to those involved in creating them and, consequently, are rarely shared effectively. One way to surmount this is to make models shareable or more accessible. The development of mobile apps that can access such models is highlighted here. These apps can be used to predict ligand interactions with transporters using Bayesian algorithms. We used recently published transporter data sets (MATE1, MATE2K, OCT2, OCTN2, ASBT, and NTCP) to build preliminary models in a commercial tool and in open software that can deliver the model in a mobile app. In addition, several transporter data sets extracted from the ChEMBL database were used to illustrate how such public data and models can be shared. Predicting drug-drug interactions for various transporters using computational models is potentially within reach of anyone with an iPhone or iPad. Such tools could help prioritize which substrates should be used for in vivo drug-drug interaction testing and enable open sharing of models. PMID:26199424

  8. Characterization and genome functional analysis of a novel metamitron-degrading strain Rhodococcus sp. MET via both triazinone and phenyl rings cleavage

    PubMed Central

    Fang, Hua; Xu, Tianheng; Cao, Duantao; Cheng, Longyin; Yu, Yunlong

    2016-01-01

    A novel bacterium capable of utilizing metamitron as the sole source of carbon and energy was isolated from contaminated soil and identified as Rhodococcus sp. MET based on its morphological characteristics, BIOLOG GP2 microplate profile, and 16S rDNA phylogeny. Genome sequencing and functional annotation of the isolate MET showed a 6,340,880 bp genome with a 62.47% GC content and 5,987 protein-coding genes. In total, 5,907 genes were annotated with the COG, GO, KEGG, Pfam, Swiss-Prot, TrEMBL, and nr databases. The degradation rate of metamitron by the isolate MET obviously increased with increasing substrate concentrations from 1 to 10 mg/l and subsequently decreased at 100 mg/l. The optimal pH and temperature for metamitron biodegradation were 7.0 and 20–30 °C, respectively. Based on genome annotation of the metamitron degradation genes and the metabolites detected by HPLC-MS/MS, the following metamitron biodegradation pathways were proposed: 1) Metamitron was transformed into 2-(3-hydrazinyl-2-ethyl)-hydrazono-2-phenylacetic acid by triazinone ring cleavage and further mineralization; 2) Metamitron was converted into 3-methyl-4-amino-6(2-hydroxy-muconic acid)-1,2,4-triazine-5(4H)-one by phenyl ring cleavage and further mineralization. The coexistence of diverse mineralization pathways indicates that our isolate may effectively bioremediate triazinone herbicide-contaminated soils. PMID:27578531

  9. Insights into corn genes derived from large-scale cDNA sequencing.

    PubMed

    Alexandrov, Nickolai N; Brover, Vyacheslav V; Freidin, Stanislav; Troukhan, Maxim E; Tatarinova, Tatiana V; Zhang, Hongyu; Swaller, Timothy J; Lu, Yu-Ping; Bouck, John; Flavell, Richard B; Feldmann, Kenneth A

    2009-01-01

    We present a large portion of the transcriptome of Zea mays, including ESTs representing 484,032 cDNA clones from 53 libraries and 36,565 fully sequenced cDNA clones, out of which 31,552 clones are non-redundant. These and other previously sequenced transcripts have been aligned with available genome sequences and have provided new insights into the characteristics of gene structures and promoters within this major crop species. We found that although the average number of introns per gene is about the same in corn and Arabidopsis, corn genes have more alternatively spliced isoforms. Examination of the nucleotide composition of coding regions reveals that corn genes, as well as genes of other Poaceae (Grass family), can be divided into two classes according to the GC content at the third position in the amino acid encoding codons. Many of the transcripts that have lower GC content at the third position have dicot homologs but the high GC content transcripts tend to be more specific to the grasses. The high GC content class is also enriched with intronless genes. Together this suggests that an identifiable class of genes in plants is associated with the Poaceae divergence. Furthermore, because many of these genes appear to be derived from ancestral genes that do not contain introns, this evolutionary divergence may be the result of horizontal gene transfer from species not only with different codon usage but possibly that did not have introns, perhaps outside of the plant kingdom. By comparing the cDNAs described herein with the non-redundant set of corn mRNAs in GenBank, we estimate that there are about 50,000 different protein coding genes in Zea. All of the sequence data from this study have been submitted to DDBJ/GenBank/EMBL under accession numbers EU940701-EU977132 (FLI cDNA) and FK944382-FL482108 (EST). PMID:18937034

  10. Dictionary-driven prokaryotic gene finding.

    PubMed

    Shibuya, Tetsuo; Rigoutsos, Isidore

    2002-06-15

    Gene identification, also known as gene finding or gene recognition, is among the important problems of molecular biology that have been receiving increasing attention with the advent of large scale sequencing projects. Previous strategies for solving this problem can be categorized into essentially two schools of thought: one school employs sequence composition statistics, whereas the other relies on database similarity searches. In this paper, we propose a new gene identification scheme that combines the best characteristics from each of these two schools. In particular, our method determines gene candidates among the ORFs that can be identified in a given DNA strand through the use of the Bio-Dictionary, a database of patterns that covers essentially all of the currently available sample of the natural protein sequence space. Our approach relies entirely on the use of redundant patterns as the agents on which the presence or absence of genes is predicated and does not employ any additional evidence, e.g. ribosome-binding site signals. The Bio-Dictionary Gene Finder (BDGF), the algorithm's implementation, is a single computational engine able to handle the gene identification task across distinct archaeal and bacterial genomes. The engine exhibits performance that is characterized by simultaneous very high values of sensitivity and specificity, and a high percentage of correctly predicted start sites. Using a collection of patterns derived from an old (June 2000) release of the Swiss-Prot/TrEMBL database that contained 451 602 proteins and fragments, we demonstrate our method's generality and capabilities through an extensive analysis of 17 complete archaeal and bacterial genomes. Examples of previously unreported genes are also shown and discussed in detail. PMID:12060689

  11. Improved Multiplex PCR Using Conserved and Species-Specific 16S rRNA Gene Primers for Simultaneous Detection of Actinobacillus actinomycetemcomitans, Bacteroides forsythus, and Porphyromonas gingivalis

    PubMed Central

    Tran, Simon Dangtuan; Rudney, Joel D.

    1999-01-01

    Among putative periodontal pathogens, Actinobacillus actinomycetemcomitans, Bacteroides forsythus, and Porphyromonas gingivalis are most convincingly implicated as etiological agents in periodontitis. Therefore, techniques for detection of those three species would be of value. We previously published a description of a multiplex PCR that detects A. actinomycetemcomitans and P. gingivalis. The present paper presents an improvement on that technique, which now allows more sensitive detection of all three periodontal pathogens. Sensitivity was determined by testing serial dilutions of A. actinomycetemcomitans, B. forsythus, and P. gingivalis cells. Primer specificity was tested against (i) all gene sequences from the GenBank-EMBL database, (ii) six A. actinomycetemcomitans, one B. forsythus, and four P. gingivalis strains, (iii) eight different species of oral bacteria, and (iv) supra- and subgingival plaque samples from 20 healthy subjects and subgingival plaque samples from 10 patients with periodontitis. The multiplex PCR had a detection limit of 10 A. actinomycetemcomitans, 10 P. gingivalis, and 100 B. forsythus cells. Specificity was confirmed by the fact that (i) none of our forward primers were homologous to the 16S rRNA genes of other oral species, (ii) amplicons of predicted size were detected for all A. actinomycetemcomitans, B. forsythus, and P. gingivalis strains tested, and (iii) no amplicons were detected for the eight other bacterial species. A. actinomycetemcomitans, B. forsythus, and P. gingivalis were detected in 6 of 20, 1 of 20, and 11 of 20 of supragingival plaque samples, respectively, and 4 of 20, 7 of 20, and 13 of 20 of subgingival plaque samples, respectively, from periodontally healthy subjects. Among patients with periodontitis, the organisms were detected in 7 of 10, 10 of 10, and 7 of 10 samples, respectively. The simultaneous detection of three periodontal pathogens is an advantage of this technique over conventional PCR assays. PMID

  12. Structural-functional characterization of the cathodic haemoglobin of the conger eel Conger conger: molecular modelling study of an additional phosphate-binding site.

    PubMed Central

    Pellegrini, Mariagiuseppina; Giardina, Bruno; Verde, Cinzia; Carratore, Vito; Olianas, Alessandra; Sollai, Luigi; Sanna, Maria T; Castagnola, Massimo; di Prisco, Guido

    2003-01-01

    The protein sequence data for the alpha- and beta-chains have been deposited in the SWISS-PROT and TrEMBL protein knowledgebase under the accession numbers P83479 and P83478 respectively. The Conger conger (conger eel) haemoglobin (Hb) system is made of three components, one of which, the so-called cathodic Hb, representing approx. 20% of the total pigment, has been purified and characterized from both a structural and functional point of view. Stripped Hb showed a reverse Bohr effect, high oxygen affinity and slightly low cooperativity in the absence of any effector. Addition of saturating GTP strongly influences the pH dependence of the oxygen affinity, since the reverse Bohr effect, observed under stripped conditions, is converted into a small normal Bohr effect. A further investigation of the GTP effect on oxygen affinity, carried out by fitting its titration curve, demonstrated the presence of two independent binding sites. Therefore, on the basis of the amino acid sequence of the alpha- and beta-chains, which have been determined, a computer modelling study has been performed. The data suggest that C. conger cathodic Hb may bind organic phosphates at two distinct binding sites located along the central cavity of the tetramer by hydrogen bonds and/or electrostatic interactions with amino acid residues of both chains, which have been identified. Among these residues, the two Lys-alpha(G6) (where the letter refers to the haemoglobin helix and the number to the amino acid position in the helix) appear to have a key role in the GTP movement from the external binding region to the internal central cavity of the tetrameric molecule. PMID:12646043

  13. Differentiation of Bacillus pumilus and Bacillus safensis Using MALDI-TOF-MS

    PubMed Central

    Branquinho, Raquel; Sousa, Clara; Lopes, João; Pintado, Manuela E.; Peixe, Luísa V.; Osório, Hugo

    2014-01-01

    Matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF-MS) despite being increasingly used as a method for microbial identification, still present limitations in which concerns the differentiation of closely related species. Bacillus pumillus and Bacillus safensis, are species of biotechnological and pharmaceutical significance, difficult to differentiate by conventional methodologies. In this study, using a well-characterized collection of B. pumillus and B. safensis isolates, we demonstrated the suitability of MALDI-TOF-MS combined with chemometrics to accurately and rapidly identify them. Moreover, characteristic species-specific ion masses were tentatively assigned, using UniProtKB/Swiss-Prot and UniProtKB/TrEMBL databases and primary literature. Delineation of B. pumilus (ions at m/z 5271 and 6122) and B. safensis (ions at m/z 5288, 5568 and 6413) species were supported by a congruent characteristic protein pattern. Moreover, using a chemometric approach, the score plot created by partial least square discriminant analysis (PLSDA) of mass spectra demonstrated the presence of two individualized clusters, each one enclosing isolates belonging to a species-specific spectral group. The generated pool of species-specific proteins comprised mostly ribosomal and SASPs proteins. Therefore, in B. pumilus the specific ion at m/z 5271 was associated with a small acid-soluble spore protein (SASP O) or with 50S protein L35, whereas in B. safensis specific ions at m/z 5288 and 5568 were associated with SASP J and P, respectively, and an ion at m/z 6413 with 50S protein L32. Thus, the resulting unique protein profile combined with chemometric analysis, proved to be valuable tools for B. pumilus and B. safensis discrimination, allowing their reliable, reproducible and rapid identification. PMID:25314655

  14. The 2016 database issue of Nucleic Acids Research and an updated molecular biology database collection.

    PubMed

    Rigden, Daniel J; Fernández-Suárez, Xosé M; Galperin, Michael Y

    2016-01-01

    The 2016 Database Issue of Nucleic Acids Research starts with overviews of the resources provided by three major bioinformatics centers, the U.S. National Center for Biotechnology Information (NCBI), the European Bioinformatics Institute (EMBL-EBI) and Swiss Institute for Bioinformatics (SIB). Also included are descriptions of 62 new databases and updates on 95 databases that have been previously featured in NAR plus 17 previously described elsewhere. A number of papers in this issue deal with resources on nucleic acids, including various kinds of non-coding RNAs and their interactions, molecular dynamics simulations of nucleic acid structure, and two databases of super-enhancers. The protein database section features important updates on the EBI's Pfam, PDBe and PRIDE databases, as well as a variety of resources on pathways, metabolomics and metabolic modeling. This issue also includes updates on popular metagenomics resources, such as MG-RAST, EBI Metagenomics, and probeBASE, as well as a newly compiled Human Pan-Microbe Communities database. A significant fraction of the new and updated databases are dedicated to the genetic basis of disease, primarily cancer, and various aspects of drug research, including resources for patented drugs, their side effects, withdrawn drugs, and potential drug targets. A further six papers present updated databases of various antimicrobial and anticancer peptides. The entire Database Issue is freely available online on the Nucleic Acids Research website (http://nar.oxfordjournals.org/). The NAR online Molecular Biology Database Collection, http://www.oxfordjournals.org/nar/database/c/, has been updated with the addition of 88 new resources and removal of 23 obsolete websites, which brought the current listing to 1685 databases. PMID:26740669

  15. Evolutionary and functional analysis of fructose bisphosphate aldolase of plant parasitic nematodes

    PubMed Central

    Prasad, CVS Siva; Gupta, Saurabh; Kumar, Himansu; Tiwari, Murlidhar

    2013-01-01

    The essential and ubiquitous enzyme fructose bisphosphate aldolase (FBPA) has been a good target for controlling the various types of infections caused by pathogens and parasites. The parasitic infections of nematodes are the major concern of scientific community, leading to biochemical characterization of this enzyme. In this work we have developed a small dataset of all types of FBPA sequences collected from publically available databases (EMBL, NCBI and Uni-Port). The Phylogenetic study shows that evolutionary relationships among sequences of FBPA are clustered into three main groups. FBPA sequences of Globodera rostochiensis (FBPA_GR) and Heterodera glycines (FBPA_HG) are placed in group II, sharing the similar evolutionary relationship. The catalytic mechanism of these enzymes depends upon which class of aldolase, it belongs. The class of enzyme has been confirmed on the basis of sequences and structural similarity with template structure of class I FBPA. To confirm catalytic mechanism of above said model structures, the known substrate fructose-1, 6-bisphosphate (FBP) and competitive inhibitor Mannitol-1, 6 bisphosphate (MBP) were docked at known catalytic site of enzyme of interest. The comparative docking analysis shows that enzyme-substrate complex is forming similar Schiff base intermediate and conducts C3–C4 bond cleavage by forming Hydrogen bonding with reaction catalyzing Glu-191, reactive Lys-150, and Schiff base forming Lys-233. On the other hand enzymeinhibitor noncovalent complex is forming cabinolamine precursor and the proton transfer by the formation of hydrogen bond between MBP O2 with Glu191 enabling stabilization of cabinolamine transition state, which confirms the similar inhibition mechanism. Thus we conclude that Plant Parasitic Nematodes (PPNs) have evolutionary and functional relationship with the class I aldolase enzyme. Hence, FBPA can be targeted to control plant parasitic nematodes. PMID:23390337

  16. Combinatorial phenotypic screen uncovers unrecognized family of extended thiourea inhibitors with copper-dependent anti-staphylococcal activity.

    PubMed

    Dalecki, Alex G; Malalasekera, Aruni P; Schaaf, Kaitlyn; Kutsch, Olaf; Bossmann, Stefan H; Wolschendorf, Frank

    2016-04-01

    The continuous rise of multi-drug resistant pathogenic bacteria has become a significant challenge for the health care system. In particular, novel drugs to treat infections of methicillin-resistant Staphylococcus aureus strains (MRSA) are needed, but traditional drug discovery campaigns have largely failed to deliver clinically suitable antibiotics. More than simply new drugs, new drug discovery approaches are needed to combat bacterial resistance. The recently described phenomenon of copper-dependent inhibitors has galvanized research exploring the use of metal-coordinating molecules to harness copper's natural antibacterial properties for therapeutic purposes. Here, we describe the results of the first concerted screening effort to identify copper-dependent inhibitors of Staphylococcus aureus. A standard library of 10 000 compounds was assayed for anti-staphylococcal activity, with hits defined as those compounds with a strict copper-dependent inhibitory activity. A total of 53 copper-dependent hit molecules were uncovered, similar to the copper independent hit rate of a traditionally executed campaign conducted in parallel on the same library. Most prominent was a hit family with an extended thiourea core structure, termed the NNSN motif. This motif resulted in copper-dependent and copper-specific S. aureus inhibition, while simultaneously being well tolerated by eukaryotic cells. Importantly, we could demonstrate that copper binding by the NNSN motif is highly unusual and likely responsible for the promising biological qualities of these compounds. A subsequent chemoinformatic meta-analysis of the ChEMBL chemical database confirmed the NNSNs as an unrecognized staphylococcal inhibitor, despite the family's presence in many chemical screening libraries. Thus, our copper-biased screen has proven able to discover inhibitors within previously screened libraries, offering a mechanism to reinvigorate exhausted molecular collections. PMID:26935206

  17. "helix Nebula - the Science Cloud", a European Science Driven Cross-Domain Initiative Implemented in via AN Active Ppp Set-Up

    NASA Astrophysics Data System (ADS)

    Lengert, W.; Mondon, E.; Bégin, M. E.; Ferrer, M.; Vallois, F.; DelaMar, J.

    2015-12-01

    Helix Nebula, a European science cross-domain initiative building on an active PPP, is aiming to implement the concept of an open science commons[1] while using a cloud hybrid model[2] as the proposed implementation solution. This approach allows leveraging and merging of complementary data intensive Earth Science disciplines (e.g. instrumentation[3] and modeling), without introducing significant changes in the contributors' operational set-up. Considering the seamless integration with life-science (e.g. EMBL), scientific exploitation of meteorological, climate, and Earth Observation data and models open an enormous potential for new big data science. The work of Helix Nebula has shown that is it feasible to interoperate publicly funded infrastructures, such as EGI [5] and GEANT [6], with commercial cloud services. Such hybrid systems are in the interest of the existing users of publicly funded infrastructures and funding agencies because they will provide "freedom and choice" over the type of computing resources to be consumed and the manner in which they can be obtained. But to offer such freedom and choice across a spectrum of suppliers, various issues such as intellectual property, legal responsibility, service quality agreements and related issues need to be addressed. Finding solutions to these issues is one of the goals of the Helix Nebula initiative. [1] http://www.egi.eu/news-and-media/publications/OpenScienceCommons_v3.pdf [2] http://www.helix-nebula.eu/events/towards-the-european-open-science-cloud [3] e.g. https://sentinel.esa.int/web/sentinel/sentinel-data-access [5] http://www.egi.eu/ [6] http://www.geant.net/

  18. Amyotrophic Lateral Sclerosis Type 20 - In Silico Analysis and Molecular Dynamics Simulation of hnRNPA1.

    PubMed

    Krebs, Bruna Baumgarten; De Mesquita, Joelma Freire

    2016-01-01

    Amyotrophic Lateral Sclerosis (ALS) is a fatal neurodegenerative disease that affects the upper and lower motor neurons. 5-10% of cases are genetically inherited, including ALS type 20, which is caused by mutations in the hnRNPA1 gene. The goals of this work are to analyze the effects of non-synonymous single nucleotide polymorphisms (nsSNPs) on hnRNPA1 protein function, to model the complete tridimensional structure of the protein using computational methods and to assess structural and functional differences between the wild type and its variants through Molecular Dynamics simulations. nsSNP, PhD-SNP, Polyphen2, SIFT, SNAP, SNPs&GO, SNPeffect and PROVEAN were used to predict the functional effects of nsSNPs. Ab initio modeling of hnRNPA1 was made using Rosetta and refined using KoBaMIN. The structure was validated by PROCHECK, Rampage, ERRAT, Verify3D, ProSA and Qmean. TM-align was used for the structural alignment. FoldIndex, DICHOT, ELM, D2P2, Disopred and DisEMBL were used to predict disordered regions within the protein. Amino acid conservation analysis was assessed by Consurf, and the molecular dynamics simulations were performed using GROMACS. Mutations D314V and D314N were predicted to increase amyloid propensity, and predicted as deleterious by at least three algorithms, while mutation N73S was predicted as neutral by all the algorithms. D314N and D314V occur in a highly conserved amino acid. The Molecular Dynamics results indicate that all mutations increase protein stability when compared to the wild type. Mutants D314N and N319S showed higher overall dimensions and accessible surface when compared to the wild type. The flexibility level of the C-terminal residues of hnRNPA1 is affected by all mutations, which may affect protein function, especially regarding the protein ability to interact with other proteins. PMID:27414033

  19. Intermolecular Vibrations of Hydrophobic Amino Acids

    NASA Astrophysics Data System (ADS)

    Williams, Michael Roy Casselman

    -TDS) was used to measure the absorption spectra of low-frequency vibrational modes for a variety of hydrophobic amino acids in the solid (polycrystalline) state. The THz-TDS technique uses ultrafast (<50 fs) pulses of light from a visible/near-IR laser to generate single-cycle pulses of THz (far-IR) light. Pulses from the ultrafast laser are also used to coherently gate a THz detector, allowing phase-sensitive measurements of the THz electric field. In some cases, Raman scattering spectra of some of the polycrystalline hydrophobic amino acid samples were measured as well, in this case using an Ar+ laser and a triple monochromator to detect signals at the low Raman-shift values corresponding to the far-IR. THz-TDS was used to measure the low-frequency vibrational absorption spectra of pure L- and pure D-valine crystals as well as the racemic cocrystal, DL-valine. As expected, the Land D-valine THz-TDS absorption spectra are identical to one another (they are enantiomorphous crystals) but very different from the spectrum of DL-valine. In the process of these experiments, it was discovered that it was possible to prepare two distinct polymorphs (different crystalline arrangements) of DL-valine by varying the conditions under which stock material was recrystallized. Once crystallized in a particular form, both polymorphs remained (meta)stable at all temperatures investigated (from 80 K to room temperature), i.e., no phase transformation was observed. The THz-TDS and Raman spectra of the two polymorphs of DL-valine were measured. In addition, THz-TDS and Raman spectra of DL-leucine were measured; this substance has a crystal structure closely analagous to one of the DL-valine polymorphs. The temperature-dependence of the THz-TDS spectrum of each material was also measured. At lower temperatures, it is generally expected that intermolecular vibration frequencies increase (blueshift) due to a shrinking unit cell (effectively squeezing the oscillator potential into a smaller space

  20. Analysis of bacterial DNA in synovial tissue of Tunisian patients with reactive and undifferentiated arthritis by broad-range PCR, cloning and sequencing

    PubMed Central

    Siala, Mariam; Jaulhac, Benoit; Gdoura, Radhouane; Sibilia, Jean; Fourati, Hela; Younes, Mohamed; Baklouti, Sofien; Bargaoui, Naceur; Sellami, Slaheddine; Znazen, Abir; Barthel, Cathy; Collin, Elody; Hammami, Adnane; Sghir, Abdelghani

    2008-01-01

    Introduction Bacteria and/or their antigens have been implicated in the pathogenesis of reactive arthritis (ReA). Several studies have reported the presence of bacterial antigens and nucleic acids of bacteria other than those specified by diagnostic criteria for ReA in joint specimens from patients with ReA and various arthritides. The present study was conducted to detect any bacterial DNA and identify bacterial species that are present in the synovial tissue of Tunisian patients with reactive arthritis and undifferentiated arthritis (UA) using PCR, cloning and sequencing. Methods We examined synovial tissue samples from 28 patients: six patients with ReA and nine with UA, and a control group consisting of seven patients with rheumatoid arthritis and six with osteoarthritis (OA). Using broad-range bacterial PCR producing a 1,400-base-pair fragment from the 16S rRNA gene, at least 24 clones were sequenced for each synovial tissue sample. To identify the corresponding bacteria, DNA sequences were compared with sequences from the EMBL (European Molecular Biology Laboratory) database. Results Bacterial DNA was detected in 75% of the 28 synovial tissue samples. DNA from 68 various bacterial species were found in ReA and UA samples, whereas DNA from 12 bacteria were detected in control group samples. Most of the bacterial DNAs detected were from skin or intestinal bacteria. DNA from bacteria known to trigger ReA, such as Shigella flexneri and Shigella sonnei, were detected in ReA and UA samples of synovial tissue and not in control samples. DNA from various bacterial species detected in this study have not previously been found in synovial samples. Conclusion This study is the first to use broad-range PCR targeting the full 16S rRNA gene for detection of bacterial DNA in synovial tissue. We detected DNA from a wide spectrum of bacterial species, including those known to be involved in ReA and others not previously associated with ReA or related arthritis. The pathogenic

  1. Drug Design for CNS Diseases: Polypharmacological Profiling of Compounds Using Cheminformatic, 3D-QSAR and Virtual Screening Methodologies

    PubMed Central

    Nikolic, Katarina; Mavridis, Lazaros; Djikic, Teodora; Vucicevic, Jelica; Agbaba, Danica; Yelekci, Kemal; Mitchell, John B. O.

    2016-01-01

    HIGHLIGHTS Many CNS targets are being explored for multi-target drug designNew databases and cheminformatic methods enable prediction of primary pharmaceutical target and off-targets of compoundsQSAR, virtual screening and docking methods increase the potential of rational drug design The diverse cerebral mechanisms implicated in Central Nervous System (CNS) diseases together with the heterogeneous and overlapping nature of phenotypes indicated that multitarget strategies may be appropriate for the improved treatment of complex brain diseases. Understanding how the neurotransmitter systems interact is also important in optimizing therapeutic strategies. Pharmacological intervention on one target will often influence another one, such as the well-established serotonin-dopamine interaction or the dopamine-glutamate interaction. It is now accepted that drug action can involve plural targets and that polypharmacological interaction with multiple targets, to address disease in more subtle and effective ways, is a key concept for development of novel drug candidates against complex CNS diseases. A multi-target therapeutic strategy for Alzheimer‘s disease resulted in the development of very effective Multi-Target Designed Ligands (MTDL) that act on both the cholinergic and monoaminergic systems, and also retard the progression of neurodegeneration by inhibiting amyloid aggregation. Many compounds already in databases have been investigated as ligands for multiple targets in drug-discovery programs. A probabilistic method, the Parzen-Rosenblatt Window approach, was used to build a “predictor” model using data collected from the ChEMBL database. The model can be used to predict both the primary pharmaceutical target and off-targets of a compound based on its structure. Several multi-target ligands were selected for further study, as compounds with possible additional beneficial pharmacological activities. Based on all these findings, it is concluded that multipotent

  2. De Novo Characterization of the Mung Bean Transcriptome and Transcriptomic Analysis of Adventitious Rooting in Seedlings Using RNA-Seq

    PubMed Central

    Li, Shi-Weng; Shi, Rui-Fang; Leng, Yan

    2015-01-01

    Adventitious rooting is the most important mechanism underlying vegetative propagation and an important strategy for plant propagation under environmental stress. The present study was conducted to obtain transcriptomic data and examine gene expression using RNA-Seq and bioinformatics analysis, thereby providing a foundation for understanding the molecular mechanisms controlling adventitious rooting. Three cDNA libraries constructed from mRNA samples from mung bean hypocotyls during adventitious rooting were sequenced. These three samples generated a total of 73 million, 60 million, and 59 million 100-bp reads, respectively. These reads were assembled into 78,697 unigenes with an average length of 832 bp, totaling 65 Mb. The unigenes were aligned against six public protein databases, and 29,029 unigenes (36.77%) were annotated using BLASTx. Among them, 28,225 (35.75%) and 28,119 (35.62%) unigenes had homologs in the TrEMBL and NCBI non-redundant (Nr) databases, respectively. Of these unigenes, 21,140 were assigned to gene ontology classes, and a total of 11,990 unigenes were classified into 25 KOG functional categories. A total of 7,357 unigenes were annotated to 4,524 KOs, and 4,651 unigenes were mapped onto 342 KEGG pathways using BLAST comparison against the KEGG database. A total of 11,717 unigenes were differentially expressed (fold change>2) during the root induction stage, with 8,772 unigenes down-regulated and 2,945 unigenes up-regulated. A total of 12,737 unigenes were differentially expressed during the root initiation stage, with 9,303 unigenes down-regulated and 3,434 unigenes up-regulated. A total of 5,334 unigenes were differentially expressed between the root induction and initiation stage, with 2,167 unigenes down-regulated and 3,167 unigenes up-regulated. qRT-PCR validation of the 39 genes with known functions indicated a strong correlation (92.3%) with the RNA-Seq data. The GO enrichment, pathway mapping, and gene expression profiles reveal

  3. Sequential Application of Ligand and Structure Based Modeling Approaches to Index Chemicals for Their hH4R Antagonism

    PubMed Central

    Basile, Livia; Milardi, Danilo; Zeidan, Mouhammed; Raiyn, Jamal; Guccione, Salvatore; Rayan, Anwar

    2014-01-01

    The human histamine H4 receptor (hH4R), a member of the G-protein coupled receptors (GPCR) family, is an increasingly attractive drug target. It plays a key role in many cell pathways and many hH4R ligands are studied for the treatment of several inflammatory, allergic and autoimmune disorders, as well as for analgesic activity. Due to the challenging difficulties in the experimental elucidation of hH4R structure, virtual screening campaigns are normally run on homology based models. However, a wealth of information about the chemical properties of GPCR ligands has also accumulated over the last few years and an appropriate combination of these ligand-based knowledge with structure-based molecular modeling studies emerges as a promising strategy for computer-assisted drug design. Here, two chemoinformatics techniques, the Intelligent Learning Engine (ILE) and Iterative Stochastic Elimination (ISE) approach, were used to index chemicals for their hH4R bioactivity. An application of the prediction model on external test set composed of more than 160 hH4R antagonists picked from the chEMBL database gave enrichment factor of 16.4. A virtual high throughput screening on ZINC database was carried out, picking ∼4000 chemicals highly indexed as H4R antagonists' candidates. Next, a series of 3D models of hH4R were generated by molecular modeling and molecular dynamics simulations performed in fully atomistic lipid membranes. The efficacy of the hH4R 3D models in discrimination between actives and non-actives were checked and the 3D model with the best performance was chosen for further docking studies performed on the focused library. The output of these docking studies was a consensus library of 11 highly active scored drug candidates. Our findings suggest that a sequential combination of ligand-based chemoinformatics approaches with structure-based ones has the potential to improve the success rate in discovering new biologically active GPCR drugs and increase the

  4. Molecular analysis of the glpFKX regions of Escherichia coli and Shigella flexneri.

    PubMed Central

    Truniger, V; Boos, W; Sweet, G

    1992-01-01

    We have identified a new gene, glpX, belonging to the glp regulon of Escherichia coli, located directly downstream of the glpK gene. The transcription of glpX is inducible with glycerol and sn-glycerol-3-phosphate and is constitutive in a glpR mutant. glpX is the third gene in the glpFKX operon. The function of GlpX remains unknown. GlpX has an apparent molecular weight of 40,000 on sodium dodecyl sulfate-polyacrylamide gels. In addition to determining the E. coli glpX sequence, we also sequenced the corresponding glpFKX region originating from Shigella flexneri, which after transfer into E. coli was instrumental in elucidating the function of glpF in glycerol transport (D. P. Richey and E. C. C. Lin, J. Bacteriol. 112:784-790, 1972). Sequencing of the glpFKX region of this hybrid strain revealed an amber mutation instead of the tryptophan 215 codon in glpF. The most striking difference between the E. coli and S. flexneri DNA was found directly behind glpK, where two repetitive (REP) sequences were present in S. flexneri, but not in the E. coli sequence. The presence or absence of these REP sequences had no effect on transport or on growth on glycerol. Not including the REP sequence-containing region, only 1.1% of a total of 2,167 bp sequenced was different in the two sequences. Comparison of the sequence with those in the EMBL data library revealed a 99% identity between the last third of glpX and the first part of a gene called mvrA. We show that the cloned mvrA gene (M. Morimyo, J. Bacteriol. 170:2136-2142, 1988) originated from the 88-min region of the Escherichia coli chromosome and not, as reported, from the 7-min region and that the gene product identified as MvrA is in fact encoded by a gene distal to glpX. Images PMID:1400248

  5. The genetic diversity of Citrus dwarfing viroid populations is mainly dependent on the infected host species.

    PubMed

    Tessitori, Matilde; Rizza, Serena; Reina, Antonella; Causarano, Giovanni; Di Serio, Francesco

    2013-03-01

    As with viruses, viroids infect their hosts as polymorphic populations of variants. Identifying possible sources of genetic variability is significant in the case of the species Citrus dwarfing viroid (CDVd) which has been proposed as a dwarfing agent for high-density citrus plantings. Here, a natural CDVd isolate (CMC) was used as an inoculum source for long-term (25 years) and short-term (1 year) bioassays in different citrus host species. Characterization of progenies indicated that the genetic stability of CDVd populations was high in certain hosts (trifoliate orange, Troyer citrange, Etrog citron, Navelina sweet orange), which preserve viroid populations similar to the original CMC isolate even after 25 years. By contrast, CDVd variant populations in Interdonato lemon and Volkamer lemon were completely different to those in the inoculated sources, highlighting how influential the host is on the genetic variability of CDVd populations. Implications for risk assessment of CDVd as a dwarfing agent are discussed. The GenBank/EMBL/DDBJ accession numbers for the complete sequences of the Citrus dwarfing viroid variants are JF970266.1 forH2-2, JF970267.1 for H2-7, EU938647.1 for H6-2, EU938651.1 forH6-10, JF970268.1 for H10-7, EU938652.1 for H14-13, EU938653.1for H14-14, JF970269.1 for H14-16, EU938648.1 for H15-9,EU938649.1 for H16-2, JF970265.1 for H16-9, EU938654.1 forH16-13, EU938650.1 for H20-3, JF970270.1 for H20-7, EU938641.1for PR-1, EU938642.1 for PR-3, EU938643.1 for PR-7, EU938644.1for CR-1, EU938639.1 for VR-4, JF12070.1 for VR-15, JF812069.1LS-4, EU938640.1 for LS-10 and JF970264.1 for LS-11. PMID:23152366

  6. Genomic Anatomy of a Premier Major Histocompatibility Complex Paralogous Region on Chromosome 1q21–q22

    PubMed Central

    Shiina, Takashi; Ando, Asako; Suto, Yumiko; Kasai, Fumio; Shigenari, Atsuko; Takishima, Nobusada; Kikkawa, Eri; Iwata, Kyoko; Kuwano, Yuko; Kitamura, Yuka; Matsuzawa, Yumiko; Sano, Kazumi; Nogami, Masahiro; Kawata, Hisako; Li, Suyun; Fukuzumi, Yasuhito; Yamazaki, Masaaki; Tashiro, Hiroyuki; Tamiya, Gen; Kohda, Atsushi; Okumura, Katsuzumi; Ikemura, Toshimichi; Soeda, Eiichi; Mizuki, Nobuhisa; Kimura, Minoru; Bahram, Seiamak; Inoko, Hidetoshi

    2001-01-01

    Human chromosomes 1q21–q25, 6p21.3–22.2, 9q33–q34, and 19p13.1–p13.4 carry clusters of paralogous loci, to date best defined by the flagship 6p MHC region. They have presumably been created by two rounds of large-scale genomic duplications around the time of vertebrate emergence. Phylogenetically, the 1q21–25 region seems most closely related to the 6p21.3 MHC region, as it is only the MHC paralogous region that includes bona fide MHC class I genes, the CD1 and MR1 loci. Here, to clarify the genomic structure of this model MHC paralogous region as well as to gain insight into the evolutionary dynamics of the entire quadriplication process, a detailed analysis of a critical 1.7 megabase (Mb) region was performed. To this end, a composite, deep, YAC, BAC, and PAC contig encompassing all five CD1 genes and linking the centromeric +P5 locus to the telomeric KRTC7 locus was constructed. Within this contig a 1.1-Mb BAC and PAC core segment joining CD1D to FCER1A was fully sequenced and thoroughly analyzed. This led to the mapping of a total of 41 genes (12 expressed genes, 12 possibly expressed genes, and 17 pseudogenes), among which 31 were novel. The latter include 20 olfactory receptor (OR) genes, 9 of which are potentially expressed. Importantly, CD1, SPTA1, OR, and FCERIA belong to multigene families, which have paralogues in the other three regions. Furthermore, it is noteworthy that 12 of the 13 expressed genes in the 1q21–q22 region around the CD1 loci are immunologically relevant. In addition to CD1A-E, these include SPTA1, MNDA, IFI-16, AIM2, BL1A, FY and FCERIA. This functional convergence of structurally unrelated genes is reminiscent of the 6p MHC region, and perhaps represents the emergence of yet another antigen presentation gene cluster, in this case dedicated to lipid/glycolipid antigens rather than antigen-derived peptides. [The nucleotide sequence data reported in this paper have been submitted to the DDBJ, EMBL, and GenBank databases under

  7. Synthesis of the vitamin E amino acid esters with an enhanced anticancer activity and in silico screening for new antineoplastic drugs.

    PubMed

    Gagic, Zarko; Ivkovic, Branka; Srdic-Rajic, Tatjana; Vucicevic, Jelica; Nikolic, Katarina; Agbaba, Danica

    2016-06-10

    virtual screening of the ChEMBL database identified new compounds with a potential antiproliferative activity on MCF-7 and on multi-drug resistant MDA-MB 231 breast cancer cells. PMID:27063330

  8. La grossesse extra-utérine dans une région semi-rurale en Afrique: Aspects épidémiologiques, cliniques et thérapeutiques à propos d'une série de 74 cas traités à l'Hôpital de District de Sangmelima au Sud-Cameroun

    PubMed Central

    Kenfack, Bruno; Noubom, Michel; Bongoe, Adamo; Tsatedem, Faustin Atemkeng; Ngono, Modeste; Tsague, Georges Nguefack; Mboudou, Emile

    2012-01-01

    La grossesse extra-utérine (GEU) constitue une cause fréquente de morbidité et parfois de mortalité chez les femmes en âge de procréation. Son étiologie n'est pas clairement précisée. Son tableau clinique est polymorphe et ses méthodes thérapeutiques très diversifiées. C'est dans le but d’étudier les aspects épidémiologiques cliniques et thérapeutiques dans une zone rurale à ressources limitées d'Afrique que ce travail a été réalisé. Il s'agit d'une étude descriptive transversale sur une durée de trois ans, portant sur 74 cas de GEU traités à l'Hôpital de District de Sangmelima. Le matériel utilisé était constitué d'une fiche anonyme de collecte des données, des dossiers du malade, et du registre opératoire. Au cours de la période d’étude, 2142 naissances vivantes ont été enregistrées, soit un taux de GEU de 3,45%. Les femmes non mariées et celles ayant les antécédents d'IST étaient les plus atteintes. Le délai moyen entre le début des symptômes et l'admission était de132h. L’âge gestationnel moyen au moment du diagnostic était de 8,14 semaines. Le diagnostic était clinique dans 61% des cas. L'annexe controlatérale était cliniquement normale dans 53% des cas. Le traitement était chirurgical d'emblée chez 97% des cas. Aucun décès n'a été observé. La GEU est fréquente dans cette zone rurale, les malades consultent à un stade tardif, le diagnostic est surtout clinique, et le traitement chirurgical par laparotomie. PMID:23396682

  9. Implications of structural genomics target selection strategies: Pfam5000, whole genome, and random approaches

    SciTech Connect

    Chandonia, John-Marc; Brenner, Steven E.

    2004-07-14

    The structural genomics project is an international effort to determine the three-dimensional shapes of all important biological macromolecules, with a primary focus on proteins. Target proteins should be selected according to a strategy which is medically and biologically relevant, of good value, and tractable. As an option to consider, we present the Pfam5000 strategy, which involves selecting the 5000 most important families from the Pfam database as sources for targets. We compare the Pfam5000 strategy to several other proposed strategies that would require similar numbers of targets. These include including complete solution of several small to moderately sized bacterial proteomes, partial coverage of the human proteome, and random selection of approximately 5000 targets from sequenced genomes. We measure the impact that successful implementation of these strategies would have upon structural interpretation of the proteins in Swiss-Prot, TrEMBL, and 131 complete proteomes (including 10 of eukaryotes) from the Proteome Analysis database at EBI. Solving the structures of proteins from the 5000 largest Pfam families would allow accurate fold assignment for approximately 68 percent of all prokaryotic proteins (covering 59 percent of residues) and 61 percent of eukaryotic proteins (40 percent of residues). More fine-grained coverage which would allow accurate modeling of these proteins would require an order of magnitude more targets. The Pfam5000 strategy may be modified in several ways, for example to focus on larger families, bacterial sequences, or eukaryotic sequences; as long as secondary consideration is given to large families within Pfam, coverage results vary only slightly. In contrast, focusing structural genomics on a single tractable genome would have only a limited impact in structural knowledge of other proteomes: a significant fraction (about 30-40 percent of the proteins, and 40-60 percent of the residues) of each proteome is classified in small

  10. Helix Nebula: Enabling federation of existing data infrastructures and data services to an overarching cross-domain e-infrastructure

    NASA Astrophysics Data System (ADS)

    Lengert, Wolfgang; Farres, Jordi; Lanari, Riccardo; Casu, Francesco; Manunta, Michele; Lassalle-Balier, Gerard

    2014-05-01

    Helix Nebula has established a growing public private partnership of more than 30 commercial cloud providers, SMEs, and publicly funded research organisations and e-infrastructures. The Helix Nebula strategy is to establish a federated cloud service across Europe. Three high-profile flagships, sponsored by CERN (high energy physics), EMBL (life sciences) and ESA/DLR/CNES/CNR (earth science), have been deployed and extensively tested within this federated environment. The commitments behind these initial flagships have created a critical mass that attracts suppliers and users to the initiative, to work together towards an "Information as a Service" market place. Significant progress in implementing the following 4 programmatic goals (as outlined in the strategic Plan Ref.1) has been achieved: × Goal #1 Establish a Cloud Computing Infrastructure for the European Research Area (ERA) serving as a platform for innovation and evolution of the overall infrastructure. × Goal #2 Identify and adopt suitable policies for trust, security and privacy on a European-level can be provided by the European Cloud Computing framework and infrastructure. × Goal #3 Create a light-weight governance structure for the future European Cloud Computing Infrastructure that involves all the stakeholders and can evolve over time as the infrastructure, services and user-base grows. × Goal #4 Define a funding scheme involving the three stake-holder groups (service suppliers, users, EC and national funding agencies) into a Public-Private-Partnership model to implement a Cloud Computing Infrastructure that delivers a sustainable business environment adhering to European level policies. Now in 2014 a first version of this generic cross-domain e-infrastructure is ready to go into operations building on federation of European industry and contributors (data, tools, knowledge, ...). This presentation describes how Helix Nebula is being used in the domain of earth science focusing on geohazards. The

  11. Transcriptome Sequencing and Differential Gene Expression Analysis of Delayed Gland Morphogenesis in Gossypium australe during Seed Germination

    PubMed Central

    Tao, Tao; Zhao, Liang; Lv, Yuanda; Chen, Jiedan; Hu, Yan; Zhang, Tianzhen; Zhou, Baoliang

    2013-01-01

    The genus Gossypium is a globally important crop that is used to produce textiles, oil and protein. However, gossypol, which is found in cultivated cottonseed, is toxic to humans and non-ruminant animals. Efforts have been made to breed improved cultivated cotton with lower gossypol content. The delayed gland morphogenesis trait possessed by some Australian wild cotton species may enable the widespread, direct usage of cottonseed. However, the mechanisms about the delayed gland morphogenesis are still unknown. Here, we sequenced the first Australian wild cotton species (Gossypiumaustrale) and a diploid cotton species (Gossypiumarboreum) using the Illumina Hiseq 2000 RNA-seq platform to help elucidate the mechanisms underlying gossypol synthesis and gland development. Paired-end Illumina short reads were de novo assembled into 226,184, 213,257 and 275,434 transcripts, clustering into 61,048, 47,908 and 72,985 individual clusters with N50 lengths of 1,710 bp, 1544 BP and 1,743 bp, respectively. The clustered Unigenes were searched against three public protein databases (TrEMBL, SwissProt and RefSeq) and the nucleotide and protein sequences of Gossypiumraimondii using BLASTx and BLASTn. A total of 21,987, 17,209 and 25,325 Unigenes were annotated. Of these, 18,766 (85.4%), 14,552 (84.6%) and 21,374 (84.4%) Unigenes could be assigned to GO-term classifications. We identified and analyzed 13,884 differentially expressed Unigenes by clustering and functional enrichment. Terpenoid-related biosynthesis pathways showed differentially regulated expression patterns between the two cotton species. Phylogenetic analysis of the terpene synthases family was also carried out to clarify the classifications of TPSs. RNA-seq data from two distinct cotton species provide comprehensive transcriptome annotation resources and global gene expression profiles during seed germination and gland and gossypol formation. These data may be used to further elucidate various mechanisms and help

  12. Le mélanome malin: une tumeur rare des fosses nasales - à propos d'une série de 10 cas

    PubMed Central

    Errachdi, Amal; Epala, Brice Nkoua; Asabbane, Amal; Kabbali, Naoual; Hemmich, Mariem; Kebdani, Tayeb; Benjaafar, Noureddine

    2014-01-01

    Le mélanome malin des fosses nasales est une tumeur rare mais très agressive, de traitement complexe et de pronostic défavorable. Son traitement relève en principe d'une prise en charge essentiellement chirurgicale complétée par une radiothérapie. L'objectif de ce travail est de rapporter les caractéristiques cliniques, thérapeutiques et évolutives des mélanomes des fosses nasales. Nous avons analysé rétrospectivement 10 cas de mélanomes des fosses nasales suivis à l'institut national d'oncologie de Rabat. La rhinoscopie avec biopsie a permis la confirmation histologique du diagnostic de mélanome. Le bilan d'extension comprenait une tomodensitométrie ou imagerie par résonnance magnétique du massif facial, une radiographie thoracique et une échographie abdominale. Dans notre série, l’âge médian était de 67.5 ans, avec une prédominance féminine (7femmes et 3hommes). Le délai médian de découverte était de 6 mois. Deux patients étaient métastatiques d'emblée, et toutes les tumeurs étaient localement avancées au moment du diagnostic. Sept patients ont été opérés avec des limites chirurgicales envahies dans 2 cas et 3 patients étaient inopérables. 2 patients ont été irradiés après la chirurgie et 2 patients ont reçu une chimiothérapie arrêtée au moment de la progression. Deux patients ont récidivé après traitement, et un patient était en mauvais état général et a bénéficié uniquement de soins palliatifs. Tous les patients sont décédés avec un délai médian de survie de 12 mois. Le mélanome malin muqueux des fosses nasales, bien que rare, demeure une pathologie de pronostic défavorable et pose des problèmes de prise en charge. PMID:25404963

  13. Cloning and expression of the liver and muscle isoforms of ovine carnitine palmitoyltransferase 1: residues within the N-terminus of the muscle isoform influence the kinetic properties of the enzyme.

    PubMed Central

    Price, Nigel T; Jackson, Vicky N; van der Leij, Feike R; Cameron, Jacqueline M; Travers, Maureen T; Bartelds, Beatrijs; Huijkman, Nicolette C; Zammit, Victor A

    2003-01-01

    The nucleotide sequence data reported will appear in DDBJ, EMBL, GenBank(R) and GSDB Nucleotide Sequence Databases; the sequences of ovine CPT1A and CPT1B cDNAs have the accession numbers Y18387 and AJ272435 respectively and the partial adipose tissue and liver CPT1A clones have the accession numbers Y18830 and Y18829 respectively. Fatty acid and ketone body metabolism differ considerably between monogastric and ruminant species. The regulation of the key enzymes involved may differ accordingly. Carnitine palmitoyltransferase 1 (CPT 1) is the key locus for the control of long-chain fatty acid beta-oxidation and liver ketogenesis. Previously we showed that CPT 1 kinetics in sheep and rat liver mitochondria differ. We cloned cDNAs for both isoforms [liver- (L-) and muscle- (M-)] of ovine CPT 1 in order to elucidate the structural features of these proteins and their genes ( CPT1A and CPT1B ). Their deduced amino acid sequences show a high degree of conservation compared with orthologues from other mammalian species, with the notable exception of the N-terminus of ovine M-CPT 1. These differences were also present in bovine M-CPT 1, whose N-terminal sequence we determined. In addition, the 5'-end of the sheep CPT1B cDNA suggested a different promoter architecture when compared with previously characterized CPT1B genes. Northern blotting revealed differences in tissue distribution for both CPT1A and CPT1B transcripts compared with other species. In particular, ovine CPT1B mRNA was less tissue restricted, and the predominant transcript in the pancreas was CPT1B. Expression in yeast allowed kinetic characterization of the two native enzymes, and of a chimaera in which the distinctive N-terminal segment of ovine M-CPT 1 was replaced with that from rat M-CPT 1. The ovine N-terminal segment influences the kinetics of the enzyme for both its substrates, such that the K (m) for palmitoyl-CoA is decreased and that for carnitine is increased for the chimaera, relative to the

  14. Mappability of drug-like space: towards a polypharmacologically competent map of drug-relevant compounds.

    PubMed

    Sidorov, Pavel; Gaspar, Helena; Marcou, Gilles; Varnek, Alexandre; Horvath, Dragos

    2015-12-01

    Intuitive, visual rendering--mapping--of high-dimensional chemical spaces (CS), is an important topic in chemoinformatics. Such maps were so far dedicated to specific compound collections--either limited series of known activities, or large, even exhaustive enumerations of molecules, but without associated property data. Typically, they were challenged to answer some classification problem with respect to those same molecules, admired for their aesthetical virtues and then forgotten--because they were set-specific constructs. This work wishes to address the question whether a general, compound set-independent map can be generated, and the claim of "universality" quantitatively justified, with respect to all the structure-activity information available so far--or, more realistically, an exploitable but significant fraction thereof. The "universal" CS map is expected to project molecules from the initial CS into a lower-dimensional space that is neighborhood behavior-compliant with respect to a large panel of ligand properties. Such map should be able to discriminate actives from inactives, or even support quantitative neighborhood-based, parameter-free property prediction (regression) models, for a wide panel of targets and target families. It should be polypharmacologically competent, without requiring any target-specific parameter fitting. This work describes an evolutionary growth procedure of such maps, based on generative topographic mapping, followed by the validation of their polypharmacological competence. Validation was achieved with respect to a maximum of exploitable structure-activity information, covering all of Homo sapiens proteins of the ChEMBL database, antiparasitic and antiviral data, etc. Five evolved maps satisfactorily solved hundreds of activity-based ligand classification challenges for targets, and even in vivo properties independent from training data. They also stood chemogenomics-related challenges, as cumulated responsibility vectors

  15. Sequence analysis of PER-1 extended-spectrum beta-lactamase from Pseudomonas aeruginosa and comparison with class A beta-lactamases.

    PubMed Central

    Nordmann, P; Naas, T

    1994-01-01

    We have determined the nucleotide sequence (EMBL accession number, Z 21957) of the cloned chromosomal PER-1 extended-spectrum beta-lactamase gene from a Pseudomonas aeruginosa RNL-1 clinical isolate, blaPER-1 corresponds to a 924-bp open reading frame which encodes a polypeptide of 308 amino acids. This open reading frame is preceded by a -10 and a -35 region consistent with a putative P. aeruginosa promoter. Primer extension analysis of the PER-1 mRNA start revealed that this promoter was active in P. aeruginosa but not in Escherichia coli, in which PER-1 expression was driven by vector promoter sequences. N-terminal sequencing identified the PER-1 26-amino-acid leader peptide and enabled us to calculate the molecular mass (30.8 kDa) of the PER-1 mature form. Analysis of the percent GC content of blaPER-1 and of its 5' upstream sequences, as well as the codon usage for blaPER-1, indicated that blaPER-1 may have been inserted into P. aeruginosa genomic DNA from a nonpseudomonad bacterium. The PER-1 gene showed very low homology with other beta-lactamase genes at the DNA level. By using computer methods, assessment of the extent of identity between PER-1 and 10 beta-lactamase amino acid sequences indicated that PER-1 is a class A beta-lactamase. PER-1 shares around 27% amino acid identity with the sequenced extended-spectrum beta-lactamases of the TEM-SHV series and MEN-1 from Enterobacteriaceae species. The use of parsimony methods showed that PER-1 is not more closely related to gram-negative than to gram-positive bacterial class A beta-lactamases. Surprisingly, among class A beta-lactamases, PER-1 was most closely related to the recently reported CFXA from Bacteroides vulgatus, with which it shared 40% amino acid identity. This work indicates that non-Enterobacteriaceae species such as P. aeruginosa may possess class A extended-spectrum beta-lactamase genes possibly resulting from intergeneric DNA transfer. Images PMID:8141562

  16. Magna Carta for Researchers

    NASA Astrophysics Data System (ADS)

    2006-12-01

    Today, Janez Potočnik, European Commissioner for Science and Research received a statement of support for the European Charter for Researchers and the Code of Conduct for the Recruitment of Researchers from EIROforum. "The EIROforum partners warmly welcome this valuable initiative by the European Commission", said Prof. William G. Stirling, Director General of ESRF and present Chairman of EIROforum."This is an important step towards the implementation of the European Research Area." ESO PR Photo 47/06 ESO PR Photo 47a/06 Janez Potočnik, European Commissioner for Science and Research receives the statement of support from Bill Stirling, Director General of ESRF and present Chairman of EIROforum. The European Charter for Researchers addresses the roles, responsibilities and entitlements of researchers and their employers or funding organisations. It aims at ensuring that the relationship between these parties contributes to successful performance in the generation, transfer and sharing of knowledge, and to the career development of researchers. The Code of Conduct for the Recruitment of Researchers aims to improve recruitment, to make selection procedures fairer and more transparent and proposes different means of judging merit. Merit should not just be measured on the number of publications but on a wider range of evaluation criteria, such as teaching, supervision, teamwork, knowledge transfer, management and public awareness activities. ESO PR Photo 47/06 ESO PR Photo 47b/06 The signature of the statement of support last November. From left to right: Richard Wagner, Director of the ILL, David Southwood, Scientific Director of ESA, Robert Aymar, Director General of CERN, Bill Stirling, Director General of ESRF, Catherine Cesarsky, Director General of ESO, Francesco Romanelli, EFDA-JET leader and Silke Schumacher, Coordinator International Relations and Communication of the EMBL. In their statement, signed at the EIROforum Assembly on 15 November 2006, the seven

  17. Probing the origins of human acetylcholinesterase inhibition via QSAR modeling and molecular docking.

    PubMed

    Simeon, Saw; Anuwongcharoen, Nuttapat; Shoombuatong, Watshara; Malik, Aijaz Ahmad; Prachayasittikul, Virapong; Wikberg, Jarl E S; Nantasenamat, Chanin

    2016-01-01

    Alzheimer's disease (AD) is a chronic neurodegenerative disease which leads to the gradual loss of neuronal cells. Several hypotheses for AD exists (e.g., cholinergic, amyloid, tau hypotheses, etc.). As per the cholinergic hypothesis, the deficiency of choline is responsible for AD; therefore, the inhibition of AChE is a lucrative therapeutic strategy for the treatment of AD. Acetylcholinesterase (AChE) is an enzyme that catalyzes the breakdown of the neurotransmitter acetylcholine that is essential for cognition and memory. A large non-redundant data set of 2,570 compounds with reported IC50 values against AChE was obtained from ChEMBL and employed in quantitative structure-activity relationship (QSAR) study so as to gain insights on their origin of bioactivity. AChE inhibitors were described by a set of 12 fingerprint descriptors and predictive models were constructed from 100 different data splits using random forest. Generated models afforded R (2), [Formula: see text] and [Formula: see text] values in ranges of 0.66-0.93, 0.55-0.79 and 0.56-0.81 for the training set, 10-fold cross-validated set and external set, respectively. The best model built using the substructure count was selected according to the OECD guidelines and it afforded R (2), [Formula: see text] and [Formula: see text] values of 0.92 ± 0.01, 0.78 ± 0.06 and 0.78 ± 0.05, respectively. Furthermore, Y-scrambling was applied to evaluate the possibility of chance correlation of the predictive model. Subsequently, a thorough analysis of the substructure fingerprint count was conducted to provide informative insights on the inhibitory activity of AChE inhibitors. Moreover, Kennard-Stone sampling of the actives were applied to select 30 diverse compounds for further molecular docking studies in order to gain structural insights on the origin of AChE inhibition. Site-moiety mapping of compounds from the diversity set revealed three binding anchors encompassing both hydrogen bonding and van der Waals

  18. Isolation of pregnancy-associated glycoproteins from placenta of the American bison (Bison bison) at first half of pregnancy.

    PubMed

    Kiewisz, Jolanta; Sousa, Noelita Melo de; Beckers, Jean-Francois; Vervaecke, Hilde; Panasiewicz, Grzegorz; Szafranska, Bozena

    2008-01-01

    This paper describes the successful purification and characterisation of pregnancy-associated glycoproteins (PAG) extracted from placenta (3-4 months) of American bisons (Amb). Chorionic AmbPAG proteins were purified from foetal cotyledonary tissues (CT) and liquid cotyledonary-carrying proteins (LCP) leaking from damaged cells. Our protocols successfully indicated the usefulness of AmbPAG protein identification, especially from LCP fraction. The AmbPAGs were extracted, precipitated and eluted during DEAE cellulose chromatography. The richest protein fractions were further chromatographed on VVA (Vicia villosa agglutinin affinity column), then characterised by mono- and bi-dimensional electrophoresis, Western blot and N-terminal amino acid (aa) sequence. After being transferred to PVDF membranes, three selected VVA-purified AmbPAG isoforms differing in molecular masses and isoelectric points (Ip 4-4.6) were selected for sequencing. One identified N-terminal 25aa sequence of AmbPAG72kDa CT form was identified as completely new (RGSNLTSLPLQNVIDLFYVGNITIG). Two other AmbPAG proteins purified from different sources (74kDa CT and 76kDa LCP forms; RGSNLTIHPLRNIRDIFYVGNITIG) were identical or corresponded to N-terminus of various bovine PAGs (boPAG). The two AmbPAGs (74kDa CT and 76kDa LCP) revealed identical micro-sequence to boPAG7; and were similar mainly to bovine PAG4, -6, -15 and -17 precursors that were identified by full-length sequencing derived from cDNA cloning. The novel sequence of the AmbPAG (72kDa CT) was related to some boPAG and various other ruminant PAG precursors (caprine and ovine). All three identified AmbPAG sequences were also relatively similar to mature forms of purified native boPAG(56-75kDa) proteins. This is the first report indicating aa sequences of native AmbPAG proteins purified from placenta (CT and LCP) of bison species. The N-terminal sequences of the AmbPAGs have been deposited in the EMBL-EBI database (UniProtKB; Accession Nos.: P

  19. Anoplastie périnéale simple pour le traitement des malformations anorectales basses chez l'adulte, à propos de deux cas

    PubMed Central

    Echchaoui, Abdelmoughit; Benyachou, Malika; Hafidi, Jawad; Fathi, Nahed; Mohammadine, Elhamid; ELmazouz, Samir; Gharib, Nour-eddine; Abbassi, Abdellah

    2014-01-01

    Les malformations anorectales chez l'adulte sont des anomalies congénitales rares du tube digestif qui prédominent chez le sexe féminin. Notre étude porte sur deux observations de malformation anorectale basses vues et traitées au stade adulte par les 2 équipes (plasticiens et viscéralistes) à l'Hôpital Avicenne à Rabat. Il s'agit d'un homme de 24 ans avec une dyschésie anale l'autre cas est une femme de 18 ans avec une malformation anovulvaire Les caractéristiques cliniques combinées avec les imageries radiologiques (lavement baryté, et la manométrie anorectale) ont confirmé qu'il s'agit d'une malfomation anorectale basse. Les deux cas sont corrigés par une reconstruction sphinctérienne, réimplantation anale avec anoplastie périnéale. Les suites opératoires étaient simples, pas de souffrance cutanée ou nécrose, avec changement de pansement gras chaque jour. Le résultat fonctionnel (la continence) était favorable pour les 2 patients. La présentation des MAR à l’âge adulte est rare, d’étiologie mal connu, elles apparaissent selon le mode sporadique. Les caractéristiques cliniques, couplées à l'imagerie (lavement baryté, IRM pelvienne), l'endoscopie et la manométrie anorectale, permettent de confirmer le diagnostic et classer ces anomalies en 3 types: basses, intermédiaires, et hautes. Les formes basses sont traités d'emblée par une réimplantation anale et anoplastie périnéale simple tels nos deux cas, elles peuvent être traités dans certains cas par un abaissement anorectale associé à une plastie V-Y permettant ainsi un emplacement anatomique correct de l'anus; alors que les formes hautes ou intermédiaires relèvent d'une chirurgie complexe avec souvent une dérivation digestive transitoire. Contrairement aux autres formes, Les formes basses ont un pronostic fonctionnel favorable. PMID:25667689

  20. Place de la chirurgie dans la prise en charge des cancers du sein chez la femme au Centre Hospitalier Universitaire Yalgado Ouedraogo: à propos de 81 cas

    PubMed Central

    Zongo, Nayi; Millogo-Traore, Timonga Françoise Danielle; Bagre, Sidpawalmdé Carine; Bagué, Abdoul-Halim; Ouangre, Edgar; Zida, Maurice; Bambara, Aboubacar; Bambara, Tozoula Augustin; Traoré, Si Simon

    2015-01-01

    Etudier la place de la chirurgie dans la prise en charge des cancers du sein au centre hospitalier universitaire Yalgado Ouédraogo. Nous avons réalisé une étude prospective et descriptive sur dix (10) mois portant sur la place de la chirurgie dans le cancer du sein. Elle a eu pour cadre les services de gynécologie-obstétrique et de chirurgie viscérale et digestive du centre hospitalier universitaire Yalgado Ouédraogo. Ont été pris en compte les indications, les gestes et les résultats de la chirurgie. Nous avons colligé 81 cancers mammaires. Le délai moyen de consultation a été de 14,26 mois. Les tumeurs T3 à T4 représentaient 82,71% des cas. Trente-huit patientes (46,91%) ont été opérées. La chimiothérapie néo adjuvante a été réalisée dans 29,63% des cas. Trente-quatre patientes (41,97%) étaient opérables d'emblée. Il s'agissait de mastectomie selon Madden dans 94,74% des cas et de chirurgie de propreté dans 2 cas (5,26% des cas). Une chimiothérapie adjuvante a été réalisée chez 52,63% des patientes opérées. Des complications à type de lymphocèle ont été notées dans 23,68% des cas. Leur traitement a consisté en des ponctions évacuatrices. Les indications de la chirurgie sont limitées par le retard diagnostique corollaire de stades avancés des cancers du sein. L'absence de la radiothérapie rend délicate la pratique de la chirurgie conservatrice et la mastectomie occupe toujours une place importante. Un diagnostic précoce permettrait d'augmenter les indications chirurgicales. PMID:26848364

  1. Bioinformatics approach to evaluate differential gene expression of M1/M2 macrophage phenotypes and antioxidant genes in atherosclerosis.

    PubMed

    da Rocha, Ricardo Fagundes; De Bastiani, Marco Antônio; Klamt, Fábio

    2014-11-01

    Atherosclerosis is a pro-inflammatory process intrinsically related to systemic redox impairments. Macrophages play a major role on disease development. The specific involvement of classically activated, M1 (pro-inflammatory), or the alternatively activated, M2 (anti-inflammatory), on plaque formation and disease progression are still not established. Thus, based on meta-data analysis of public micro-array datasets, we compared differential gene expression levels of the human antioxidant genes (HAG) and M1/M2 genes between early and advanced human atherosclerotic plaques, and among peripheric macrophages (with or without foam cells induction by oxidized low density lipoprotein, oxLDL) from healthy and atherosclerotic subjects. Two independent datasets, GSE28829 and GSE9874, were selected from gene expression omnibus (http://www.ncbi.nlm.nih.gov/geo/) repository. Functional interactions were obtained with STRING (http://string-db.org/) and Medusa (http://coot.embl.de/medusa/). Statistical analysis was performed with ViaComplex(®) (http://lief.if.ufrgs.br/pub/biosoftwares/viacomplex/) and gene score enrichment analysis (http://www.broadinstitute.org/gsea/index.jsp). Bootstrap analysis demonstrated that the activity (expression) of HAG and M1 gene sets were significantly increased in advance compared to early atherosclerotic plaque. Increased expressions of HAG, M1, and M2 gene sets were found in peripheric macrophages from atherosclerotic subjects compared to peripheric macrophages from healthy subjects, while only M1 gene set was increased in foam cells from atherosclerotic subjects compared to foam cells from healthy subjects. However, M1 gene set was decreased in foam cells from healthy subjects compared to peripheric macrophages from healthy subjects, while no differences were found in foam cells from atherosclerotic subjects compared to peripheric macrophages from atherosclerotic subjects. Our data suggest that, different to cancer, in atherosclerosis there is

  2. Molecular cloning, gene organization and expression of the human UDP-GalNAc:Neu5Acalpha2-3Galbeta-R beta1,4-N-acetylgalactosaminyltransferase responsible for the biosynthesis of the blood group Sda/Cad antigen: evidence for an unusual extended cytoplasmic domain.

    PubMed Central

    Montiel, Maria-Dolores; Krzewinski-Recchi, Marie-Ange; Delannoy, Philippe; Harduin-Lepers, Anne

    2003-01-01

    The nucleotide sequence of the short and long transcripts of beta1,4- N -acetylgalactosaminyltransferase have been submitted to the DDBJ, EMBL, GenBank(R) and GSDB Nucleotide Sequence Databases under accession nos AJ517770 and AJ517771 respectively. The human Sd(a) antigen is formed through the addition of an N -acetylgalactosamine residue via a beta1,4-linkage to a sub-terminal galactose residue substituted with an alpha2,3-linked sialic acid residue. We have taken advantage of the previously cloned mouse cDNA sequence of the UDP-GalNAc:Neu5Acalpha2-3Galbeta-R beta1,4- N -acetylgalactosaminyltransferase (Sd(a) beta1,4GalNAc transferase) to screen the human EST and genomic databases and to identify the corresponding human gene. The sequence spans over 35 kb of genomic DNA on chromosome 17 and comprises at least 12 exons. As judged by reverse transcription PCR, the human gene is expressed widely since it is detected in various amounts in almost all cell types studied. Northern blot analysis indicated that five Sd(a) beta1,4GalNAc transferase transcripts of 8.8, 6.1, 4.7, 3.8 and 1.65 kb were highly expressed in colon and to a lesser extent in kidney, stomach, ileum and rectum. The complete coding nucleotide sequence was amplified from Caco-2 cells. Interestingly, the alternative use of two first exons, named E1(S) and E1(L), leads to the production of two transcripts. These nucleotide sequences give rise potentially to two proteins of 506 and 566 amino acid residues, identical in their sequence with the exception of their cytoplasmic tail. The short form is highly similar (74% identity) to the mouse enzyme whereas the long form shows an unusual long cytoplasmic tail of 66 amino acid residues that is as yet not described for any other mammalian glycosyltransferase. Upon transient transfection in Cos-7 cells of the common catalytic domain, a soluble form of the protein was obtained, which catalysed the transfer of GalNAc residues to alpha2,3-sialylated acceptor

  3. Mapping of the first preferentially expressed cDNA in human fetal cochlea to human 14q11.2-12 and to a region of homologous synteny on mouse chromosome 12

    SciTech Connect

    Robertson, N.G.; Weremowicz, S.; Kovatch, K.A.

    1994-09-01

    We have isolated a cDNA, Coch-5B2 (D14S564E) from a human fetal cochlear cDNA library by subtractive hybridization and differential screening methods. This is the first cDNA to date shown to be expressed preferentially in human fetal cochlea (membranous labyrinth). On Northern blot of a panel of 14 human fetal tissue RNAs including cochlea, brain, liver, spleen, skeletal muscle, kidney, lung, skin, thymus, adrenal, small intestine, eye, sternal cartilage, and cultured fibroblasts, very high level expression of D14S564E is seen only in cochlea; very faint bands are discernible in brain and eye. Sequence comparison of this clone to sequences in GenBank/EMBL data bases shows no match to any known genes, indicating that it represents a novel cochlear sequence. Chromosome localization of this cochlear cDNA may provide insight into a region of the human genome to which human deafness disorders may map. We have assigned D14S564E to human chromosome 14 using the NIGMS human/rodent somatic cell hybrid mapping panel 1, and regionally to q11.2-q12 by fluorescence in situ hybridization (FISH). Besides detection of the human genomic band on the hybrid panel, genomic bands were seen for mouse and hamster, demonstrating evolutionary conservation of D14S564E. By FISH, signal was detected on human 14q11.2-q12 in 20 metaphases. In 3 metaphases, signal was present on both chromosome 14s. The mouse homolog of this cochlear cDNA was also used to probe human metaphases by FISH: signal was detected in the same region, 14q11.2-12, as the human clone in 5 metaphases, confirming human mapping data and homology to the human cDNA. The human cochlear D14S564E was genetically mapped in the mouse to chromosome 12, in a region of homology with human 14q11.2-q12. This region on mouse 12 contains the asp-1 (audiogenic seizure prone) locus and future studies will be directed at determining whether D14S564E is a candidate gene for this disorder.

  4. Drug Design for CNS Diseases: Polypharmacological Profiling of Compounds Using Cheminformatic, 3D-QSAR and Virtual Screening Methodologies.

    PubMed

    Nikolic, Katarina; Mavridis, Lazaros; Djikic, Teodora; Vucicevic, Jelica; Agbaba, Danica; Yelekci, Kemal; Mitchell, John B O

    2016-01-01

    HIGHLIGHTS Many CNS targets are being explored for multi-target drug designNew databases and cheminformatic methods enable prediction of primary pharmaceutical target and off-targets of compoundsQSAR, virtual screening and docking methods increase the potential of rational drug design The diverse cerebral mechanisms implicated in Central Nervous System (CNS) diseases together with the heterogeneous and overlapping nature of phenotypes indicated that multitarget strategies may be appropriate for the improved treatment of complex brain diseases. Understanding how the neurotransmitter systems interact is also important in optimizing therapeutic strategies. Pharmacological intervention on one target will often influence another one, such as the well-established serotonin-dopamine interaction or the dopamine-glutamate interaction. It is now accepted that drug action can involve plural targets and that polypharmacological interaction with multiple targets, to address disease in more subtle and effective ways, is a key concept for development of novel drug candidates against complex CNS diseases. A multi-target therapeutic strategy for Alzheimer's disease resulted in the development of very effective Multi-Target Designed Ligands (MTDL) that act on both the cholinergic and monoaminergic systems, and also retard the progression of neurodegeneration by inhibiting amyloid aggregation. Many compounds already in databases have been investigated as ligands for multiple targets in drug-discovery programs. A probabilistic method, the Parzen-Rosenblatt Window approach, was used to build a "predictor" model using data collected from the ChEMBL database. The model can be used to predict both the primary pharmaceutical target and off-targets of a compound based on its structure. Several multi-target ligands were selected for further study, as compounds with possible additional beneficial pharmacological activities. Based on all these findings, it is concluded that multipotent ligands

  5. Comparative genomics of citric-acid producing Aspergillus niger ATCC 1015 versus enzyme-producing CBS 513.88

    SciTech Connect

    Grigoriev, Igor V.; Baker, Scott E.; Andersen, Mikael R.; Salazar, Margarita P.; Schaap, Peter J.; Vondervoot, Peter J.I. van de; Culley, David; Thykaer, Jette; Frisvad, Jens C.; Nielsen, Kristen F.; Albang, Richard; Albermann, Kaj; Berka, Randy M.; Braus, Gerhard H.; Braus-Stromeyer, Susanna A.; Corrochano, Luis M.; Dai, Ziyu; Dijck, Piet W.M. van; Hofmann, Gerald; Lasure, Linda L.; Magnusson, Jon K.; Meijer, Susan L.; Nielsen, Jakob B.; Nielsen, Michael L.; Ooyen, Albert J.J. van; Panther, Kathyrn S.; Pel, Herman J.; Poulsen, Lars; Samson, Rob A.; Stam, Hen; Tsang, Adrian; Brink, Johannes M. van den; Atkins, Alex; Aerts, Andrea; Shapiro, Harris; Pangilinan, Jasmyn; Salamov, Asaf; Lou, Yigong; Lindquist, Erika; Lucas, Susan; Grimwood, Jane; Kubicek, Christian P.; Martinez, Diego; Peij, Noel N.M.E. van; Roubos, Johannes A.; Nielsen, Jens

    2011-04-28

    . The whole genome sequence for A. niger ATCC 1015 is available from NBCI under acc. no ACJE00000000. The up-dated sequence for A. niger CBS 513.88 is available from EMBL under acc. no AM269948-AM270415. The sequence data from the phylogeny study has been submitted to NCBI (GU296686-296739). Microarray data from this study is submitted to GEO as series GSE10983. Accession for reviewers is possible through: http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi token GSE10983] The dsmM_ANIGERa_coll511030F library and platform information is deposited at GEO under number GPL6758

  6. Characterization of Two Microbial Isolates from Andean Lakes in Bolivia

    NASA Technical Reports Server (NTRS)

    Demergasso, C.; Blamey, J.; Escudero, L.; Chong, G.; Casamayor, E. O.; Cabrol, N. A.; Grin, E. A.; Hock, A.; Kiss, A.; Borics, G.

    2004-01-01

    miniprep protocol. The 16S rRNA genes were amplified by PCR using both Bacteria- and Archaeauniversal primer sets: 27f and 1492r, 21f and 1492r respectively. Sequences of 16S rRNA gene were determined and initially compared with reference sequences contained in the EMBL nucleotide sequence database by using the BLAST program and were subsequently aligned with 16S rRNA reference sequences in the ARB package (http://www.mikro.biologie.tu-muenchen.de). Aligned sequences were inserted within a stable phylogenetic tree by using the ARB parsimony tool. In this work we report the morphology and phylogenetic characterization of two isolates belonged to Laguna Blanca sediments.

  7. Profil évolutif et pronostic des tumeurs urothéliales de la vessie chez le sujet jeune

    PubMed Central

    Bouzouita, Abderrazak; Saadi, Ahmed; Kerkeni, Walid; Chakroun, Marouene; Cherif, Mohamed; Ayed, Haroun; Selmi, Selim; Derouiche, Amine; Benslama, Riadh Mohamed; Chebil, Mohamed

    2016-01-01

    Résumé But: Les tumeurs urothéliales de la vessie sont rares chez le jeune adulte. Leur profil évolutif et leur pronostic restent matière à contro-verse. Nous rapportons notre expérience à propos de 54 patients. Méthodologie: Entre 1990 et 2010, 54 patients de moins de 40 ans au moment du diagnostic ont été traités pour carcinome à cellules transitionnelles de la vessie. Nous avons étudié le profil évolutif de ces tumeurs en séparant les patients en deux groupes (moins de 30 ans, et 30 à 40 ans). Résultats: La tumeur n’infiltrait pas le muscle vésical dans 37 cas et l’infiltrait dans 17 cas. Pour les tumeurs n’infiltrant pas le muscle vésical, elles étaient de stade Ta dans 20 cas et de grade I–II dans 36 cas. Le pronostic de ces tumeurs était meilleur avant l’âge de 30 ans avec un taux de récidive de 15,3 % sans progression. Pour les patients de 30 à 40 ans, le taux de récidive était de 33,3 %, et 25 % des tumeurs qui ont récidivé ont présenté une progression du stade. Pour les tumeurs infiltrant le muscle vésical, le pronostic était sombre (localement avancées dans neuf cas et métastatiques d’emblée dans cinq cas). Conclusion: Le profil évolutif des tumeurs n’infiltrant pas le muscle vésical a semblé meilleur avant l’âge de 30 ans. Entre 30 et 40 ans, le profil évolutif s’est approché de celui des sujets âgés. Les tumeurs infiltrantes étaient souvent évoluées et agressives, évoquant un potentiel évolutif particulier. PMID:27330577

  8. The glucuronic acid utilization gene cluster from Bacillus stearothermophilus T-6.

    PubMed

    Shulami, S; Gat, O; Sonenshein, A L; Shoham, Y

    1999-06-01

    A lambda-EMBL3 genomic library of Bacillus stearothermophilus T-6 was screened for hemicellulolytic activities, and five independent clones exhibiting beta-xylosidase activity were isolated. The clones overlap each other and together represent a 23.5-kb chromosomal segment. The segment contains a cluster of xylan utilization genes, which are organized in at least three transcriptional units. These include the gene for the extracellular xylanase, xylanase T-6; part of an operon coding for an intracellular xylanase and a beta-xylosidase; and a putative 15.5-kb-long transcriptional unit, consisting of 12 genes involved in the utilization of alpha-D-glucuronic acid (GlcUA). The first four genes in the potential GlcUA operon (orf1, -2, -3, and -4) code for a putative sugar transport system with characteristic components of the binding-protein-dependent transport systems. The most likely natural substrate for this transport system is aldotetraouronic acid [2-O-alpha-(4-O-methyl-alpha-D-glucuronosyl)-xylotriose] (MeGlcUAXyl3). The following two genes code for an intracellular alpha-glucuronidase (aguA) and a beta-xylosidase (xynB). Five more genes (kdgK, kdgA, uxaC, uxuA, and uxuB) encode proteins that are homologous to enzymes involved in galacturonate and glucuronate catabolism. The gene cluster also includes a potential regulatory gene, uxuR, the product of which resembles repressors of the GntR family. The apparent transcriptional start point of the cluster was determined by primer extension analysis and is located 349 bp from the initial ATG codon. The potential operator site is a perfect 12-bp inverted repeat located downstream from the promoter between nucleotides +170 and +181. Gel retardation assays indicated that UxuR binds specifically to this sequence and that this binding is efficiently prevented in vitro by MeGlcUAXyl3, the most likely molecular inducer. PMID:10368143

  9. Leading European Research Organisations Join Forces in EIROFORUM

    NASA Astrophysics Data System (ADS)

    2001-05-01

    Since the early 1950s, a number of powerful research infrastructures and laboratories which are used by an extensive network of scientists have been developed and deployed within Europe by European Intergovernmental Research Organisations (EIRO). Together, they represent European spearheads in some of the most crucial basic and applied research fields. Seven of these organisations have set up a co-ordination and collaboration group ( EIROFORUM ) with their top executives (Directors General or equivalent) as members. They include CERN (particle physics), EMBL (molecular biology), ESA (space activities), ESO (astronomy and astrophysics), ESRF (synchrotron radiation), ILL (neutron source) and EFDA (fusion). A primary goal of EIROFORUM is to play an active and constructive role in promoting the quality and impact of European Research. In particular, the group will be a basis for effective, high-level inter-organisational interaction and co-ordination. It will mobilise its substantial combined expertise in basic research and in the management of large international projects for the benefit of European research and development. This will be possible by exploiting the existing intimate links between the member organisations and their respective European research communities. According to the EIROFORUM Charter , the main aims of the collaboration are to: 1. Encourage and facilitate discussions among its members on issues of common interest, which are relevant to research and development. 2. Maximise the scientific return and optimise the use of resources by sharing relevant developments and results, whenever feasible. 3. Co-ordinate the education and outreach activities of the organisations, including technology transfer and public understanding. 4. Take an active part, in collaboration with other European scientific organisations, in taking a forward-look at promising and/or developing research directions and priorities, in particular in relation to new large

  10. Antitubercular specific activity of ibuprofen and the other 2-arylpropanoic acids using the HT-SPOTi whole-cell phenotypic assay

    PubMed Central

    Guzman, Juan D; Evangelopoulos, Dimitrios; Gupta, Antima; Birchall, Kristian; Mwaigwisya, Solomon; Saxty, Barbara; McHugh, Timothy D; Gibbons, Simon; Malkinson, John; Bhakta, Sanjib

    2013-01-01

    Objectives Lead antituberculosis (anti-TB) molecules with novel mechanisms of action are urgently required to fuel the anti-TB drug discovery pipeline. The aim of this study was to validate the use of the high-throughput spot culture growth inhibition (HT-SPOTi) assay for screening libraries of compounds against Mycobacterium tuberculosis and to study the inhibitory effect of ibuprofen (IBP) and the other 2-arylpropanoic acids on the growth inhibition of M tuberculosis and other mycobacterial species. Methods The HT-SPOTi method was validated not only with known drugs but also with a library of 47 confirmed anti-TB active compounds published in the ChEMBL database. Three over-the-counter non-steroidal anti-inflammatory drugs were also included in the screening. The 2-arylpropanoic acids, including IBP, were comprehensively evaluated against phenotypically and physiologically different strains of mycobacteria, and their cytotoxicity was determined against murine RAW264.7 macrophages. Furthermore, a comparative bioinformatic analysis was employed to propose a potential mycobacterial target. Results IBP showed antitubercular properties while carprofen was the most potent among the 2-arylpropanoic class. A 3,5-dinitro-IBP derivative was found to be more potent than IBP but equally selective. Other synthetic derivatives of IBP were less active, and the free carboxylic acid of IBP seems to be essential for its anti-TB activity. IBP, carprofen and the 3,5-dinitro-IBP derivative exhibited activity against multidrug-resistant isolates and stationary phase bacilli. On the basis of the human targets of the 2-arylpropanoic analgesics, the protein initiation factor infB (Rv2839c) of M tuberculosis was proposed as a potential molecular target. Conclusions The HT-SPOTi method can be employed reliably and reproducibly to screen the antimicrobial potency of different compounds. IBP demonstrated specific antitubercular activity, while carprofen was the most selective agent among the

  11. Cancer pulmonaire: parcours de soins au service de radiothérapie à l'institut national d'oncologie de Rabat

    PubMed Central

    Lachgar, Amine; Sahli, Nadir; Toulba, Ahmedou; Kebdani, Tayeb; Benjaafar, Noureddine

    2015-01-01

    L'objectif de cette étude est d'expliquer la discordance entre le nombre important de patients présentant un cancer du poumon localement avancé demandeurs de consultations en service de radiothérapie et le faible nombre de patients effectivement traité. Il s'agit d'une étude décrivant le circuit de soins des patients admis au service de radiothérapie de l'Institut national d'oncologie de Rabat entre le premier mars 2011 et le 29 février 2012 pour la prise en charge d'un cancer du poumon inopérable et/ou non résécable. On a utilisé pour la collecte des données les dossiers cliniques, le registre des nouveaux patients du bureau des admissions de l'institut ainsi que les registres des rendez-vous de consultation et de traitement du service de radiothérapie. 117 patients ont été collectés. Le stade de la maladie n'a pu être déterminé que chez 102 patients, on a ainsi trouvé 53 cancers non métastatiques et 49 cancers métastatiques. Chez les patients avec un cancer non métastatique une radiothérapie palliative a été réalisée chez 9 patients, chez 2 patients la radiothérapie a été contre indiquée, une chimiothérapie néo-adjuvante a été réalisée chez 7 patients et la radio-chimiothérapie concomitante d'emblée fut proposée à 35 patients, mais 34 patients seulement ont pu avoir leur première séance de radiothérapie à visée curative. Cette étude nous a permis de décrire le circuit de soins de nos patients en repérant les points critiques, auxquels on propose des mesures correctives. PMID:26523190

  12. Comparing Class A GPCRs to bitter taste receptors: Structural motifs, ligand interactions and agonist-to-antagonist ratios.

    PubMed

    Di Pizio, Antonella; Levit, Anat; Slutzki, Michal; Behrens, Maik; Karaman, Rafik; Niv, Masha Y

    2016-01-01

    G protein-coupled receptors (GPCRs) are seven transmembrane (TM) proteins that play a key role in human physiology. The GPCR superfamily comprises about 800 members, classified into several classes, with rhodopsin-like Class A being the largest and most studied thus far. A huge component of the human repertoire consists of the chemosensory GPCRs, including ∼400 odorant receptors, 25 bitter taste receptors (TAS2Rs), which are thought to guard the organism from consuming poisons, and sweet and umami TAS1R heteromers, which indicate the nutritive value of food. The location of the binding site of TAS2Rs is similar to that of Class A GPCRs. However, most of the known bitter ligands are agonists, with only a few antagonists documented thus far. The agonist-to-antagonist ratios of Class A GPCRs vary, but in general are much lower than for TAS2Rs. For a set of well-studied GPCRs, a gradual change in agonists-to-antagonists ratios is observed when comparing low (10 μM)- and high (10 nM)-affinity ligand sets from ChEMBL and the DrugBank set of drugs. This shift reflects pharmaceutical bias toward the therapeutically desirable pharmacology for each of these GPCRs, while the 10 μM sets possibly represent the native tendency of the receptors toward either agonists or antagonists. Analyzing ligand-GPCR interactions in 56 X-ray structures representative of currently available structural data, we find that the N-terminus, TM1 and TM2 are more involved in binding of antagonists than of agonists. On the other hand, ECL2 tends to be more involved in binding of agonists. This is of interest, since TAS2Rs harbor variations on the typical Class A sequence motifs, including the absence of the ECL2-TM3 disulfide bridge. This suggests an alternative mode of regulation of conformational states for TAS2Rs, with potentially less stabilized inactive state. The comparison of TAS2Rs and Class A GPCRs structural features and the pharmacology of the their ligands highlights the intricacies of

  13. Isolation and characterization of two overlapping cosmid clones from the 4q35 region, near the facioscapulohumeral muscular dystrophy locus

    SciTech Connect

    Deidda, G.; Grisanti, P.; Vigneti, E.

    1994-09-01

    The gene for facioscapulohumeral muscular dystrophy (FSHD) has been localized by linkage analysis to the 4q35 region. The most telomeric p13E-11 prove has been shown to detect 4q35 DNA rearrangements in both sporadic and familial cases of the disease. With the aim of constructing a detailed physical map of the 4q35 region and searching for the mutant gene, we used p13E-11 probe to isolate cosmid clones from a human genomic library in a pCos-EMBL 2 vector. Two positive clones were isolated, clones 3 and 5, which partially overlap and carry human genomic inserts of 42 and 45 kb, respectively. The cosmids share a common region containing the p13E-11 region and a stretch of KpnI units consisting of 3.2 kb tandemly repeated sequences (about 10). The restriction maps were constructed using the following enzymes: Bam HI, BgIII, Eco RI, EcoRV, KpnI and Sfi I. Clone 3 extends 4 kb upstream of C5 and stops within the Kpn repeats. Clone 5 extends 4 kb downstream from the Kpn repeats and it presents an additional EcoRI site. Clone 5 contains a stretch of Kpn sequences of nearly 32 kb, corresponding to 10 Kpn repeats; clone 3 contains a stretch of 29 kb corresponding to 9 Kpn repeats, as determined by PFGE analysis of partial digestion of the clones. Clone 5 seems to contain the entire Eco RI region prone to rearrangements in FSHD patients. From clone 5 several subclones were obtained, from the Kpn region and from the region spanning from the last Kpn repeat to the cloning site. No single copy sequences were detected. Subclones from the 3{prime} end region contain beta-satellite or Sau3A-like sequences. In situ hybridization with the whole C5 cosmid shows hybridization signals at the tip of chromosome 4 (4q35) and chromosome 10 (10q26), in the pericentromeric region of chromosome 1 (1q12) and in the p12 region of the acrocentric chromosomes (chr. 21, 22, 13, 14, 15).

  14. Using collective expert judgements to evaluate quality measures of mass spectrometry images

    PubMed Central

    Palmer, Andrew; Ovchinnikova, Ekaterina; Thuné, Mikael; Lavigne, Régis; Guével, Blandine; Dyatlov, Andrey; Vitek, Olga; Pineau, Charles; Borén, Mats; Alexandrov, Theodore

    2015-01-01

    and the Matlab source code for data processing can be found at: https://github.com/alexandrovteam/IMS_quality. Contact: theodore.alexandrov@embl.de PMID:26072506

  15. The BioPrompt-box: an ontology-based clustering tool for searching in biological databases

    PubMed Central

    Corsi, Claudio; Ferragina, Paolo; Marangoni, Roberto

    2007-01-01

    the retrieved documents such as the references to Gene Ontology, the taxonomy lineage, the organism and the keywords. Of course, the approach is flexible enough to leave room for future additions of other meta-information. The ultimate goal of the clustering process is to provide the user with several different readings of the (maybe numerous) query results and show possible hidden correlations among them, thus improving their browsing and understanding. Conclusion Bpb is a powerful search engine that makes it very easy to perform complex queries over the indexed databanks (currently only UNIPROT is considered). The ontology-based clustering approach is efficient and effective, and could thus be applied successfully to larger databanks, like GenBank or EMBL. PMID:17430575

  16. Structure and Sequence of the Human Fast Skeletal Troponin T (TNNT3) Gene: Insight Into the Evolution of the Gene and the Origin of the Developmentally Regulated Isoforms

    PubMed Central

    Stefancsik, Raymund; Randall, Jeffrey D.; Mao, Chengjian

    2003-01-01

    We describe the cloning, sequencing and structure of the human fast skeletal troponin T (TNNT3) gene located on chromosome 11p15.5. The single-copy gene encodes 19 exons and 18 introns. Eleven of these exons, 1–3, 9–15 and 18, are constitutively spliced, whereas exons 4–8 are alternatively spliced. The gene contains an additional subset of developmentally regulated and alternatively spliced exons, including a foetal exon located between exon 8 and 9 and exon 16 or α (adult) and 17 or β (foetal and neonatal). Exon phasing suggests that the majority of the alternatively spliced exons located at the 5′ end of the gene may have evolved as a result of exon shuffling, because they are of the same phase class. In contrast, the 3′ exons encoding an evolutionarily conserved heptad repeat domain, shared by both TnT and troponin I (TnI), may be remnants of an ancient ancestral gene. The sequence of the 5′ flanking region shows that the putative promoter contains motifs including binding sites for MyoD, MEF-2 and several transcription factors which may play a role in transcriptional regulation and tissue-specific expression of TnT. The coding region of TNNT3 exhibits strong similarity to the corresponding rat sequence. However, unlike the rat TnT gene, TNNT3 possesses two repeat regions of CCA and TC. The exclusive presence of these repetitive elements in the human gene indicates divergence in the evolutionary dynamics of mammalian TnT genes. Homologous muscle-specific splicing enhancer motifs are present in the introns upstream and downstream of the foetal exon, and may play a role in the developmental pattern of alternative splicing of the gene. The genomic correlates of TNNT3 are relevant to our understanding of the evolution and regulation of expression of the gene, as well as the structure and function of the protein isoforms. The nucleotide sequence of TNNT3 has been submitted to EMBL/GenBank under Accession No. AF026276. PMID:18629027

  17. Phylogenomic Study of Burkholderia glathei-like Organisms, Proposal of 13 Novel Burkholderia Species and Emended Descriptions of Burkholderia sordidicola, Burkholderia zhejiangensis, and Burkholderia grimmiae

    PubMed Central

    Peeters, Charlotte; Meier-Kolthoff, Jan P.; Verheyde, Bart; De Brandt, Evie; Cooper, Vaughn S.; Vandamme, Peter

    2016-01-01

    68415T). Furthermore, we present emended descriptions of the species Burkholderia sordidicola, Burkholderia zhejiangensis and Burkholderia grimmiae. The GenBank/EMBL/DDBJ accession numbers for the 16S rRNA and gyrB gene sequences determined in this study are LT158612-LT158624 and LT158625-LT158641, respectively. PMID:27375597

  18. Bioinformatic and phylogenetic analysis of the CLAVATA3/EMBRYO-SURROUNDING REGION (CLE) and the CLE-LIKE signal peptide genes in the Pinophyta

    PubMed Central

    2014-01-01

    Background There is a rapidly growing awareness that plant peptide signalling molecules are numerous and varied and they are known to play fundamental roles in angiosperm plant growth and development. Two closely related peptide signalling molecule families are the CLAVATA3-EMBRYO-SURROUNDING REGION (CLE) and CLE-LIKE (CLEL) genes, which encode precursors of secreted peptide ligands that have roles in meristem maintenance and root gravitropism. Progress in peptide signalling molecule research in gymnosperms has lagged behind that of angiosperms. We therefore sought to identify CLE and CLEL genes in gymnosperms and conduct a comparative analysis of these gene families with angiosperms. Results We undertook a meta-analysis of the GenBank/EMBL/DDBJ gymnosperm EST database and the Picea abies and P. glauca genomes and identified 93 putative CLE genes and 11 CLEL genes among eight Pinophyta species, in the genera Cryptomeria, Pinus and Picea. The predicted conifer CLE and CLEL protein sequences had close phylogenetic relationships with their homologues in Arabidopsis. Notably, perfect conservation of the active CLE dodecapeptide in presumed orthologues of the Arabidopsis CLE41/44-TRACHEARY ELEMENT DIFFERENTIATION (TDIF) protein, an inhibitor of tracheary element (xylem) differentiation, was seen in all eight conifer species. We cloned the Pinus radiata CLE41/44-TDIF orthologues. These genes were preferentially expressed in phloem in planta as expected, but unexpectedly, also in differentiating tracheary element (TE) cultures. Surprisingly, transcript abundances of these TE differentiation-inhibitors sharply increased during early TE differentiation, suggesting that some cells differentiate into phloem cells in addition to TEs in these cultures. Applied CLE13 and CLE41/44 peptides inhibited root elongation in Pinus radiata seedlings. We show evidence that two CLEL genes are alternatively spliced via 3′-terminal acceptor exons encoding separate CLEL peptides

  19. Do not hesitate to use Tversky-and other hints for successful active analogue searches with feature count descriptors.

    PubMed

    Horvath, Dragos; Marcou, Gilles; Varnek, Alexandre

    2013-07-22

    This study is an exhaustive analysis of the neighborhood behavior over a large coherent data set (ChEMBL target/ligand pairs of known Ki, for 165 targets with >50 associated ligands each). It focuses on similarity-based virtual screening (SVS) success defined by the ascertained optimality index. This is a weighted compromise between purity and retrieval rate of active hits in the neighborhood of an active query. One key issue addressed here is the impact of Tversky asymmetric weighing of query vs candidate features (represented as integer-value ISIDA colored fragment/pharmacophore triplet count descriptor vectors). The nearly a 3/4 million independent SVS runs showed that Tversky scores with a strong bias in favor of query-specific features are, by far, the most successful and the least failure-prone out of a set of nine other dissimilarity scores. These include classical Tanimoto, which failed to defend its privileged status in practical SVS applications. Tversky performance is not significantly conditioned by tuning of its bias parameter α. Both initial "guesses" of α = 0.9 and 0.7 were more successful than Tanimoto (at its turn, better than Euclid). Tversky was eventually tested in exhaustive similarity searching within the library of 1.6 M commercial + bioactive molecules at http://infochim.u-strasbg.fr/webserv/VSEngine.html , comparing favorably to Tanimoto in terms of "scaffold hopping" propensity. Therefore, it should be used at least as often as, perhaps in parallel to Tanimoto in SVS. Analysis with respect to query subclasses highlighted relationships of query complexity (simply expressed in terms of pharmacophore pattern counts) and/or target nature vs SVS success likelihood. SVS using more complex queries are more robust with respect to the choice of their operational premises (descriptors, metric). Yet, they are best handled by "pro-query" Tversky scores at α > 0.5. Among simpler queries, one may distinguish between "growable" (allowing for active

  20. Comprehensive analysis of expressed sequence tags from cultivated and wild radish (Raphanus spp.)

    PubMed Central

    2013-01-01

    Background Radish (Raphanus sativus L., 2n = 2× = 18) is an economically important vegetable crop worldwide. A large collection of radish expressed sequence tags (ESTs) has been generated but remains largely uncharacterized. Results In this study, approximately 315,000 ESTs derived from 22 Raphanus cDNA libraries from 18 different genotypes were analyzed, for the purpose of gene and marker discovery and to evaluate large-scale genome duplication and phylogenetic relationships among Raphanus spp. The ESTs were assembled into 85,083 unigenes, of which 90%, 65%, 89% and 89% had homologous sequences in the GenBank nr, SwissProt, TrEMBL and Arabidopsis protein databases, respectively. A total of 66,194 (78%) could be assigned at least one gene ontology (GO) term. Comparative analysis identified 5,595 gene families unique to radish that were significantly enriched with genes related to small molecule metabolism, as well as 12,899 specific to the Brassicaceae that were enriched with genes related to seed oil body biogenesis and responses to phytohormones. The analysis further indicated that the divergence of radish and Brassica rapa occurred approximately 8.9-14.9 million years ago (MYA), following a whole-genome duplication event (12.8-21.4 MYA) in their common ancestor. An additional whole-genome duplication event in radish occurred at 5.1-8.4 MYA, after its divergence from B. rapa. A total of 13,570 simple sequence repeats (SSRs) and 28,758 high-quality single nucleotide polymorphisms (SNPs) were also identified. Using a subset of SNPs, the phylogenetic relationships of eight different accessions of Raphanus was inferred. Conclusion Comprehensive analysis of radish ESTs provided new insights into radish genome evolution and the phylogenetic relationships of different radish accessions. Moreover, the radish EST sequences and the associated SSR and SNP markers described in this study represent a valuable resource for radish functional genomics studies and