GlycomeDB – integration of open-access carbohydrate structure databases
Ranzinger, René; Herget, Stephan; Wetter, Thomas; von der Lieth, Claus-Wilhelm
2008-01-01
Background Although carbohydrates are the third major class of biological macromolecules, after proteins and DNA, there is neither a comprehensive database for carbohydrate structures nor an established universal structure encoding scheme for computational purposes. Funding for further development of the Complex Carbohydrate Structure Database (CCSD or CarbBank) ceased in 1997, and since then several initiatives have developed independent databases with partially overlapping foci. For each database, different encoding schemes for residues and sequence topology were designed. Therefore, it is virtually impossible to obtain an overview of all deposited structures or to compare the contents of the various databases. Results We have implemented procedures which download the structures contained in the seven major databases, e.g. GLYCOSCIENCES.de, the Consortium for Functional Glycomics (CFG), the Kyoto Encyclopedia of Genes and Genomes (KEGG) and the Bacterial Carbohydrate Structure Database (BCSDB). We have created a new database called GlycomeDB, containing all structures, their taxonomic annotations and references (IDs) for the original databases. More than 100000 datasets were imported, resulting in more than 33000 unique sequences now encoded in GlycomeDB using the universal format GlycoCT. Inconsistencies were found in all public databases, which were discussed and corrected in multiple feedback rounds with the responsible curators. Conclusion GlycomeDB is a new, publicly available database for carbohydrate sequences with a unified, all-encompassing structure encoding format and NCBI taxonomic referencing. The database is updated weekly and can be downloaded free of charge. The JAVA application GlycoUpdateDB is also available for establishing and updating a local installation of GlycomeDB. With the advent of GlycomeDB, the distributed islands of knowledge in glycomics are now bridged to form a single resource. PMID:18803830
myPhyloDB: a local web server for the storage and analysis of metagenomic data.
Manter, Daniel K; Korsa, Matthew; Tebbe, Caleb; Delgado, Jorge A
2016-01-01
myPhyloDB v.1.1.2 is a user-friendly personal database with a browser-interface designed to facilitate the storage, processing, analysis, and distribution of microbial community populations (e.g. 16S metagenomics data). MyPhyloDB archives raw sequencing files, and allows for easy selection of project(s)/sample(s) of any combination from all available data in the database. The data processing capabilities of myPhyloDB are also flexible enough to allow the upload and storage of pre-processed data, or use the built-in Mothur pipeline to automate the processing of raw sequencing data. myPhyloDB provides several analytical (e.g. analysis of covariance,t-tests, linear regression, differential abundance (DESeq2), and principal coordinates analysis (PCoA)) and normalization (rarefaction, DESeq2, and proportion) tools for the comparative analysis of taxonomic abundance, species richness and species diversity for projects of various types (e.g. human-associated, human gut microbiome, air, soil, and water) for any taxonomic level(s) desired. Finally, since myPhyloDB is a local web-server, users can quickly distribute data between colleagues and end-users by simply granting others access to their personal myPhyloDB database. myPhyloDB is available athttp://www.ars.usda.gov/services/software/download.htm?softwareid=472 and more information along with tutorials can be found on our websitehttp://www.myphylodb.org. Database URL:http://www.myphylodb.org. Published by Oxford University Press 2016. This work is written by US Government employees and is in the public domain in the United States.
NGSmethDB 2017: enhanced methylomes and differential methylation.
Lebrón, Ricardo; Gómez-Martín, Cristina; Carpena, Pedro; Bernaola-Galván, Pedro; Barturen, Guillermo; Hackenberg, Michael; Oliver, José L
2017-01-04
The 2017 update of NGSmethDB stores whole genome methylomes generated from short-read data sets obtained by bisulfite sequencing (WGBS) technology. To generate high-quality methylomes, stringent quality controls were integrated with third-part software, adding also a two-step mapping process to exploit the advantages of the new genome assembly models. The samples were all profiled under constant parameter settings, thus enabling comparative downstream analyses. Besides a significant increase in the number of samples, NGSmethDB now includes two additional data-types, which are a valuable resource for the discovery of methylation epigenetic biomarkers: (i) differentially methylated single-cytosines; and (ii) methylation segments (i.e. genome regions of homogeneous methylation). The NGSmethDB back-end is now based on MongoDB, a NoSQL hierarchical database using JSON-formatted documents and dynamic schemas, thus accelerating sample comparative analyses. Besides conventional database dumps, track hubs were implemented, which improved database access, visualization in genome browsers and comparative analyses to third-part annotations. In addition, the database can be also accessed through a RESTful API. Lastly, a Python client and a multiplatform virtual machine allow for program-driven access from user desktop. This way, private methylation data can be compared to NGSmethDB without the need to upload them to public servers. Database website: http://bioinfo2.ugr.es/NGSmethDB. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Molecular signatures database (MSigDB) 3.0.
Liberzon, Arthur; Subramanian, Aravind; Pinchback, Reid; Thorvaldsdóttir, Helga; Tamayo, Pablo; Mesirov, Jill P
2011-06-15
Well-annotated gene sets representing the universe of the biological processes are critical for meaningful and insightful interpretation of large-scale genomic data. The Molecular Signatures Database (MSigDB) is one of the most widely used repositories of such sets. We report the availability of a new version of the database, MSigDB 3.0, with over 6700 gene sets, a complete revision of the collection of canonical pathways and experimental signatures from publications, enhanced annotations and upgrades to the web site. MSigDB is freely available for non-commercial use at http://www.broadinstitute.org/msigdb.
The Research Potential of the Electronic OED Database at the University of Waterloo: A Case Study.
ERIC Educational Resources Information Center
Berg, Donna Lee
1991-01-01
Discusses the history and structure of the online database of the second edition of the Oxford English Dictionary (OED) and the software tools developed at the University of Waterloo to manipulate the unusually complex database. Four sample searches that indicate some types of problems that might be encountered are appended. (DB)
SinEx DB: a database for single exon coding sequences in mammalian genomes.
Jorquera, Roddy; Ortiz, Rodrigo; Ossandon, F; Cárdenas, Juan Pablo; Sepúlveda, Rene; González, Carolina; Holmes, David S
2016-01-01
Eukaryotic genes are typically interrupted by intragenic, noncoding sequences termed introns. However, some genes lack introns in their coding sequence (CDS) and are generally known as 'single exon genes' (SEGs). In this work, a SEG is defined as a nuclear, protein-coding gene that lacks introns in its CDS. Whereas, many public databases of Eukaryotic multi-exon genes are available, there are only two specialized databases for SEGs. The present work addresses the need for a more extensive and diverse database by creating SinEx DB, a publicly available, searchable database of predicted SEGs from 10 completely sequenced mammalian genomes including human. SinEx DB houses the DNA and protein sequence information of these SEGs and includes their functional predictions (KOG) and the relative distribution of these functions within species. The information is stored in a relational database built with My SQL Server 5.1.33 and the complete dataset of SEG sequences and their functional predictions are available for downloading. SinEx DB can be interrogated by: (i) a browsable phylogenetic schema, (ii) carrying out BLAST searches to the in-house SinEx DB of SEGs and (iii) via an advanced search mode in which the database can be searched by key words and any combination of searches by species and predicted functions. SinEx DB provides a rich source of information for advancing our understanding of the evolution and function of SEGs.Database URL: www.sinex.cl. © The Author(s) 2016. Published by Oxford University Press.
dbDSM: a manually curated database for deleterious synonymous mutations.
Wen, Pengbo; Xiao, Peng; Xia, Junfeng
2016-06-15
Synonymous mutations (SMs), which changed the sequence of a gene without directly altering the amino acid sequence of the encoded protein, were thought to have no functional consequences for a long time. They are often assumed to be neutral in models of mutation and selection and were completely ignored in many studies. However, accumulating experimental evidence has demonstrated that these mutations exert their impact on gene functions via splicing accuracy, mRNA stability, translation fidelity, protein folding and expression, and some of these mutations are implicated in human diseases. To the best of our knowledge, there is still no database specially focusing on disease-related SMs. We have developed a new database called dbDSM (database of Deleterious Synonymous Mutation), a continually updated database that collects, curates and manages available human disease-related SM data obtained from published literature. In the current release, dbDSM collects 1936 SM-disease association entries, including 1289 SMs and 443 human diseases from ClinVar, GRASP, GWAS Catalog, GWASdb, PolymiRTS database, PubMed database and Web of Knowledge. Additionally, we provided users a link to download all the data in the dbDSM and a link to submit novel data into the database. We hope dbDSM will be a useful resource for investigating the roles of SMs in human disease. dbDSM is freely available online at http://bioinfo.ahu.edu.cn:8080/dbDSM/index.jsp with all major browser supported. jfxia@ahu.edu.cn Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
PhamDB: a web-based application for building Phamerator databases.
Lamine, James G; DeJong, Randall J; Nelesen, Serita M
2016-07-01
PhamDB is a web application which creates databases of bacteriophage genes, grouped by gene similarity. It is backwards compatible with the existing Phamerator desktop software while providing an improved database creation workflow. Key features include a graphical user interface, validation of uploaded GenBank files, and abilities to import phages from existing databases, modify existing databases and queue multiple jobs. Source code and installation instructions for Linux, Windows and Mac OSX are freely available at https://github.com/jglamine/phage PhamDB is also distributed as a docker image which can be managed via Kitematic. This docker image contains the application and all third party software dependencies as a pre-configured system, and is freely available via the installation instructions provided. snelesen@calvin.edu. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
RiboDB Database: A Comprehensive Resource for Prokaryotic Systematics.
Jauffrit, Frédéric; Penel, Simon; Delmotte, Stéphane; Rey, Carine; de Vienne, Damien M; Gouy, Manolo; Charrier, Jean-Philippe; Flandrois, Jean-Pierre; Brochier-Armanet, Céline
2016-08-01
Ribosomal proteins (r-proteins) are increasingly used as an alternative to ribosomal rRNA for prokaryotic systematics. However, their routine use is difficult because r-proteins are often not or wrongly annotated in complete genome sequences, and there is currently no dedicated exhaustive database of r-proteins. RiboDB aims at fulfilling this gap. This weekly updated comprehensive database allows the fast and easy retrieval of r-protein sequences from publicly available complete prokaryotic genome sequences. The current version of RiboDB contains 90 r-proteins from 3,750 prokaryotic complete genomes encompassing 38 phyla/major classes and 1,759 different species. RiboDB is accessible at http://ribodb.univ-lyon1.fr and through ACNUC interfaces. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
NASA Technical Reports Server (NTRS)
Li, Chung-Sheng (Inventor); Smith, John R. (Inventor); Chang, Yuan-Chi (Inventor); Jhingran, Anant D. (Inventor); Padmanabhan, Sriram K. (Inventor); Hsiao, Hui-I (Inventor); Choy, David Mun-Hien (Inventor); Lin, Jy-Jine James (Inventor); Fuh, Gene Y. C. (Inventor); Williams, Robin (Inventor)
2004-01-01
Methods and apparatus for providing a multi-tier object-relational database architecture are disclosed. In one illustrative embodiment of the present invention, a multi-tier database architecture comprises an object-relational database engine as a top tier, one or more domain-specific extension modules as a bottom tier, and one or more universal extension modules as a middle tier. The individual extension modules of the bottom tier operationally connect with the one or more universal extension modules which, themselves, operationally connect with the database engine. The domain-specific extension modules preferably provide such functions as search, index, and retrieval services of images, video, audio, time series, web pages, text, XML, spatial data, etc. The domain-specific extension modules may include one or more IBM DB2 extenders, Oracle data cartridges and/or Informix datablades, although other domain-specific extension modules may be used.
IMGT, the International ImMunoGeneTics database.
Lefranc, M P; Giudicelli, V; Busin, C; Bodmer, J; Müller, W; Bontrop, R; Lemaitre, M; Malik, A; Chaume, D
1998-01-01
IMGT, the international ImMunoGeneTics database, is an integrated database specialising in Immunoglobulins (Ig), T cell Receptors (TcR) and Major Histocompatibility Complex (MHC) of all vertebrate species, created by Marie-Paule Lefranc, CNRS, Montpellier II University, Montpellier, France (lefranc@ligm.crbm.cnrs-mop.fr). IMGT includes three databases: LIGM-DB (for Ig and TcR), MHC/HLA-DB and PRIMER-DB (the last two in development). IMGT comprises expertly annotated sequences and alignment tables. LIGM-DB contains more than 23 000 Immunoglobulin and T cell Receptor sequences from 78 species. MHC/HLA-DB contains Class I and Class II Human Leucocyte Antigen alignment tables. An IMGT tool, DNAPLOT, developed for Ig, TcR and MHC sequence alignments, is also available. IMGT works in close collaboration with the EMBL database. IMGT goals are to establish a common data access to all immunogenetics data, including nucleotide and protein sequences, oligonucleotide primers, gene maps and other genetic data of Ig, TcR and MHC molecules, and to provide a graphical user friendly data access. IMGT has important implications in medical research (repertoire in autoimmune diseases, AIDS, leukemias, lymphomas), therapeutical approaches (antibody engineering), genome diversity and genome evolution studies. IMGT is freely available at http://imgt.cnusc.fr:8104 PMID:9399859
Ndhlovu, Andrew; Durand, Pierre M; Hazelhurst, Scott
2015-01-01
The evolutionary rate at codon sites across protein-coding nucleotide sequences represents a valuable tier of information for aligning sequences, inferring homology and constructing phylogenetic profiles. However, a comprehensive resource for cataloguing the evolutionary rate at codon sites and their corresponding nucleotide and protein domain sequence alignments has not been developed. To address this gap in knowledge, EvoDB (an Evolutionary rates DataBase) was compiled. Nucleotide sequences and their corresponding protein domain data including the associated seed alignments from the PFAM-A (protein family) database were used to estimate evolutionary rate (ω = dN/dS) profiles at codon sites for each entry. EvoDB contains 98.83% of the gapped nucleotide sequence alignments and 97.1% of the evolutionary rate profiles for the corresponding information in PFAM-A. As the identification of codon sites under positive selection and their position in a sequence profile is usually the most sought after information for molecular evolutionary biologists, evolutionary rate profiles were determined under the M2a model using the CODEML algorithm in the PAML (Phylogenetic Analysis by Maximum Likelihood) suite of software. Validation of nucleotide sequences against amino acid data was implemented to ensure high data quality. EvoDB is a catalogue of the evolutionary rate profiles and provides the corresponding phylogenetic trees, PFAM-A alignments and annotated accession identifier data. In addition, the database can be explored and queried using known evolutionary rate profiles to identify domains under similar evolutionary constraints and pressures. EvoDB is a resource for evolutionary, phylogenetic studies and presents a tier of information untapped by current databases. © The Author(s) 2015. Published by Oxford University Press.
HTT-DB: horizontally transferred transposable elements database.
Dotto, Bruno Reis; Carvalho, Evelise Leis; Silva, Alexandre Freitas; Duarte Silva, Luiz Fernando; Pinto, Paulo Marcos; Ortiz, Mauro Freitas; Wallau, Gabriel Luz
2015-09-01
Horizontal transfer of transposable (HTT) elements among eukaryotes was discovered in the mid-1980s. As then, >300 new cases have been described. New findings about HTT are revealing the evolutionary impact of this phenomenon on host genomes. In order to provide an up to date, interactive and expandable database for such events, we developed the HTT-DB database. HTT-DB allows easy access to most of HTT cases reported along with rich information about each case. Moreover, it allows the user to generate tables and graphs based on searches using Transposable elements and/or host species classification and export them in several formats. This database is freely available on the web at http://lpa.saogabriel.unipampa.edu.br:8080/httdatabase. HTT-DB was developed based on Java and MySQL with all major browsers supported. Tools and software packages used are free for personal or non-profit projects. bdotto82@gmail.com or gabriel.wallau@gmail.com. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
WholeCellSimDB: a hybrid relational/HDF database for whole-cell model predictions.
Karr, Jonathan R; Phillips, Nolan C; Covert, Markus W
2014-01-01
Mechanistic 'whole-cell' models are needed to develop a complete understanding of cell physiology. However, extracting biological insights from whole-cell models requires running and analyzing large numbers of simulations. We developed WholeCellSimDB, a database for organizing whole-cell simulations. WholeCellSimDB was designed to enable researchers to search simulation metadata to identify simulations for further analysis, and quickly slice and aggregate simulation results data. In addition, WholeCellSimDB enables users to share simulations with the broader research community. The database uses a hybrid relational/hierarchical data format architecture to efficiently store and retrieve both simulation setup metadata and results data. WholeCellSimDB provides a graphical Web-based interface to search, browse, plot and export simulations; a JavaScript Object Notation (JSON) Web service to retrieve data for Web-based visualizations; a command-line interface to deposit simulations; and a Python API to retrieve data for advanced analysis. Overall, we believe WholeCellSimDB will help researchers use whole-cell models to advance basic biological science and bioengineering. http://www.wholecellsimdb.org SOURCE CODE REPOSITORY: URL: http://github.com/CovertLab/WholeCellSimDB. © The Author(s) 2014. Published by Oxford University Press.
Database resources of the National Center for Biotechnology Information.
2016-01-04
The National Center for Biotechnology Information (NCBI) provides a large suite of online resources for biological information and data, including the GenBank(®) nucleic acid sequence database and the PubMed database of citations and abstracts for published life science journals. Additional NCBI resources focus on literature (PubMed Central (PMC), Bookshelf and PubReader), health (ClinVar, dbGaP, dbMHC, the Genetic Testing Registry, HIV-1/Human Protein Interaction Database and MedGen), genomes (BioProject, Assembly, Genome, BioSample, dbSNP, dbVar, Epigenomics, the Map Viewer, Nucleotide, Probe, RefSeq, Sequence Read Archive, the Taxonomy Browser and the Trace Archive), genes (Gene, Gene Expression Omnibus (GEO), HomoloGene, PopSet and UniGene), proteins (Protein, the Conserved Domain Database (CDD), COBALT, Conserved Domain Architecture Retrieval Tool (CDART), the Molecular Modeling Database (MMDB) and Protein Clusters) and chemicals (Biosystems and the PubChem suite of small molecule databases). The Entrez system provides search and retrieval operations for most of these databases. Augmenting many of the web applications are custom implementations of the BLAST program optimized to search specialized datasets. All of these resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov. Published by Oxford University Press on behalf of Nucleic Acids Research 2015. This work is written by (a) US Government employee(s) and is in the public domain in the US.
Salati, Michele; Pompili, Cecilia; Refai, Majed; Xiumè, Francesco; Sabbatini, Armando; Brunelli, Alessandro
2014-06-01
The aim of the present study was to verify whether the implementation of an electronic health record (EHR) in our thoracic surgery unit allows creation of a high-quality clinical database saving time and costs. Before August 2011, multiple individuals compiled the on-paper documents/records and a single data manager inputted selected data into the database (traditional database, tDB). Since the adoption of an EHR in August 2011, multiple individuals have been responsible for compiling the EHR, which automatically generates a real-time database (EHR-based database, eDB), without the need for a data manager. During the initial period of implementation of the EHR, periodic meetings were held with all physicians involved in the use of the EHR in order to monitor and standardize the data registration process. Data quality of the first 100 anatomical lung resections recorded in the eDB was assessed by measuring the total number of missing values (MVs: existing non-reported value) and inaccurate values (wrong data) occurring in 95 core variables. The average MV of the eDB was compared with the one occurring in the same variables of the last 100 records registered in the tDB. A learning curve was constructed by plotting the number of MVs in the electronic database and tDB with the patients arranged by the date of registration. The tDB and eDB had similar MVs (0.74 vs 1, P = 0.13). The learning curve showed an initial phase including about 35 records, where MV in the eDB was higher than that in the tDB (1.9 vs 0.74, P = 0.03), and a subsequent phase, where the MV was similar in the two databases (0.7 vs 0.74, P = 0.6). The inaccuracy rate of these two phases in the eDB was stable (0.5 vs 0.3, P = 0.3). Using EHR saved an average of 9 min per patient, totalling 15 h saved for obtaining a dataset of 100 patients with respect to the tDB. The implementation of EHR allowed streamlining the process of clinical data recording. It saved time and human resource costs, without compromising the quality of data. © The Author 2014. Published by Oxford University Press on behalf of the European Association for Cardio-Thoracic Surgery. All rights reserved.
DB-PABP: a database of polyanion-binding proteins
Fang, Jianwen; Dong, Yinghua; Salamat-Miller, Nazila; Russell Middaugh, C.
2008-01-01
The interactions between polyanions (PAs) and polyanion-binding proteins (PABPs) have been found to play significant roles in many essential biological processes including intracellular organization, transport and protein folding. Furthermore, many neurodegenerative disease-related proteins are PABPs. Thus, a better understanding of PA/PABP interactions may not only enhance our understandings of biological systems but also provide new clues to these deadly diseases. The literature in this field is widely scattered, suggesting the need for a comprehensive and searchable database of PABPs. The DB-PABP is a comprehensive, manually curated and searchable database of experimentally characterized PABPs. It is freely available and can be accessed online at http://pabp.bcf.ku.edu/DB_PABP/. The DB-PABP was implemented as a MySQL relational database. An interactive web interface was created using Java Server Pages (JSP). The search page of the database is organized into a main search form and a section for utilities. The main search form enables custom searches via four menus: protein names, polyanion names, the source species of the proteins and the methods used to discover the interactions. Available utilities include a commonality matrix, a function of listing PABPs by the number of interacting polyanions and a string search for author surnames. The DB-PABP is maintained at the University of Kansas. We encourage users to provide feedback and submit new data and references. PMID:17916573
DB-PABP: a database of polyanion-binding proteins.
Fang, Jianwen; Dong, Yinghua; Salamat-Miller, Nazila; Middaugh, C Russell
2008-01-01
The interactions between polyanions (PAs) and polyanion-binding proteins (PABPs) have been found to play significant roles in many essential biological processes including intracellular organization, transport and protein folding. Furthermore, many neurodegenerative disease-related proteins are PABPs. Thus, a better understanding of PA/PABP interactions may not only enhance our understandings of biological systems but also provide new clues to these deadly diseases. The literature in this field is widely scattered, suggesting the need for a comprehensive and searchable database of PABPs. The DB-PABP is a comprehensive, manually curated and searchable database of experimentally characterized PABPs. It is freely available and can be accessed online at http://pabp.bcf.ku.edu/DB_PABP/. The DB-PABP was implemented as a MySQL relational database. An interactive web interface was created using Java Server Pages (JSP). The search page of the database is organized into a main search form and a section for utilities. The main search form enables custom searches via four menus: protein names, polyanion names, the source species of the proteins and the methods used to discover the interactions. Available utilities include a commonality matrix, a function of listing PABPs by the number of interacting polyanions and a string search for author surnames. The DB-PABP is maintained at the University of Kansas. We encourage users to provide feedback and submit new data and references.
The Gypsy Database (GyDB) of mobile genetic elements: release 2.0
Llorens, Carlos; Futami, Ricardo; Covelli, Laura; Domínguez-Escribá, Laura; Viu, Jose M.; Tamarit, Daniel; Aguilar-Rodríguez, Jose; Vicente-Ripolles, Miguel; Fuster, Gonzalo; Bernet, Guillermo P.; Maumus, Florian; Munoz-Pomer, Alfonso; Sempere, Jose M.; Latorre, Amparo; Moya, Andres
2011-01-01
This article introduces the second release of the Gypsy Database of Mobile Genetic Elements (GyDB 2.0): a research project devoted to the evolutionary dynamics of viruses and transposable elements based on their phylogenetic classification (per lineage and protein domain). The Gypsy Database (GyDB) is a long-term project that is continuously progressing, and that owing to the high molecular diversity of mobile elements requires to be completed in several stages. GyDB 2.0 has been powered with a wiki to allow other researchers participate in the project. The current database stage and scope are long terminal repeats (LTR) retroelements and relatives. GyDB 2.0 is an update based on the analysis of Ty3/Gypsy, Retroviridae, Ty1/Copia and Bel/Pao LTR retroelements and the Caulimoviridae pararetroviruses of plants. Among other features, in terms of the aforementioned topics, this update adds: (i) a variety of descriptions and reviews distributed in multiple web pages; (ii) protein-based phylogenies, where phylogenetic levels are assigned to distinct classified elements; (iii) a collection of multiple alignments, lineage-specific hidden Markov models and consensus sequences, called GyDB collection; (iv) updated RefSeq databases and BLAST and HMM servers to facilitate sequence characterization of new LTR retroelement and caulimovirus queries; and (v) a bibliographic server. GyDB 2.0 is available at http://gydb.org. PMID:21036865
The Gypsy Database (GyDB) of mobile genetic elements: release 2.0.
Llorens, Carlos; Futami, Ricardo; Covelli, Laura; Domínguez-Escribá, Laura; Viu, Jose M; Tamarit, Daniel; Aguilar-Rodríguez, Jose; Vicente-Ripolles, Miguel; Fuster, Gonzalo; Bernet, Guillermo P; Maumus, Florian; Munoz-Pomer, Alfonso; Sempere, Jose M; Latorre, Amparo; Moya, Andres
2011-01-01
This article introduces the second release of the Gypsy Database of Mobile Genetic Elements (GyDB 2.0): a research project devoted to the evolutionary dynamics of viruses and transposable elements based on their phylogenetic classification (per lineage and protein domain). The Gypsy Database (GyDB) is a long-term project that is continuously progressing, and that owing to the high molecular diversity of mobile elements requires to be completed in several stages. GyDB 2.0 has been powered with a wiki to allow other researchers participate in the project. The current database stage and scope are long terminal repeats (LTR) retroelements and relatives. GyDB 2.0 is an update based on the analysis of Ty3/Gypsy, Retroviridae, Ty1/Copia and Bel/Pao LTR retroelements and the Caulimoviridae pararetroviruses of plants. Among other features, in terms of the aforementioned topics, this update adds: (i) a variety of descriptions and reviews distributed in multiple web pages; (ii) protein-based phylogenies, where phylogenetic levels are assigned to distinct classified elements; (iii) a collection of multiple alignments, lineage-specific hidden Markov models and consensus sequences, called GyDB collection; (iv) updated RefSeq databases and BLAST and HMM servers to facilitate sequence characterization of new LTR retroelement and caulimovirus queries; and (v) a bibliographic server. GyDB 2.0 is available at http://gydb.org.
PCoM-DB Update: A Protein Co-Migration Database for Photosynthetic Organisms.
Takabayashi, Atsushi; Takabayashi, Saeka; Takahashi, Kaori; Watanabe, Mai; Uchida, Hiroko; Murakami, Akio; Fujita, Tomomichi; Ikeuchi, Masahiko; Tanaka, Ayumi
2017-01-01
The identification of protein complexes is important for the understanding of protein structure and function and the regulation of cellular processes. We used blue-native PAGE and tandem mass spectrometry to identify protein complexes systematically, and built a web database, the protein co-migration database (PCoM-DB, http://pcomdb.lowtem.hokudai.ac.jp/proteins/top), to provide prediction tools for protein complexes. PCoM-DB provides migration profiles for any given protein of interest, and allows users to compare them with migration profiles of other proteins, showing the oligomeric states of proteins and thus identifying potential interaction partners. The initial version of PCoM-DB (launched in January 2013) included protein complex data for Synechocystis whole cells and Arabidopsis thaliana thylakoid membranes. Here we report PCoM-DB version 2.0, which includes new data sets and analytical tools. Additional data are included from whole cells of the pelagic marine picocyanobacterium Prochlorococcus marinus, the thermophilic cyanobacterium Thermosynechococcus elongatus, the unicellular green alga Chlamydomonas reinhardtii and the bryophyte Physcomitrella patens. The Arabidopsis protein data now include data for intact mitochondria, intact chloroplasts, chloroplast stroma and chloroplast envelopes. The new tools comprise a multiple-protein search form and a heat map viewer for protein migration profiles. Users can compare migration profiles of a protein of interest among different organelles or compare migration profiles among different proteins within the same sample. For Arabidopsis proteins, users can compare migration profiles of a protein of interest with putative homologous proteins from non-Arabidopsis organisms. The updated PCoM-DB will help researchers find novel protein complexes and estimate their evolutionary changes in the green lineage. © The Author 2017. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists. All rights reserved. For permissions, please email: journals.permissions@oup.com.
Chang, Yi-Chien; Hu, Zhenjun; Rachlin, John; Anton, Brian P; Kasif, Simon; Roberts, Richard J; Steffen, Martin
2016-01-04
The COMBREX database (COMBREX-DB; combrex.bu.edu) is an online repository of information related to (i) experimentally determined protein function, (ii) predicted protein function, (iii) relationships among proteins of unknown function and various types of experimental data, including molecular function, protein structure, and associated phenotypes. The database was created as part of the novel COMBREX (COMputational BRidges to EXperiments) effort aimed at accelerating the rate of gene function validation. It currently holds information on ∼ 3.3 million known and predicted proteins from over 1000 completely sequenced bacterial and archaeal genomes. The database also contains a prototype recommendation system for helping users identify those proteins whose experimental determination of function would be most informative for predicting function for other proteins within protein families. The emphasis on documenting experimental evidence for function predictions, and the prioritization of uncharacterized proteins for experimental testing distinguish COMBREX from other publicly available microbial genomics resources. This article describes updates to COMBREX-DB since an initial description in the 2011 NAR Database Issue. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Web client and ODBC access to legacy database information: a low cost approach.
Sanders, N. W.; Mann, N. H.; Spengler, D. M.
1997-01-01
A new method has been developed for the Department of Orthopaedics of Vanderbilt University Medical Center to access departmental clinical data. Previously this data was stored only in the medical center's mainframe DB2 database, it is now additionally stored in a departmental SQL database. Access to this data is available via any ODBC compliant front-end or a web client. With a small budget and no full time staff, we were able to give our department on-line access to many years worth of patient data that was previously inaccessible. PMID:9357735
Ameur, Adam; Bunikis, Ignas; Enroth, Stefan; Gyllensten, Ulf
2014-01-01
CanvasDB is an infrastructure for management and analysis of genetic variants from massively parallel sequencing (MPS) projects. The system stores SNP and indel calls in a local database, designed to handle very large datasets, to allow for rapid analysis using simple commands in R. Functional annotations are included in the system, making it suitable for direct identification of disease-causing mutations in human exome- (WES) or whole-genome sequencing (WGS) projects. The system has a built-in filtering function implemented to simultaneously take into account variant calls from all individual samples. This enables advanced comparative analysis of variant distribution between groups of samples, including detection of candidate causative mutations within family structures and genome-wide association by sequencing. In most cases, these analyses are executed within just a matter of seconds, even when there are several hundreds of samples and millions of variants in the database. We demonstrate the scalability of canvasDB by importing the individual variant calls from all 1092 individuals present in the 1000 Genomes Project into the system, over 4.4 billion SNPs and indels in total. Our results show that canvasDB makes it possible to perform advanced analyses of large-scale WGS projects on a local server. Database URL: https://github.com/UppsalaGenomeCenter/CanvasDB. © The Author(s) 2014. Published by Oxford University Press.
PGSB PlantsDB: updates to the database framework for comparative plant genome research.
Spannagl, Manuel; Nussbaumer, Thomas; Bader, Kai C; Martis, Mihaela M; Seidel, Michael; Kugler, Karl G; Gundlach, Heidrun; Mayer, Klaus F X
2016-01-04
PGSB (Plant Genome and Systems Biology: formerly MIPS) PlantsDB (http://pgsb.helmholtz-muenchen.de/plant/index.jsp) is a database framework for the comparative analysis and visualization of plant genome data. The resource has been updated with new data sets and types as well as specialized tools and interfaces to address user demands for intuitive access to complex plant genome data. In its latest incarnation, we have re-worked both the layout and navigation structure and implemented new keyword search options and a new BLAST sequence search functionality. Actively involved in corresponding sequencing consortia, PlantsDB has dedicated special efforts to the integration and visualization of complex triticeae genome data, especially for barley, wheat and rye. We enhanced CrowsNest, a tool to visualize syntenic relationships between genomes, with data from the wheat sub-genome progenitor Aegilops tauschii and added functionality to the PGSB RNASeqExpressionBrowser. GenomeZipper results were integrated for the genomes of barley, rye, wheat and perennial ryegrass and interactive access is granted through PlantsDB interfaces. Data exchange and cross-linking between PlantsDB and other plant genome databases is stimulated by the transPLANT project (http://transplantdb.eu/). © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
TMC-SNPdb: an Indian germline variant database derived from whole exome sequences.
Upadhyay, Pawan; Gardi, Nilesh; Desai, Sanket; Sahoo, Bikram; Singh, Ankita; Togar, Trupti; Iyer, Prajish; Prasad, Ratnam; Chandrani, Pratik; Gupta, Sudeep; Dutt, Amit
2016-01-01
Cancer is predominantly a somatic disease. A mutant allele present in a cancer cell genome is considered somatic when it's absent in the paired normal genome along with public SNP databases. The current build of dbSNP, the most comprehensive public SNP database, however inadequately represents several non-European Caucasian populations, posing a limitation in cancer genomic analyses of data from these populations. We present the T: ata M: emorial C: entre-SNP D: ata B: ase (TMC-SNPdb), as the first open source, flexible, upgradable, and freely available SNP database (accessible through dbSNP build 149 and ANNOVAR)-representing 114 309 unique germline variants-generated from whole exome data of 62 normal samples derived from cancer patients of Indian origin. The TMC-SNPdb is presented with a companion subtraction tool that can be executed with command line option or using an easy-to-use graphical user interface with the ability to deplete additional Indian population specific SNPs over and above dbSNP and 1000 Genomes databases. Using an institutional generated whole exome data set of 132 samples of Indian origin, we demonstrate that TMC-SNPdb could deplete 42, 33 and 28% false positive somatic events post dbSNP depletion in Indian origin tongue, gallbladder, and cervical cancer samples, respectively. Beyond cancer somatic analyses, we anticipate utility of the TMC-SNPdb in several Mendelian germline diseases. In addition to dbSNP build 149 and ANNOVAR, the TMC-SNPdb along with the subtraction tool is available for download in the public domain at the following:Database URL: http://www.actrec.gov.in/pi-webpages/AmitDutt/TMCSNP/TMCSNPdp.html. © The Author(s) 2016. Published by Oxford University Press.
TransAtlasDB: an integrated database connecting expression data, metadata and variants
Adetunji, Modupeore O; Lamont, Susan J; Schmidt, Carl J
2018-01-01
Abstract High-throughput transcriptome sequencing (RNAseq) is the universally applied method for target-free transcript identification and gene expression quantification, generating huge amounts of data. The constraint of accessing such data and interpreting results can be a major impediment in postulating suitable hypothesis, thus an innovative storage solution that addresses these limitations, such as hard disk storage requirements, efficiency and reproducibility are paramount. By offering a uniform data storage and retrieval mechanism, various data can be compared and easily investigated. We present a sophisticated system, TransAtlasDB, which incorporates a hybrid architecture of both relational and NoSQL databases for fast and efficient data storage, processing and querying of large datasets from transcript expression analysis with corresponding metadata, as well as gene-associated variants (such as SNPs) and their predicted gene effects. TransAtlasDB provides the data model of accurate storage of the large amount of data derived from RNAseq analysis and also methods of interacting with the database, either via the command-line data management workflows, written in Perl, with useful functionalities that simplifies the complexity of data storage and possibly manipulation of the massive amounts of data generated from RNAseq analysis or through the web interface. The database application is currently modeled to handle analyses data from agricultural species, and will be expanded to include more species groups. Overall TransAtlasDB aims to serve as an accessible repository for the large complex results data files derived from RNAseq gene expression profiling and variant analysis. Database URL: https://modupeore.github.io/TransAtlasDB/ PMID:29688361
MobiDB-lite: fast and highly specific consensus prediction of intrinsic disorder in proteins.
Necci, Marco; Piovesan, Damiano; Dosztányi, Zsuzsanna; Tosatto, Silvio C E
2017-05-01
Intrinsic disorder (ID) is established as an important feature of protein sequences. Its use in proteome annotation is however hampered by the availability of many methods with similar performance at the single residue level, which have mostly not been optimized to predict long ID regions of size comparable to domains. Here, we have focused on providing a single consensus-based prediction, MobiDB-lite, optimized for highly specific (i.e. few false positive) predictions of long disorder. The method uses eight different predictors to derive a consensus which is then filtered for spurious short predictions. Consensus prediction is shown to outperform the single methods when annotating long ID regions. MobiDB-lite can be useful in large-scale annotation scenarios and has indeed already been integrated in the MobiDB, DisProt and InterPro databases. MobiDB-lite is available as part of the MobiDB database from URL: http://mobidb.bio.unipd.it/. An executable can be downloaded from URL: http://protein.bio.unipd.it/mobidblite/. silvio.tosatto@unipd.it. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
Release of ToxCastDB and ExpoCastDB databases
EPA has released two databases - the Toxicity Forecaster database (ToxCastDB) and a database of chemical exposure studies (ExpoCastDB) - that scientists and the public can use to access chemical toxicity and exposure data. ToxCastDB users can search and download data from over 50...
The MAR databases: development and implementation of databases specific for marine metagenomics.
Klemetsen, Terje; Raknes, Inge A; Fu, Juan; Agafonov, Alexander; Balasundaram, Sudhagar V; Tartari, Giacomo; Robertsen, Espen; Willassen, Nils P
2018-01-04
We introduce the marine databases; MarRef, MarDB and MarCat (https://mmp.sfb.uit.no/databases/), which are publicly available resources that promote marine research and innovation. These data resources, which have been implemented in the Marine Metagenomics Portal (MMP) (https://mmp.sfb.uit.no/), are collections of richly annotated and manually curated contextual (metadata) and sequence databases representing three tiers of accuracy. While MarRef is a database for completely sequenced marine prokaryotic genomes, which represent a marine prokaryote reference genome database, MarDB includes all incomplete sequenced prokaryotic genomes regardless level of completeness. The last database, MarCat, represents a gene (protein) catalog of uncultivable (and cultivable) marine genes and proteins derived from marine metagenomics samples. The first versions of MarRef and MarDB contain 612 and 3726 records, respectively. Each record is built up of 106 metadata fields including attributes for sampling, sequencing, assembly and annotation in addition to the organism and taxonomic information. Currently, MarCat contains 1227 records with 55 metadata fields. Ontologies and controlled vocabularies are used in the contextual databases to enhance consistency. The user-friendly web interface lets the visitors browse, filter and search in the contextual databases and perform BLAST searches against the corresponding sequence databases. All contextual and sequence databases are freely accessible and downloadable from https://s1.sfb.uit.no/public/mar/. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
dbHiMo: a web-based epigenomics platform for histone-modifying enzymes.
Choi, Jaeyoung; Kim, Ki-Tae; Huh, Aram; Kwon, Seomun; Hong, Changyoung; Asiegbu, Fred O; Jeon, Junhyun; Lee, Yong-Hwan
2015-01-01
Over the past two decades, epigenetics has evolved into a key concept for understanding regulation of gene expression. Among many epigenetic mechanisms, covalent modifications such as acetylation and methylation of lysine residues on core histones emerged as a major mechanism in epigenetic regulation. Here, we present the database for histone-modifying enzymes (dbHiMo; http://hme.riceblast.snu.ac.kr/) aimed at facilitating functional and comparative analysis of histone-modifying enzymes (HMEs). HMEs were identified by applying a search pipeline built upon profile hidden Markov model (HMM) to proteomes. The database incorporates 11,576 HMEs identified from 603 proteomes including 483 fungal, 32 plants and 51 metazoan species. The dbHiMo provides users with web-based personalized data browsing and analysis tools, supporting comparative and evolutionary genomics. With comprehensive data entries and associated web-based tools, our database will be a valuable resource for future epigenetics/epigenomics studies. © The Author(s) 2015. Published by Oxford University Press.
Database resources of the National Center for Biotechnology Information.
2015-01-01
The National Center for Biotechnology Information (NCBI) provides a large suite of online resources for biological information and data, including the GenBank(®) nucleic acid sequence database and the PubMed database of citations and abstracts for published life science journals. Additional NCBI resources focus on literature (Bookshelf, PubMed Central (PMC) and PubReader); medical genetics (ClinVar, dbMHC, the Genetic Testing Registry, HIV-1/Human Protein Interaction Database and MedGen); genes and genomics (BioProject, BioSample, dbSNP, dbVar, Epigenomics, Gene, Gene Expression Omnibus (GEO), Genome, HomoloGene, the Map Viewer, Nucleotide, PopSet, Probe, RefSeq, Sequence Read Archive, the Taxonomy Browser, Trace Archive and UniGene); and proteins and chemicals (Biosystems, COBALT, the Conserved Domain Database (CDD), the Conserved Domain Architecture Retrieval Tool (CDART), the Molecular Modeling Database (MMDB), Protein Clusters, Protein and the PubChem suite of small molecule databases). The Entrez system provides search and retrieval operations for many of these databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of these resources can be accessed through the NCBI home page at http://www.ncbi.nlm.nih.gov. Published by Oxford University Press on behalf of Nucleic Acids Research 2014. This work is written by (a) US Government employee(s) and is in the public domain in the US.
MeT-DB V2.0: elucidating context-specific functions of N6-methyl-adenosine methyltranscriptome
Liu, Hui; Wang, Huaizhi; Wei, Zhen; Zhang, Songyao; Hua, Gang; Zhang, Shao-Wu; Zhang, Lin; Gao, Shou-Jiang
2018-01-01
Abstract Methyltranscriptome is an exciting new area that studies the mechanisms and functions of methylation in transcripts. A knowledge base with the systematic collection and curation of context specific transcriptome-wide methylations is critical for elucidating their biological functions as well as for developing bioinformatics tools. Since its inception in 2014, the Met-DB (Liu, H., Flores, M.A., Meng, J., Zhang, L., Zhao, X., Rao, M.K., Chen, Y. and Huang, Y. (2015) MeT-DB: a database of transcriptome methylation in mammalian cells. Nucleic Acids Res., 43, D197–D203), has become an important resource for methyltranscriptome, especially in the N6-methyl-adenosine (m6A) research community. Here, we report Met-DB v2.0, the significantly improved second version of Met-DB, which is entirely redesigned to focus more on elucidating context-specific m6A functions. Met-DB v2.0 has a major increase in context-specific m6A peaks and single-base sites predicted from 185 samples for 7 species from 26 independent studies. Moreover, it is also integrated with a new database for targets of m6A readers, erasers and writers and expanded with more collections of functional data. The redesigned Met-DB v2.0 web interface and genome browser provide more friendly, powerful, and informative ways to query and visualize the data. More importantly, MeT-DB v2.0 offers for the first time a series of tools specifically designed for understanding m6A functions. Met-DB V2.0 will be a valuable resource for m6A methyltranscriptome research. The Met-DB V2.0 database is available at http://compgenomics.utsa.edu/MeTDB/ and http://www.xjtlu.edu.cn/metdb2. PMID:29126312
Anderson, Beth M.; Stevens, Michael C.; Glahn, David C.; Assaf, Michal; Pearlson, Godfrey D.
2013-01-01
We present a modular, high performance, open-source database system that incorporates popular neuroimaging database features with novel peer-to-peer sharing, and a simple installation. An increasing number of imaging centers have created a massive amount of neuroimaging data since fMRI became popular more than 20 years ago, with much of that data unshared. The Neuroinformatics Database (NiDB) provides a stable platform to store and manipulate neuroimaging data and addresses several of the impediments to data sharing presented by the INCF Task Force on Neuroimaging Datasharing, including 1) motivation to share data, 2) technical issues, and 3) standards development. NiDB solves these problems by 1) minimizing PHI use, providing a cost effective simple locally stored platform, 2) storing and associating all data (including genome) with a subject and creating a peer-to-peer sharing model, and 3) defining a sample, normalized definition of a data storage structure that is used in NiDB. NiDB not only simplifies the local storage and analysis of neuroimaging data, but also enables simple sharing of raw data and analysis methods, which may encourage further sharing. PMID:23912507
Certifiable database generation for SVS
NASA Astrophysics Data System (ADS)
Schiefele, Jens; Damjanovic, Dejan; Kubbat, Wolfgang
2000-06-01
In future aircraft cockpits SVS will be used to display 3D physical and virtual information to pilots. A review of prototype and production Synthetic Vision Displays (SVD) from Euro Telematic, UPS Advanced Technologies, Universal Avionics, VDO-Luftfahrtgeratewerk, and NASA, are discussed. As data sources terrain, obstacle, navigation, and airport data is needed, Jeppesen-Sanderson, Inc. and Darmstadt Univ. of Technology currently develop certifiable methods for acquisition, validation, and processing methods for terrain, obstacle, and airport databases. The acquired data will be integrated into a High-Quality Database (HQ-DB). This database is the master repository. It contains all information relevant for all types of aviation applications. From the HQ-DB SVS relevant data is retried, converted, decimated, and adapted into a SVS Real-Time Onboard Database (RTO-DB). The process of data acquisition, verification, and data processing will be defined in a way that allows certication within DO-200a and new RTCA/EUROCAE standards for airport and terrain data. The open formats proposed will be established and evaluated for industrial usability. Finally, a NASA-industry cooperation to develop industrial SVS products under the umbrella of the NASA Aviation Safety Program (ASP) is introduced. A key element of the SVS NASA-ASP is the Jeppesen lead task to develop methods for world-wide database generation and certification. Jeppesen will build three airport databases that will be used in flight trials with NASA aircraft.
NGSmethDB 2017: enhanced methylomes and differential methylation
Lebrón, Ricardo; Gómez-Martín, Cristina; Carpena, Pedro; Bernaola-Galván, Pedro; Barturen, Guillermo; Hackenberg, Michael; Oliver, José L.
2017-01-01
The 2017 update of NGSmethDB stores whole genome methylomes generated from short-read data sets obtained by bisulfite sequencing (WGBS) technology. To generate high-quality methylomes, stringent quality controls were integrated with third-part software, adding also a two-step mapping process to exploit the advantages of the new genome assembly models. The samples were all profiled under constant parameter settings, thus enabling comparative downstream analyses. Besides a significant increase in the number of samples, NGSmethDB now includes two additional data-types, which are a valuable resource for the discovery of methylation epigenetic biomarkers: (i) differentially methylated single-cytosines; and (ii) methylation segments (i.e. genome regions of homogeneous methylation). The NGSmethDB back-end is now based on MongoDB, a NoSQL hierarchical database using JSON-formatted documents and dynamic schemas, thus accelerating sample comparative analyses. Besides conventional database dumps, track hubs were implemented, which improved database access, visualization in genome browsers and comparative analyses to third-part annotations. In addition, the database can be also accessed through a RESTful API. Lastly, a Python client and a multiplatform virtual machine allow for program-driven access from user desktop. This way, private methylation data can be compared to NGSmethDB without the need to upload them to public servers. Database website: http://bioinfo2.ugr.es/NGSmethDB. PMID:27794041
The Toxicity Reference Database (ToxRefDB) contains approximately 30 years and $2 billion worth of animal studies. ToxRefDB allows scientists and the interested public to search and download thousands of animal toxicity testing results for hundreds of chemicals that were previously found only in paper documents. Currently, there are 474 chemicals in ToxRefDB, primarily the data rich pesticide active ingredients, but the number will continue to expand.
MPIC: a mitochondrial protein import components database for plant and non-plant species.
Murcha, Monika W; Narsai, Reena; Devenish, James; Kubiszewski-Jakubiak, Szymon; Whelan, James
2015-01-01
In the 2 billion years since the endosymbiotic event that gave rise to mitochondria, variations in mitochondrial protein import have evolved across different species. With the genomes of an increasing number of plant species sequenced, it is possible to gain novel insights into mitochondrial protein import pathways. We have generated the Mitochondrial Protein Import Components (MPIC) Database (DB; http://www.plantenergy.uwa.edu.au/applications/mpic) providing searchable information on the protein import apparatus of plant and non-plant mitochondria. An in silico analysis was carried out, comparing the mitochondrial protein import apparatus from 24 species representing various lineages from Saccharomyces cerevisiae (yeast) and algae to Homo sapiens (human) and higher plants, including Arabidopsis thaliana (Arabidopsis), Oryza sativa (rice) and other more recently sequenced plant species. Each of these species was extensively searched and manually assembled for analysis in the MPIC DB. The database presents an interactive diagram in a user-friendly manner, allowing users to select their import component of interest. The MPIC DB presents an extensive resource facilitating detailed investigation of the mitochondrial protein import machinery and allowing patterns of conservation and divergence to be recognized that would otherwise have been missed. To demonstrate the usefulness of the MPIC DB, we present a comparative analysis of the mitochondrial protein import machinery in plants and non-plant species, revealing plant-specific features that have evolved. © The Author 2014. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists. All rights reserved. For permissions, please email: journals.permissions@oup.com.
Crasto, Chiquito J.; Marenco, Luis N.; Liu, Nian; Morse, Thomas M.; Cheung, Kei-Hoi; Lai, Peter C.; Bahl, Gautam; Masiar, Peter; Lam, Hugo Y.K.; Lim, Ernest; Chen, Huajin; Nadkarni, Prakash; Migliore, Michele; Miller, Perry L.; Shepherd, Gordon M.
2009-01-01
This article presents the latest developments in neuroscience information dissemination through the SenseLab suite of databases: NeuronDB, CellPropDB, ORDB, OdorDB, OdorMapDB, ModelDB and BrainPharm. These databases include information related to: (i) neuronal membrane properties and neuronal models, and (ii) genetics, genomics, proteomics and imaging studies of the olfactory system. We describe here: the new features for each database, the evolution of SenseLab’s unifying database architecture and instances of SenseLab database interoperation with other neuroscience online resources. PMID:17510162
Analyzing GAIAN Database (GaianDB) on a Tactical Network
2015-11-30
we connected 3 Raspberry Pi’s running GaianDB and our augmented version of splatform to a network of 3 CSRs. The Raspberry Pi is a low power, low...based on Debian from a connected secure digital high capacity (SDHC) card or a universal serial bus (USB) device. The Raspberry Pi comes equipped with...requirements, capabilities, and cost make the Raspberry Pi a useful device for sensor experimentation. From there, we performed 3 types of benchmarks
Lee, Myunggyo; Lee, Kyubum; Yu, Namhee; Jang, Insu; Choi, Ikjung; Kim, Pora; Jang, Ye Eun; Kim, Byounggun; Kim, Sunkyu; Lee, Byungwook; Kang, Jaewoo; Lee, Sanghyuk
2017-01-04
Fusion gene is an important class of therapeutic targets and prognostic markers in cancer. ChimerDB is a comprehensive database of fusion genes encompassing analysis of deep sequencing data and manual curations. In this update, the database coverage was enhanced considerably by adding two new modules of The Cancer Genome Atlas (TCGA) RNA-Seq analysis and PubMed abstract mining. ChimerDB 3.0 is composed of three modules of ChimerKB, ChimerPub and ChimerSeq. ChimerKB represents a knowledgebase including 1066 fusion genes with manual curation that were compiled from public resources of fusion genes with experimental evidences. ChimerPub includes 2767 fusion genes obtained from text mining of PubMed abstracts. ChimerSeq module is designed to archive the fusion candidates from deep sequencing data. Importantly, we have analyzed RNA-Seq data of the TCGA project covering 4569 patients in 23 cancer types using two reliable programs of FusionScan and TopHat-Fusion. The new user interface supports diverse search options and graphic representation of fusion gene structure. ChimerDB 3.0 is available at http://ercsb.ewha.ac.kr/fusiongene/. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
AOP-DB Frontend: A user interface for the Adverse Outcome Pathways Database.
The EPA Adverse Outcome Pathway Database (AOP-DB) is a database resource that aggregates association relationships between AOPs, genes, chemicals, diseases, pathways, species orthology information, ontologies. The AOP-DB frontend is a simple yet powerful AOP-DB user interface in...
Neuroimaging Data Sharing on the Neuroinformatics Database Platform
Book, Gregory A; Stevens, Michael; Assaf, Michal; Glahn, David; Pearlson, Godfrey D
2015-01-01
We describe the Neuroinformatics Database (NiDB), an open-source database platform for archiving, analysis, and sharing of neuroimaging data. Data from the multi-site projects Autism Brain Imaging Data Exchange (ABIDE), Bipolar-Schizophrenia Network on Intermediate Phenotypes parts one and two (B-SNIP1, B-SNIP2), and Monetary Incentive Delay task (MID) are available for download from the public instance of NiDB, with more projects sharing data as it becomes available. As demonstrated by making several large datasets available, NiDB is an extensible platform appropriately suited to archive and distribute shared neuroimaging data. PMID:25888923
Mostaguir, Khaled; Hoogland, Christine; Binz, Pierre-Alain; Appel, Ron D
2003-08-01
The Make 2D-DB tool has been previously developed to help build federated two-dimensional gel electrophoresis (2-DE) databases on one's own web site. The purpose of our work is to extend the strength of the first package and to build a more efficient environment. Such an environment should be able to fulfill the different needs and requirements arising from both the growing use of 2-DE techniques and the increasing amount of distributed experimental data.
Kaas, Quentin; Ruiz, Manuel; Lefranc, Marie-Paule
2004-01-01
IMGT/3Dstructure-DB and IMGT/Structural-Query are a novel 3D structure database and a new tool for immunological proteins. They are part of IMGT, the international ImMunoGenetics information system®, a high-quality integrated knowledge resource specializing in immunoglobulins (IG), T cell receptors (TR), major histocompatibility complex (MHC) and related proteins of the immune system (RPI) of human and other vertebrate species, which consists of databases, Web resources and interactive on-line tools. IMGT/3Dstructure-DB data are described according to the IMGT Scientific chart rules based on the IMGT-ONTOLOGY concepts. IMGT/3Dstructure-DB provides IMGT gene and allele identification of IG, TR and MHC proteins with known 3D structures, domain delimitations, amino acid positions according to the IMGT unique numbering and renumbered coordinate flat files. Moreover IMGT/3Dstructure-DB provides 2D graphical representations (or Collier de Perles) and results of contact analysis. The IMGT/StructuralQuery tool allows search of this database based on specific structural characteristics. IMGT/3Dstructure-DB and IMGT/StructuralQuery are freely available at http://imgt.cines.fr. PMID:14681396
Wu, Jiaxin; Wu, Mengmeng; Li, Lianshuo; Liu, Zhuo; Zeng, Wanwen; Jiang, Rui
2016-01-01
The recent advancement of the next generation sequencing technology has enabled the fast and low-cost detection of all genetic variants spreading across the entire human genome, making the application of whole-genome sequencing a tendency in the study of disease-causing genetic variants. Nevertheless, there still lacks a repository that collects predictions of functionally damaging effects of human genetic variants, though it has been well recognized that such predictions play a central role in the analysis of whole-genome sequencing data. To fill this gap, we developed a database named dbWGFP (a database and web server of human whole-genome single nucleotide variants and their functional predictions) that contains functional predictions and annotations of nearly 8.58 billion possible human whole-genome single nucleotide variants. Specifically, this database integrates 48 functional predictions calculated by 17 popular computational methods and 44 valuable annotations obtained from various data sources. Standalone software, user-friendly query services and free downloads of this database are available at http://bioinfo.au.tsinghua.edu.cn/dbwgfp. dbWGFP provides a valuable resource for the analysis of whole-genome sequencing, exome sequencing and SNP array data, thereby complementing existing data sources and computational resources in deciphering genetic bases of human inherited diseases. © The Author(s) 2016. Published by Oxford University Press.
Clima, Rosanna; Preste, Roberto; Calabrese, Claudia; Diroma, Maria Angela; Santorsola, Mariangela; Scioscia, Gaetano; Simone, Domenico; Shen, Lishuang; Gasparre, Giuseppe; Attimonelli, Marcella
2017-01-04
The HmtDB resource hosts a database of human mitochondrial genome sequences from individuals with healthy and disease phenotypes. The database is intended to support both population geneticists as well as clinicians undertaking the task to assess the pathogenicity of specific mtDNA mutations. The wide application of next-generation sequencing (NGS) has provided an enormous volume of high-resolution data at a low price, increasing the availability of human mitochondrial sequencing data, which called for a cogent and significant expansion of HmtDB data content that has more than tripled in the current release. We here describe additional novel features, including: (i) a complete, user-friendly restyling of the web interface, (ii) links to the command-line stand-alone and web versions of the MToolBox package, an up-to-date tool to reconstruct and analyze human mitochondrial DNA from NGS data and (iii) the implementation of the Reconstructed Sapiens Reference Sequence (RSRS) as mitochondrial reference sequence. The overall update renders HmtDB an even more handy and useful resource as it enables a more rapid data access, processing and analysis. HmtDB is accessible at http://www.hmtdb.uniba.it/. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Blin, Kai; Medema, Marnix H; Kottmann, Renzo; Lee, Sang Yup; Weber, Tilmann
2017-01-04
Secondary metabolites produced by microorganisms are the main source of bioactive compounds that are in use as antimicrobial and anticancer drugs, fungicides, herbicides and pesticides. In the last decade, the increasing availability of microbial genomes has established genome mining as a very important method for the identification of their biosynthetic gene clusters (BGCs). One of the most popular tools for this task is antiSMASH. However, so far, antiSMASH is limited to de novo computing results for user-submitted genomes and only partially connects these with BGCs from other organisms. Therefore, we developed the antiSMASH database, a simple but highly useful new resource to browse antiSMASH-annotated BGCs in the currently 3907 bacterial genomes in the database and perform advanced search queries combining multiple search criteria. antiSMASH-DB is available at http://antismash-db.secondarymetabolites.org/. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
PylotDB - A Database Management, Graphing, and Analysis Tool Written in Python
DOE Office of Scientific and Technical Information (OSTI.GOV)
Barnette, Daniel W.
2012-01-04
PylotDB, written completely in Python, provides a user interface (UI) with which to interact with, analyze, graph data from, and manage open source databases such as MySQL. The UI mitigates the user having to know in-depth knowledge of the database application programming interface (API). PylotDB allows the user to generate various kinds of plots from user-selected data; generate statistical information on text as well as numerical fields; backup and restore databases; compare database tables across different databases as well as across different servers; extract information from any field to create new fields; generate, edit, and delete databases, tables, and fields;more » generate or read into a table CSV data; and similar operations. Since much of the database information is brought under control of the Python computer language, PylotDB is not intended for huge databases for which MySQL and Oracle, for example, are better suited. PylotDB is better suited for smaller databases that might be typically needed in a small research group situation. PylotDB can also be used as a learning tool for database applications in general.« less
PlantNATsDB: a comprehensive database of plant natural antisense transcripts.
Chen, Dijun; Yuan, Chunhui; Zhang, Jian; Zhang, Zhao; Bai, Lin; Meng, Yijun; Chen, Ling-Ling; Chen, Ming
2012-01-01
Natural antisense transcripts (NATs), as one type of regulatory RNAs, occur prevalently in plant genomes and play significant roles in physiological and pathological processes. Although their important biological functions have been reported widely, a comprehensive database is lacking up to now. Consequently, we constructed a plant NAT database (PlantNATsDB) involving approximately 2 million NAT pairs in 69 plant species. GO annotation and high-throughput small RNA sequencing data currently available were integrated to investigate the biological function of NATs. PlantNATsDB provides various user-friendly web interfaces to facilitate the presentation of NATs and an integrated, graphical network browser to display the complex networks formed by different NATs. Moreover, a 'Gene Set Analysis' module based on GO annotation was designed to dig out the statistical significantly overrepresented GO categories from the specific NAT network. PlantNATsDB is currently the most comprehensive resource of NATs in the plant kingdom, which can serve as a reference database to investigate the regulatory function of NATs. The PlantNATsDB is freely available at http://bis.zju.edu.cn/pnatdb/.
dbPTM 2016: 10-year anniversary of a resource for post-translational modification of proteins.
Huang, Kai-Yao; Su, Min-Gang; Kao, Hui-Ju; Hsieh, Yun-Chung; Jhong, Jhih-Hua; Cheng, Kuang-Hao; Huang, Hsien-Da; Lee, Tzong-Yi
2016-01-04
Owing to the importance of the post-translational modifications (PTMs) of proteins in regulating biological processes, the dbPTM (http://dbPTM.mbc.nctu.edu.tw/) was developed as a comprehensive database of experimentally verified PTMs from several databases with annotations of potential PTMs for all UniProtKB protein entries. For this 10th anniversary of dbPTM, the updated resource provides not only a comprehensive dataset of experimentally verified PTMs, supported by the literature, but also an integrative interface for accessing all available databases and tools that are associated with PTM analysis. As well as collecting experimental PTM data from 14 public databases, this update manually curates over 12 000 modified peptides, including the emerging S-nitrosylation, S-glutathionylation and succinylation, from approximately 500 research articles, which were retrieved by text mining. As the number of available PTM prediction methods increases, this work compiles a non-homologous benchmark dataset to evaluate the predictive power of online PTM prediction tools. An increasing interest in the structural investigation of PTM substrate sites motivated the mapping of all experimental PTM peptides to protein entries of Protein Data Bank (PDB) based on database identifier and sequence identity, which enables users to examine spatially neighboring amino acids, solvent-accessible surface area and side-chain orientations for PTM substrate sites on tertiary structures. Since drug binding in PDB is annotated, this update identified over 1100 PTM sites that are associated with drug binding. The update also integrates metabolic pathways and protein-protein interactions to support the PTM network analysis for a group of proteins. Finally, the web interface is redesigned and enhanced to facilitate access to this resource. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Non-B DB v2.0: a database of predicted non-B DNA-forming motifs and its associated tools.
Cer, Regina Z; Donohue, Duncan E; Mudunuri, Uma S; Temiz, Nuri A; Loss, Michael A; Starner, Nathan J; Halusa, Goran N; Volfovsky, Natalia; Yi, Ming; Luke, Brian T; Bacolla, Albino; Collins, Jack R; Stephens, Robert M
2013-01-01
The non-B DB, available at http://nonb.abcc.ncifcrf.gov, catalogs predicted non-B DNA-forming sequence motifs, including Z-DNA, G-quadruplex, A-phased repeats, inverted repeats, mirror repeats, direct repeats and their corresponding subsets: cruciforms, triplexes and slipped structures, in several genomes. Version 2.0 of the database revises and re-implements the motif discovery algorithms to better align with accepted definitions and thresholds for motifs, expands the non-B DNA-forming motifs coverage by including short tandem repeats and adds key visualization tools to compare motif locations relative to other genomic annotations. Non-B DB v2.0 extends the ability for comparative genomics by including re-annotation of the five organisms reported in non-B DB v1.0, human, chimpanzee, dog, macaque and mouse, and adds seven additional organisms: orangutan, rat, cow, pig, horse, platypus and Arabidopsis thaliana. Additionally, the non-B DB v2.0 provides an overall improved graphical user interface and faster query performance.
NASA Astrophysics Data System (ADS)
Block, K. A.; Lehnert, K. A.; Johansson, A. K.; Herzberg, C. T.; Stern, R. J.; Bloomer, S.; Gerard-Little, P.; Paul, M.; Raye, U.; Sou, N.
2007-12-01
Geoinformatics resources are indispensable tools for researchers and educators at the forefront of geoscience. One example is PetDB (http://www.petdb.org) which serves as a data resource and reference in a broad suite of studies of the solid earth and is cited in over 160 peer-reviewed articles. The ongoing success of geochemical and petrological database projects, such as PetDB, SedDB, and the EarthChem Deep Lithosphere dataset depends on addressing disciplinary interest and scientific need. A new generation of scientists who understand and utilize online data resources therefore possess a unique advantage over researchers with limited experience using online databases in that they can help shape the way the resources evolve. In an effort to foment awareness and further research goals, students and faculty from the University of Texas at Dallas, Rutgers University, and Columbia University have partnered with researchers at the Lamont-Doherty Earth Observatory to provide training in the use and development of geochemical databases to undergraduate and graduate students. Student internships lasting between 6 weeks and two months consisted of familiarization with relational databases at every level. Internships were developed to extend and apply students' prior knowledge to the development of data resources, to nurture interest in geochemistry and petrology, and to encourage students into pursuing graduate studies by engaging them in current scientific topics. Students were mentored one-on-one and assigned to data compilation in specific topics with the intent of providing background in the literature that can be used in future research papers. Outcomes of the internships include the development of a new petrological dataset of samples from the Central Atlantic Magmatic Province (CAMP), expansion of a database of mantle xenoliths (EarthChem Deep Lithosphere Dataset) that will serve as a major component to a doctoral dissertation, and the development of a classification for mantle peridotites. These efforts are expected to have a significant impact on long-standing research issues and will provide insight into the processes involving the breakup of Pangea, the influence of large igneous provinces on mass extinctions, and the evolution of the North American lithospheric mantle.
Ni, Ming; Ye, Fuqiang; Zhu, Juanjuan; Li, Zongwei; Yang, Shuai; Yang, Bite; Han, Lu; Wu, Yongge; Chen, Ying; Li, Fei; Wang, Shengqi; Bo, Xiaochen
2014-12-01
Numerous public microarray datasets are valuable resources for the scientific communities. Several online tools have made great steps to use these data by querying related datasets with users' own gene signatures or expression profiles. However, dataset annotation and result exhibition still need to be improved. ExpTreeDB is a database that allows for queries on human and mouse microarray experiments from Gene Expression Omnibus with gene signatures or profiles. Compared with similar applications, ExpTreeDB pays more attention to dataset annotations and result visualization. We introduced a multiple-level annotation system to depict and organize original experiments. For example, a tamoxifen-treated cell line experiment is hierarchically annotated as 'agent→drug→estrogen receptor antagonist→tamoxifen'. Consequently, retrieved results are exhibited by an interactive tree-structured graphics, which provide an overview for related experiments and might enlighten users on key items of interest. The database is freely available at http://biotech.bmi.ac.cn/ExpTreeDB. Web site is implemented in Perl, PHP, R, MySQL and Apache. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Development of the geometry database for the CBM experiment
NASA Astrophysics Data System (ADS)
Akishina, E. P.; Alexandrov, E. I.; Alexandrov, I. N.; Filozova, I. A.; Friese, V.; Ivanov, V. V.
2018-01-01
The paper describes the current state of the Geometry Database (Geometry DB) for the CBM experiment. The main purpose of this database is to provide convenient tools for: (1) managing the geometry modules; (2) assembling various versions of the CBM setup as a combination of geometry modules and additional files. The CBM users of the Geometry DB may use both GUI (Graphical User Interface) and API (Application Programming Interface) tools for working with it.
TR32DB - Management of Research Data in a Collaborative, Interdisciplinary Research Project
NASA Astrophysics Data System (ADS)
Curdt, Constanze; Hoffmeister, Dirk; Waldhoff, Guido; Lang, Ulrich; Bareth, Georg
2015-04-01
The management of research data in a well-structured and documented manner is essential in the context of collaborative, interdisciplinary research environments (e.g. across various institutions). Consequently, set-up and use of a research data management (RDM) system like a data repository or project database is necessary. These systems should accompany and support scientists during the entire research life cycle (e.g. data collection, documentation, storage, archiving, sharing, publishing) and operate cross-disciplinary in interdisciplinary research projects. Challenges and problems of RDM are well-know. Consequently, the set-up of a user-friendly, well-documented, sustainable RDM system is essential, as well as user support and further assistance. In the framework of the Transregio Collaborative Research Centre 32 'Patterns in Soil-Vegetation-Atmosphere Systems: Monitoring, Modelling, and Data Assimilation' (CRC/TR32), funded by the German Research Foundation (DFG), a RDM system was self-designed and implemented. The CRC/TR32 project database (TR32DB, www.tr32db.de) is operating online since early 2008. The TR32DB handles all data, which are created by the involved project participants from several institutions (e.g. Universities of Cologne, Bonn, Aachen, and the Research Centre Jülich) and research fields (e.g. soil and plant sciences, hydrology, geography, geophysics, meteorology, remote sensing). Very heterogeneous research data are considered, which are resulting from field measurement campaigns, meteorological monitoring, remote sensing, laboratory studies and modelling approaches. Furthermore, outcomes like publications, conference contributions, PhD reports and corresponding images are regarded. The TR32DB project database is set-up in cooperation with the Regional Computing Centre of the University of Cologne (RRZK) and also located in this hardware environment. The TR32DB system architecture is composed of three main components: (i) a file-based data storage including backup, (ii) a database-based storage for administrative data and metadata, and (iii) a web-interface for user access. The TR32DB offers common features of RDM systems. These include data storage, entry of corresponding metadata by a user-friendly input wizard, search and download of data depending on user permission, as well as secure internal exchange of data. In addition, a Digital Object Identifier (DOI) can be allocated for specific datasets and several web mapping components are supported (e.g. Web-GIS and map search). The centrepiece of the TR32DB is the self-provided and implemented CRC/TR32 specific metadata schema. This enables the documentation of all involved, heterogeneous data with accurate, interoperable metadata. The TR32DB Metadata Schema is set-up in a multi-level approach and supports several metadata standards and schemes (e.g. Dublin Core, ISO 19115, INSPIRE, DataCite). Furthermore, metadata properties with focus on the CRC/TR32 background (e.g. CRC/TR32 specific keywords) and the supported data types are complemented. Mandatory, optional and automatic metadata properties are specified. Overall, the TR32DB is designed and implemented according to the needs of the CRC/TR32 (e.g. huge amount of heterogeneous data) and demands of the DFG (e.g. cooperation with a computing centre). The application of a self-designed, project-specific, interoperable metadata schema enables the accurate documentation of all CRC/TR32 data. The implementation of the TR32DB in the hardware environment of the RRZK ensures the access to the data after the end of the CRC/TR32 funding in 2018.
AOP-DB Frontend: A user interface for the Adverse Outcome Pathways Database
The EPA Adverse Outcome Pathway Database (AOP-DB) is a database resource that aggregates association relationships between AOPs, genes, chemicals, diseases, pathways, species orthology information, ontologies. The AOP-DB frontend is a simple yet powerful user interface in the for...
CottonDB: A resource for cotton genome research
USDA-ARS?s Scientific Manuscript database
CottonDB (http://cottondb.org/) is a database and web resource for cotton genomic and genetic research. Created in 1995, CottonDB was among the first plant genome databases established by the USDA-ARS. Accessed through a website interface, the database aims to be a convenient, inclusive medium of ...
S/MARt DB: a database on scaffold/matrix attached regions.
Liebich, Ines; Bode, Jürgen; Frisch, Matthias; Wingender, Edgar
2002-01-01
S/MARt DB, the S/MAR transaction database, is a relational database covering scaffold/matrix attached regions (S/MARs) and nuclear matrix proteins that are involved in the chromosomal attachment to the nuclear scaffold. The data are mainly extracted from original publications, but a World Wide Web interface for direct submissions is also available. S/MARt DB is closely linked to the TRANSFAC database on transcription factors and their binding sites. It is freely accessible through the World Wide Web (http://transfac.gbf.de/SMARtDB/) for non-profit research.
The BiolAD-DB system : an informatics system for clinical and genetic data.
Nielsen, David A; Leidner, Marty; Haynes, Chad; Krauthammer, Michael; Kreek, Mary Jeanne
2007-01-01
The Biology of Addictive Diseases-Database (BiolAD-DB) system is a research bioinformatics system for archiving, analyzing, and processing of complex clinical and genetic data. The database schema employs design principles for handling complex clinical information, such as response items in genetic questionnaires. Data access and validation is provided by the BiolAD-DB client application, which features a data validation engine tightly coupled to a graphical user interface. Data integrity is provided by the password-protected BiolAD-DB SQL compliant server and database. BiolAD-DB tools further provide functionalities for generating customized reports and views. The BiolAD-DB system schema, client, and installation instructions are freely available at http://www.rockefeller.edu/biolad-db/.
Liu, Wanting; Xiang, Lunping; Zheng, Tingkai; Jin, Jingjie; Zhang, Gong
2018-01-04
Translation is a key regulatory step, linking transcriptome and proteome. Two major methods of translatome investigations are RNC-seq (sequencing of translating mRNA) and Ribo-seq (ribosome profiling). To facilitate the investigation of translation, we built a comprehensive database TranslatomeDB (http://www.translatomedb.net/) which provides collection and integrated analysis of published and user-generated translatome sequencing data. The current version includes 2453 Ribo-seq, 10 RNC-seq and their 1394 corresponding mRNA-seq datasets in 13 species. The database emphasizes the analysis functions in addition to the dataset collections. Differential gene expression (DGE) analysis can be performed between any two datasets of same species and type, both on transcriptome and translatome levels. The translation indices translation ratios, elongation velocity index and translational efficiency can be calculated to quantitatively evaluate translational initiation efficiency and elongation velocity, respectively. All datasets were analyzed using a unified, robust, accurate and experimentally-verifiable pipeline based on the FANSe3 mapping algorithm and edgeR for DGE analyzes. TranslatomeDB also allows users to upload their own datasets and utilize the identical unified pipeline to analyze their data. We believe that our TranslatomeDB is a comprehensive platform and knowledgebase on translatome and proteome research, releasing the biologists from complex searching, analyzing and comparing huge sequencing data without needing local computational power. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
A Tutorial in Creating Web-Enabled Databases with Inmagic DB/TextWorks through ODBC.
ERIC Educational Resources Information Center
Breeding, Marshall
2000-01-01
Explains how to create Web-enabled databases. Highlights include Inmagic's DB/Text WebPublisher product called DB/TextWorks; ODBC (Open Database Connectivity) drivers; Perl programming language; HTML coding; Structured Query Language (SQL); Common Gateway Interface (CGI) programming; and examples of HTML pages and Perl scripts. (LRW)
The Toxicity Reference Database (ToxRefDB) is a publicly accessible resource that contains 40+ years of in vivo dose-response toxicological studies. ToxRefDB provides curated in vivo toxicity data for systematic evaluation of a continuously expanding catalog of chemicals, and co...
Gasc, Cyrielle; Constantin, Antony; Jaziri, Faouzi; Peyret, Pierre
2017-01-01
The detection and identification of bacterial pathogens involved in acts of bio- and agroterrorism are essential to avoid pathogen dispersal in the environment and propagation within the population. Conventional molecular methods, such as PCR amplification, DNA microarrays or shotgun sequencing, are subject to various limitations when assessing environmental samples, which can lead to inaccurate findings. We developed a hybridization capture strategy that uses a set of oligonucleotide probes to target and enrich biomarkers of interest in environmental samples. Here, we present Oligonucleotide Capture Probes for Pathogen Identification Database (OCaPPI-Db), an online capture probe database containing a set of 1,685 oligonucleotide probes allowing for the detection and identification of 30 biothreat agents up to the species level. This probe set can be used in its entirety as a comprehensive diagnostic tool or can be restricted to a set of probes targeting a specific pathogen or virulence factor according to the user's needs. : http://ocappidb.uca.works. © The Author(s) 2017. Published by Oxford University Press.
Development of Vision Based Multiview Gait Recognition System with MMUGait Database
Ng, Hu; Tan, Wooi-Haw; Tong, Hau-Lee
2014-01-01
This paper describes the acquisition setup and development of a new gait database, MMUGait. This database consists of 82 subjects walking under normal condition and 19 subjects walking with 11 covariate factors, which were captured under two views. This paper also proposes a multiview model-based gait recognition system with joint detection approach that performs well under different walking trajectories and covariate factors, which include self-occluded or external occluded silhouettes. In the proposed system, the process begins by enhancing the human silhouette to remove the artifacts. Next, the width and height of the body are obtained. Subsequently, the joint angular trajectories are determined once the body joints are automatically detected. Lastly, crotch height and step-size of the walking subject are determined. The extracted features are smoothened by Gaussian filter to eliminate the effect of outliers. The extracted features are normalized with linear scaling, which is followed by feature selection prior to the classification process. The classification experiments carried out on MMUGait database were benchmarked against the SOTON Small DB from University of Southampton. Results showed correct classification rate above 90% for all the databases. The proposed approach is found to outperform other approaches on SOTON Small DB in most cases. PMID:25143972
SilkPathDB: a comprehensive resource for the study of silkworm pathogens.
Li, Tian; Pan, Guo-Qing; Vossbrinck, Charles R; Xu, Jin-Shan; Li, Chun-Feng; Chen, Jie; Long, Meng-Xian; Yang, Ming; Xu, Xiao-Fei; Xu, Chen; Debrunner-Vossbrinck, Bettina A; Zhou, Ze-Yang
2017-01-01
Silkworm pathogens have been heavily impeding the development of sericultural industry and play important roles in lepidopteran ecology, and some of which are used as biological insecticides. Rapid advances in studies on the omics of silkworm pathogens have produced a large amount of data, which need to be brought together centrally in a coherent and systematic manner. This will facilitate the reuse of these data for further analysis. We have collected genomic data for 86 silkworm pathogens from 4 taxa (fungi, microsporidia, bacteria and viruses) and from 4 lepidopteran hosts, and developed the open-access Silkworm Pathogen Database (SilkPathDB) to make this information readily available. The implementation of SilkPathDB involves integrating Drupal and GBrowse as a graphic interface for a Chado relational database which houses all of the datasets involved. The genomes have been assembled and annotated for comparative purposes and allow the search and analysis of homologous sequences, transposable elements, protein subcellular locations, including secreted proteins, and gene ontology. We believe that the SilkPathDB will aid researchers in the identification of silkworm parasites, understanding the mechanisms of silkworm infections, and the developmental ecology of silkworm parasites (gene expression) and their hosts. http://silkpathdb.swu.edu.cn. © The Author(s) 2017. Published by Oxford University Press.
CicerTransDB 1.0: a resource for expression and functional study of chickpea transcription factors.
Gayali, Saurabh; Acharya, Shankar; Lande, Nilesh Vikram; Pandey, Aarti; Chakraborty, Subhra; Chakraborty, Niranjan
2016-07-29
Transcription factor (TF) databases are major resource for systematic studies of TFs in specific species as well as related family members. Even though there are several publicly available multi-species databases, the information on the amount and diversity of TFs within individual species is fragmented, especially for newly sequenced genomes of non-model species of agricultural significance. We constructed CicerTransDB (Cicer Transcription Factor Database), the first database of its kind, which would provide a centralized putatively complete list of TFs in a food legume, chickpea. CicerTransDB, available at www.cicertransdb.esy.es , is based on chickpea (Cicer arietinum L.) annotation v 1.0. The database is an outcome of genome-wide domain study and manual classification of TF families. This database not only provides information of the gene, but also gene ontology, domain and motif architecture. CicerTransDB v 1.0 comprises information of 1124 genes of chickpea and enables the user to not only search, browse and download sequences but also retrieve sequence features. CicerTransDB also provides several single click interfaces, transconnecting to various other databases to ease further analysis. Several webAPI(s) integrated in the database allow end-users direct access of data. A critical comparison of CicerTransDB with PlantTFDB (Plant Transcription Factor Database) revealed 68 novel TFs in the chickpea genome, hitherto unexplored. Database URL: http://www.cicertransdb.esy.es.
LocSigDB: a database of protein localization signals
Negi, Simarjeet; Pandey, Sanjit; Srinivasan, Satish M.; Mohammed, Akram; Guda, Chittibabu
2015-01-01
LocSigDB (http://genome.unmc.edu/LocSigDB/) is a manually curated database of experimental protein localization signals for eight distinct subcellular locations; primarily in a eukaryotic cell with brief coverage of bacterial proteins. Proteins must be localized at their appropriate subcellular compartment to perform their desired function. Mislocalization of proteins to unintended locations is a causative factor for many human diseases; therefore, collection of known sorting signals will help support many important areas of biomedical research. By performing an extensive literature study, we compiled a collection of 533 experimentally determined localization signals, along with the proteins that harbor such signals. Each signal in the LocSigDB is annotated with its localization, source, PubMed references and is linked to the proteins in UniProt database along with the organism information that contain the same amino acid pattern as the given signal. From LocSigDB webserver, users can download the whole database or browse/search for data using an intuitive query interface. To date, LocSigDB is the most comprehensive compendium of protein localization signals for eight distinct subcellular locations. Database URL: http://genome.unmc.edu/LocSigDB/ PMID:25725059
OriDB, the DNA replication origin database updated and extended.
Siow, Cheuk C; Nieduszynska, Sian R; Müller, Carolin A; Nieduszynski, Conrad A
2012-01-01
OriDB (http://www.oridb.org/) is a database containing collated genome-wide mapping studies of confirmed and predicted replication origin sites. The original database collated and curated Saccharomyces cerevisiae origin mapping studies. Here, we report that the OriDB database and web site have been revamped to improve user accessibility to curated data sets, to greatly increase the number of curated origin mapping studies, and to include the collation of replication origin sites in the fission yeast Schizosaccharomyces pombe. The revised database structure underlies these improvements and will facilitate further expansion in the future. The updated OriDB for S. cerevisiae is available at http://cerevisiae.oridb.org/ and for S. pombe at http://pombe.oridb.org/.
Providing R-Tree Support for Mongodb
NASA Astrophysics Data System (ADS)
Xiang, Longgang; Shao, Xiaotian; Wang, Dehao
2016-06-01
Supporting large amounts of spatial data is a significant characteristic of modern databases. However, unlike some mature relational databases, such as Oracle and PostgreSQL, most of current burgeoning NoSQL databases are not well designed for storing geospatial data, which is becoming increasingly important in various fields. In this paper, we propose a novel method to provide R-tree index, as well as corresponding spatial range query and nearest neighbour query functions, for MongoDB, one of the most prevalent NoSQL databases. First, after in-depth analysis of MongoDB's features, we devise an efficient tabular document structure which flattens R-tree index into MongoDB collections. Further, relevant mechanisms of R-tree operations are issued, and then we discuss in detail how to integrate R-tree into MongoDB. Finally, we present the experimental results which show that our proposed method out-performs the built-in spatial index of MongoDB. Our research will greatly facilitate big data management issues with MongoDB in a variety of geospatial information applications.
HuMiChip: Development of a Functional Gene Array for the Study of Human Microbiomes
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tu, Q.; Deng, Ye; Lin, Lu
Microbiomes play very important roles in terms of nutrition, health and disease by interacting with their hosts. Based on sequence data currently available in public domains, we have developed a functional gene array to monitor both organismal and functional gene profiles of normal microbiota in human and mouse hosts, and such an array is called human and mouse microbiota array, HMM-Chip. First, seed sequences were identified from KEGG databases, and used to construct a seed database (seedDB) containing 136 gene families in 19 metabolic pathways closely related to human and mouse microbiomes. Second, a mother database (motherDB) was constructed withmore » 81 genomes of bacterial strains with 54 from gut and 27 from oral environments, and 16 metagenomes, and used for selection of genes and probe design. Gene prediction was performed by Glimmer3 for bacterial genomes, and by the Metagene program for metagenomes. In total, 228,240 and 801,599 genes were identified for bacterial genomes and metagenomes, respectively. Then the motherDB was searched against the seedDB using the HMMer program, and gene sequences in the motherDB that were highly homologous with seed sequences in the seedDB were used for probe design by the CommOligo software. Different degrees of specific probes, including gene-specific, inclusive and exclusive group-specific probes were selected. All candidate probes were checked against the motherDB and NCBI databases for specificity. Finally, 7,763 probes covering 91.2percent (12,601 out of 13,814) HMMer confirmed sequences from 75 bacterial genomes and 16 metagenomes were selected. This developed HMM-Chip is able to detect the diversity and abundance of functional genes, the gene expression of microbial communities, and potentially, the interactions of microorganisms and their hosts.« less
A Benchmark and Comparative Study of Video-Based Face Recognition on COX Face Database.
Huang, Zhiwu; Shan, Shiguang; Wang, Ruiping; Zhang, Haihong; Lao, Shihong; Kuerban, Alifu; Chen, Xilin
2015-12-01
Face recognition with still face images has been widely studied, while the research on video-based face recognition is inadequate relatively, especially in terms of benchmark datasets and comparisons. Real-world video-based face recognition applications require techniques for three distinct scenarios: 1) Videoto-Still (V2S); 2) Still-to-Video (S2V); and 3) Video-to-Video (V2V), respectively, taking video or still image as query or target. To the best of our knowledge, few datasets and evaluation protocols have benchmarked for all the three scenarios. In order to facilitate the study of this specific topic, this paper contributes a benchmarking and comparative study based on a newly collected still/video face database, named COX(1) Face DB. Specifically, we make three contributions. First, we collect and release a largescale still/video face database to simulate video surveillance with three different video-based face recognition scenarios (i.e., V2S, S2V, and V2V). Second, for benchmarking the three scenarios designed on our database, we review and experimentally compare a number of existing set-based methods. Third, we further propose a novel Point-to-Set Correlation Learning (PSCL) method, and experimentally show that it can be used as a promising baseline method for V2S/S2V face recognition on COX Face DB. Extensive experimental results clearly demonstrate that video-based face recognition needs more efforts, and our COX Face DB is a good benchmark database for evaluation.
New tools and methods for direct programmatic access to the dbSNP relational database.
Saccone, Scott F; Quan, Jiaxi; Mehta, Gaurang; Bolze, Raphael; Thomas, Prasanth; Deelman, Ewa; Tischfield, Jay A; Rice, John P
2011-01-01
Genome-wide association studies often incorporate information from public biological databases in order to provide a biological reference for interpreting the results. The dbSNP database is an extensive source of information on single nucleotide polymorphisms (SNPs) for many different organisms, including humans. We have developed free software that will download and install a local MySQL implementation of the dbSNP relational database for a specified organism. We have also designed a system for classifying dbSNP tables in terms of common tasks we wish to accomplish using the database. For each task we have designed a small set of custom tables that facilitate task-related queries and provide entity-relationship diagrams for each task composed from the relevant dbSNP tables. In order to expose these concepts and methods to a wider audience we have developed web tools for querying the database and browsing documentation on the tables and columns to clarify the relevant relational structure. All web tools and software are freely available to the public at http://cgsmd.isi.edu/dbsnpq. Resources such as these for programmatically querying biological databases are essential for viably integrating biological information into genetic association experiments on a genome-wide scale.
ReprDB and panDB: minimalist databases with maximal microbial representation.
Zhou, Wei; Gay, Nicole; Oh, Julia
2018-01-18
Profiling of shotgun metagenomic samples is hindered by a lack of unified microbial reference genome databases that (i) assemble genomic information from all open access microbial genomes, (ii) have relatively small sizes, and (iii) are compatible to various metagenomic read mapping tools. Moreover, computational tools to rapidly compile and update such databases to accommodate the rapid increase in new reference genomes do not exist. As a result, database-guided analyses often fail to profile a substantial fraction of metagenomic shotgun sequencing reads from complex microbiomes. We report pipelines that efficiently traverse all open access microbial genomes and assemble non-redundant genomic information. The pipelines result in two species-resolution microbial reference databases of relatively small sizes: reprDB, which assembles microbial representative or reference genomes, and panDB, for which we developed a novel iterative alignment algorithm to identify and assemble non-redundant genomic regions in multiple sequenced strains. With the databases, we managed to assign taxonomic labels and genome positions to the majority of metagenomic reads from human skin and gut microbiomes, demonstrating a significant improvement over a previous database-guided analysis on the same datasets. reprDB and panDB leverage the rapid increases in the number of open access microbial genomes to more fully profile metagenomic samples. Additionally, the databases exclude redundant sequence information to avoid inflated storage or memory space and indexing or analyzing time. Finally, the novel iterative alignment algorithm significantly increases efficiency in pan-genome identification and can be useful in comparative genomic analyses.
Rosenfeld, Aaron M; Meng, Wenzhao; Luning Prak, Eline T; Hershberg, Uri
2017-01-15
As high-throughput sequencing of B cells becomes more common, the need for tools to analyze the large quantity of data also increases. This article introduces ImmuneDB, a system for analyzing vast amounts of heavy chain variable region sequences and exploring the resulting data. It can take as input raw FASTA/FASTQ data, identify genes, determine clones, construct lineages, as well as provide information such as selection pressure and mutation analysis. It uses an industry leading database, MySQL, to provide fast analysis and avoid the complexities of using error prone flat-files. ImmuneDB is freely available at http://immunedb.comA demo of the ImmuneDB web interface is available at: http://immunedb.com/demo CONTACT: Uh25@drexel.eduSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
ProCarDB: a database of bacterial carotenoids.
Nupur, L N U; Vats, Asheema; Dhanda, Sandeep Kumar; Raghava, Gajendra P S; Pinnaka, Anil Kumar; Kumar, Ashwani
2016-05-26
Carotenoids have important functions in bacteria, ranging from harvesting light energy to neutralizing oxidants and acting as virulence factors. However, information pertaining to the carotenoids is scattered throughout the literature. Furthermore, information about the genes/proteins involved in the biosynthesis of carotenoids has tremendously increased in the post-genomic era. A web server providing the information about microbial carotenoids in a structured manner is required and will be a valuable resource for the scientific community working with microbial carotenoids. Here, we have created a manually curated, open access, comprehensive compilation of bacterial carotenoids named as ProCarDB- Prokaryotic Carotenoid Database. ProCarDB includes 304 unique carotenoids arising from 50 biosynthetic pathways distributed among 611 prokaryotes. ProCarDB provides important information on carotenoids, such as 2D and 3D structures, molecular weight, molecular formula, SMILES, InChI, InChIKey, IUPAC name, KEGG Id, PubChem Id, and ChEBI Id. The database also provides NMR data, UV-vis absorption data, IR data, MS data and HPLC data that play key roles in the identification of carotenoids. An important feature of this database is the extension of biosynthetic pathways from the literature and through the presence of the genes/enzymes in different organisms. The information contained in the database was mined from published literature and databases such as KEGG, PubChem, ChEBI, LipidBank, LPSN, and Uniprot. The database integrates user-friendly browsing and searching with carotenoid analysis tools to help the user. We believe that this database will serve as a major information centre for researchers working on bacterial carotenoids.
HypoxiaDB: a database of hypoxia-regulated proteins
Khurana, Pankaj; Sugadev, Ragumani; Jain, Jaspreet; Singh, Shashi Bala
2013-01-01
There has been intense interest in the cellular response to hypoxia, and a large number of differentially expressed proteins have been identified through various high-throughput experiments. These valuable data are scattered, and there have been no systematic attempts to document the various proteins regulated by hypoxia. Compilation, curation and annotation of these data are important in deciphering their role in hypoxia and hypoxia-related disorders. Therefore, we have compiled HypoxiaDB, a database of hypoxia-regulated proteins. It is a comprehensive, manually-curated, non-redundant catalog of proteins whose expressions are shown experimentally to be altered at different levels and durations of hypoxia. The database currently contains 72 000 manually curated entries taken on 3500 proteins extracted from 73 peer-reviewed publications selected from PubMed. HypoxiaDB is distinctive from other generalized databases: (i) it compiles tissue-specific protein expression changes under different levels and duration of hypoxia. Also, it provides manually curated literature references to support the inclusion of the protein in the database and establish its association with hypoxia. (ii) For each protein, HypoxiaDB integrates data on gene ontology, KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway, protein–protein interactions, protein family (Pfam), OMIM (Online Mendelian Inheritance in Man), PDB (Protein Data Bank) structures and homology to other sequenced genomes. (iii) It also provides pre-compiled information on hypoxia-proteins, which otherwise requires tedious computational analysis. This includes information like chromosomal location, identifiers like Entrez, HGNC, Unigene, Uniprot, Ensembl, Vega, GI numbers and Genbank accession numbers associated with the protein. These are further cross-linked to respective public databases augmenting HypoxiaDB to the external repositories. (iv) In addition, HypoxiaDB provides an online sequence-similarity search tool for users to compare their protein sequences with HypoxiaDB protein database. We hope that HypoxiaDB will enrich our knowledge about hypoxia-related biology and eventually will lead to the development of novel hypothesis and advancements in diagnostic and therapeutic activities. HypoxiaDB is freely accessible for academic and non-profit users via http://www.hypoxiadb.com. Database URL: http://www.hypoxiadb.com PMID:24178989
Ndhlovu, Andrew; Durand, Pierre M.; Hazelhurst, Scott
2015-01-01
The evolutionary rate at codon sites across protein-coding nucleotide sequences represents a valuable tier of information for aligning sequences, inferring homology and constructing phylogenetic profiles. However, a comprehensive resource for cataloguing the evolutionary rate at codon sites and their corresponding nucleotide and protein domain sequence alignments has not been developed. To address this gap in knowledge, EvoDB (an Evolutionary rates DataBase) was compiled. Nucleotide sequences and their corresponding protein domain data including the associated seed alignments from the PFAM-A (protein family) database were used to estimate evolutionary rate (ω = dN/dS) profiles at codon sites for each entry. EvoDB contains 98.83% of the gapped nucleotide sequence alignments and 97.1% of the evolutionary rate profiles for the corresponding information in PFAM-A. As the identification of codon sites under positive selection and their position in a sequence profile is usually the most sought after information for molecular evolutionary biologists, evolutionary rate profiles were determined under the M2a model using the CODEML algorithm in the PAML (Phylogenetic Analysis by Maximum Likelihood) suite of software. Validation of nucleotide sequences against amino acid data was implemented to ensure high data quality. EvoDB is a catalogue of the evolutionary rate profiles and provides the corresponding phylogenetic trees, PFAM-A alignments and annotated accession identifier data. In addition, the database can be explored and queried using known evolutionary rate profiles to identify domains under similar evolutionary constraints and pressures. EvoDB is a resource for evolutionary, phylogenetic studies and presents a tier of information untapped by current databases. Database URL: http://www.bioinf.wits.ac.za/software/fire/evodb PMID:26140928
The Kepler DB: a database management system for arrays, sparse arrays, and binary data
NASA Astrophysics Data System (ADS)
McCauliff, Sean; Cote, Miles T.; Girouard, Forrest R.; Middour, Christopher; Klaus, Todd C.; Wohler, Bill
2010-07-01
The Kepler Science Operations Center stores pixel values on approximately six million pixels collected every 30 minutes, as well as data products that are generated as a result of running the Kepler science processing pipeline. The Kepler Database management system (Kepler DB)was created to act as the repository of this information. After one year of flight usage, Kepler DB is managing 3 TiB of data and is expected to grow to over 10 TiB over the course of the mission. Kepler DB is a non-relational, transactional database where data are represented as one-dimensional arrays, sparse arrays or binary large objects. We will discuss Kepler DB's APIs, implementation, usage and deployment at the Kepler Science Operations Center.
The Kepler DB, a Database Management System for Arrays, Sparse Arrays and Binary Data
NASA Technical Reports Server (NTRS)
McCauliff, Sean; Cote, Miles T.; Girouard, Forrest R.; Middour, Christopher; Klaus, Todd C.; Wohler, Bill
2010-01-01
The Kepler Science Operations Center stores pixel values on approximately six million pixels collected every 30-minutes, as well as data products that are generated as a result of running the Kepler science processing pipeline. The Kepler Database (Kepler DB) management system was created to act as the repository of this information. After one year of ight usage, Kepler DB is managing 3 TiB of data and is expected to grow to over 10 TiB over the course of the mission. Kepler DB is a non-relational, transactional database where data are represented as one dimensional arrays, sparse arrays or binary large objects. We will discuss Kepler DB's APIs, implementation, usage and deployment at the Kepler Science Operations Center.
WholeCellSimDB: a hybrid relational/HDF database for whole-cell model predictions
Karr, Jonathan R.; Phillips, Nolan C.; Covert, Markus W.
2014-01-01
Mechanistic ‘whole-cell’ models are needed to develop a complete understanding of cell physiology. However, extracting biological insights from whole-cell models requires running and analyzing large numbers of simulations. We developed WholeCellSimDB, a database for organizing whole-cell simulations. WholeCellSimDB was designed to enable researchers to search simulation metadata to identify simulations for further analysis, and quickly slice and aggregate simulation results data. In addition, WholeCellSimDB enables users to share simulations with the broader research community. The database uses a hybrid relational/hierarchical data format architecture to efficiently store and retrieve both simulation setup metadata and results data. WholeCellSimDB provides a graphical Web-based interface to search, browse, plot and export simulations; a JavaScript Object Notation (JSON) Web service to retrieve data for Web-based visualizations; a command-line interface to deposit simulations; and a Python API to retrieve data for advanced analysis. Overall, we believe WholeCellSimDB will help researchers use whole-cell models to advance basic biological science and bioengineering. Database URL: http://www.wholecellsimdb.org Source code repository URL: http://github.com/CovertLab/WholeCellSimDB PMID:25231498
Kales, S N; Freyman, R L; Hill, J M; Polyhronopoulos, G N; Aldrich, J M; Christiani, D C
2001-07-01
We investigated firefighters' hearing relative to general population data to adjust for age-expected hearing loss. For five groups of male firefighters with increasing mean ages, we compared their hearing thresholds at the 50th and 90th percentiles with normative and age- and sex-matched hearing data from the International Standards Organization (databases A and B). At the 50th percentile, from a mean age of 28 to a mean age of 53 years, relative to databases A and B, the firefighters lost an excess of 19 to 23 dB, 20 to 23 dB, and 16 to 19 dB at 3000, 4000, and 6000 Hz, respectively. At the 90th percentile, from a mean age of 28 to a mean age of 53 years, relative to databases A and B, the firefighters lost an excess of 12 to 20 dB, 38 to 44 dB, 41 to 45 dB, and 22 to 28 dB at 2000, 3000, 4000, and 6000 Hz, respectively. The results are consistent with accelerated hearing loss in excess of age-expected loss among the firefighters, especially at or above the 90th percentile.
HBVPathDB: a database of HBV infection-related molecular interaction network.
Zhang, Yi; Bo, Xiao-Chen; Yang, Jing; Wang, Sheng-Qi
2005-03-21
To describe molecules or genes interaction between hepatitis B viruses (HBV) and host, for understanding how virus' and host's genes and molecules are networked to form a biological system and for perceiving mechanism of HBV infection. The knowledge of HBV infection-related reactions was organized into various kinds of pathways with carefully drawn graphs in HBVPathDB. Pathway information is stored with relational database management system (DBMS), which is currently the most efficient way to manage large amounts of data and query is implemented with powerful Structured Query Language (SQL). The search engine is written using Personal Home Page (PHP) with SQL embedded and web retrieval interface is developed for searching with Hypertext Markup Language (HTML). We present the first version of HBVPathDB, which is a HBV infection-related molecular interaction network database composed of 306 pathways with 1 050 molecules involved. With carefully drawn graphs, pathway information stored in HBVPathDB can be browsed in an intuitive way. We develop an easy-to-use interface for flexible accesses to the details of database. Convenient software is implemented to query and browse the pathway information of HBVPathDB. Four search page layout options-category search, gene search, description search, unitized search-are supported by the search engine of the database. The database is freely available at http://www.bio-inf.net/HBVPathDB/HBV/. The conventional perspective HBVPathDB have already contained a considerable amount of pathway information with HBV infection related, which is suitable for in-depth analysis of molecular interaction network of virus and host. HBVPathDB integrates pathway data-sets with convenient software for query, browsing, visualization, that provides users more opportunity to identify regulatory key molecules as potential drug targets and to explore the possible mechanism of HBV infection based on gene expression datasets.
Baldwin, Thomas T; Basenko, Evelina; Harb, Omar; Brown, Neil A; Urban, Martin; Hammond-Kosack, Kim E; Bregitzer, Phil P
2018-06-01
There is no comprehensive storage for generated mutants of Fusarium graminearum or data associated with these mutants. Instead, researchers relied on several independent and non-integrated databases. FgMutantDb was designed as a simple spreadsheet that is accessible globally on the web that will function as a centralized source of information on F. graminearum mutants. FgMutantDb aids in the maintenance and sharing of mutants within a research community. It will serve also as a platform for disseminating prepublication results as well as negative results that often go unreported. Additionally, the highly curated information on mutants in FgMutantDb will be shared with other databases (FungiDB, Ensembl, PhytoPath, and PHI-base) through updating reports. Here we describe the creation and potential usefulness of FgMutantDb to the F. graminearum research community, and provide a tutorial on its use. This type of database could be easily emulated for other fungal species. Published by Elsevier Inc.
PharmDB-K: Integrated Bio-Pharmacological Network Database for Traditional Korean Medicine
Lee, Ji-Hyun; Park, Kyoung Mii; Han, Dong-Jin; Bang, Nam Young; Kim, Do-Hee; Na, Hyeongjin; Lim, Semi; Kim, Tae Bum; Kim, Dae Gyu; Kim, Hyun-Jung; Chung, Yeonseok; Sung, Sang Hyun; Surh, Young-Joon; Kim, Sunghoon; Han, Byung Woo
2015-01-01
Despite the growing attention given to Traditional Medicine (TM) worldwide, there is no well-known, publicly available, integrated bio-pharmacological Traditional Korean Medicine (TKM) database for researchers in drug discovery. In this study, we have constructed PharmDB-K, which offers comprehensive information relating to TKM-associated drugs (compound), disease indication, and protein relationships. To explore the underlying molecular interaction of TKM, we integrated fourteen different databases, six Pharmacopoeias, and literature, and established a massive bio-pharmacological network for TKM and experimentally validated some cases predicted from the PharmDB-K analyses. Currently, PharmDB-K contains information about 262 TKMs, 7,815 drugs, 3,721 diseases, 32,373 proteins, and 1,887 side effects. One of the unique sets of information in PharmDB-K includes 400 indicator compounds used for standardization of herbal medicine. Furthermore, we are operating PharmDB-K via phExplorer (a network visualization software) and BioMart (a data federation framework) for convenient search and analysis of the TKM network. Database URL: http://pharmdb-k.org, http://biomart.i-pharm.org. PMID:26555441
New tools and methods for direct programmatic access to the dbSNP relational database
Saccone, Scott F.; Quan, Jiaxi; Mehta, Gaurang; Bolze, Raphael; Thomas, Prasanth; Deelman, Ewa; Tischfield, Jay A.; Rice, John P.
2011-01-01
Genome-wide association studies often incorporate information from public biological databases in order to provide a biological reference for interpreting the results. The dbSNP database is an extensive source of information on single nucleotide polymorphisms (SNPs) for many different organisms, including humans. We have developed free software that will download and install a local MySQL implementation of the dbSNP relational database for a specified organism. We have also designed a system for classifying dbSNP tables in terms of common tasks we wish to accomplish using the database. For each task we have designed a small set of custom tables that facilitate task-related queries and provide entity-relationship diagrams for each task composed from the relevant dbSNP tables. In order to expose these concepts and methods to a wider audience we have developed web tools for querying the database and browsing documentation on the tables and columns to clarify the relevant relational structure. All web tools and software are freely available to the public at http://cgsmd.isi.edu/dbsnpq. Resources such as these for programmatically querying biological databases are essential for viably integrating biological information into genetic association experiments on a genome-wide scale. PMID:21037260
HIVsirDB: a database of HIV inhibiting siRNAs.
Tyagi, Atul; Ahmed, Firoz; Thakur, Nishant; Sharma, Arun; Raghava, Gajendra P S; Kumar, Manoj
2011-01-01
Human immunodeficiency virus (HIV) is responsible for millions of deaths every year. The current treatment involves the use of multiple antiretroviral agents that may harm patients due to their toxic nature. RNA interference (RNAi) is a potent candidate for the future treatment of HIV, uses short interfering RNA (siRNA/shRNA) for silencing HIV genes. In this study, attempts have been made to create a database HIVsirDB of siRNAs responsible for silencing HIV genes. HIVsirDB is a manually curated database of HIV inhibiting siRNAs that provides comprehensive information about each siRNA or shRNA. Information was collected and compiled from literature and public resources. This database contains around 750 siRNAs that includes 75 partially complementary siRNAs differing by one or more bases with the target sites and over 100 escape mutant sequences. HIVsirDB structure contains sixteen fields including siRNA sequence, HIV strain, targeted genome region, efficacy and conservation of target sequences. In order to facilitate user, many tools have been integrated in this database that includes; i) siRNAmap for mapping siRNAs on target sequence, ii) HIVsirblast for BLAST search against database, iii) siRNAalign for aligning siRNAs. HIVsirDB is a freely accessible database of siRNAs which can silence or degrade HIV genes. It covers 26 types of HIV strains and 28 cell types. This database will be very useful for developing models for predicting efficacy of HIV inhibiting siRNAs. In summary this is a useful resource for researchers working in the field of siRNA based HIV therapy. HIVsirDB database is accessible at http://crdd.osdd.net/raghava/hivsir/.
Respiratory cancer database: An open access database of respiratory cancer gene and miRNA.
Choubey, Jyotsna; Choudhari, Jyoti Kant; Patel, Ashish; Verma, Mukesh Kumar
2017-01-01
Respiratory cancer database (RespCanDB) is a genomic and proteomic database of cancer of respiratory organ. It also includes the information of medicinal plants used for the treatment of various respiratory cancers with structure of its active constituents as well as pharmacological and chemical information of drug associated with various respiratory cancers. Data in RespCanDB has been manually collected from published research article and from other databases. Data has been integrated using MySQL an object-relational database management system. MySQL manages all data in the back-end and provides commands to retrieve and store the data into the database. The web interface of database has been built in ASP. RespCanDB is expected to contribute to the understanding of scientific community regarding respiratory cancer biology as well as developments of new way of diagnosing and treating respiratory cancer. Currently, the database consist the oncogenomic information of lung cancer, laryngeal cancer, and nasopharyngeal cancer. Data for other cancers, such as oral and tracheal cancers, will be added in the near future. The URL of RespCanDB is http://ridb.subdic-bioinformatics-nitrr.in/.
Advancements in web-database applications for rabies surveillance.
Rees, Erin E; Gendron, Bruno; Lelièvre, Frédérick; Coté, Nathalie; Bélanger, Denise
2011-08-02
Protection of public health from rabies is informed by the analysis of surveillance data from human and animal populations. In Canada, public health, agricultural and wildlife agencies at the provincial and federal level are responsible for rabies disease control, and this has led to multiple agency-specific data repositories. Aggregation of agency-specific data into one database application would enable more comprehensive data analyses and effective communication among participating agencies. In Québec, RageDB was developed to house surveillance data for the raccoon rabies variant, representing the next generation in web-based database applications that provide a key resource for the protection of public health. RageDB incorporates data from, and grants access to, all agencies responsible for the surveillance of raccoon rabies in Québec. Technological advancements of RageDB to rabies surveillance databases include (1) automatic integration of multi-agency data and diagnostic results on a daily basis; (2) a web-based data editing interface that enables authorized users to add, edit and extract data; and (3) an interactive dashboard to help visualize data simply and efficiently, in table, chart, and cartographic formats. Furthermore, RageDB stores data from citizens who voluntarily report sightings of rabies suspect animals. We also discuss how sightings data can indicate public perception to the risk of racoon rabies and thus aid in directing the allocation of disease control resources for protecting public health. RageDB provides an example in the evolution of spatio-temporal database applications for the storage, analysis and communication of disease surveillance data. The database was fast and inexpensive to develop by using open-source technologies, simple and efficient design strategies, and shared web hosting. The database increases communication among agencies collaborating to protect human health from raccoon rabies. Furthermore, health agencies have real-time access to a wide assortment of data documenting new developments in the raccoon rabies epidemic and this enables a more timely and appropriate response.
Advancements in web-database applications for rabies surveillance
2011-01-01
Background Protection of public health from rabies is informed by the analysis of surveillance data from human and animal populations. In Canada, public health, agricultural and wildlife agencies at the provincial and federal level are responsible for rabies disease control, and this has led to multiple agency-specific data repositories. Aggregation of agency-specific data into one database application would enable more comprehensive data analyses and effective communication among participating agencies. In Québec, RageDB was developed to house surveillance data for the raccoon rabies variant, representing the next generation in web-based database applications that provide a key resource for the protection of public health. Results RageDB incorporates data from, and grants access to, all agencies responsible for the surveillance of raccoon rabies in Québec. Technological advancements of RageDB to rabies surveillance databases include 1) automatic integration of multi-agency data and diagnostic results on a daily basis; 2) a web-based data editing interface that enables authorized users to add, edit and extract data; and 3) an interactive dashboard to help visualize data simply and efficiently, in table, chart, and cartographic formats. Furthermore, RageDB stores data from citizens who voluntarily report sightings of rabies suspect animals. We also discuss how sightings data can indicate public perception to the risk of racoon rabies and thus aid in directing the allocation of disease control resources for protecting public health. Conclusions RageDB provides an example in the evolution of spatio-temporal database applications for the storage, analysis and communication of disease surveillance data. The database was fast and inexpensive to develop by using open-source technologies, simple and efficient design strategies, and shared web hosting. The database increases communication among agencies collaborating to protect human health from raccoon rabies. Furthermore, health agencies have real-time access to a wide assortment of data documenting new developments in the raccoon rabies epidemic and this enables a more timely and appropriate response. PMID:21810215
Ko, Seung Hyun; Han, Kyungdo; Lee, Yong Ho; Noh, Junghyun; Park, Cheol Young; Kim, Dae Jung; Jung, Chang Hee; Lee, Ki Up; Ko, Kyung Soo
2018-04-01
Korea's National Healthcare Program, the National Health Insurance Service (NHIS), a government-affiliated agency under the Korean Ministry of Health and Welfare, covers the entire Korean population. The NHIS supervises all medical services in Korea and establishes a systematic National Health Information database (DB). A health information DB system including all of the claims, medications, death information, and health check-ups, both in the general population and in patients with various diseases, is not common worldwide. On June 9, 2014, the NHIS signed a memorandum of understanding with the Korean Diabetes Association (KDA) to provide limited open access to its DB. By October 31, 2017, seven papers had been published through this collaborative research project. These studies were conducted to investigate the past and current status of type 2 diabetes mellitus and its complications and management in Korea. This review is a brief summary of the collaborative projects between the KDA and the NHIS over the last 3 years. According to the analysis, the national health check-up DB or claim DB were used, and the age category or study period were differentially applied. Copyright © 2018 Korean Diabetes Association.
Database resources of the National Center for Biotechnology Information
2016-01-01
The National Center for Biotechnology Information (NCBI) provides a large suite of online resources for biological information and data, including the GenBank® nucleic acid sequence database and the PubMed database of citations and abstracts for published life science journals. Additional NCBI resources focus on literature (PubMed Central (PMC), Bookshelf and PubReader), health (ClinVar, dbGaP, dbMHC, the Genetic Testing Registry, HIV-1/Human Protein Interaction Database and MedGen), genomes (BioProject, Assembly, Genome, BioSample, dbSNP, dbVar, Epigenomics, the Map Viewer, Nucleotide, Probe, RefSeq, Sequence Read Archive, the Taxonomy Browser and the Trace Archive), genes (Gene, Gene Expression Omnibus (GEO), HomoloGene, PopSet and UniGene), proteins (Protein, the Conserved Domain Database (CDD), COBALT, Conserved Domain Architecture Retrieval Tool (CDART), the Molecular Modeling Database (MMDB) and Protein Clusters) and chemicals (Biosystems and the PubChem suite of small molecule databases). The Entrez system provides search and retrieval operations for most of these databases. Augmenting many of the web applications are custom implementations of the BLAST program optimized to search specialized datasets. All of these resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov. PMID:26615191
NeuroTransDB: highly curated and structured transcriptomic metadata for neurodegenerative diseases.
Bagewadi, Shweta; Adhikari, Subash; Dhrangadhariya, Anjani; Irin, Afroza Khanam; Ebeling, Christian; Namasivayam, Aishwarya Alex; Page, Matthew; Hofmann-Apitius, Martin; Senger, Philipp
2015-01-01
Neurodegenerative diseases are chronic debilitating conditions, characterized by progressive loss of neurons that represent a significant health care burden as the global elderly population continues to grow. Over the past decade, high-throughput technologies such as the Affymetrix GeneChip microarrays have provided new perspectives into the pathomechanisms underlying neurodegeneration. Public transcriptomic data repositories, namely Gene Expression Omnibus and curated ArrayExpress, enable researchers to conduct integrative meta-analysis; increasing the power to detect differentially regulated genes in disease and explore patterns of gene dysregulation across biologically related studies. The reliability of retrospective, large-scale integrative analyses depends on an appropriate combination of related datasets, in turn requiring detailed meta-annotations capturing the experimental setup. In most cases, we observe huge variation in compliance to defined standards for submitted metadata in public databases. Much of the information to complete, or refine meta-annotations are distributed in the associated publications. For example, tissue preparation or comorbidity information is frequently described in an article's supplementary tables. Several value-added databases have employed additional manual efforts to overcome this limitation. However, none of these databases explicate annotations that distinguish human and animal models in neurodegeneration context. Therefore, adopting a more specific disease focus, in combination with dedicated disease ontologies, will better empower the selection of comparable studies with refined annotations to address the research question at hand. In this article, we describe the detailed development of NeuroTransDB, a manually curated database containing metadata annotations for neurodegenerative studies. The database contains more than 20 dimensions of metadata annotations within 31 mouse, 5 rat and 45 human studies, defined in collaboration with domain disease experts. We elucidate the step-by-step guidelines used to critically prioritize studies from public archives and their metadata curation and discuss the key challenges encountered. Curated metadata for Alzheimer's disease gene expression studies are available for download. Database URL: www.scai.fraunhofer.de/NeuroTransDB.html. © The Author(s) 2015. Published by Oxford University Press.
dbWFA: a web-based database for functional annotation of Triticum aestivum transcripts
Vincent, Jonathan; Dai, Zhanwu; Ravel, Catherine; Choulet, Frédéric; Mouzeyar, Said; Bouzidi, M. Fouad; Agier, Marie; Martre, Pierre
2013-01-01
The functional annotation of genes based on sequence homology with genes from model species genomes is time-consuming because it is necessary to mine several unrelated databases. The aim of the present work was to develop a functional annotation database for common wheat Triticum aestivum (L.). The database, named dbWFA, is based on the reference NCBI UniGene set, an expressed gene catalogue built by expressed sequence tag clustering, and on full-length coding sequences retrieved from the TriFLDB database. Information from good-quality heterogeneous sources, including annotations for model plant species Arabidopsis thaliana (L.) Heynh. and Oryza sativa L., was gathered and linked to T. aestivum sequences through BLAST-based homology searches. Even though the complexity of the transcriptome cannot yet be fully appreciated, we developed a tool to easily and promptly obtain information from multiple functional annotation systems (Gene Ontology, MapMan bin codes, MIPS Functional Categories, PlantCyc pathway reactions and TAIR gene families). The use of dbWFA is illustrated here with several query examples. We were able to assign a putative function to 45% of the UniGenes and 81% of the full-length coding sequences from TriFLDB. Moreover, comparison of the annotation of the whole T. aestivum UniGene set along with curated annotations of the two model species assessed the accuracy of the annotation provided by dbWFA. To further illustrate the use of dbWFA, genes specifically expressed during the early cell division or late storage polymer accumulation phases of T. aestivum grain development were identified using a clustering analysis and then annotated using dbWFA. The annotation of these two sets of genes was consistent with previous analyses of T. aestivum grain transcriptomes and proteomes. Database URL: urgi.versailles.inra.fr/dbWFA/ PMID:23660284
NASA Technical Reports Server (NTRS)
Rilee, Michael Lee; Kuo, Kwo-Sen
2017-01-01
The SpatioTemporal Adaptive Resolution Encoding (STARE) is a unifying scheme encoding geospatial and temporal information for organizing data on scalable computing/storage resources, minimizing expensive data transfers. STARE provides a compact representation that turns set-logic functions into integer operations, e.g. conditional sub-setting, taking into account representative spatiotemporal resolutions of the data in the datasets. STARE geo-spatiotemporally aligns data placements of diverse data on massive parallel resources to maximize performance. Automating important scientific functions (e.g. regridding) and computational functions (e.g. data placement) allows scientists to focus on domain-specific questions instead of expending their efforts and expertise on data processing. With STARE-enabled automation, SciDB (Scientific Database) plus STARE provides a database interface, reducing costly data preparation, increasing the volume and variety of interoperable data, and easing result sharing. Using SciDB plus STARE as part of an integrated analysis infrastructure dramatically eases combining diametrically different datasets.
Silva, Cristina; Fresco, Paula; Monteiro, Joaquim; Rama, Ana Cristina Ribeiro
2013-08-01
Evidence-Based Practice requires health care decisions to be based on the best available evidence. The model "Information Mastery" proposes that clinicians should use sources of information that have previously evaluated relevance and validity, provided at the point of care. Drug databases (DB) allow easy and fast access to information and have the benefit of more frequent content updates. Relevant information, in the context of drug therapy, is that which supports safe and effective use of medicines. Accordingly, the European Guideline on the Summary of Product Characteristics (EG-SmPC) was used as a standard to evaluate the inclusion of relevant information contents in DB. To develop and test a method to evaluate relevancy of DB contents, by assessing the inclusion of information items deemed relevant for effective and safe drug use. Hierarchical organisation and selection of the principles defined in the EGSmPC; definition of criteria to assess inclusion of selected information items; creation of a categorisation and quantification system that allows score calculation; calculation of relative differences (RD) of scores for comparison with an "ideal" database, defined as the one that achieves the best quantification possible for each of the information items; pilot test on a sample of 9 drug databases, using 10 drugs frequently associated in literature with morbidity-mortality and also being widely consumed in Portugal. Main outcome measure Calculate individual and global scores for clinically relevant information items of drug monographs in databases, using the categorisation and quantification system created. A--Method development: selection of sections, subsections, relevant information items and corresponding requisites; system to categorise and quantify their inclusion; score and RD calculation procedure. B--Pilot test: calculated scores for the 9 databases; globally, all databases evaluated significantly differed from the "ideal" database; some DB performed better but performance was inconsistent at subsections level, within the same DB. The method developed allows quantification of the inclusion of relevant information items in DB and comparison with an "ideal database". It is necessary to consult diverse DB in order to find all the relevant information needed to support clinical drug use.
LHCb Conditions database operation assistance systems
NASA Astrophysics Data System (ADS)
Clemencic, M.; Shapoval, I.; Cattaneo, M.; Degaudenzi, H.; Santinelli, R.
2012-12-01
The Conditions Database (CondDB) of the LHCb experiment provides versioned, time dependent geometry and conditions data for all LHCb data processing applications (simulation, high level trigger (HLT), reconstruction, analysis) in a heterogeneous computing environment ranging from user laptops to the HLT farm and the Grid. These different use cases impose front-end support for multiple database technologies (Oracle and SQLite are used). Sophisticated distribution tools are required to ensure timely and robust delivery of updates to all environments. The content of the database has to be managed to ensure that updates are internally consistent and externally compatible with multiple versions of the physics application software. In this paper we describe three systems that we have developed to address these issues. The first system is a CondDB state tracking extension to the Oracle 3D Streams replication technology, to trap cases when the CondDB replication was corrupted. Second, an automated distribution system for the SQLite-based CondDB, providing also smart backup and checkout mechanisms for the CondDB managers and LHCb users respectively. And, finally, a system to verify and monitor the internal (CondDB self-consistency) and external (LHCb physics software vs. CondDB) compatibility. The former two systems are used in production in the LHCb experiment and have achieved the desired goal of higher flexibility and robustness for the management and operation of the CondDB. The latter one has been fully designed and is passing currently to the implementation stage.
Informatics in radiology: use of CouchDB for document-based storage of DICOM objects.
Rascovsky, Simón J; Delgado, Jorge A; Sanz, Alexander; Calvo, Víctor D; Castrillón, Gabriel
2012-01-01
Picture archiving and communication systems traditionally have depended on schema-based Structured Query Language (SQL) databases for imaging data management. To optimize database size and performance, many such systems store a reduced set of Digital Imaging and Communications in Medicine (DICOM) metadata, discarding informational content that might be needed in the future. As an alternative to traditional database systems, document-based key-value stores recently have gained popularity. These systems store documents containing key-value pairs that facilitate data searches without predefined schemas. Document-based key-value stores are especially suited to archive DICOM objects because DICOM metadata are highly heterogeneous collections of tag-value pairs conveying specific information about imaging modalities, acquisition protocols, and vendor-supported postprocessing options. The authors used an open-source document-based database management system (Apache CouchDB) to create and test two such databases; CouchDB was selected for its overall ease of use, capability for managing attachments, and reliance on HTTP and Representational State Transfer standards for accessing and retrieving data. A large database was created first in which the DICOM metadata from 5880 anonymized magnetic resonance imaging studies (1,949,753 images) were loaded by using a Ruby script. To provide the usual DICOM query functionality, several predefined "views" (standard queries) were created by using JavaScript. For performance comparison, the same queries were executed in both the CouchDB database and a SQL-based DICOM archive. The capabilities of CouchDB for attachment management and database replication were separately assessed in tests of a similar, smaller database. Results showed that CouchDB allowed efficient storage and interrogation of all DICOM objects; with the use of information retrieval algorithms such as map-reduce, all the DICOM metadata stored in the large database were searchable with only a minimal increase in retrieval time over that with the traditional database management system. Results also indicated possible uses for document-based databases in data mining applications such as dose monitoring, quality assurance, and protocol optimization. RSNA, 2012
BioPepDB: an integrated data platform for food-derived bioactive peptides.
Li, Qilin; Zhang, Chao; Chen, Hongjun; Xue, Jitong; Guo, Xiaolei; Liang, Ming; Chen, Ming
2018-03-12
Food-derived bioactive peptides play critical roles in regulating most biological processes and have considerable biological, medical and industrial importance. However, a large number of active peptides data, including sequence, function, source, commercial product information, references and other information are poorly integrated. BioPepDB is a searchable database of food-derived bioactive peptides and their related articles, including more than four thousand bioactive peptide entries. Moreover, BioPepDB provides modules of prediction and hydrolysis-simulation for discovering novel peptides. It can serve as a reference database to investigate the function of different bioactive peptides. BioPepDB is available at http://bis.zju.edu.cn/biopepdbr/ . The web page utilises Apache, PHP5 and MySQL to provide the user interface for accessing the database and predict novel peptides. The database itself is operated on a specialised server.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fisher,D
Concerns about the long-term viability of SFS as the metadata store for HPSS have been increasing. A concern that Transarc may discontinue support for SFS motivates us to consider alternative means to store HPSS metadata. The obvious alternative is a commercial database. Commercial databases have the necessary characteristics for storage of HPSS metadata records. They are robust and scalable and can easily accommodate the volume of data that must be stored. They provide programming interfaces, transactional semantics and a full set of maintenance and performance enhancement tools. A team was organized within the HPSS project to study and recommend anmore » approach for the replacement of SFS. Members of the team are David Fisher, Jim Minton, Donna Mecozzi, Danny Cook, Bart Parliman and Lynn Jones. We examined several possible solutions to the problem of replacing SFS, and recommended on May 22, 2000, in a report to the HPSS Technical and Executive Committees, to change HPSS into a database application over either Oracle or DB2. We recommended either Oracle or DB2 on the basis of market share and technical suitability. Oracle and DB2 are dominant offerings in the market, and it is in the best interest of HPSS to use a major player's product. Both databases provide a suitable programming interface. Transaction management functions, support for multi-threaded clients and data manipulation languages (DML) are available. These findings were supported in meetings held with technical experts from both companies. In both cases, the evidence indicated that either database would provide the features needed to host HPSS.« less
Baron, Robert V; Conley, Yvette P; Gorin, Michael B; Weeks, Daniel E
2015-03-18
When studying the genetics of a human trait, we typically have to manage both genome-wide and targeted genotype data. There can be overlap of both people and markers from different genotyping experiments; the overlap can introduce several kinds of problems. Most times the overlapping genotypes are the same, but sometimes they are different. Occasionally, the lab will return genotypes using a different allele labeling scheme (for example 1/2 vs A/C). Sometimes, the genotype for a person/marker index is unreliable or missing. Further, over time some markers are merged and bad samples are re-run under a different sample name. We need a consistent picture of the subset of data we have chosen to work with even though there might possibly be conflicting measurements from multiple data sources. We have developed the dbVOR database, which is designed to hold data efficiently for both genome-wide and targeted experiments. The data are indexed for fast retrieval by person and marker. In addition, we store pedigree and phenotype data for our subjects. The dbVOR database allows us to select subsets of the data by several different criteria and to merge their results into a coherent and consistent whole. Data may be filtered by: family, person, trait value, markers, chromosomes, and chromosome ranges. The results can be presented in columnar, Mega2, or PLINK format. dbVOR serves our needs well. It is freely available from https://watson.hgen.pitt.edu/register . Documentation for dbVOR can be found at https://watson.hgen.pitt.edu/register/docs/dbvor.html .
Krassowski, Michal; Paczkowska, Marta; Cullion, Kim; Huang, Tina; Dzneladze, Irakli; Ouellette, B F Francis; Yamada, Joseph T; Fradet-Turcotte, Amelie
2018-01-01
Abstract Interpretation of genetic variation is needed for deciphering genotype-phenotype associations, mechanisms of inherited disease, and cancer driver mutations. Millions of single nucleotide variants (SNVs) in human genomes are known and thousands are associated with disease. An estimated 21% of disease-associated amino acid substitutions corresponding to missense SNVs are located in protein sites of post-translational modifications (PTMs), chemical modifications of amino acids that extend protein function. ActiveDriverDB is a comprehensive human proteo-genomics database that annotates disease mutations and population variants through the lens of PTMs. We integrated >385,000 published PTM sites with ∼3.6 million substitutions from The Cancer Genome Atlas (TCGA), the ClinVar database of disease genes, and human genome sequencing projects. The database includes site-specific interaction networks of proteins, upstream enzymes such as kinases, and drugs targeting these enzymes. We also predicted network-rewiring impact of mutations by analyzing gains and losses of kinase-bound sequence motifs. ActiveDriverDB provides detailed visualization, filtering, browsing and searching options for studying PTM-associated mutations. Users can upload mutation datasets interactively and use our application programming interface in pipelines. Integrative analysis of mutations and PTMs may help decipher molecular mechanisms of phenotypes and disease, as exemplified by case studies of TP53, BRCA2 and VHL. The open-source database is available at https://www.ActiveDriverDB.org. PMID:29126202
GigaDB: announcing the GigaScience database.
Sneddon, Tam P; Li, Peter; Edmunds, Scott C
2012-07-12
With the launch of GigaScience journal, here we provide insight into the accompanying database GigaDB, which allows the integration of manuscript publication with supporting data and tools. Reinforcing and upholding GigaScience's goals to promote open-data and reproducibility of research, GigaDB also aims to provide a home, when a suitable public repository does not exist, for the supporting data or tools featured in the journal and beyond.
Version VI of the ESTree db: an improved tool for peach transcriptome analysis
Lazzari, Barbara; Caprera, Andrea; Vecchietti, Alberto; Merelli, Ivan; Barale, Francesca; Milanesi, Luciano; Stella, Alessandra; Pozzi, Carlo
2008-01-01
Background The ESTree database (db) is a collection of Prunus persica and Prunus dulcis EST sequences that in its current version encompasses 75,404 sequences from 3 almond and 19 peach libraries. Nine peach genotypes and four peach tissues are represented, from four fruit developmental stages. The aim of this work was to implement the already existing ESTree db by adding new sequences and analysis programs. Particular care was given to the implementation of the web interface, that allows querying each of the database features. Results A Perl modular pipeline is the backbone of sequence analysis in the ESTree db project. Outputs obtained during the pipeline steps are automatically arrayed into the fields of a MySQL database. Apart from standard clustering and annotation analyses, version VI of the ESTree db encompasses new tools for tandem repeat identification, annotation against genomic Rosaceae sequences, and positioning on the database of oligomer sequences that were used in a peach microarray study. Furthermore, known protein patterns and motifs were identified by comparison to PROSITE. Based on data retrieved from sequence annotation against the UniProtKB database, a script was prepared to track positions of homologous hits on the GO tree and build statistics on the ontologies distribution in GO functional categories. EST mapping data were also integrated in the database. The PHP-based web interface was upgraded and extended. The aim of the authors was to enable querying the database according to all the biological aspects that can be investigated from the analysis of data available in the ESTree db. This is achieved by allowing multiple searches on logical subsets of sequences that represent different biological situations or features. Conclusions The version VI of ESTree db offers a broad overview on peach gene expression. Sequence analyses results contained in the database, extensively linked to external related resources, represent a large amount of information that can be queried via the tools offered in the web interface. Flexibility and modularity of the ESTree analysis pipeline and of the web interface allowed the authors to set up similar structures for different datasets, with limited manual intervention. PMID:18387211
MetaMetaDB: a database and analytic system for investigating microbial habitability.
Yang, Ching-chia; Iwasaki, Wataru
2014-01-01
MetaMetaDB (http://mmdb.aori.u-tokyo.ac.jp/) is a database and analytic system for investigating microbial habitability, i.e., how a prokaryotic group can inhabit different environments. The interaction between prokaryotes and the environment is a key issue in microbiology because distinct prokaryotic communities maintain distinct ecosystems. Because 16S ribosomal RNA (rRNA) sequences play pivotal roles in identifying prokaryotic species, a system that comprehensively links diverse environments to 16S rRNA sequences of the inhabitant prokaryotes is necessary for the systematic understanding of the microbial habitability. However, existing databases are biased to culturable prokaryotes and exhibit limitations in the comprehensiveness of the data because most prokaryotes are unculturable. Recently, metagenomic and 16S rRNA amplicon sequencing approaches have generated abundant 16S rRNA sequence data that encompass unculturable prokaryotes across diverse environments; however, these data are usually buried in large databases and are difficult to access. In this study, we developed MetaMetaDB (Meta-Metagenomic DataBase), which comprehensively and compactly covers 16S rRNA sequences retrieved from public datasets. Using MetaMetaDB, users can quickly generate hypotheses regarding the types of environments a prokaryotic group may be adapted to. We anticipate that MetaMetaDB will improve our understanding of the diversity and evolution of prokaryotes.
GenoMycDB: a database for comparative analysis of mycobacterial genes and genomes.
Catanho, Marcos; Mascarenhas, Daniel; Degrave, Wim; Miranda, Antonio Basílio de
2006-03-31
Several databases and computational tools have been created with the aim of organizing, integrating and analyzing the wealth of information generated by large-scale sequencing projects of mycobacterial genomes and those of other organisms. However, with very few exceptions, these databases and tools do not allow for massive and/or dynamic comparison of these data. GenoMycDB (http://www.dbbm.fiocruz.br/GenoMycDB) is a relational database built for large-scale comparative analyses of completely sequenced mycobacterial genomes, based on their predicted protein content. Its central structure is composed of the results obtained after pair-wise sequence alignments among all the predicted proteins coded by the genomes of six mycobacteria: Mycobacterium tuberculosis (strains H37Rv and CDC1551), M. bovis AF2122/97, M. avium subsp. paratuberculosis K10, M. leprae TN, and M. smegmatis MC2 155. The database stores the computed similarity parameters of every aligned pair, providing for each protein sequence the predicted subcellular localization, the assigned cluster of orthologous groups, the features of the corresponding gene, and links to several important databases. Tables containing pairs or groups of potential homologs between selected species/strains can be produced dynamically by user-defined criteria, based on one or multiple sequence similarity parameters. In addition, searches can be restricted according to the predicted subcellular localization of the protein, the DNA strand of the corresponding gene and/or the description of the protein. Massive data search and/or retrieval are available, and different ways of exporting the result are offered. GenoMycDB provides an on-line resource for the functional classification of mycobacterial proteins as well as for the analysis of genome structure, organization, and evolution.
Rapamycin Increases Mortality in db/db Mice, a Mouse Model of Type 2 Diabetes.
Sataranatarajan, Kavithalakshmi; Ikeno, Yuji; Bokov, Alex; Feliers, Denis; Yalamanchili, Himabindu; Lee, Hak Joo; Mariappan, Meenalakshmi M; Tabatabai-Mir, Hooman; Diaz, Vivian; Prasad, Sanjay; Javors, Martin A; Ghosh Choudhury, Goutam; Hubbard, Gene B; Barnes, Jeffrey L; Richardson, Arlan; Kasinath, Balakuntalam S
2016-07-01
We examined the effect of rapamycin on the life span of a mouse model of type 2 diabetes, db/db mice. At 4 months of age, male and female C57BLKSJ-lepr (db/db) mice (db/db) were placed on either a control diet, lacking rapamycin or a diet containing rapamycin and maintained on these diets over their life span. Rapamycin was found to reduce the life span of the db/db mice. The median survival of male db/db mice fed the control and rapamycin diets was 349 and 302 days, respectively, and the median survival of female db/db mice fed the control and rapamycin diets was 487 and 411 days, respectively. Adjusting for gender differences, rapamycin increased the mortality risk 1.7-fold in both male and female db/db mice. End-of-life pathological data showed that suppurative inflammation was the main cause of death in the db/db mice, which is enhanced slightly by rapamycin treatment. © The Author 2015. Published by Oxford University Press on behalf of The Gerontological Society of America. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
SalmonDB: a bioinformatics resource for Salmo salar and Oncorhynchus mykiss
Di Génova, Alex; Aravena, Andrés; Zapata, Luis; González, Mauricio; Maass, Alejandro; Iturra, Patricia
2011-01-01
SalmonDB is a new multiorganism database containing EST sequences from Salmo salar, Oncorhynchus mykiss and the whole genome sequence of Danio rerio, Gasterosteus aculeatus, Tetraodon nigroviridis, Oryzias latipes and Takifugu rubripes, built with core components from GMOD project, GOPArc system and the BioMart project. The information provided by this resource includes Gene Ontology terms, metabolic pathways, SNP prediction, CDS prediction, orthologs prediction, several precalculated BLAST searches and domains. It also provides a BLAST server for matching user-provided sequences to any of the databases and an advanced query tool (BioMart) that allows easy browsing of EST databases with user-defined criteria. These tools make SalmonDB database a valuable resource for researchers searching for transcripts and genomic information regarding S. salar and other salmonid species. The database is expected to grow in the near feature, particularly with the S. salar genome sequencing project. Database URL: http://genomicasalmones.dim.uchile.cl/ PMID:22120661
SalmonDB: a bioinformatics resource for Salmo salar and Oncorhynchus mykiss.
Di Génova, Alex; Aravena, Andrés; Zapata, Luis; González, Mauricio; Maass, Alejandro; Iturra, Patricia
2011-01-01
SalmonDB is a new multiorganism database containing EST sequences from Salmo salar, Oncorhynchus mykiss and the whole genome sequence of Danio rerio, Gasterosteus aculeatus, Tetraodon nigroviridis, Oryzias latipes and Takifugu rubripes, built with core components from GMOD project, GOPArc system and the BioMart project. The information provided by this resource includes Gene Ontology terms, metabolic pathways, SNP prediction, CDS prediction, orthologs prediction, several precalculated BLAST searches and domains. It also provides a BLAST server for matching user-provided sequences to any of the databases and an advanced query tool (BioMart) that allows easy browsing of EST databases with user-defined criteria. These tools make SalmonDB database a valuable resource for researchers searching for transcripts and genomic information regarding S. salar and other salmonid species. The database is expected to grow in the near feature, particularly with the S. salar genome sequencing project. Database URL: http://genomicasalmones.dim.uchile.cl/
GigaDB: announcing the GigaScience database
2012-01-01
With the launch of GigaScience journal, here we provide insight into the accompanying database GigaDB, which allows the integration of manuscript publication with supporting data and tools. Reinforcing and upholding GigaScience’s goals to promote open-data and reproducibility of research, GigaDB also aims to provide a home, when a suitable public repository does not exist, for the supporting data or tools featured in the journal and beyond. PMID:23587345
Design, Implementation and Applications of 3d Web-Services in DB4GEO
NASA Astrophysics Data System (ADS)
Breunig, M.; Kuper, P. V.; Dittrich, A.; Wild, P.; Butwilowski, E.; Al-Doori, M.
2013-09-01
The object-oriented database architecture DB4GeO was originally designed to support sub-surface applications in the geo-sciences. This is reflected in DB4GeO's geometric data model as well as in its import and export functions. Initially, these functions were designed for communication with 3D geological modeling and visualization tools such as GOCAD or MeshLab. However, it soon became clear that DB4GeO was suitable for a much wider range of applications. Therefore it is natural to move away from a standalone solution and to open the access to DB4GeO data by standardized OGC web-services. Though REST and OGC services seem incompatible at first sight, the implementation in DB4GeO shows that OGC-based implementation of web-services may use parts of the DB4GeO-REST implementation. Starting with initial solutions in the history of DB4GeO, this paper will introduce the design, adaptation (i.e. model transformation), and first steps in the implementation of OGC Web Feature (WFS) and Web Processing Services (WPS), as new interfaces to DB4GeO data and operations. Among its capabilities, DB4GeO can provide data in different data formats like GML, GOCAD, or DB3D XML through a WFS, as well as its ability to run operations like a 3D-to-2D service, or mesh-simplification (Progressive Meshes) through a WPS. We then demonstrate, an Android-based mobile 3D augmented reality viewer for DB4GeO that uses the Web Feature Service to visualize 3D geo-database query results. Finally, we explore future research work considering DB4GeO in the framework of the research group "Computer-Aided Collaborative Subway Track Planning in Multi-Scale 3D City and Building Models".
Atomic analysis of protein-protein interfaces with known inhibitors: the 2P2I database.
Bourgeas, Raphaël; Basse, Marie-Jeanne; Morelli, Xavier; Roche, Philippe
2010-03-09
In the last decade, the inhibition of protein-protein interactions (PPIs) has emerged from both academic and private research as a new way to modulate the activity of proteins. Inhibitors of these original interactions are certainly the next generation of highly innovative drugs that will reach the market in the next decade. However, in silico design of such compounds still remains challenging. Here we describe this particular PPI chemical space through the presentation of 2P2I(DB), a hand-curated database dedicated to the structure of PPIs with known inhibitors. We have analyzed protein/protein and protein/inhibitor interfaces in terms of geometrical parameters, atom and residue properties, buried accessible surface area and other biophysical parameters. The interfaces found in 2P2I(DB) were then compared to those of representative datasets of heterodimeric complexes. We propose a new classification of PPIs with known inhibitors into two classes depending on the number of segments present at the interface and corresponding to either a single secondary structure element or to a more globular interacting domain. 2P2I(DB) complexes share global shape properties with standard transient heterodimer complexes, but their accessible surface areas are significantly smaller. No major conformational changes are seen between the different states of the proteins. The interfaces are more hydrophobic than general PPI's interfaces, with less charged residues and more non-polar atoms. Finally, fifty percent of the complexes in the 2P2I(DB) dataset possess more hydrogen bonds than typical protein-protein complexes. Potential areas of study for the future are proposed, which include a new classification system consisting of specific families and the identification of PPI targets with high druggability potential based on key descriptors of the interaction. 2P2I database stores structural information about PPIs with known inhibitors and provides a useful tool for biologists to assess the potential druggability of their interfaces. The database can be accessed at http://2p2idb.cnrs-mrs.fr.
AgeFactDB--the JenAge Ageing Factor Database--towards data integration in ageing research.
Hühne, Rolf; Thalheim, Torsten; Sühnel, Jürgen
2014-01-01
AgeFactDB (http://agefactdb.jenage.de) is a database aimed at the collection and integration of ageing phenotype data including lifespan information. Ageing factors are considered to be genes, chemical compounds or other factors such as dietary restriction, whose action results in a changed lifespan or another ageing phenotype. Any information related to the effects of ageing factors is called an observation and is presented on observation pages. To provide concise access to the complete information for a particular ageing factor, corresponding observations are also summarized on ageing factor pages. In a first step, ageing-related data were primarily taken from existing databases such as the Ageing Gene Database--GenAge, the Lifespan Observations Database and the Dietary Restriction Gene Database--GenDR. In addition, we have started to include new ageing-related information. Based on homology data taken from the HomoloGene Database, AgeFactDB also provides observation and ageing factor pages of genes that are homologous to known ageing-related genes. These homologues are considered as candidate or putative ageing-related genes. AgeFactDB offers a variety of search and browse options, and also allows the download of ageing factor or observation lists in TSV, CSV and XML formats.
Atomic Spectra Bibliography Databases at NIST
NASA Astrophysics Data System (ADS)
Kramida, Alexander
2010-03-01
NIST's Atomic Spectroscopy Data Center maintains three online Bibliographic Databases (BD) [http://physics.nist.gov/PhysRefData/ASBib1/index.html]: -- Atomic Energy Levels and Spectra (AEL BD), Atomic Transition Probability (ATP BD), and Atomic Spectral Line Broadening (ALB BD). This year marks new releases of these BDs -- AEL BD v.2.0, ATP BD v.9.0, and ALB DB v.3.0. These releases incorporate significant improvements in the quantity and quality of bibliographic data since the previous versions published first in 2006. The total number of papers in the three DBs grew from 20,000 to 30,000. The data search is now made easier, and the returned content is enriched with direct links to online journal articles and universal Digital Object Identifiers. Statistics show a nearly constant flow of new publications on atomic spectroscopy, about 600 new papers published each year since 1968. New papers are inserted in our BDs every two weeks on average.
Sakai, Hiroaki; Lee, Sung Shin; Tanaka, Tsuyoshi; Numa, Hisataka; Kim, Jungsok; Kawahara, Yoshihiro; Wakimoto, Hironobu; Yang, Ching-chia; Iwamoto, Masao; Abe, Takashi; Yamada, Yuko; Muto, Akira; Inokuchi, Hachiro; Ikemura, Toshimichi; Matsumoto, Takashi; Sasaki, Takuji; Itoh, Takeshi
2013-02-01
The Rice Annotation Project Database (RAP-DB, http://rapdb.dna.affrc.go.jp/) has been providing a comprehensive set of gene annotations for the genome sequence of rice, Oryza sativa (japonica group) cv. Nipponbare. Since the first release in 2005, RAP-DB has been updated several times along with the genome assembly updates. Here, we present our newest RAP-DB based on the latest genome assembly, Os-Nipponbare-Reference-IRGSP-1.0 (IRGSP-1.0), which was released in 2011. We detected 37,869 loci by mapping transcript and protein sequences of 150 monocot species. To provide plant researchers with highly reliable and up to date rice gene annotations, we have been incorporating literature-based manually curated data, and 1,626 loci currently incorporate literature-based annotation data, including commonly used gene names or gene symbols. Transcriptional activities are shown at the nucleotide level by mapping RNA-Seq reads derived from 27 samples. We also mapped the Illumina reads of a Japanese leading japonica cultivar, Koshihikari, and a Chinese indica cultivar, Guangluai-4, to the genome and show alignments together with the single nucleotide polymorphisms (SNPs) and gene functional annotations through a newly developed browser, Short-Read Assembly Browser (S-RAB). We have developed two satellite databases, Plant Gene Family Database (PGFD) and Integrative Database of Cereal Gene Phylogeny (IDCGP), which display gene family and homologous gene relationships among diverse plant species. RAP-DB and the satellite databases offer simple and user-friendly web interfaces, enabling plant and genome researchers to access the data easily and facilitating a broad range of plant research topics.
The Molecular Signatures Database (MSigDB) hallmark gene set collection.
Liberzon, Arthur; Birger, Chet; Thorvaldsdóttir, Helga; Ghandi, Mahmoud; Mesirov, Jill P; Tamayo, Pablo
2015-12-23
The Molecular Signatures Database (MSigDB) is one of the most widely used and comprehensive databases of gene sets for performing gene set enrichment analysis. Since its creation, MSigDB has grown beyond its roots in metabolic disease and cancer to include >10,000 gene sets. These better represent a wider range of biological processes and diseases, but the utility of the database is reduced by increased redundancy across, and heterogeneity within, gene sets. To address this challenge, here we use a combination of automated approaches and expert curation to develop a collection of "hallmark" gene sets as part of MSigDB. Each hallmark in this collection consists of a "refined" gene set, derived from multiple "founder" sets, that conveys a specific biological state or process and displays coherent expression. The hallmarks effectively summarize most of the relevant information of the original founder sets and, by reducing both variation and redundancy, provide more refined and concise inputs for gene set enrichment analysis.
Friedrich, Anne; Garnier, Nicolas; Gagnière, Nicolas; Nguyen, Hoan; Albou, Laurent-Philippe; Biancalana, Valérie; Bettler, Emmanuel; Deléage, Gilbert; Lecompte, Odile; Muller, Jean; Moras, Dino; Mandel, Jean-Louis; Toursel, Thierry; Moulinier, Luc; Poch, Olivier
2010-02-01
Understanding how genetic alterations affect gene products at the molecular level represents a first step in the elucidation of the complex relationships between genotypic and phenotypic variations, and is thus a major challenge in the postgenomic era. Here, we present SM2PH-db (http://decrypthon.igbmc.fr/sm2ph), a new database designed to investigate structural and functional impacts of missense mutations and their phenotypic effects in the context of human genetic diseases. A wealth of up-to-date interconnected information is provided for each of the 2,249 disease-related entry proteins (August 2009), including data retrieved from biological databases and data generated from a Sequence-Structure-Evolution Inference in Systems-based approach, such as multiple alignments, three-dimensional structural models, and multidimensional (physicochemical, functional, structural, and evolutionary) characterizations of mutations. SM2PH-db provides a robust infrastructure associated with interactive analysis tools supporting in-depth study and interpretation of the molecular consequences of mutations, with the more long-term goal of elucidating the chain of events leading from a molecular defect to its pathology. The entire content of SM2PH-db is regularly and automatically updated thanks to a computational grid data federation facilities provided in the context of the Decrypthon program. (c) 2009 Wiley-Liss, Inc.
MIPS PlantsDB: a database framework for comparative plant genome research.
Nussbaumer, Thomas; Martis, Mihaela M; Roessner, Stephan K; Pfeifer, Matthias; Bader, Kai C; Sharma, Sapna; Gundlach, Heidrun; Spannagl, Manuel
2013-01-01
The rapidly increasing amount of plant genome (sequence) data enables powerful comparative analyses and integrative approaches and also requires structured and comprehensive information resources. Databases are needed for both model and crop plant organisms and both intuitive search/browse views and comparative genomics tools should communicate the data to researchers and help them interpret it. MIPS PlantsDB (http://mips.helmholtz-muenchen.de/plant/genomes.jsp) was initially described in NAR in 2007 [Spannagl,M., Noubibou,O., Haase,D., Yang,L., Gundlach,H., Hindemitt, T., Klee,K., Haberer,G., Schoof,H. and Mayer,K.F. (2007) MIPSPlantsDB-plant database resource for integrative and comparative plant genome research. Nucleic Acids Res., 35, D834-D840] and was set up from the start to provide data and information resources for individual plant species as well as a framework for integrative and comparative plant genome research. PlantsDB comprises database instances for tomato, Medicago, Arabidopsis, Brachypodium, Sorghum, maize, rice, barley and wheat. Building up on that, state-of-the-art comparative genomics tools such as CrowsNest are integrated to visualize and investigate syntenic relationships between monocot genomes. Results from novel genome analysis strategies targeting the complex and repetitive genomes of triticeae species (wheat and barley) are provided and cross-linked with model species. The MIPS Repeat Element Database (mips-REdat) and Catalog (mips-REcat) as well as tight connections to other databases, e.g. via web services, are further important components of PlantsDB.
Photo-z-SQL: Photometric redshift estimation framework
NASA Astrophysics Data System (ADS)
Beck, Róbert; Dobos, László; Budavári, Tamás; Szalay, Alexander S.; Csabai, István
2017-04-01
Photo-z-SQL is a flexible template-based photometric redshift estimation framework that can be seamlessly integrated into a SQL database (or DB) server and executed on demand in SQL. The DB integration eliminates the need to move large photometric datasets outside a database for redshift estimation, and uses the computational capabilities of DB hardware. Photo-z-SQL performs both maximum likelihood and Bayesian estimation and handles inputs of variable photometric filter sets and corresponding broad-band magnitudes.
Volcanic observation data and simulation database at NIED, Japan (Invited)
NASA Astrophysics Data System (ADS)
Fujita, E.; Ueda, H.; Kozono, T.
2009-12-01
NIED (Nat’l Res. Inst. for Earth Sci. & Disast. Prev.) has a project to develop two volcanic database systems: (1) volcanic observation database; (2) volcanic simulation database. The volcanic observation database is the data archive center obtained by the geophysical observation networks at Mt. Fuji, Miyake, Izu-Oshima, Iwo-jima and Nasu volcanoes, central Japan. The data consist of seismic (both high-sensitivity and broadband), ground deformation (tiltmeter, GPS) and those from other sensors (e.g., rain gauge, gravimeter, magnetometer, pressure gauge.) These data is originally stored in “WIN format,” the Japanese standard format, which is also at the Hi-net (High sensitivity seismic network Japan, http://www.hinet.bosai.go.jp/). NIED joins to WOVOdat and we have prepared to upload our data, via XML format. Our concept of the XML format is 1)a common format for intermediate files to upload into the WOVOdat DB, 2) for data files downloaded from the WOVOdat DB, 3) for data exchanges between observatories without the WOVOdat DB, 4) for common data files in each observatory, 5) for data communications between systems and softwares and 6)a for softwares. NIED is now preparing for (2) the volcanic simulation database. The objective of this project is to support to develop a “real-time” hazard map, i.e., the system which is effective to evaluate volcanic hazard in case of emergency, including the up-to-date conditions. Our system will include lava flow simulation (LavaSIM) and pyroclastic flow simulation (grvcrt). The database will keep many cases of assumed simulations and we can pick up the most probable case as the first evaluation in case the eruption started. The final goals of the both database will realize the volcanic eruption prediction and forecasting in real time by the combination of monitoring data and numerical simulations.
HoPaCI-DB: host-Pseudomonas and Coxiella interaction database
Bleves, Sophie; Dunger, Irmtraud; Walter, Mathias C.; Frangoulidis, Dimitrios; Kastenmüller, Gabi; Voulhoux, Romé; Ruepp, Andreas
2014-01-01
Bacterial infectious diseases are the result of multifactorial processes affected by the interplay between virulence factors and host targets. The host-Pseudomonas and Coxiella interaction database (HoPaCI-DB) is a publicly available manually curated integrative database (http://mips.helmholtz-muenchen.de/HoPaCI/) of host–pathogen interaction data from Pseudomonas aeruginosa and Coxiella burnetii. The resource provides structured information on 3585 experimentally validated interactions between molecules, bioprocesses and cellular structures extracted from the scientific literature. Systematic annotation and interactive graphical representation of disease networks make HoPaCI-DB a versatile knowledge base for biologists and network biology approaches. PMID:24137008
ESTree db: a Tool for Peach Functional Genomics
Lazzari, Barbara; Caprera, Andrea; Vecchietti, Alberto; Stella, Alessandra; Milanesi, Luciano; Pozzi, Carlo
2005-01-01
Background The ESTree db represents a collection of Prunus persica expressed sequenced tags (ESTs) and is intended as a resource for peach functional genomics. A total of 6,155 successful EST sequences were obtained from four in-house prepared cDNA libraries from Prunus persica mesocarps at different developmental stages. Another 12,475 peach EST sequences were downloaded from public databases and added to the ESTree db. An automated pipeline was prepared to process EST sequences using public software integrated by in-house developed Perl scripts and data were collected in a MySQL database. A php-based web interface was developed to query the database. Results The ESTree db version as of April 2005 encompasses 18,630 sequences representing eight libraries. Contig assembly was performed with CAP3. Putative single nucleotide polymorphism (SNP) detection was performed with the AutoSNP program and a search engine was implemented to retrieve results. All the sequences and all the contig consensus sequences were annotated both with blastx against the GenBank nr db and with GOblet against the viridiplantae section of the Gene Ontology db. Links to NiceZyme (Expasy) and to the KEGG metabolic pathways were provided. A local BLAST utility is available. A text search utility allows querying and browsing the database. Statistics were provided on Gene Ontology occurrences to assign sequences to Gene Ontology categories. Conclusion The resulting database is a comprehensive resource of data and links related to peach EST sequences. The Sequence Report and Contig Report pages work as the web interface core structures, giving quick access to data related to each sequence/contig. PMID:16351742
ESTree db: a tool for peach functional genomics.
Lazzari, Barbara; Caprera, Andrea; Vecchietti, Alberto; Stella, Alessandra; Milanesi, Luciano; Pozzi, Carlo
2005-12-01
The ESTree db http://www.itb.cnr.it/estree/ represents a collection of Prunus persica expressed sequenced tags (ESTs) and is intended as a resource for peach functional genomics. A total of 6,155 successful EST sequences were obtained from four in-house prepared cDNA libraries from Prunus persica mesocarps at different developmental stages. Another 12,475 peach EST sequences were downloaded from public databases and added to the ESTree db. An automated pipeline was prepared to process EST sequences using public software integrated by in-house developed Perl scripts and data were collected in a MySQL database. A php-based web interface was developed to query the database. The ESTree db version as of April 2005 encompasses 18,630 sequences representing eight libraries. Contig assembly was performed with CAP3. Putative single nucleotide polymorphism (SNP) detection was performed with the AutoSNP program and a search engine was implemented to retrieve results. All the sequences and all the contig consensus sequences were annotated both with blastx against the GenBank nr db and with GOblet against the viridiplantae section of the Gene Ontology db. Links to NiceZyme (Expasy) and to the KEGG metabolic pathways were provided. A local BLAST utility is available. A text search utility allows querying and browsing the database. Statistics were provided on Gene Ontology occurrences to assign sequences to Gene Ontology categories. The resulting database is a comprehensive resource of data and links related to peach EST sequences. The Sequence Report and Contig Report pages work as the web interface core structures, giving quick access to data related to each sequence/contig.
MetaMetaDB: A Database and Analytic System for Investigating Microbial Habitability
Yang, Ching-chia; Iwasaki, Wataru
2014-01-01
MetaMetaDB (http://mmdb.aori.u-tokyo.ac.jp/) is a database and analytic system for investigating microbial habitability, i.e., how a prokaryotic group can inhabit different environments. The interaction between prokaryotes and the environment is a key issue in microbiology because distinct prokaryotic communities maintain distinct ecosystems. Because 16S ribosomal RNA (rRNA) sequences play pivotal roles in identifying prokaryotic species, a system that comprehensively links diverse environments to 16S rRNA sequences of the inhabitant prokaryotes is necessary for the systematic understanding of the microbial habitability. However, existing databases are biased to culturable prokaryotes and exhibit limitations in the comprehensiveness of the data because most prokaryotes are unculturable. Recently, metagenomic and 16S rRNA amplicon sequencing approaches have generated abundant 16S rRNA sequence data that encompass unculturable prokaryotes across diverse environments; however, these data are usually buried in large databases and are difficult to access. In this study, we developed MetaMetaDB (Meta-Metagenomic DataBase), which comprehensively and compactly covers 16S rRNA sequences retrieved from public datasets. Using MetaMetaDB, users can quickly generate hypotheses regarding the types of environments a prokaryotic group may be adapted to. We anticipate that MetaMetaDB will improve our understanding of the diversity and evolution of prokaryotes. PMID:24475242
ESTuber db: an online database for Tuber borchii EST sequences.
Lazzari, Barbara; Caprera, Andrea; Cosentino, Cristian; Stella, Alessandra; Milanesi, Luciano; Viotti, Angelo
2007-03-08
The ESTuber database (http://www.itb.cnr.it/estuber) includes 3,271 Tuber borchii expressed sequence tags (EST). The dataset consists of 2,389 sequences from an in-house prepared cDNA library from truffle vegetative hyphae, and 882 sequences downloaded from GenBank and representing four libraries from white truffle mycelia and ascocarps at different developmental stages. An automated pipeline was prepared to process EST sequences using public software integrated by in-house developed Perl scripts. Data were collected in a MySQL database, which can be queried via a php-based web interface. Sequences included in the ESTuber db were clustered and annotated against three databases: the GenBank nr database, the UniProtKB database and a third in-house prepared database of fungi genomic sequences. An algorithm was implemented to infer statistical classification among Gene Ontology categories from the ontology occurrences deduced from the annotation procedure against the UniProtKB database. Ontologies were also deduced from the annotation of more than 130,000 EST sequences from five filamentous fungi, for intra-species comparison purposes. Further analyses were performed on the ESTuber db dataset, including tandem repeats search and comparison of the putative protein dataset inferred from the EST sequences to the PROSITE database for protein patterns identification. All the analyses were performed both on the complete sequence dataset and on the contig consensus sequences generated by the EST assembly procedure. The resulting web site is a resource of data and links related to truffle expressed genes. The Sequence Report and Contig Report pages are the web interface core structures which, together with the Text search utility and the Blast utility, allow easy access to the data stored in the database.
WebDB Component Builder - Lessons Learned
DOE Office of Scientific and Technical Information (OSTI.GOV)
Macedo, C.
2000-02-15
Oracle WebDB is the easiest way to produce web enabled lightweight and enterprise-centric applications. This concept from Oracle has tantalized our taste for simplistic web development by using a purely web based tool that lives nowhere else but in the database. The use of online wizards, templates, and query builders, which produces PL/SQL behind the curtains, can be used straight ''out of the box'' by both novice and seasoned developers. The topic of this presentation will introduce lessons learned by developing and deploying applications built using the WebDB Component Builder in conjunction with custom PL/SQL code to empower a hybridmore » application. There are two kinds of WebDB components: those that display data to end users via reporting, and those that let end users update data in the database via entry forms. The presentation will also discuss various methods within the Component Builder to enhance the applications pushed to the desktop. The demonstrated example is an application entitled HOME (Helping Other's More Effectively) that was built to manage a yearly United Way Campaign effort. Our task was to build an end to end application which could manage approximately 900 non-profit agencies, an average of 4,100 individual contributions, and $1.2 million dollars. Using WebDB, the shell of the application was put together in a matter of a few weeks. However, we did encounter some hurdles that WebDB, in it's stage of infancy (v2.0), could not solve for us directly. Together with custom PL/SQL, WebDB's Component Builder became a powerful tool that enabled us to produce a very flexible hybrid application.« less
High Throughput Experimental Materials Database
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zakutayev, Andriy; Perkins, John; Schwarting, Marcus
The mission of the High Throughput Experimental Materials Database (HTEM DB) is to enable discovery of new materials with useful properties by releasing large amounts of high-quality experimental data to public. The HTEM DB contains information about materials obtained from high-throughput experiments at the National Renewable Energy Laboratory (NREL).
Methods to Secure Databases Against Vulnerabilities
2015-12-01
for several languages such as C, C++, PHP, Java and Python [16]. MySQL will work well with very large databases. The documentation references...using Eclipse and connected to each database management system using Python and Java drivers provided by MySQL , MongoDB, and Datastax (for Cassandra...tiers in Python and Java . Problem MySQL MongoDB Cassandra 1. Injection a. Tautologies Vulnerable Vulnerable Not Vulnerable b. Illegal query
Jefferson, Emily R.; Walsh, Thomas P.; Roberts, Timothy J.; Barton, Geoffrey J.
2007-01-01
SNAPPI-DB, a high performance database of Structures, iNterfaces and Alignments of Protein–Protein Interactions, and its associated Java Application Programming Interface (API) is described. SNAPPI-DB contains structural data, down to the level of atom co-ordinates, for each structure in the Protein Data Bank (PDB) together with associated data including SCOP, CATH, Pfam, SWISSPROT, InterPro, GO terms, Protein Quaternary Structures (PQS) and secondary structure information. Domain–domain interactions are stored for multiple domain definitions and are classified by their Superfamily/Family pair and interaction interface. Each set of classified domain–domain interactions has an associated multiple structure alignment for each partner. The API facilitates data access via PDB entries, domains and domain–domain interactions. Rapid development, fast database access and the ability to perform advanced queries without the requirement for complex SQL statements are provided via an object oriented database and the Java Data Objects (JDO) API. SNAPPI-DB contains many features which are not available in other databases of structural protein–protein interactions. It has been applied in three studies on the properties of protein–protein interactions and is currently being employed to train a protein–protein interaction predictor and a functional residue predictor. The database, API and manual are available for download at: . PMID:17202171
Human Mitochondrial Protein Database
National Institute of Standards and Technology Data Gateway
SRD 131 Human Mitochondrial Protein Database (Web, free access) The Human Mitochondrial Protein Database (HMPDb) provides comprehensive data on mitochondrial and human nuclear encoded proteins involved in mitochondrial biogenesis and function. This database consolidates information from SwissProt, LocusLink, Protein Data Bank (PDB), GenBank, Genome Database (GDB), Online Mendelian Inheritance in Man (OMIM), Human Mitochondrial Genome Database (mtDB), MITOMAP, Neuromuscular Disease Center and Human 2-D PAGE Databases. This database is intended as a tool not only to aid in studying the mitochondrion but in studying the associated diseases.
KMeyeDB: a graphical database of mutations in genes that cause eye diseases.
Kawamura, Takashi; Ohtsubo, Masafumi; Mitsuyama, Susumu; Ohno-Nakamura, Saho; Shimizu, Nobuyoshi; Minoshima, Shinsei
2010-06-01
KMeyeDB (http://mutview.dmb.med.keio.ac.jp/) is a database of human gene mutations that cause eye diseases. We have substantially enriched the amount of data in the database, which now contains information about the mutations of 167 human genes causing eye-related diseases including retinitis pigmentosa, cone-rod dystrophy, night blindness, Oguchi disease, Stargardt disease, macular degeneration, Leber congenital amaurosis, corneal dystrophy, cataract, glaucoma, retinoblastoma, Bardet-Biedl syndrome, and Usher syndrome. KMeyeDB is operated using the database software MutationView, which deals with various characters of mutations, gene structure, protein functional domains, and polymerase chain reaction (PCR) primers, as well as clinical data for each case. Users can access the database using an ordinary Internet browser with smooth user-interface, without user registration. The results are displayed on the graphical windows together with statistical calculations. All mutations and associated data have been collected from published articles. Careful data analysis with KMeyeDB revealed many interesting features regarding the mutations in 167 genes that cause 326 different types of eye diseases. Some genes are involved in multiple types of eye diseases, whereas several eye diseases are caused by different mutations in one gene.
AtomDB: Expanding an Accessible and Accurate Atomic Database for X-ray Astronomy
NASA Astrophysics Data System (ADS)
Smith, Randall
Since its inception in 2001, the AtomDB has become the standard repository of accurate and accessible atomic data for the X-ray astrophysics community, including laboratory astrophysicists, observers, and modelers. Modern calculations of collisional excitation rates now exist - and are in AtomDB - for all abundant ions in a hot plasma. AtomDB has expanded beyond providing just a collisional model, and now also contains photoionization data from XSTAR as well as a charge exchange model, amongst others. However, building and maintaining an accurate and complete database that can fully exploit the diagnostic potential of high-resolution X-ray spectra requires further work. The Hitomi results, sadly limited as they were, demonstrated the urgent need for the best possible wavelength and rate data, not merely for the strongest lines but for the diagnostic features that may have 1% or less of the flux of the strong lines. In particular, incorporation of weak but powerfully diagnostic satellite lines will be crucial to understanding the spectra expected from upcoming deep observations with Chandra and XMM-Newton, as well as the XARM and Athena satellites. Beyond incorporating this new data, a number of groups, both experimental and theoretical, have begun to produce data with errors and/or sensitivity estimates. We plan to use this to create statistically meaningful spectral errors on collisional plasmas, providing practical uncertainties together with model spectra. We propose to continue to (1) engage the X-ray astrophysics community regarding their issues and needs, notably by a critical comparison with other related databases and tools, (2) enhance AtomDB to incorporate a large number of satellite lines as well as updated wavelengths with error estimates, (3) continue to update the AtomDB with the latest calculations and laboratory measurements, in particular velocity-dependent charge exchange rates, and (4) enhance existing tools, and create new ones as needed to increase the functionality of, and access to, AtomDB.
Virus Database and Online Inquiry System Based on Natural Vectors.
Dong, Rui; Zheng, Hui; Tian, Kun; Yau, Shek-Chung; Mao, Weiguang; Yu, Wenping; Yin, Changchuan; Yu, Chenglong; He, Rong Lucy; Yang, Jie; Yau, Stephen St
2017-01-01
We construct a virus database called VirusDB (http://yaulab.math.tsinghua.edu.cn/VirusDB/) and an online inquiry system to serve people who are interested in viral classification and prediction. The database stores all viral genomes, their corresponding natural vectors, and the classification information of the single/multiple-segmented viral reference sequences downloaded from National Center for Biotechnology Information. The online inquiry system serves the purpose of computing natural vectors and their distances based on submitted genomes, providing an online interface for accessing and using the database for viral classification and prediction, and back-end processes for automatic and manual updating of database content to synchronize with GenBank. Submitted genomes data in FASTA format will be carried out and the prediction results with 5 closest neighbors and their classifications will be returned by email. Considering the one-to-one correspondence between sequence and natural vector, time efficiency, and high accuracy, natural vector is a significant advance compared with alignment methods, which makes VirusDB a useful database in further research.
The Web-Based DNA Vaccine Database DNAVaxDB and Its Usage for Rational DNA Vaccine Design.
Racz, Rebecca; He, Yongqun
2016-01-01
A DNA vaccine is a vaccine that uses a mammalian expression vector to express one or more protein antigens and is administered in vivo to induce an adaptive immune response. Since the 1990s, a significant amount of research has been performed on DNA vaccines and the mechanisms behind them. To meet the needs of the DNA vaccine research community, we created DNAVaxDB ( http://www.violinet.org/dnavaxdb ), the first Web-based database and analysis resource of experimentally verified DNA vaccines. All the data in DNAVaxDB, which includes plasmids, antigens, vaccines, and sources, is manually curated and experimentally verified. This chapter goes over the detail of DNAVaxDB system and shows how the DNA vaccine database, combined with the Vaxign vaccine design tool, can be used for rational design of a DNA vaccine against a pathogen, such as Mycobacterium bovis.
Ibmdbpy-spatial : An Open-source implementation of in-database geospatial analytics in Python
NASA Astrophysics Data System (ADS)
Roy, Avipsa; Fouché, Edouard; Rodriguez Morales, Rafael; Moehler, Gregor
2017-04-01
As the amount of spatial data acquired from several geodetic sources has grown over the years and as data infrastructure has become more powerful, the need for adoption of in-database analytic technology within geosciences has grown rapidly. In-database analytics on spatial data stored in a traditional enterprise data warehouse enables much faster retrieval and analysis for making better predictions about risks and opportunities, identifying trends and spot anomalies. Although there are a number of open-source spatial analysis libraries like geopandas and shapely available today, most of them have been restricted to manipulation and analysis of geometric objects with a dependency on GEOS and similar libraries. We present an open-source software package, written in Python, to fill the gap between spatial analysis and in-database analytics. Ibmdbpy-spatial provides a geospatial extension to the ibmdbpy package, implemented in 2015. It provides an interface for spatial data manipulation and access to in-database algorithms in IBM dashDB, a data warehouse platform with a spatial extender that runs as a service on IBM's cloud platform called Bluemix. Working in-database reduces the network overload, as the complete data need not be replicated into the user's local system altogether and only a subset of the entire dataset can be fetched into memory in a single instance. Ibmdbpy-spatial accelerates Python analytics by seamlessly pushing operations written in Python into the underlying database for execution using the dashDB spatial extender, thereby benefiting from in-database performance-enhancing features, such as columnar storage and parallel processing. The package is currently supported on Python versions from 2.7 up to 3.4. The basic architecture of the package consists of three main components - 1) a connection to the dashDB represented by the instance IdaDataBase, which uses a middleware API namely - pypyodbc or jaydebeapi to establish the database connection via ODBC or JDBC respectively, 2) an instance to represent the spatial data stored in the database as a dataframe in Python, called the IdaGeoDataFrame, with a specific geometry attribute which recognises a planar geometry column in dashDB and 3) Python wrappers for spatial functions like within, distance, area, buffer} and more which dashDB currently supports to make the querying process from Python much simpler for the users. The spatial functions translate well-known geopandas-like syntax into SQL queries utilising the database connection to perform spatial operations in-database and can operate on single geometries as well two different geometries from different IdaGeoDataFrames. The in-database queries strictly follow the standards of OpenGIS Implementation Specification for Geographic information - Simple feature access for SQL. The results of the operations obtained can thereby be accessed dynamically via interactive Jupyter notebooks from any system which supports Python, without any additional dependencies and can also be combined with other open source libraries such as matplotlib and folium in-built within Jupyter notebooks for visualization purposes. We built a use case to analyse crime hotspots in New York city to validate our implementation and visualized the results as a choropleth map for each borough.
EPAs ToxCast Program: From Research to Application
A New Paradigm for Toxicity Testing in the 21st Century. In FY 2009, EPA published the toxicity reference database ToxRefDB, which contains results of over 30 years and $2B worth of animal studies for over 400 chemicals. This database is available on EPA’s website, and increases...
EuPathDB: the eukaryotic pathogen genomics database resource
Aurrecoechea, Cristina; Barreto, Ana; Basenko, Evelina Y.; Brestelli, John; Brunk, Brian P.; Cade, Shon; Crouch, Kathryn; Doherty, Ryan; Falke, Dave; Fischer, Steve; Gajria, Bindu; Harb, Omar S.; Heiges, Mark; Hertz-Fowler, Christiane; Hu, Sufen; Iodice, John; Kissinger, Jessica C.; Lawrence, Cris; Li, Wei; Pinney, Deborah F.; Pulman, Jane A.; Roos, David S.; Shanmugasundram, Achchuthan; Silva-Franco, Fatima; Steinbiss, Sascha; Stoeckert, Christian J.; Spruill, Drew; Wang, Haiming; Warrenfeltz, Susanne; Zheng, Jie
2017-01-01
The Eukaryotic Pathogen Genomics Database Resource (EuPathDB, http://eupathdb.org) is a collection of databases covering 170+ eukaryotic pathogens (protists & fungi), along with relevant free-living and non-pathogenic species, and select pathogen hosts. To facilitate the discovery of meaningful biological relationships, the databases couple preconfigured searches with visualization and analysis tools for comprehensive data mining via intuitive graphical interfaces and APIs. All data are analyzed with the same workflows, including creation of gene orthology profiles, so data are easily compared across data sets, data types and organisms. EuPathDB is updated with numerous new analysis tools, features, data sets and data types. New tools include GO, metabolic pathway and word enrichment analyses plus an online workspace for analysis of personal, non-public, large-scale data. Expanded data content is mostly genomic and functional genomic data while new data types include protein microarray, metabolic pathways, compounds, quantitative proteomics, copy number variation, and polysomal transcriptomics. New features include consistent categorization of searches, data sets and genome browser tracks; redesigned gene pages; effective integration of alternative transcripts; and a EuPathDB Galaxy instance for private analyses of a user's data. Forthcoming upgrades include user workspaces for private integration of data with existing EuPathDB data and improved integration and presentation of host–pathogen interactions. PMID:27903906
MannDB: A microbial annotation database for protein characterization
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhou, C; Lam, M; Smith, J
2006-05-19
MannDB was created to meet a need for rapid, comprehensive automated protein sequence analyses to support selection of proteins suitable as targets for driving the development of reagents for pathogen or protein toxin detection. Because a large number of open-source tools were needed, it was necessary to produce a software system to scale the computations for whole-proteome analysis. Thus, we built a fully automated system for executing software tools and for storage, integration, and display of automated protein sequence analysis and annotation data. MannDB is a relational database that organizes data resulting from fully automated, high-throughput protein-sequence analyses using open-sourcemore » tools. Types of analyses provided include predictions of cleavage, chemical properties, classification, features, functional assignment, post-translational modifications, motifs, antigenicity, and secondary structure. Proteomes (lists of hypothetical and known proteins) are downloaded and parsed from Genbank and then inserted into MannDB, and annotations from SwissProt are downloaded when identifiers are found in the Genbank entry or when identical sequences are identified. Currently 36 open-source tools are run against MannDB protein sequences either on local systems or by means of batch submission to external servers. In addition, BLAST against protein entries in MvirDB, our database of microbial virulence factors, is performed. A web client browser enables viewing of computational results and downloaded annotations, and a query tool enables structured and free-text search capabilities. When available, links to external databases, including MvirDB, are provided. MannDB contains whole-proteome analyses for at least one representative organism from each category of biological threat organism listed by APHIS, CDC, HHS, NIAID, USDA, USFDA, and WHO. MannDB comprises a large number of genomes and comprehensive protein sequence analyses representing organisms listed as high-priority agents on the websites of several governmental organizations concerned with bio-terrorism. MannDB provides the user with a BLAST interface for comparison of native and non-native sequences and a query tool for conveniently selecting proteins of interest. In addition, the user has access to a web-based browser that compiles comprehensive and extensive reports.« less
EPA's Toxicity Reference Databases (ToxRefDB) was developed by the National Center for Computational Toxicology in partnership with EPA's Office of Pesticide Programs, to store data derived from in vivo animal toxicity studies [www.epa.gov/ncct/toxrefdb/]. The initial build of To...
Validating Variational Bayes Linear Regression Method With Multi-Central Datasets.
Murata, Hiroshi; Zangwill, Linda M; Fujino, Yuri; Matsuura, Masato; Miki, Atsuya; Hirasawa, Kazunori; Tanito, Masaki; Mizoue, Shiro; Mori, Kazuhiko; Suzuki, Katsuyoshi; Yamashita, Takehiro; Kashiwagi, Kenji; Shoji, Nobuyuki; Asaoka, Ryo
2018-04-01
To validate the prediction accuracy of variational Bayes linear regression (VBLR) with two datasets external to the training dataset. The training dataset consisted of 7268 eyes of 4278 subjects from the University of Tokyo Hospital. The Japanese Archive of Multicentral Databases in Glaucoma (JAMDIG) dataset consisted of 271 eyes of 177 patients, and the Diagnostic Innovations in Glaucoma Study (DIGS) dataset includes 248 eyes of 173 patients, which were used for validation. Prediction accuracy was compared between the VBLR and ordinary least squared linear regression (OLSLR). First, OLSLR and VBLR were carried out using total deviation (TD) values at each of the 52 test points from the second to fourth visual fields (VFs) (VF2-4) to 2nd to 10th VF (VF2-10) of each patient in JAMDIG and DIGS datasets, and the TD values of the 11th VF test were predicted every time. The predictive accuracy of each method was compared through the root mean squared error (RMSE) statistic. OLSLR RMSEs with the JAMDIG and DIGS datasets were between 31 and 4.3 dB, and between 19.5 and 3.9 dB. On the other hand, VBLR RMSEs with JAMDIG and DIGS datasets were between 5.0 and 3.7, and between 4.6 and 3.6 dB. There was statistically significant difference between VBLR and OLSLR for both datasets at every series (VF2-4 to VF2-10) (P < 0.01 for all tests). However, there was no statistically significant difference in VBLR RMSEs between JAMDIG and DIGS datasets at any series of VFs (VF2-2 to VF2-10) (P > 0.05). VBLR outperformed OLSLR to predict future VF progression, and the VBLR has a potential to be a helpful tool at clinical settings.
Kumari, Sangita; Pundhir, Sachin; Priya, Piyush; Jeena, Ganga; Punetha, Ankita; Chawla, Konika; Firdos Jafaree, Zohra; Mondal, Subhasish; Yadav, Gitanjali
2014-01-01
Plant essential oils are complex mixtures of volatile organic compounds, which play indispensable roles in the environment, for the plant itself, as well as for humans. The potential biological information stored in essential oil composition data can provide an insight into the silent language of plants, and the roles of these chemical emissions in defense, communication and pollinator attraction. In order to decipher volatile profile patterns from a global perspective, we have developed the ESSential OIL DataBase (EssOilDB), a continually updated, freely available electronic database designed to provide knowledge resource for plant essential oils, that enables one to address a multitude of queries on volatile profiles of native, invasive, normal or stressed plants, across taxonomic clades, geographical locations and several other biotic and abiotic influences. To our knowledge, EssOilDB is the only database in the public domain providing an opportunity for context based scientific research on volatile patterns in plants. EssOilDB presently contains 123 041 essential oil records spanning a century of published reports on volatile profiles, with data from 92 plant taxonomic families, spread across diverse geographical locations all over the globe. We hope that this huge repository of VOCs will facilitate unraveling of the true significance of volatiles in plants, along with creating potential avenues for industrial applications of essential oils. We also illustrate the use of this database in terpene biology and show how EssOilDB can be used to complement data from computational genomics to gain insights into the diversity and variability of terpenoids in the plant kingdom. EssOilDB would serve as a valuable information resource, for students and researchers in plant biology, in the design and discovery of new odor profiles, as well as for entrepreneurs—the potential for generating consumer specific scents being one of the most attractive and interesting topics in the cosmetic industry. Database URL: http://nipgr.res.in/Essoildb/ PMID:25534749
Development, deployment and operations of ATLAS databases
NASA Astrophysics Data System (ADS)
Vaniachine, A. V.; Schmitt, J. G. v. d.
2008-07-01
In preparation for ATLAS data taking, a coordinated shift from development towards operations has occurred in ATLAS database activities. In addition to development and commissioning activities in databases, ATLAS is active in the development and deployment (in collaboration with the WLCG 3D project) of the tools that allow the worldwide distribution and installation of databases and related datasets, as well as the actual operation of this system on ATLAS multi-grid infrastructure. We describe development and commissioning of major ATLAS database applications for online and offline. We present the first scalability test results and ramp-up schedule over the initial LHC years of operations towards the nominal year of ATLAS running, when the database storage volumes are expected to reach 6.1 TB for the Tag DB and 1.0 TB for the Conditions DB. ATLAS database applications require robust operational infrastructure for data replication between online and offline at Tier-0, and for the distribution of the offline data to Tier-1 and Tier-2 computing centers. We describe ATLAS experience with Oracle Streams and other technologies for coordinated replication of databases in the framework of the WLCG 3D services.
Co-PylotDB - A Python-Based Single-Window User Interface for Transmitting Information to a Database
DOE Office of Scientific and Technical Information (OSTI.GOV)
Barnette, Daniel W.
2012-01-05
Co-PylotDB, written completely in Python, provides a user interface (UI) with which to select user and data file(s), directories, and file content, and provide or capture various other information for sending data collected from running any computer program to a pre-formatted database table for persistent storage. The interface allows the user to select input, output, make, source, executable, and qsub files. It also provides fields for specifying the machine name on which the software was run, capturing compile and execution lines, and listing relevant user comments. Data automatically captured by Co-PylotDB and sent to the database are user, current directory,more » local hostname, current date, and time of send. The UI provides fields for logging into a local or remote database server, specifying a database and a table, and sending the information to the selected database table. If a server is not available, the UI provides for saving the command that would have saved the information to a database table for either later submission or for sending via email to a collaborator who has access to the desired database.« less
Database resources of the National Center for Biotechnology Information
2015-01-01
The National Center for Biotechnology Information (NCBI) provides a large suite of online resources for biological information and data, including the GenBank® nucleic acid sequence database and the PubMed database of citations and abstracts for published life science journals. Additional NCBI resources focus on literature (Bookshelf, PubMed Central (PMC) and PubReader); medical genetics (ClinVar, dbMHC, the Genetic Testing Registry, HIV-1/Human Protein Interaction Database and MedGen); genes and genomics (BioProject, BioSample, dbSNP, dbVar, Epigenomics, Gene, Gene Expression Omnibus (GEO), Genome, HomoloGene, the Map Viewer, Nucleotide, PopSet, Probe, RefSeq, Sequence Read Archive, the Taxonomy Browser, Trace Archive and UniGene); and proteins and chemicals (Biosystems, COBALT, the Conserved Domain Database (CDD), the Conserved Domain Architecture Retrieval Tool (CDART), the Molecular Modeling Database (MMDB), Protein Clusters, Protein and the PubChem suite of small molecule databases). The Entrez system provides search and retrieval operations for many of these databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of these resources can be accessed through the NCBI home page at http://www.ncbi.nlm.nih.gov. PMID:25398906
BtoxDB: a comprehensive database of protein structural data on toxin-antitoxin systems.
Barbosa, Luiz Carlos Bertucci; Garrido, Saulo Santesso; Marchetto, Reinaldo
2015-03-01
Toxin-antitoxin (TA) systems are diverse and abundant genetic modules in prokaryotic cells that are typically formed by two genes encoding a stable toxin and a labile antitoxin. Because TA systems are able to repress growth or kill cells and are considered to be important actors in cell persistence (multidrug resistance without genetic change), these modules are considered potential targets for alternative drug design. In this scenario, structural information for the proteins in these systems is highly valuable. In this report, we describe the development of a web-based system, named BtoxDB, that stores all protein structural data on TA systems. The BtoxDB database was implemented as a MySQL relational database using PHP scripting language. Web interfaces were developed using HTML, CSS and JavaScript. The data were collected from the PDB, UniProt and Entrez databases. These data were appropriately filtered using specialized literature and our previous knowledge about toxin-antitoxin systems. The database provides three modules ("Search", "Browse" and "Statistics") that enable searches, acquisition of contents and access to statistical data. Direct links to matching external databases are also available. The compilation of all protein structural data on TA systems in one platform is highly useful for researchers interested in this content. BtoxDB is publicly available at http://www.gurupi.uft.edu.br/btoxdb. Copyright © 2015 Elsevier Ltd. All rights reserved.
TryTransDB: A web-based resource for transport proteins in Trypanosomatidae.
Sonar, Krushna; Kabra, Ritika; Singh, Shailza
2018-03-12
TryTransDB is a web-based resource that stores transport protein data which can be retrieved using a standalone BLAST tool. We have attempted to create an integrated database that can be a one-stop shop for the researchers working with transport proteins of Trypanosomatidae family. TryTransDB (Trypanosomatidae Transport Protein Database) is a web based comprehensive resource that can fire a BLAST search against most of the transport protein sequences (protein and nucleotide) from Trypanosomatidae family organisms. This web resource further allows to compute a phylogenetic tree by performing multiple sequence alignment (MSA) using CLUSTALW suite embedded in it. Also, cross-linking to other databases helps in gathering more information for a certain transport protein in a single website.
Mikaelyan, Aram; Köhler, Tim; Lampert, Niclas; Rohland, Jeffrey; Boga, Hamadi; Meuser, Katja; Brune, Andreas
2015-10-01
Recent developments in sequencing technology have given rise to a large number of studies that assess bacterial diversity and community structure in termite and cockroach guts based on large amplicon libraries of 16S rRNA genes. Although these studies have revealed important ecological and evolutionary patterns in the gut microbiota, classification of the short sequence reads is limited by the taxonomic depth and resolution of the reference databases used in the respective studies. Here, we present a curated reference database for accurate taxonomic analysis of the bacterial gut microbiota of dictyopteran insects. The Dictyopteran gut microbiota reference Database (DictDb) is based on the Silva database but was significantly expanded by the addition of clones from 11 mostly unexplored termite and cockroach groups, which increased the inventory of bacterial sequences from dictyopteran guts by 26%. The taxonomic depth and resolution of DictDb was significantly improved by a general revision of the taxonomic guide tree for all important lineages, including a detailed phylogenetic analysis of the Treponema and Alistipes complexes, the Fibrobacteres, and the TG3 phylum. The performance of this first documented version of DictDb (v. 3.0) using the revised taxonomic guide tree in the classification of short-read libraries obtained from termites and cockroaches was highly superior to that of the current Silva and RDP databases. DictDb uses an informative nomenclature that is consistent with the literature also for clades of uncultured bacteria and provides an invaluable tool for anyone exploring the gut community structure of termites and cockroaches. Copyright © 2015 Elsevier GmbH. All rights reserved.
dbAMEPNI: a database of alanine mutagenic effects for protein–nucleic acid interactions
DOE Office of Scientific and Technical Information (OSTI.GOV)
Liu, Ling; Xiong, Yi; Gao, Hongyun
Protein–nucleic acid interactions play essential roles in various biological activities such as gene regulation, transcription, DNA repair and DNA packaging. Understanding the effects of amino acid substitutions on protein–nucleic acid binding affinities can help elucidate the molecular mechanism of protein–nucleic acid recognition. Until now, no comprehensive and updated database of quantitative binding data on alanine mutagenic effects for protein–nucleic acid interactions is publicly accessible. Thus, we developed a new database of Alanine Mutagenic Effects for Protein-Nucleic Acid Interactions (dbAMEPNI). dbAMEPNI is a manually curated, literature-derived database, comprising over 577 alanine mutagenic data with experimentally determined binding affinities for protein–nucleic acidmore » complexes. Here, it contains several important parameters, such as dissociation constant (Kd), Gibbs free energy change (ΔΔG), experimental conditions and structural parameters of mutant residues. In addition, the database provides an extended dataset of 282 single alanine mutations with only qualitative data (or descriptive effects) of thermodynamic information.« less
Constructing a Graph Database for Semantic Literature-Based Discovery.
Hristovski, Dimitar; Kastrin, Andrej; Dinevski, Dejan; Rindflesch, Thomas C
2015-01-01
Literature-based discovery (LBD) generates discoveries, or hypotheses, by combining what is already known in the literature. Potential discoveries have the form of relations between biomedical concepts; for example, a drug may be determined to treat a disease other than the one for which it was intended. LBD views the knowledge in a domain as a network; a set of concepts along with the relations between them. As a starting point, we used SemMedDB, a database of semantic relations between biomedical concepts extracted with SemRep from Medline. SemMedDB is distributed as a MySQL relational database, which has some problems when dealing with network data. We transformed and uploaded SemMedDB into the Neo4j graph database, and implemented the basic LBD discovery algorithms with the Cypher query language. We conclude that storing the data needed for semantic LBD is more natural in a graph database. Also, implementing LBD discovery algorithms is conceptually simpler with a graph query language when compared with standard SQL.
dbAMEPNI: a database of alanine mutagenic effects for protein–nucleic acid interactions
Liu, Ling; Xiong, Yi; Gao, Hongyun; ...
2018-04-02
Protein–nucleic acid interactions play essential roles in various biological activities such as gene regulation, transcription, DNA repair and DNA packaging. Understanding the effects of amino acid substitutions on protein–nucleic acid binding affinities can help elucidate the molecular mechanism of protein–nucleic acid recognition. Until now, no comprehensive and updated database of quantitative binding data on alanine mutagenic effects for protein–nucleic acid interactions is publicly accessible. Thus, we developed a new database of Alanine Mutagenic Effects for Protein-Nucleic Acid Interactions (dbAMEPNI). dbAMEPNI is a manually curated, literature-derived database, comprising over 577 alanine mutagenic data with experimentally determined binding affinities for protein–nucleic acidmore » complexes. Here, it contains several important parameters, such as dissociation constant (Kd), Gibbs free energy change (ΔΔG), experimental conditions and structural parameters of mutant residues. In addition, the database provides an extended dataset of 282 single alanine mutations with only qualitative data (or descriptive effects) of thermodynamic information.« less
GenderMedDB: an interactive database of sex and gender-specific medical literature.
Oertelt-Prigione, Sabine; Gohlke, Björn-Oliver; Dunkel, Mathias; Preissner, Robert; Regitz-Zagrosek, Vera
2014-01-01
Searches for sex and gender-specific publications are complicated by the absence of a specific algorithm within search engines and by the lack of adequate archives to collect the retrieved results. We previously addressed this issue by initiating the first systematic archive of medical literature containing sex and/or gender-specific analyses. This initial collection has now been greatly enlarged and re-organized as a free user-friendly database with multiple functions: GenderMedDB (http://gendermeddb.charite.de). GenderMedDB retrieves the included publications from the PubMed database. Manuscripts containing sex and/or gender-specific analysis are continuously screened and the relevant findings organized systematically into disciplines and diseases. Publications are furthermore classified by research type, subject and participant numbers. More than 11,000 abstracts are currently included in the database, after screening more than 40,000 publications. The main functions of the database include searches by publication data or content analysis based on pre-defined classifications. In addition, registrants are enabled to upload relevant publications, access descriptive publication statistics and interact in an open user forum. Overall, GenderMedDB offers the advantages of a discipline-specific search engine as well as the functions of a participative tool for the gender medicine community.
The Halophile protein database.
Sharma, Naveen; Farooqi, Mohammad Samir; Chaturvedi, Krishna Kumar; Lal, Shashi Bhushan; Grover, Monendra; Rai, Anil; Pandey, Pankaj
2014-01-01
Halophilic archaea/bacteria adapt to different salt concentration, namely extreme, moderate and low. These type of adaptations may occur as a result of modification of protein structure and other changes in different cell organelles. Thus proteins may play an important role in the adaptation of halophilic archaea/bacteria to saline conditions. The Halophile protein database (HProtDB) is a systematic attempt to document the biochemical and biophysical properties of proteins from halophilic archaea/bacteria which may be involved in adaptation of these organisms to saline conditions. In this database, various physicochemical properties such as molecular weight, theoretical pI, amino acid composition, atomic composition, estimated half-life, instability index, aliphatic index and grand average of hydropathicity (Gravy) have been listed. These physicochemical properties play an important role in identifying the protein structure, bonding pattern and function of the specific proteins. This database is comprehensive, manually curated, non-redundant catalogue of proteins. The database currently contains 59 897 proteins properties extracted from 21 different strains of halophilic archaea/bacteria. The database can be accessed through link. Database URL: http://webapp.cabgrid.res.in/protein/ © The Author(s) 2014. Published by Oxford University Press.
The MAR databases: development and implementation of databases specific for marine metagenomics
Klemetsen, Terje; Raknes, Inge A; Fu, Juan; Agafonov, Alexander; Balasundaram, Sudhagar V; Tartari, Giacomo; Robertsen, Espen
2018-01-01
Abstract We introduce the marine databases; MarRef, MarDB and MarCat (https://mmp.sfb.uit.no/databases/), which are publicly available resources that promote marine research and innovation. These data resources, which have been implemented in the Marine Metagenomics Portal (MMP) (https://mmp.sfb.uit.no/), are collections of richly annotated and manually curated contextual (metadata) and sequence databases representing three tiers of accuracy. While MarRef is a database for completely sequenced marine prokaryotic genomes, which represent a marine prokaryote reference genome database, MarDB includes all incomplete sequenced prokaryotic genomes regardless level of completeness. The last database, MarCat, represents a gene (protein) catalog of uncultivable (and cultivable) marine genes and proteins derived from marine metagenomics samples. The first versions of MarRef and MarDB contain 612 and 3726 records, respectively. Each record is built up of 106 metadata fields including attributes for sampling, sequencing, assembly and annotation in addition to the organism and taxonomic information. Currently, MarCat contains 1227 records with 55 metadata fields. Ontologies and controlled vocabularies are used in the contextual databases to enhance consistency. The user-friendly web interface lets the visitors browse, filter and search in the contextual databases and perform BLAST searches against the corresponding sequence databases. All contextual and sequence databases are freely accessible and downloadable from https://s1.sfb.uit.no/public/mar/. PMID:29106641
Ameur, Adam; Bunikis, Ignas; Enroth, Stefan; Gyllensten, Ulf
2014-01-01
CanvasDB is an infrastructure for management and analysis of genetic variants from massively parallel sequencing (MPS) projects. The system stores SNP and indel calls in a local database, designed to handle very large datasets, to allow for rapid analysis using simple commands in R. Functional annotations are included in the system, making it suitable for direct identification of disease-causing mutations in human exome- (WES) or whole-genome sequencing (WGS) projects. The system has a built-in filtering function implemented to simultaneously take into account variant calls from all individual samples. This enables advanced comparative analysis of variant distribution between groups of samples, including detection of candidate causative mutations within family structures and genome-wide association by sequencing. In most cases, these analyses are executed within just a matter of seconds, even when there are several hundreds of samples and millions of variants in the database. We demonstrate the scalability of canvasDB by importing the individual variant calls from all 1092 individuals present in the 1000 Genomes Project into the system, over 4.4 billion SNPs and indels in total. Our results show that canvasDB makes it possible to perform advanced analyses of large-scale WGS projects on a local server. Database URL: https://github.com/UppsalaGenomeCenter/CanvasDB PMID:25281234
Database resources of the National Center for Biotechnology Information.
Sayers, Eric W; Barrett, Tanya; Benson, Dennis A; Bolton, Evan; Bryant, Stephen H; Canese, Kathi; Chetvernin, Vyacheslav; Church, Deanna M; Dicuccio, Michael; Federhen, Scott; Feolo, Michael; Fingerman, Ian M; Geer, Lewis Y; Helmberg, Wolfgang; Kapustin, Yuri; Krasnov, Sergey; Landsman, David; Lipman, David J; Lu, Zhiyong; Madden, Thomas L; Madej, Tom; Maglott, Donna R; Marchler-Bauer, Aron; Miller, Vadim; Karsch-Mizrachi, Ilene; Ostell, James; Panchenko, Anna; Phan, Lon; Pruitt, Kim D; Schuler, Gregory D; Sequeira, Edwin; Sherry, Stephen T; Shumway, Martin; Sirotkin, Karl; Slotta, Douglas; Souvorov, Alexandre; Starchenko, Grigory; Tatusova, Tatiana A; Wagner, Lukas; Wang, Yanli; Wilbur, W John; Yaschenko, Eugene; Ye, Jian
2012-01-01
In addition to maintaining the GenBank® nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data made available through the NCBI Website. NCBI resources include Entrez, the Entrez Programming Utilities, MyNCBI, PubMed, PubMed Central (PMC), Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Primer-BLAST, COBALT, Splign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, dbVar, Epigenomics, Genome and related tools, the Map Viewer, Model Maker, Evidence Viewer, Trace Archive, Sequence Read Archive, BioProject, BioSample, Retroviral Genotyping Tools, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus (GEO), Probe, Online Mendelian Inheritance in Animals (OMIA), the Molecular Modeling Database (MMDB), the Conserved Domain Database (CDD), the Conserved Domain Architecture Retrieval Tool (CDART), Biosystems, Protein Clusters and the PubChem suite of small molecule databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of these resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov.
A spatial-temporal system for dynamic cadastral management.
Nan, Liu; Renyi, Liu; Guangliang, Zhu; Jiong, Xie
2006-03-01
A practical spatio-temporal database (STDB) technique for dynamic urban land management is presented. One of the STDB models, the expanded model of Base State with Amendments (BSA), is selected as the basis for developing the dynamic cadastral management technique. Two approaches, the Section Fast Indexing (SFI) and the Storage Factors of Variable Granularity (SFVG), are used to improve the efficiency of the BSA model. Both spatial graphic data and attribute data, through a succinct engine, are stored in standard relational database management systems (RDBMS) for the actual implementation of the BSA model. The spatio-temporal database is divided into three interdependent sub-databases: present DB, history DB and the procedures-tracing DB. The efficiency of database operation is improved by the database connection in the bottom layer of the Microsoft SQL Server. The spatio-temporal system can be provided at a low-cost while satisfying the basic needs of urban land management in China. The approaches presented in this paper may also be of significance to countries where land patterns change frequently or to agencies where financial resources are limited.
Database resources of the National Center for Biotechnology Information
Acland, Abigail; Agarwala, Richa; Barrett, Tanya; Beck, Jeff; Benson, Dennis A.; Bollin, Colleen; Bolton, Evan; Bryant, Stephen H.; Canese, Kathi; Church, Deanna M.; Clark, Karen; DiCuccio, Michael; Dondoshansky, Ilya; Federhen, Scott; Feolo, Michael; Geer, Lewis Y.; Gorelenkov, Viatcheslav; Hoeppner, Marilu; Johnson, Mark; Kelly, Christopher; Khotomlianski, Viatcheslav; Kimchi, Avi; Kimelman, Michael; Kitts, Paul; Krasnov, Sergey; Kuznetsov, Anatoliy; Landsman, David; Lipman, David J.; Lu, Zhiyong; Madden, Thomas L.; Madej, Tom; Maglott, Donna R.; Marchler-Bauer, Aron; Karsch-Mizrachi, Ilene; Murphy, Terence; Ostell, James; O'Sullivan, Christopher; Panchenko, Anna; Phan, Lon; Pruitt, Don Preussm Kim D.; Rubinstein, Wendy; Sayers, Eric W.; Schneider, Valerie; Schuler, Gregory D.; Sequeira, Edwin; Sherry, Stephen T.; Shumway, Martin; Sirotkin, Karl; Siyan, Karanjit; Slotta, Douglas; Soboleva, Alexandra; Soussov, Vladimir; Starchenko, Grigory; Tatusova, Tatiana A.; Trawick, Bart W.; Vakatov, Denis; Wang, Yanli; Ward, Minghong; John Wilbur, W.; Yaschenko, Eugene; Zbicz, Kerry
2014-01-01
In addition to maintaining the GenBank® nucleic acid sequence database, the National Center for Biotechnology Information (NCBI, http://www.ncbi.nlm.nih.gov) provides analysis and retrieval resources for the data in GenBank and other biological data made available through the NCBI Web site. NCBI resources include Entrez, the Entrez Programming Utilities, MyNCBI, PubMed, PubMed Central, PubReader, Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link, Primer-BLAST, COBALT, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, dbVar, Epigenomics, the Genetic Testing Registry, Genome and related tools, the Map Viewer, Trace Archive, Sequence Read Archive, BioProject, BioSample, ClinVar, MedGen, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus, Probe, Online Mendelian Inheritance in Animals, the Molecular Modeling Database, the Conserved Domain Database, the Conserved Domain Architecture Retrieval Tool, Biosystems, Protein Clusters and the PubChem suite of small molecule databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All these resources can be accessed through the NCBI home page. PMID:24259429
samiDB: A Prototype Data Archive for Big Science Exploration
NASA Astrophysics Data System (ADS)
Konstantopoulos, I. S.; Green, A. W.; Cortese, L.; Foster, C.; Scott, N.
2015-04-01
samiDB is an archive, database, and query engine to serve the spectra, spectral hypercubes, and high-level science products that make up the SAMI Galaxy Survey. Based on the versatile Hierarchical Data Format (HDF5), samiDB does not depend on relational database structures and hence lightens the setup and maintenance load imposed on science teams by metadata tables. The code, written in Python, covers the ingestion, querying, and exporting of data as well as the automatic setup of an HTML schema browser. samiDB serves as a maintenance-light data archive for Big Science and can be adopted and adapted by science teams that lack the means to hire professional archivists to set up the data back end for their projects.
GreenPhylDB v2.0: comparative and functional genomics in plants.
Rouard, Mathieu; Guignon, Valentin; Aluome, Christelle; Laporte, Marie-Angélique; Droc, Gaëtan; Walde, Christian; Zmasek, Christian M; Périn, Christophe; Conte, Matthieu G
2011-01-01
GreenPhylDB is a database designed for comparative and functional genomics based on complete genomes. Version 2 now contains sixteen full genomes of members of the plantae kingdom, ranging from algae to angiosperms, automatically clustered into gene families. Gene families are manually annotated and then analyzed phylogenetically in order to elucidate orthologous and paralogous relationships. The database offers various lists of gene families including plant, phylum and species specific gene families. For each gene cluster or gene family, easy access to gene composition, protein domains, publications, external links and orthologous gene predictions is provided. Web interfaces have been further developed to improve the navigation through information related to gene families. New analysis tools are also available, such as a gene family ontology browser that facilitates exploration. GreenPhylDB is a component of the South Green Bioinformatics Platform (http://southgreen.cirad.fr/) and is accessible at http://greenphyl.cirad.fr. It enables comparative genomics in a broad taxonomy context to enhance the understanding of evolutionary processes and thus tends to speed up gene discovery.
NCX-DB: a unified resource for integrative analysis of the sodium calcium exchanger super-family.
Bode, Katrin; O'Halloran, Damien M
2018-04-13
Na + /Ca 2+ exchangers are low-affinity high-capacity transporters that mediate Ca 2+ extrusion by coupling Ca 2+ efflux to the influx of Na + ions. The Na + /Ca 2+ exchangers form a super-family comprised of three branches each differing in ion-substrate selectivity: Na + /Ca 2+ exchangers (NCX), Na + /Ca 2+ /K + exchangers, and Ca 2+ /cation exchangers. Their primary function is to maintain Ca 2+ homeostasis and play a particularly important role in excitable cells that experience transient Ca 2+ fluxes. Research into the role and activity of Na + /Ca 2+ exchangers has focused extensively on the cardio-vascular system, however, growing evidence suggests that Na + /Ca 2+ exchangers play a key role in neuronal processes such as memory formation, learning, oligodendrocyte differentiation, neuroprotection during brain ischemia and axon guidance. They have also been implicated in pathologies such as Alzheimer's disease, Parkinson's disease, Multiple Sclerosis and Epilepsy, however, a clear understanding of their mechanism during disease is lacking. To date, there has never been a central resource or database for Na + /Ca 2+ exchangers. With clear disease relevance and ever-increasing research on Na + /Ca 2+ exchangers from both model and non-model species, a database that unifies the data on Na + /Ca 2+ exchangers is needed for future research. NCX-DB is a publicly available database with a web interface that enables users to explore various Na + /Ca 2+ exchangers, perform cross-species sequence comparison, identify new exchangers, and stay-up to date with recent literature. NCX-DB is available on the web via an interactive user interface with an intuitive design, which is applicable for the identification and comparison of Na + /Ca 2+ exchanger proteins across diverse species.
Single-Shot Readout of a Superconducting Qubit using a Josephson Parametric Oscillator
2016-01-11
Gustavsson2, Vitaly Shumeiko1, W.D. Oliver2,3, C.M. Wilson4, Per Delsing1, and Jonas Bylander1 1Microtechnology and Nanoscience, Chalmers University of...Josephson Parametric Oscillator Philip Krantz1, Andreas Bengtsson1, Michaël Simoen1, Simon Gustavsson2, Vitaly Shumeiko1, W. D. Oliver2,3, C. M...BOX Qubit 10 mK 300 K Parametric resonator Fdc (b) 2.8 K 4-8 GHz HEMT R e so n at o r Q u b it P u m p (a) -40 dB -40 dB -20 dB-6 0 d B BPF A| | 2 B
Martin, Stanton L; Blackmon, Barbara P; Rajagopalan, Ravi; Houfek, Thomas D; Sceeles, Robert G; Denn, Sheila O; Mitchell, Thomas K; Brown, Douglas E; Wing, Rod A; Dean, Ralph A
2002-01-01
We have created a federated database for genome studies of Magnaporthe grisea, the causal agent of rice blast disease, by integrating end sequence data from BAC clones, genetic marker data and BAC contig assembly data. A library of 9216 BAC clones providing >25-fold coverage of the entire genome was end sequenced and fingerprinted by HindIII digestion. The Image/FPC software package was then used to generate an assembly of 188 contigs covering >95% of the genome. The database contains the results of this assembly integrated with hybridization data of genetic markers to the BAC library. AceDB was used for the core database engine and a MySQL relational database, populated with numerical representations of BAC clones within FPC contigs, was used to create appropriately scaled images. The database is being used to facilitate sequencing efforts. The database also allows researchers mapping known genes or other sequences of interest, rapid and easy access to the fundamental organization of the M.grisea genome. This database, MagnaportheDB, can be accessed on the web at http://www.cals.ncsu.edu/fungal_genomics/mgdatabase/int.htm.
Games, Patrícia Dias; daSilva, Elói Quintas Gonçalves; Barbosa, Meire de Oliveira; Almeida-Souza, Hebréia Oliveira; Fontes, Patrícia Pereira; deMagalhães, Marcos Jorge; Pereira, Paulo Roberto Gomes; Prates, Maura Vianna; Franco, Gloria Regina; Faria-Campos, Alessandra; Campos, Sérgio Vale Aguiar; Baracat-Pereira, Maria Cristina
2016-12-15
Antimicrobial peptides from plants present mechanisms of action that are different from those of conventional defense agents. They are under-explored but have a potential as commercial antimicrobials. Bell pepper leaves ('Magali R') are discarded after harvesting the fruit and are sources of bioactive peptides. This work reports the isolation by peptidomics tools, and the identification and partially characterization by computational tools of an antimicrobial peptide from bell pepper leaves, and evidences the usefulness of records and the in silico analysis for the study of plant peptides aiming biotechnological uses. Aqueous extracts from leaves were enriched in peptide by salt fractionation and ultrafiltration. An antimicrobial peptide was isolated by tandem chromatographic procedures. Mass spectrometry, automated peptide sequencing and bioinformatics tools were used alternately for identification and partial characterization of the Hevein-like peptide, named HEV-CANN. The computational tools that assisted to the identification of the peptide included BlastP, PSI-Blast, ClustalOmega, PeptideCutter, and ProtParam; conventional protein databases (DB) as Mascot, Protein-DB, GenBank-DB, RefSeq, Swiss-Prot, and UniProtKB; specific for peptides DB as Amper, APD2, CAMP, LAMPs, and PhytAMP; other tools included in ExPASy for Proteomics; The Bioactive Peptide Databases, and The Pepper Genome Database. The HEV-CANN sequence presented 40 amino acid residues, 4258.8 Da, theoretical pI-value of 8.78, and four disulfide bonds. It was stable, and it has inhibited the growth of phytopathogenic bacteria and a fungus. HEV-CANN presented a chitin-binding domain in their sequence. There was a high identity and a positive alignment of HEV-CANN sequence in various databases, but there was not a complete identity, suggesting that HEV-CANN may be produced by ribosomal synthesis, which is in accordance with its constitutive nature. Computational tools for proteomics and databases are not adjusted for short sequences, which hampered HEV-CANN identification. The adjustment of statistical tests in large databases for proteins is an alternative to promote the significant identification of peptides. The development of specific DB for plant antimicrobial peptides, with information about peptide sequences, functional genomic data, structural motifs and domains of molecules, functional domains, and peptide-biomolecule interactions are valuable and necessary.
Das, Sankha Subhra; Saha, Pritam
2018-01-01
Abstract MicroRNAs (miRNAs) are well-known as key regulators of diverse biological pathways. A series of experimental evidences have shown that abnormal miRNA expression profiles are responsible for various pathophysiological conditions by modulating genes in disease associated pathways. In spite of the rapid increase in research data confirming such associations, scientists still do not have access to a consolidated database offering these miRNA-pathway association details for critical diseases. We have developed miRwayDB, a database providing comprehensive information of experimentally validated miRNA-pathway associations in various pathophysiological conditions utilizing data collected from published literature. To the best of our knowledge, it is the first database that provides information about experimentally validated miRNA mediated pathway dysregulation as seen specifically in critical human diseases and hence indicative of a cause-and-effect relationship in most cases. The current version of miRwayDB collects an exhaustive list of miRNA-pathway association entries for 76 critical disease conditions by reviewing 663 published articles. Each database entry contains complete information on the name of the pathophysiological condition, associated miRNA(s), experimental sample type(s), regulation pattern (up/down) of miRNA, pathway association(s), targeted member of dysregulated pathway(s) and a brief description. In addition, miRwayDB provides miRNA, gene and pathway score to evaluate the role of a miRNA regulated pathways in various pathophysiological conditions. The database can also be used for other biomedical approaches such as validation of computational analysis, integrated analysis and prediction of computational model. It also offers a submission page to submit novel data from recently published studies. We believe that miRwayDB will be a useful tool for miRNA research community. Database URL: http://www.mirway.iitkgp.ac.in PMID:29688364
Application of materials database (MAT.DB.) to materials education
NASA Technical Reports Server (NTRS)
Liu, Ping; Waskom, Tommy L.
1994-01-01
Finding the right material for the job is an important aspect of engineering. Sometimes the choice is as fundamental as selecting between steel and aluminum. Other times, the choice may be between different compositions in an alloy. Discovering and compiling materials data is a demanding task, but it leads to accurate models for analysis and successful materials application. Mat. DB. is a database management system designed for maintaining information on the properties and processing of engineered materials, including metals, plastics, composites, and ceramics. It was developed by the Center for Materials Data of American Society for Metals (ASM) International. The ASM Center for Materials Data collects and reviews material property data for publication in books, reports, and electronic database. Mat. DB was developed to aid the data management and material applications.
NASA Astrophysics Data System (ADS)
Kuo, K. S.; Rilee, M. L.
2017-12-01
Existing pathways for bringing together massive, diverse Earth Science datasets for integrated analyses burden end users with data packaging and management details irrelevant to their domain goals. The major data repositories focus on archival, discovery, and dissemination of products (files) in a standardized manner. End-users must download and then adapt these files using local resources and custom methods before analysis can proceed. This reduces scientific or other domain productivity, as scarce resources and expertise must be diverted to data processing. The Spatio-Temporal Adaptive Resolution Encoding (STARE) is a unifying scheme encoding geospatial and temporal information for organizing data on scalable computing/storage resources, minimizing expensive data transfers. STARE provides a compact representation that turns set-logic functions, e.g. conditional subsetting, into integer operations, that takes into account representative spatiotemporal resolutions of the data in the datasets, which is needed for data placement alignment of geo-spatiotemporally diverse data on massive parallel resources. Automating important scientific functions (e.g. regridding) and computational functions (e.g. data placement) allows scientists to focus on domain specific questions instead of expending their expertise on data processing. While STARE is not tied to any particular computing technology, we have used STARE for visualization and the SciDB array database to analyze Earth Science data on a 28-node compute cluster. STARE's automatic data placement and coupling of geometric and array indexing allows complicated data comparisons to be realized as straightforward database operations like "join." With STARE-enabled automation, SciDB+STARE provides a database interface, reducing costly data preparation, increasing the volume and variety of integrable data, and easing result sharing. Using SciDB+STARE as part of an integrated analysis infrastructure, we demonstrate the dramatic ease of combining diametrically different datasets, i.e. gridded (NMQ radar) vs. spacecraft swath (TRMM). SciDB+STARE is an important step towards a computational infrastructure for integrating and sharing diverse, complex Earth Science data and science products derived from them.
IMGT, the international ImMunoGeneTics information system®
Lefranc, Marie-Paule; Giudicelli, Véronique; Kaas, Quentin; Duprat, Elodie; Jabado-Michaloud, Joumana; Scaviner, Dominique; Ginestoux, Chantal; Clément, Oliver; Chaume, Denys; Lefranc, Gérard
2005-01-01
The international ImMunoGeneTics information system® (IMGT) (http://imgt.cines.fr), created in 1989, by the Laboratoire d'ImmunoGénétique Moléculaire LIGM (Université Montpellier II and CNRS) at Montpellier, France, is a high-quality integrated knowledge resource specializing in the immunoglobulins (IGs), T cell receptors (TRs), major histocompatibility complex (MHC) of human and other vertebrates, and related proteins of the immune systems (RPI) that belong to the immunoglobulin superfamily (IgSF) and to the MHC superfamily (MhcSF). IMGT includes several sequence databases (IMGT/LIGM-DB, IMGT/PRIMER-DB, IMGT/PROTEIN-DB and IMGT/MHC-DB), one genome database (IMGT/GENE-DB) and one three-dimensional (3D) structure database (IMGT/3Dstructure-DB), Web resources comprising 8000 HTML pages (IMGT Marie-Paule page), and interactive tools. IMGT data are expertly annotated according to the rules of the IMGT Scientific chart, based on the IMGT-ONTOLOGY concepts. IMGT tools are particularly useful for the analysis of the IG and TR repertoires in normal physiological and pathological situations. IMGT is used in medical research (autoimmune diseases, infectious diseases, AIDS, leukemias, lymphomas, myelomas), veterinary research, biotechnology related to antibody engineering (phage displays, combinatorial libraries, chimeric, humanized and human antibodies), diagnostics (clonalities, detection and follow up of residual diseases) and therapeutical approaches (graft, immunotherapy and vaccinology). IMGT is freely available at http://imgt.cines.fr. PMID:15608269
AtomDB Progress Report: Atomic data and new models for X-ray spectroscopy.
NASA Astrophysics Data System (ADS)
Smith, Randall K.; Foster, Adam; Brickhouse, Nancy S.; Stancil, Phillip C.; Cumbee, Renata; Mullen, Patrick Dean; AtomDB Team
2018-06-01
The AtomDB project collects atomic data from both theoretical and observational/experimental sources, providing both a convenient interface (http://www.atomdb.org/Webguide/webguide.php) as well as providing input to spectral models for many types of astrophysical X-ray plasmas. We have released several updates to AtomDB in response to the Hitomi data, including new data for the Fe K complex, and have expanded the range of models available in AtomDB to include the Kronos charge exchange models from Mullen at al. (2016, ApJS, 224, 2). Combined with the previous AtomDB charge exchange model (http://www.atomdb.org/CX/), these data enable a velocity-dependent model for X-ray and EUV charge exchange spectra. We also present a new Kappa-distribution spectral model, enabling plasmas with non-Maxwellian electron distributions to be modeled with AtomDB. Tools are provided within pyAtomDB to explore and exploit these new plasma models. This presentation will review these enhancements and describe plans for the new few years of database and code development in preparation for XARM, Athena, and (hopefully) Arcus.
MIPS PlantsDB: a database framework for comparative plant genome research
Nussbaumer, Thomas; Martis, Mihaela M.; Roessner, Stephan K.; Pfeifer, Matthias; Bader, Kai C.; Sharma, Sapna; Gundlach, Heidrun; Spannagl, Manuel
2013-01-01
The rapidly increasing amount of plant genome (sequence) data enables powerful comparative analyses and integrative approaches and also requires structured and comprehensive information resources. Databases are needed for both model and crop plant organisms and both intuitive search/browse views and comparative genomics tools should communicate the data to researchers and help them interpret it. MIPS PlantsDB (http://mips.helmholtz-muenchen.de/plant/genomes.jsp) was initially described in NAR in 2007 [Spannagl,M., Noubibou,O., Haase,D., Yang,L., Gundlach,H., Hindemitt, T., Klee,K., Haberer,G., Schoof,H. and Mayer,K.F. (2007) MIPSPlantsDB–plant database resource for integrative and comparative plant genome research. Nucleic Acids Res., 35, D834–D840] and was set up from the start to provide data and information resources for individual plant species as well as a framework for integrative and comparative plant genome research. PlantsDB comprises database instances for tomato, Medicago, Arabidopsis, Brachypodium, Sorghum, maize, rice, barley and wheat. Building up on that, state-of-the-art comparative genomics tools such as CrowsNest are integrated to visualize and investigate syntenic relationships between monocot genomes. Results from novel genome analysis strategies targeting the complex and repetitive genomes of triticeae species (wheat and barley) are provided and cross-linked with model species. The MIPS Repeat Element Database (mips-REdat) and Catalog (mips-REcat) as well as tight connections to other databases, e.g. via web services, are further important components of PlantsDB. PMID:23203886
4-(2,4-Dichlorophenoxy)butyric acid (2,4-DB)
Integrated Risk Information System (IRIS)
Integrated Risk Information System ( IRIS ) Chemical Assessment Summary U.S . Environmental Protection Agency National Center for Environmental Assessment This IRIS Summary has been removed from the IRIS database and is available for historical reference purposes . ( July 2016 ) 4 - ( 2,4 - Dichloro
A Comparison of Global Indexing Schemes to Facilitate Earth Science Data Management
NASA Astrophysics Data System (ADS)
Griessbaum, N.; Frew, J.; Rilee, M. L.; Kuo, K. S.
2017-12-01
Recent advances in database technology have led to systems optimized for managing petabyte-scale multidimensional arrays. These array databases are a good fit for subsets of the Earth's surface that can be projected into a rectangular coordinate system with acceptable geometric fidelity. However, for global analyses, array databases must address the same distortions and discontinuities that apply to map projections in general. The array database SciDB supports enormous databases spread across thousands of computing nodes. Additionally, the following SciDB characteristics are particularly germane to the coordinate system problem: SciDB efficiently stores and manipulates sparse (i.e. mostly empty) arrays. SciDB arrays have 64-bit indexes. SciDB supports user-defined data types, functions, and operators. We have implemented two geospatial indexing schemes in SciDB. The simplest uses two array dimensions to represent longitude and latitude. For representation as 64-bit integers, the coordinates are multiplied by a scale factor large enough to yield an appropriate Earth surface resolution (e.g., a scale factor of 100,000 yields a resolution of approximately 1m at the equator). Aside from the longitudinal discontinuity, the principal disadvantage of this scheme is its fixed scale factor. The second scheme uses a single array dimension to represent the bit-codes for locations in a hierarchical triangular mesh (HTM) coordinate system. A HTM maps the Earth's surface onto an octahedron, and then recursively subdivides each triangular face to the desired resolution. Earth surface locations are represented as the concatenation of an octahedron face code and a quadtree code within the face. Unlike our integerized lat-lon scheme, the HTM allow for objects of different size (e.g., pixels with differing resolutions) to be represented in the same indexing scheme. We present an evaluation of the relative utility of these two schemes for managing and analyzing MODIS swath data.
MPID-T2: a database for sequence-structure-function analyses of pMHC and TR/pMHC structures.
Khan, Javed Mohammed; Cheruku, Harish Reddy; Tong, Joo Chuan; Ranganathan, Shoba
2011-04-15
Sequence-structure-function information is critical in understanding the mechanism of pMHC and TR/pMHC binding and recognition. A database for sequence-structure-function information on pMHC and TR/pMHC interactions, MHC-Peptide Interaction Database-TR version 2 (MPID-T2), is now available augmented with the latest PDB and IMGT/3Dstructure-DB data, advanced features and new parameters for the analysis of pMHC and TR/pMHC structures. http://biolinfo.org/mpid-t2. shoba.ranganathan@mq.edu.au Supplementary data are available at Bioinformatics online.
myPhyloDB: a local web server for the storage and analysis of metagenomics data
USDA-ARS?s Scientific Manuscript database
myPhyloDB is a user-friendly personal database with a browser-interface designed to facilitate the storage, processing, analysis, and distribution of metagenomics data. MyPhyloDB archives raw sequencing files, and allows for easy selection of project(s)/sample(s) of any combination from all availab...
Managing Attribute—Value Clinical Trials Data Using the ACT/DB Client—Server Database System
Nadkarni, Prakash M.; Brandt, Cynthia; Frawley, Sandra; Sayward, Frederick G.; Einbinder, Robin; Zelterman, Daniel; Schacter, Lee; Miller, Perry L.
1998-01-01
ACT/DB is a client—server database application for storing clinical trials and outcomes data, which is currently undergoing initial pilot use. It stores most of its data in entity—attribute—value form. Such data are segregated according to data type to allow indexing by value when possible, and binary large object data are managed in the same way as other data. ACT/DB lets an investigator design a study rapidly by defining the parameters (or attributes) that are to be gathered, as well as their logical grouping for purposes of display and data entry. ACT/DB generates customizable data entry. The data can be viewed through several standard reports as well as exported as text to external analysis programs. ACT/DB is designed to encourage reuse of parameters across multiple studies and has facilities for dictionary search and maintenance. It uses a Microsoft Access client running on Windows 95 machines, which communicates with an Oracle server running on a UNIX platform. ACT/DB is being used to manage the data for seven studies in its initial deployment. PMID:9524347
NASA Astrophysics Data System (ADS)
Grimes, J.; Mahoney, A. R.; Heinrichs, T. A.; Eicken, H.
2012-12-01
Sensor data can be highly variable in nature and also varied depending on the physical quantity being observed, sensor hardware and sampling parameters. The sea ice mass balance site (MBS) operated in Barrow by the University of Alaska Fairbanks (http://seaice.alaska.edu/gi/observatories/barrow_sealevel) is a multisensor platform consisting of a thermistor string, air and water temperature sensors, acoustic altimeters above and below the ice and a humidity sensor. Each sensor has a unique specification and configuration. The data from multiple sensors are combined to generate sea ice data products. For example, ice thickness is calculated from the positions of the upper and lower ice surfaces, which are determined using data from downward-looking and upward-looking acoustic altimeters above and below the ice, respectively. As a data clearinghouse, the Geographic Information Network of Alaska (GINA) processes real time data from many sources, including the Barrow MBS. Doing so requires a system that is easy to use, yet also offers the flexibility to handle data from multisensor observing platforms. In the case of the Barrow MBS, the metadata system needs to accommodate the addition of new and retirement of old sensors from year to year as well as instrument configuration changes caused by, for example, spring melt or inquisitive polar bears. We also require ease of use for both administrators and end users. Here we present the data and processing steps of using sensor data system powered by the NoSQL storage engine, MongoDB. The system has been developed to ingest, process, disseminate and archive data from the Barrow MBS. Storing sensor data in a generalized format, from many different sources, is a challenging task, especially for traditional SQL databases with a set schema. MongoDB is a NoSQL (not only SQL) database that does not require a fixed schema. There are several advantages using this model over the traditional relational database management system (RDBMS) model databases. The lack of a required schema allows flexibility in how the data can be ingested into the database. For example, MongoDB imposes no restrictions on field names. For researchers using the system, this means that the name they have chosen for the sensor is carried through the database, any processing, and to the final output helping to preserve data integrity. Also, MongoDB allows the data to be pushed to it dynamically meaning that field attributes can be defined at the point of ingestion. This allows any sensor data to be ingested as a document and for this functionality to be transferred to the user interface, allowing greater adaptability to different use-case scenarios. In presenting the MondoDB data system being developed for the Barrow MBS, we demonstrate the versatility of this approach and its suitability as the foundation of a Barrow node of the Arctic Observing Network. Authors Jason Grimes - Geographic Information Network of Alaska - jason@gina.alaska.edu Andy Mahony - Geophysical Institute - mahoney@gi.alaska.edu Hajo Eiken - Geophysical Institute - Hajo.Eicken@gi.alaska.edu Tom Heinrichs - Geographic Information Network of Alaska - Tom.Heinrichs@alaska.edu
Virtual file system on NoSQL for processing high volumes of HL7 messages.
Kimura, Eizen; Ishihara, Ken
2015-01-01
The Standardized Structured Medical Information Exchange (SS-MIX) is intended to be the standard repository for HL7 messages that depend on a local file system. However, its scalability is limited. We implemented a virtual file system using NoSQL to incorporate modern computing technology into SS-MIX and allow the system to integrate local patient IDs from different healthcare systems into a universal system. We discuss its implementation using the database MongoDB and describe its performance in a case study.
Guzmán-Flores, Juan Manuel; Flores-Pérez, Elsa Cristina; Hernández-Ortiz, Magdalena; Vargas-Ortiz, Katya; Ramírez-Emiliano, Joel; Encarnación-Guevara, Sergio; Pérez-Vázquez, Victoriano
2018-06-01
Type 2 diabetes mellitus is characterized by insulin resistance in the liver. Insulin is not only involved in carbohydrate metabolism, it also regulates protein synthesis. This work describes the expression of proteins in the liver of a diabetic mouse and identifies the metabolic pathways involved. Twenty-week-old diabetic db/db mice were hepatectomized, after which proteins were separated by 2D-Polyacrylamide Gel Electrophoresis (2D-PAGE). Spots varying in intensity were analyzed using mass spectrometry, and biological function was assigned by the Database for Annotation, Visualization and Integrated Discovery (DAVID) software. A differential expression of 26 proteins was identified; among these were arginase-1, pyruvate carboxylase, peroxiredoxin-1, regucalcin, and sorbitol dehydrogenase. Bioinformatics analysis indicated that many of these proteins are mitochondrial and participate in metabolic pathways, such as the citrate cycle, the fructose and mannose metabolism, and glycolysis or gluconeogenesis. In addition, these proteins are related to oxidation⁻reduction reactions and molecular function of vitamin binding and amino acid metabolism. In conclusion, the proteomic profile of the liver of diabetic mouse db/db exhibited mainly alterations in the metabolism of carbohydrates and nitrogen. These differences illustrate the heterogeneity of diabetes in its different stages and under different conditions and highlights the need to improve treatments for this disease.
Database resources of the National Center for Biotechnology Information
Sayers, Eric W.; Barrett, Tanya; Benson, Dennis A.; Bolton, Evan; Bryant, Stephen H.; Canese, Kathi; Chetvernin, Vyacheslav; Church, Deanna M.; DiCuccio, Michael; Federhen, Scott; Feolo, Michael; Fingerman, Ian M.; Geer, Lewis Y.; Helmberg, Wolfgang; Kapustin, Yuri; Krasnov, Sergey; Landsman, David; Lipman, David J.; Lu, Zhiyong; Madden, Thomas L.; Madej, Tom; Maglott, Donna R.; Marchler-Bauer, Aron; Miller, Vadim; Karsch-Mizrachi, Ilene; Ostell, James; Panchenko, Anna; Phan, Lon; Pruitt, Kim D.; Schuler, Gregory D.; Sequeira, Edwin; Sherry, Stephen T.; Shumway, Martin; Sirotkin, Karl; Slotta, Douglas; Souvorov, Alexandre; Starchenko, Grigory; Tatusova, Tatiana A.; Wagner, Lukas; Wang, Yanli; Wilbur, W. John; Yaschenko, Eugene; Ye, Jian
2012-01-01
In addition to maintaining the GenBank® nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data made available through the NCBI Website. NCBI resources include Entrez, the Entrez Programming Utilities, MyNCBI, PubMed, PubMed Central (PMC), Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Primer-BLAST, COBALT, Splign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, dbVar, Epigenomics, Genome and related tools, the Map Viewer, Model Maker, Evidence Viewer, Trace Archive, Sequence Read Archive, BioProject, BioSample, Retroviral Genotyping Tools, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus (GEO), Probe, Online Mendelian Inheritance in Animals (OMIA), the Molecular Modeling Database (MMDB), the Conserved Domain Database (CDD), the Conserved Domain Architecture Retrieval Tool (CDART), Biosystems, Protein Clusters and the PubChem suite of small molecule databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of these resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov. PMID:22140104
Database resources of the National Center for Biotechnology Information
2013-01-01
In addition to maintaining the GenBank® nucleic acid sequence database, the National Center for Biotechnology Information (NCBI, http://www.ncbi.nlm.nih.gov) provides analysis and retrieval resources for the data in GenBank and other biological data made available through the NCBI web site. NCBI resources include Entrez, the Entrez Programming Utilities, MyNCBI, PubMed, PubMed Central, Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Primer-BLAST, COBALT, Splign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, dbVar, Epigenomics, the Genetic Testing Registry, Genome and related tools, the Map Viewer, Model Maker, Evidence Viewer, Trace Archive, Sequence Read Archive, BioProject, BioSample, Retroviral Genotyping Tools, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus, Probe, Online Mendelian Inheritance in Animals, the Molecular Modeling Database, the Conserved Domain Database, the Conserved Domain Architecture Retrieval Tool, Biosystems, Protein Clusters and the PubChem suite of small molecule databases. Augmenting many of the web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of these resources can be accessed through the NCBI home page. PMID:23193264
dbHiMo: a web-based epigenomics platform for histone-modifying enzymes
Choi, Jaeyoung; Kim, Ki-Tae; Huh, Aram; Kwon, Seomun; Hong, Changyoung; Asiegbu, Fred O.; Jeon, Junhyun; Lee, Yong-Hwan
2015-01-01
Over the past two decades, epigenetics has evolved into a key concept for understanding regulation of gene expression. Among many epigenetic mechanisms, covalent modifications such as acetylation and methylation of lysine residues on core histones emerged as a major mechanism in epigenetic regulation. Here, we present the database for histone-modifying enzymes (dbHiMo; http://hme.riceblast.snu.ac.kr/) aimed at facilitating functional and comparative analysis of histone-modifying enzymes (HMEs). HMEs were identified by applying a search pipeline built upon profile hidden Markov model (HMM) to proteomes. The database incorporates 11 576 HMEs identified from 603 proteomes including 483 fungal, 32 plants and 51 metazoan species. The dbHiMo provides users with web-based personalized data browsing and analysis tools, supporting comparative and evolutionary genomics. With comprehensive data entries and associated web-based tools, our database will be a valuable resource for future epigenetics/epigenomics studies. Database URL: http://hme.riceblast.snu.ac.kr/ PMID:26055100
Relax with CouchDB - Into the non-relational DBMS era of Bioinformatics
Manyam, Ganiraju; Payton, Michelle A.; Roth, Jack A.; Abruzzo, Lynne V.; Coombes, Kevin R.
2012-01-01
With the proliferation of high-throughput technologies, genome-level data analysis has become common in molecular biology. Bioinformaticians are developing extensive resources to annotate and mine biological features from high-throughput data. The underlying database management systems for most bioinformatics software are based on a relational model. Modern non-relational databases offer an alternative that has flexibility, scalability, and a non-rigid design schema. Moreover, with an accelerated development pace, non-relational databases like CouchDB can be ideal tools to construct bioinformatics utilities. We describe CouchDB by presenting three new bioinformatics resources: (a) geneSmash, which collates data from bioinformatics resources and provides automated gene-centric annotations, (b) drugBase, a database of drug-target interactions with a web interface powered by geneSmash, and (c) HapMap-CN, which provides a web interface to query copy number variations from three SNP-chip HapMap datasets. In addition to the web sites, all three systems can be accessed programmatically via web services. PMID:22609849
NASA Technical Reports Server (NTRS)
Wrenn, Gregory A.
2005-01-01
This report describes a database routine called DB90 which is intended for use with scientific and engineering computer programs. The software is written in the Fortran 90/95 programming language standard with file input and output routines written in the C programming language. These routines should be completely portable to any computing platform and operating system that has Fortran 90/95 and C compilers. DB90 allows a program to supply relation names and up to 5 integer key values to uniquely identify each record of each relation. This permits the user to select records or retrieve data in any desired order.
NASA Astrophysics Data System (ADS)
Choi, Sang-Hwa; Kim, Sung Dae; Park, Hyuk Min; Lee, SeungHa
2016-04-01
We established and have operated an integrated data system for managing, archiving and sharing marine geology and geophysical data around Korea produced from various research projects and programs in Korea Institute of Ocean Science & Technology (KIOST). First of all, to keep the consistency of data system with continuous data updates, we set up standard operating procedures (SOPs) for data archiving, data processing and converting, data quality controls, and data uploading, DB maintenance, etc. Database of this system comprises two databases, ARCHIVE DB and GIS DB for the purpose of this data system. ARCHIVE DB stores archived data as an original forms and formats from data providers for data archive and GIS DB manages all other compilation, processed and reproduction data and information for data services and GIS application services. Relational data management system, Oracle 11g, adopted for DBMS and open source GIS techniques applied for GIS services such as OpenLayers for user interface, GeoServer for application server, PostGIS and PostgreSQL for GIS database. For the sake of convenient use of geophysical data in a SEG Y format, a viewer program was developed and embedded in this system. Users can search data through GIS user interface and save the results as a report.
Farrell, L J; Lo, R; Wanford, J J; Jenkins, A; Maxwell, A; Piddock, L J V
2018-06-11
The current state of antibiotic discovery, research and development is insufficient to respond to the need for new treatments for drug-resistant bacterial infections. The process has changed over the last decade, with most new agents that are in Phases 1-3, or recently approved, having been discovered in small- and medium-sized enterprises or academia. These agents have then been licensed or sold to large companies for further development with the goal of taking them to market. However, early drug discovery and development, including the possibility of developing previously discontinued agents, would benefit from a database of antibacterial compounds for scrutiny by the developers. This article describes the first free, open-access searchable database of antibacterial compounds, including discontinued agents, drugs under pre-clinical development and those in clinical trials: AntibioticDB (AntibioticDB.com). Data were obtained from publicly available sources. This article summarizes the compounds and drugs in AntibioticDB, including their drug class, mode of action, development status and propensity to select drug-resistant bacteria. AntibioticDB includes compounds currently in pre-clinical development and 834 that have been discontinued and that reached varying stages of development. These may serve as starting points for future research and development.
Logic programming to infer complex RNA expression patterns from RNA-seq data.
Weirick, Tyler; Militello, Giuseppe; Ponomareva, Yuliya; John, David; Döring, Claudia; Dimmeler, Stefanie; Uchida, Shizuka
2018-03-01
To meet the increasing demand in the field, numerous long noncoding RNA (lncRNA) databases are available. Given many lncRNAs are specifically expressed in certain cell types and/or time-dependent manners, most lncRNA databases fall short of providing such profiles. We developed a strategy using logic programming to handle the complex organization of organs, their tissues and cell types as well as gender and developmental time points. To showcase this strategy, we introduce 'RenalDB' (http://renaldb.uni-frankfurt.de), a database providing expression profiles of RNAs in major organs focusing on kidney tissues and cells. RenalDB uses logic programming to describe complex anatomy, sample metadata and logical relationships defining expression, enrichment or specificity. We validated the content of RenalDB with biological experiments and functionally characterized two long intergenic noncoding RNAs: LOC440173 is important for cell growth or cell survival, whereas PAXIP1-AS1 is a regulator of cell death. We anticipate RenalDB will be used as a first step toward functional studies of lncRNAs in the kidney.
Berthold, Michael R.; Hedrick, Michael P.; Gilson, Michael K.
2015-01-01
Today’s large, public databases of protein–small molecule interaction data are creating important new opportunities for data mining and integration. At the same time, new graphical user interface-based workflow tools offer facile alternatives to custom scripting for informatics and data analysis. Here, we illustrate how the large protein-ligand database BindingDB may be incorporated into KNIME workflows as a step toward the integration of pharmacological data with broader biomolecular analyses. Thus, we describe a collection of KNIME workflows that access BindingDB data via RESTful webservices and, for more intensive queries, via a local distillation of the full BindingDB dataset. We focus in particular on the KNIME implementation of knowledge-based tools to generate informed hypotheses regarding protein targets of bioactive compounds, based on notions of chemical similarity. A number of variants of this basic approach are tested for seven existing drugs with relatively ill-defined therapeutic targets, leading to replication of some previously confirmed results and discovery of new, high-quality hits. Implications for future development are discussed. Database URL: www.bindingdb.org PMID:26384374
Processing and Quality Monitoring for the ATLAS Tile Hadronic Calorimeter Data
NASA Astrophysics Data System (ADS)
Burghgrave, Blake; ATLAS Collaboration
2017-10-01
An overview is presented of Data Processing and Data Quality (DQ) Monitoring for the ATLAS Tile Hadronic Calorimeter. Calibration runs are monitored from a data quality perspective and used as a cross-check for physics runs. Data quality in physics runs is monitored extensively and continuously. Any problems are reported and immediately investigated. The DQ efficiency achieved was 99.6% in 2012 and 100% in 2015, after the detector maintenance in 2013-2014. Changes to detector status or calibrations are entered into the conditions database (DB) during a brief calibration loop between the end of a run and the beginning of bulk processing of data collected in it. Bulk processed data are reviewed and certified for the ATLAS Good Run List if no problem is detected. Experts maintain the tools used by DQ shifters and the calibration teams during normal operation, and prepare new conditions for data reprocessing and Monte Carlo (MC) production campaigns. Conditions data are stored in 3 databases: Online DB, Offline DB for data and a special DB for Monte Carlo. Database updates can be performed through a custom-made web interface.
Comparison of the Frontier Distributed Database Caching System to NoSQL Databases
NASA Astrophysics Data System (ADS)
Dykstra, Dave
2012-12-01
One of the main attractions of non-relational “NoSQL” databases is their ability to scale to large numbers of readers, including readers spread over a wide area. The Frontier distributed database caching system, used in production by the Large Hadron Collider CMS and ATLAS detector projects for Conditions data, is based on traditional SQL databases but also adds high scalability and the ability to be distributed over a wide-area for an important subset of applications. This paper compares the major characteristics of the two different approaches and identifies the criteria for choosing which approach to prefer over the other. It also compares in some detail the NoSQL databases used by CMS and ATLAS: MongoDB, CouchDB, HBase, and Cassandra.
Comparison of the Frontier Distributed Database Caching System to NoSQL Databases
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dykstra, Dave
One of the main attractions of non-relational NoSQL databases is their ability to scale to large numbers of readers, including readers spread over a wide area. The Frontier distributed database caching system, used in production by the Large Hadron Collider CMS and ATLAS detector projects for Conditions data, is based on traditional SQL databases but also adds high scalability and the ability to be distributed over a wide-area for an important subset of applications. This paper compares the major characteristics of the two different approaches and identifies the criteria for choosing which approach to prefer over the other. It alsomore » compares in some detail the NoSQL databases used by CMS and ATLAS: MongoDB, CouchDB, HBase, and Cassandra.« less
TRACTOR_DB: a database of regulatory networks in gamma-proteobacterial genomes
González, Abel D.; Espinosa, Vladimir; Vasconcelos, Ana T.; Pérez-Rueda, Ernesto; Collado-Vides, Julio
2005-01-01
Experimental data on the Escherichia coli transcriptional regulatory system has been used in the past years to predict new regulatory elements (promoters, transcription factors (TFs), TFs' binding sites and operons) within its genome. As more genomes of gamma-proteobacteria are being sequenced, the prediction of these elements in a growing number of organisms has become more feasible, as a step towards the study of how different bacteria respond to environmental changes at the level of transcriptional regulation. In this work, we present TRACTOR_DB (TRAnscription FaCTORs' predicted binding sites in prokaryotic genomes), a relational database that contains computational predictions of new members of 74 regulons in 17 gamma-proteobacterial genomes. For these predictions we used a comparative genomics approach regarding which several proof-of-principle articles for large regulons have been published. TRACTOR_DB may be currently accessed at http://www.bioinfo.cu/Tractor_DB, http://www.tractor.lncc.br/ or at http://www.cifn.unam.mx/Computational_Genomics/tractorDB. Contact Email id is tractor@cifn.unam.mx. PMID:15608293
pE-DB: a database of structural ensembles of intrinsically disordered and of unfolded proteins.
Varadi, Mihaly; Kosol, Simone; Lebrun, Pierre; Valentini, Erica; Blackledge, Martin; Dunker, A Keith; Felli, Isabella C; Forman-Kay, Julie D; Kriwacki, Richard W; Pierattelli, Roberta; Sussman, Joel; Svergun, Dmitri I; Uversky, Vladimir N; Vendruscolo, Michele; Wishart, David; Wright, Peter E; Tompa, Peter
2014-01-01
The goal of pE-DB (http://pedb.vib.be) is to serve as an openly accessible database for the deposition of structural ensembles of intrinsically disordered proteins (IDPs) and of denatured proteins based on nuclear magnetic resonance spectroscopy, small-angle X-ray scattering and other data measured in solution. Owing to the inherent flexibility of IDPs, solution techniques are particularly appropriate for characterizing their biophysical properties, and structural ensembles in agreement with these data provide a convenient tool for describing the underlying conformational sampling. Database entries consist of (i) primary experimental data with descriptions of the acquisition methods and algorithms used for the ensemble calculations, and (ii) the structural ensembles consistent with these data, provided as a set of models in a Protein Data Bank format. PE-DB is open for submissions from the community, and is intended as a forum for disseminating the structural ensembles and the methodologies used to generate them. While the need to represent the IDP structures is clear, methods for determining and evaluating the structural ensembles are still evolving. The availability of the pE-DB database is expected to promote the development of new modeling methods and leads to a better understanding of how function arises from disordered states.
Assessment of CFD-based Response Surface Model for Ares I Supersonic Ascent Aerodynamics
NASA Technical Reports Server (NTRS)
Hanke, Jeremy L.
2011-01-01
The Ascent Force and Moment Aerodynamic (AFMA) Databases (DBs) for the Ares I Crew Launch Vehicle (CLV) were typically based on wind tunnel (WT) data, with increments provided by computational fluid dynamics (CFD) simulations for aspects of the vehicle that could not be tested in the WT tests. During the Design Analysis Cycle 3 analysis for the outer mold line (OML) geometry designated A106, a major tunnel mishap delayed the WT test for supersonic Mach numbers (M) greater than 1.6 in the Unitary Plan Wind Tunnel at NASA Langley Research Center, and the test delay pushed the final delivery of the A106 AFMA DB back by several months. The aero team developed an interim database based entirely on the already completed CFD simulations to mitigate the impact of the delay. This CFD-based database used a response surface methodology based on radial basis functions to predict the aerodynamic coefficients for M > 1.6 based on only the CFD data from both WT and flight Reynolds number conditions. The aero team used extensive knowledge of the previous AFMA DB for the A103 OML to guide the development of the CFD-based A106 AFMA DB. This report details the development of the CFD-based A106 Supersonic AFMA DB, constructs a prediction of the database uncertainty using data available at the time of development, and assesses the overall quality of the CFD-based DB both qualitatively and quantitatively. This assessment confirms that a reasonable aerodynamic database can be constructed for launch vehicles at supersonic conditions using only CFD data if sufficient knowledge of the physics and expected behavior is available. This report also demonstrates the applicability of non-parametric response surface modeling using radial basis functions for development of aerodynamic databases that exhibit both linear and non-linear behavior throughout a large data space.
dbSUPER: a database of super-enhancers in mouse and human genome
Khan, Aziz; Zhang, Xuegong
2016-01-01
Super-enhancers are clusters of transcriptional enhancers that drive cell-type-specific gene expression and are crucial to cell identity. Many disease-associated sequence variations are enriched in super-enhancer regions of disease-relevant cell types. Thus, super-enhancers can be used as potential biomarkers for disease diagnosis and therapeutics. Current studies have identified super-enhancers in more than 100 cell types and demonstrated their functional importance. However, a centralized resource to integrate all these findings is not currently available. We developed dbSUPER (http://bioinfo.au.tsinghua.edu.cn/dbsuper/), the first integrated and interactive database of super-enhancers, with the primary goal of providing a resource for assistance in further studies related to transcriptional control of cell identity and disease. dbSUPER provides a responsive and user-friendly web interface to facilitate efficient and comprehensive search and browsing. The data can be easily sent to Galaxy instances, GREAT and Cistrome web-servers for downstream analysis, and can also be visualized in the UCSC genome browser where custom tracks can be added automatically. The data can be downloaded and exported in variety of formats. Furthermore, dbSUPER lists genes associated with super-enhancers and also links to external databases such as GeneCards, UniProt and Entrez. dbSUPER also provides an overlap analysis tool to annotate user-defined regions. We believe dbSUPER is a valuable resource for the biology and genetic research communities. PMID:26438538
A World Wide Web (WWW) server database engine for an organelle database, MitoDat.
Lemkin, P F; Chipperfield, M; Merril, C; Zullo, S
1996-03-01
We describe a simple database search engine "dbEngine" which may be used to quickly create a searchable database on a World Wide Web (WWW) server. Data may be prepared from spreadsheet programs (such as Excel, etc.) or from tables exported from relationship database systems. This Common Gateway Interface (CGI-BIN) program is used with a WWW server such as available commercially, or from National Center for Supercomputer Algorithms (NCSA) or CERN. Its capabilities include: (i) searching records by combinations of terms connected with ANDs or ORs; (ii) returning search results as hypertext links to other WWW database servers; (iii) mapping lists of literature reference identifiers to the full references; (iv) creating bidirectional hypertext links between pictures and the database. DbEngine has been used to support the MitoDat database (Mendelian and non-Mendelian inheritance associated with the Mitochondrion) on the WWW.
Transition to the vgosDb Format
NASA Astrophysics Data System (ADS)
Bolotin, Sergei; Baver, Karen; Gipson, John; Gordon, David; MacMillan, Daniel
2016-12-01
The IVS Working Group 4 developed a new format to store and exchange data obtained from geodetic VLBI observations. The new data format, vgosDb, will replace existing Mark III databases this year. At GSFC we developed utilities that implement the vgosDb format and will be used routinely to convert correlator output to the new data storage format.
dbMDEGA: a database for meta-analysis of differentially expressed genes in autism spectrum disorder.
Zhang, Shuyun; Deng, Libin; Jia, Qiyue; Huang, Shaoting; Gu, Junwang; Zhou, Fankun; Gao, Meng; Sun, Xinyi; Feng, Chang; Fan, Guangqin
2017-11-16
Autism spectrum disorders (ASD) are hereditary, heterogeneous and biologically complex neurodevelopmental disorders. Individual studies on gene expression in ASD cannot provide clear consensus conclusions. Therefore, a systematic review to synthesize the current findings from brain tissues and a search tool to share the meta-analysis results are urgently needed. Here, we conducted a meta-analysis of brain gene expression profiles in the current reported human ASD expression datasets (with 84 frozen male cortex samples, 17 female cortex samples, 32 cerebellum samples and 4 formalin fixed samples) and knock-out mouse ASD model expression datasets (with 80 collective brain samples). Then, we applied R language software and developed an interactive shared and updated database (dbMDEGA) displaying the results of meta-analysis of data from ASD studies regarding differentially expressed genes (DEGs) in the brain. This database, dbMDEGA ( https://dbmdega.shinyapps.io/dbMDEGA/ ), is a publicly available web-portal for manual annotation and visualization of DEGs in the brain from data from ASD studies. This database uniquely presents meta-analysis values and homologous forest plots of DEGs in brain tissues. Gene entries are annotated with meta-values, statistical values and forest plots of DEGs in brain samples. This database aims to provide searchable meta-analysis results based on the current reported brain gene expression datasets of ASD to help detect candidate genes underlying this disorder. This new analytical tool may provide valuable assistance in the discovery of DEGs and the elucidation of the molecular pathogenicity of ASD. This database model may be replicated to study other disorders.
ClearedLeavesDB: an online database of cleared plant leaf images
2014-01-01
Background Leaf vein networks are critical to both the structure and function of leaves. A growing body of recent work has linked leaf vein network structure to the physiology, ecology and evolution of land plants. In the process, multiple institutions and individual researchers have assembled collections of cleared leaf specimens in which vascular bundles (veins) are rendered visible. In an effort to facilitate analysis and digitally preserve these specimens, high-resolution images are usually created, either of entire leaves or of magnified leaf subsections. In a few cases, collections of digital images of cleared leaves are available for use online. However, these collections do not share a common platform nor is there a means to digitally archive cleared leaf images held by individual researchers (in addition to those held by institutions). Hence, there is a growing need for a digital archive that enables online viewing, sharing and disseminating of cleared leaf image collections held by both institutions and individual researchers. Description The Cleared Leaf Image Database (ClearedLeavesDB), is an online web-based resource for a community of researchers to contribute, access and share cleared leaf images. ClearedLeavesDB leverages resources of large-scale, curated collections while enabling the aggregation of small-scale collections within the same online platform. ClearedLeavesDB is built on Drupal, an open source content management platform. It allows plant biologists to store leaf images online with corresponding meta-data, share image collections with a user community and discuss images and collections via a common forum. We provide tools to upload processed images and results to the database via a web services client application that can be downloaded from the database. Conclusions We developed ClearedLeavesDB, a database focusing on cleared leaf images that combines interactions between users and data via an intuitive web interface. The web interface allows storage of large collections and integrates with leaf image analysis applications via an open application programming interface (API). The open API allows uploading of processed images and other trait data to the database, further enabling distribution and documentation of analyzed data within the community. The initial database is seeded with nearly 19,000 cleared leaf images representing over 40 GB of image data. Extensible storage and growth of the database is ensured by using the data storage resources of the iPlant Discovery Environment. ClearedLeavesDB can be accessed at http://clearedleavesdb.org. PMID:24678985
ClearedLeavesDB: an online database of cleared plant leaf images.
Das, Abhiram; Bucksch, Alexander; Price, Charles A; Weitz, Joshua S
2014-03-28
Leaf vein networks are critical to both the structure and function of leaves. A growing body of recent work has linked leaf vein network structure to the physiology, ecology and evolution of land plants. In the process, multiple institutions and individual researchers have assembled collections of cleared leaf specimens in which vascular bundles (veins) are rendered visible. In an effort to facilitate analysis and digitally preserve these specimens, high-resolution images are usually created, either of entire leaves or of magnified leaf subsections. In a few cases, collections of digital images of cleared leaves are available for use online. However, these collections do not share a common platform nor is there a means to digitally archive cleared leaf images held by individual researchers (in addition to those held by institutions). Hence, there is a growing need for a digital archive that enables online viewing, sharing and disseminating of cleared leaf image collections held by both institutions and individual researchers. The Cleared Leaf Image Database (ClearedLeavesDB), is an online web-based resource for a community of researchers to contribute, access and share cleared leaf images. ClearedLeavesDB leverages resources of large-scale, curated collections while enabling the aggregation of small-scale collections within the same online platform. ClearedLeavesDB is built on Drupal, an open source content management platform. It allows plant biologists to store leaf images online with corresponding meta-data, share image collections with a user community and discuss images and collections via a common forum. We provide tools to upload processed images and results to the database via a web services client application that can be downloaded from the database. We developed ClearedLeavesDB, a database focusing on cleared leaf images that combines interactions between users and data via an intuitive web interface. The web interface allows storage of large collections and integrates with leaf image analysis applications via an open application programming interface (API). The open API allows uploading of processed images and other trait data to the database, further enabling distribution and documentation of analyzed data within the community. The initial database is seeded with nearly 19,000 cleared leaf images representing over 40 GB of image data. Extensible storage and growth of the database is ensured by using the data storage resources of the iPlant Discovery Environment. ClearedLeavesDB can be accessed at http://clearedleavesdb.org.
Gama-Castro, Socorro; Salgado, Heladia; Santos-Zavaleta, Alberto; Ledezma-Tejeida, Daniela; Muñiz-Rascado, Luis; García-Sotelo, Jair Santiago; Alquicira-Hernández, Kevin; Martínez-Flores, Irma; Pannier, Lucia; Castro-Mondragón, Jaime Abraham; Medina-Rivera, Alejandra; Solano-Lira, Hilda; Bonavides-Martínez, César; Pérez-Rueda, Ernesto; Alquicira-Hernández, Shirley; Porrón-Sotelo, Liliana; López-Fuentes, Alejandra; Hernández-Koutoucheva, Anastasia; Del Moral-Chávez, Víctor; Rinaldi, Fabio; Collado-Vides, Julio
2016-01-04
RegulonDB (http://regulondb.ccg.unam.mx) is one of the most useful and important resources on bacterial gene regulation,as it integrates the scattered scientific knowledge of the best-characterized organism, Escherichia coli K-12, in a database that organizes large amounts of data. Its electronic format enables researchers to compare their results with the legacy of previous knowledge and supports bioinformatics tools and model building. Here, we summarize our progress with RegulonDB since our last Nucleic Acids Research publication describing RegulonDB, in 2013. In addition to maintaining curation up-to-date, we report a collection of 232 interactions with small RNAs affecting 192 genes, and the complete repertoire of 189 Elementary Genetic Sensory-Response units (GENSOR units), integrating the signal, regulatory interactions, and metabolic pathways they govern. These additions represent major progress to a higher level of understanding of regulated processes. We have updated the computationally predicted transcription factors, which total 304 (184 with experimental evidence and 120 from computational predictions); we updated our position-weight matrices and have included tools for clustering them in evolutionary families. We describe our semiautomatic strategy to accelerate curation, including datasets from high-throughput experiments, a novel coexpression distance to search for 'neighborhood' genes to known operons and regulons, and computational developments. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Resources | Division of Cancer Prevention
Manual of Operations Version 3, 12/13/2012 (PDF, 162KB) Database Sources Consortium for Functional Glycomics databases Design Studies Related to the Development of Distributed, Web-based European Carbohydrate Databases (EUROCarbDB) |
NASA Astrophysics Data System (ADS)
Patel, M. N.; Young, K.; Halling-Brown, M. D.
2018-03-01
The demand for medical images for research is ever increasing owing to the rapid rise in novel machine learning approaches for early detection and diagnosis. The OPTIMAM Medical Image Database (OMI-DB)1,2 was created to provide a centralized, fully annotated dataset for research. The database contains both processed and unprocessed images, associated data, annotations and expert-determined ground truths. Since the inception of the database in early 2011, the volume of images and associated data collected has dramatically increased owing to automation of the collection pipeline and inclusion of new sites. Currently, these data are stored at each respective collection site and synced periodically to a central store. This leads to a large data footprint at each site, requiring large physical onsite storage, which is expensive. Here, we propose an update to the OMI-DB collection system, whereby the storage of all the data is automatically transferred to the cloud on collection. This change in the data collection paradigm reduces the reliance of physical servers at each site; allows greater scope for future expansion; and removes the need for dedicated backups and improves security. Moreover, with the number of applications to access the data increasing rapidly with the maturity of the dataset cloud technology facilities faster sharing of data and better auditing of data access. Such updates, although may sound trivial; require substantial modification to the existing pipeline to ensure data integrity and security compliance. Here, we describe the extensions to the OMI-DB collection pipeline and discuss the relative merits of the new system.
Improved orthologous databases to ease protozoan targets inference.
Kotowski, Nelson; Jardim, Rodrigo; Dávila, Alberto M R
2015-09-29
Homology inference helps on identifying similarities, as well as differences among organisms, which provides a better insight on how closely related one might be to another. In addition, comparative genomics pipelines are widely adopted tools designed using different bioinformatics applications and algorithms. In this article, we propose a methodology to build improved orthologous databases with the potential to aid on protozoan target identification, one of the many tasks which benefit from comparative genomics tools. Our analyses are based on OrthoSearch, a comparative genomics pipeline originally designed to infer orthologs through protein-profile comparison, supported by an HMM, reciprocal best hits based approach. Our methodology allows OrthoSearch to confront two orthologous databases and to generate an improved new one. Such can be later used to infer potential protozoan targets through a similarity analysis against the human genome. The protein sequences of Cryptosporidium hominis, Entamoeba histolytica and Leishmania infantum genomes were comparatively analyzed against three orthologous databases: (i) EggNOG KOG, (ii) ProtozoaDB and (iii) Kegg Orthology (KO). That allowed us to create two new orthologous databases, "KO + EggNOG KOG" and "KO + EggNOG KOG + ProtozoaDB", with 16,938 and 27,701 orthologous groups, respectively. Such new orthologous databases were used for a regular OrthoSearch run. By confronting "KO + EggNOG KOG" and "KO + EggNOG KOG + ProtozoaDB" databases and protozoan species we were able to detect the following total of orthologous groups and coverage (relation between the inferred orthologous groups and the species total number of proteins): Cryptosporidium hominis: 1,821 (11 %) and 3,254 (12 %); Entamoeba histolytica: 2,245 (13 %) and 5,305 (19 %); Leishmania infantum: 2,702 (16 %) and 4,760 (17 %). Using our HMM-based methodology and the largest created orthologous database, it was possible to infer 13 orthologous groups which represent potential protozoan targets; these were found because of our distant homology approach. We also provide the number of species-specific, pair-to-pair and core groups from such analyses, depicted in Venn diagrams. The orthologous databases generated by our HMM-based methodology provide a broader dataset, with larger amounts of orthologous groups when compared to the original databases used as input. Those may be used for several homology inference analyses, annotation tasks and protozoan targets identification.
SPIKE – a database, visualization and analysis tool of cellular signaling pathways
Elkon, Ran; Vesterman, Rita; Amit, Nira; Ulitsky, Igor; Zohar, Idan; Weisz, Mali; Mass, Gilad; Orlev, Nir; Sternberg, Giora; Blekhman, Ran; Assa, Jackie; Shiloh, Yosef; Shamir, Ron
2008-01-01
Background Biological signaling pathways that govern cellular physiology form an intricate web of tightly regulated interlocking processes. Data on these regulatory networks are accumulating at an unprecedented pace. The assimilation, visualization and interpretation of these data have become a major challenge in biological research, and once met, will greatly boost our ability to understand cell functioning on a systems level. Results To cope with this challenge, we are developing the SPIKE knowledge-base of signaling pathways. SPIKE contains three main software components: 1) A database (DB) of biological signaling pathways. Carefully curated information from the literature and data from large public sources constitute distinct tiers of the DB. 2) A visualization package that allows interactive graphic representations of regulatory interactions stored in the DB and superposition of functional genomic and proteomic data on the maps. 3) An algorithmic inference engine that analyzes the networks for novel functional interplays between network components. SPIKE is designed and implemented as a community tool and therefore provides a user-friendly interface that allows registered users to upload data to SPIKE DB. Our vision is that the DB will be populated by a distributed and highly collaborative effort undertaken by multiple groups in the research community, where each group contributes data in its field of expertise. Conclusion The integrated capabilities of SPIKE make it a powerful platform for the analysis of signaling networks and the integration of knowledge on such networks with omics data. PMID:18289391
The 2015 Nucleic Acids Research Database Issue and molecular biology database collection.
Galperin, Michael Y; Rigden, Daniel J; Fernández-Suárez, Xosé M
2015-01-01
The 2015 Nucleic Acids Research Database Issue contains 172 papers that include descriptions of 56 new molecular biology databases, and updates on 115 databases whose descriptions have been previously published in NAR or other journals. Following the classification that has been introduced last year in order to simplify navigation of the entire issue, these articles are divided into eight subject categories. This year's highlights include RNAcentral, an international community portal to various databases on noncoding RNA; ValidatorDB, a validation database for protein structures and their ligands; SASBDB, a primary repository for small-angle scattering data of various macromolecular complexes; MoonProt, a database of 'moonlighting' proteins, and two new databases of protein-protein and other macromolecular complexes, ComPPI and the Complex Portal. This issue also includes an unusually high number of cancer-related databases and other databases dedicated to genomic basics of disease and potential drugs and drug targets. The size of NAR online Molecular Biology Database Collection, http://www.oxfordjournals.org/nar/database/a/, remained approximately the same, following the addition of 74 new resources and removal of 77 obsolete web sites. The entire Database Issue is freely available online on the Nucleic Acids Research web site (http://nar.oxfordjournals.org/). Published by Oxford University Press on behalf of Nucleic Acids Research 2014. This work is written by (a) US Government employee(s) and is in the public domain in the US.
A multi-center ring trial for the identification of anaerobic bacteria using MALDI-TOF MS.
Veloo, A C M; Jean-Pierre, H; Justesen, U S; Morris, T; Urban, E; Wybo, I; Shah, H N; Friedrich, A W; Morris, T; Shah, H N; Jean-Pierre, H; Justesen, U S; Nagy, E; Urban, E; Kostrzewa, M; Veloo, A; Friedrich, A W
2017-12-01
Inter-laboratory reproducibility of Matrix Assisted Laser Desorption Time-of-Flight Mass Spectrometry (MALDI-TOF MS) of anaerobic bacteria has not been shown before. Therefore, ten anonymized anaerobic strains were sent to seven participating laboratories, an initiative of the European Network for the Rapid Identification of Anaerobes (ENRIA). On arrival the strains were cultured and identified using MALDI-TOF MS. The spectra derived were compared with two different Biotyper MALDI-TOF MS databases, the db5627 and the db6903. The results obtained using the db5627 shows a reasonable variation between the different laboratories. However, when a more optimized database is used, the variation is less pronounced. In this study we show that an optimized database not only results in a higher number of strains which can be identified using MALDI-TOF MS, but also corrects for differences in performance between laboratories. Copyright © 2017 Elsevier Ltd. All rights reserved.
Planned and ongoing projects (pop) database: development and results.
Wild, Claudia; Erdös, Judit; Warmuth, Marisa; Hinterreiter, Gerda; Krämer, Peter; Chalon, Patrice
2014-11-01
The aim of this study was to present the development, structure and results of a database on planned and ongoing health technology assessment (HTA) projects (POP Database) in Europe. The POP Database (POP DB) was set up in an iterative process from a basic Excel sheet to a multifunctional electronic online database. The functionalities, such as the search terminology, the procedures to fill and update the database, the access rules to enter the database, as well as the maintenance roles, were defined in a multistep participatory feedback loop with EUnetHTA Partners. The POP Database has become an online database that hosts not only the titles and MeSH categorizations, but also some basic information on status and contact details about the listed projects of EUnetHTA Partners. Currently, it stores more than 1,200 planned, ongoing or recently published projects of forty-three EUnetHTA Partners from twenty-four countries. Because the POP Database aims to facilitate collaboration, it also provides a matching system to assist in identifying similar projects. Overall, more than 10 percent of the projects in the database are identical both in terms of pathology (indication or disease) and technology (drug, medical device, intervention). In addition, approximately 30 percent of the projects are similar, meaning that they have at least some overlap in content. Although the POP DB is successful concerning regular updates of most national HTA agencies within EUnetHTA, little is known about its actual effects on collaborations in Europe. Moreover, many non-nationally nominated HTA producing agencies neither have access to the POP DB nor can share their projects.
miRPathDB: a new dictionary on microRNAs and target pathways.
Backes, Christina; Kehl, Tim; Stöckel, Daniel; Fehlmann, Tobias; Schneider, Lara; Meese, Eckart; Lenhof, Hans-Peter; Keller, Andreas
2017-01-04
In the last decade, miRNAs and their regulatory mechanisms have been intensively studied and many tools for the analysis of miRNAs and their targets have been developed. We previously presented a dictionary on single miRNAs and their putative target pathways. Since then, the number of miRNAs has tripled and the knowledge on miRNAs and targets has grown substantially. This, along with changes in pathway resources such as KEGG, leads to an improved understanding of miRNAs, their target genes and related pathways. Here, we introduce the miRNA Pathway Dictionary Database (miRPathDB), freely accessible at https://mpd.bioinf.uni-sb.de/ With the database we aim to complement available target pathway web-servers by providing researchers easy access to the information which pathways are regulated by a miRNA, which miRNAs target a pathway and how specific these regulations are. The database contains a large number of miRNAs (2595 human miRNAs), different miRNA target sets (14 773 experimentally validated target genes as well as 19 281 predicted targets genes) and a broad selection of functional biochemical categories (KEGG-, WikiPathways-, BioCarta-, SMPDB-, PID-, Reactome pathways, functional categories from gene ontology (GO), protein families from Pfam and chromosomal locations totaling 12 875 categories). In addition to Homo sapiens, also Mus musculus data are stored and can be compared to human target pathways. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
An alternative database approach for management of SNOMED CT and improved patient data queries.
Campbell, W Scott; Pedersen, Jay; McClay, James C; Rao, Praveen; Bastola, Dhundy; Campbell, James R
2015-10-01
SNOMED CT is the international lingua franca of terminologies for human health. Based in Description Logics (DL), the terminology enables data queries that incorporate inferences between data elements, as well as, those relationships that are explicitly stated. However, the ontologic and polyhierarchical nature of the SNOMED CT concept model make it difficult to implement in its entirety within electronic health record systems that largely employ object oriented or relational database architectures. The result is a reduction of data richness, limitations of query capability and increased systems overhead. The hypothesis of this research was that a graph database (graph DB) architecture using SNOMED CT as the basis for the data model and subsequently modeling patient data upon the semantic core of SNOMED CT could exploit the full value of the terminology to enrich and support advanced data querying capability of patient data sets. The hypothesis was tested by instantiating a graph DB with the fully classified SNOMED CT concept model. The graph DB instance was tested for integrity by calculating the transitive closure table for the SNOMED CT hierarchy and comparing the results with transitive closure tables created using current, validated methods. The graph DB was then populated with 461,171 anonymized patient record fragments and over 2.1 million associated SNOMED CT clinical findings. Queries, including concept negation and disjunction, were then run against the graph database and an enterprise Oracle relational database (RDBMS) of the same patient data sets. The graph DB was then populated with laboratory data encoded using LOINC, as well as, medication data encoded with RxNorm and complex queries performed using LOINC, RxNorm and SNOMED CT to identify uniquely described patient populations. A graph database instance was successfully created for two international releases of SNOMED CT and two US SNOMED CT editions. Transitive closure tables and descriptive statistics generated using the graph database were identical to those using validated methods. Patient queries produced identical patient count results to the Oracle RDBMS with comparable times. Database queries involving defining attributes of SNOMED CT concepts were possible with the graph DB. The same queries could not be directly performed with the Oracle RDBMS representation of the patient data and required the creation and use of external terminology services. Further, queries of undefined depth were successful in identifying unknown relationships between patient cohorts. The results of this study supported the hypothesis that a patient database built upon and around the semantic model of SNOMED CT was possible. The model supported queries that leveraged all aspects of the SNOMED CT logical model to produce clinically relevant query results. Logical disjunction and negation queries were possible using the data model, as well as, queries that extended beyond the structural IS_A hierarchy of SNOMED CT to include queries that employed defining attribute-values of SNOMED CT concepts as search parameters. As medical terminologies, such as SNOMED CT, continue to expand, they will become more complex and model consistency will be more difficult to assure. Simultaneously, consumers of data will increasingly demand improvements to query functionality to accommodate additional granularity of clinical concepts without sacrificing speed. This new line of research provides an alternative approach to instantiating and querying patient data represented using advanced computable clinical terminologies. Copyright © 2015 Elsevier Inc. All rights reserved.
Fernández-Suárez, Xosé M; Rigden, Daniel J; Galperin, Michael Y
2014-01-01
The 2014 Nucleic Acids Research Database Issue includes descriptions of 58 new molecular biology databases and recent updates to 123 databases previously featured in NAR or other journals. For convenience, the issue is now divided into eight sections that reflect major subject categories. Among the highlights of this issue are six databases of the transcription factor binding sites in various organisms and updates on such popular databases as CAZy, Database of Genomic Variants (DGV), dbGaP, DrugBank, KEGG, miRBase, Pfam, Reactome, SEED, TCDB and UniProt. There is a strong block of structural databases, which includes, among others, the new RNA Bricks database, updates on PDBe, PDBsum, ArchDB, Gene3D, ModBase, Nucleic Acid Database and the recently revived iPfam database. An update on the NCBI's MMDB describes VAST+, an improved tool for protein structure comparison. Two articles highlight the development of the Structural Classification of Proteins (SCOP) database: one describes SCOPe, which automates assignment of new structures to the existing SCOP hierarchy; the other one describes the first version of SCOP2, with its more flexible approach to classifying protein structures. This issue also includes a collection of articles on bacterial taxonomy and metagenomics, which includes updates on the List of Prokaryotic Names with Standing in Nomenclature (LPSN), Ribosomal Database Project (RDP), the Silva/LTP project and several new metagenomics resources. The NAR online Molecular Biology Database Collection, http://www.oxfordjournals.org/nar/database/c/, has been expanded to 1552 databases. The entire Database Issue is freely available online on the Nucleic Acids Research website (http://nar.oxfordjournals.org/).
From sequence to enzyme mechanism using multi-label machine learning.
De Ferrari, Luna; Mitchell, John B O
2014-05-19
In this work we predict enzyme function at the level of chemical mechanism, providing a finer granularity of annotation than traditional Enzyme Commission (EC) classes. Hence we can predict not only whether a putative enzyme in a newly sequenced organism has the potential to perform a certain reaction, but how the reaction is performed, using which cofactors and with susceptibility to which drugs or inhibitors, details with important consequences for drug and enzyme design. Work that predicts enzyme catalytic activity based on 3D protein structure features limits the prediction of mechanism to proteins already having either a solved structure or a close relative suitable for homology modelling. In this study, we evaluate whether sequence identity, InterPro or Catalytic Site Atlas sequence signatures provide enough information for bulk prediction of enzyme mechanism. By splitting MACiE (Mechanism, Annotation and Classification in Enzymes database) mechanism labels to a finer granularity, which includes the role of the protein chain in the overall enzyme complex, the method can predict at 96% accuracy (and 96% micro-averaged precision, 99.9% macro-averaged recall) the MACiE mechanism definitions of 248 proteins available in the MACiE, EzCatDb (Database of Enzyme Catalytic Mechanisms) and SFLD (Structure Function Linkage Database) databases using an off-the-shelf K-Nearest Neighbours multi-label algorithm. We find that InterPro signatures are critical for accurate prediction of enzyme mechanism. We also find that incorporating Catalytic Site Atlas attributes does not seem to provide additional accuracy. The software code (ml2db), data and results are available online at http://sourceforge.net/projects/ml2db/ and as supplementary files.
PlantDB – a versatile database for managing plant research
Exner, Vivien; Hirsch-Hoffmann, Matthias; Gruissem, Wilhelm; Hennig, Lars
2008-01-01
Background Research in plant science laboratories often involves usage of many different species, cultivars, ecotypes, mutants, alleles or transgenic lines. This creates a great challenge to keep track of the identity of experimental plants and stored samples or seeds. Results Here, we describe PlantDB – a Microsoft® Office Access database – with a user-friendly front-end for managing information relevant for experimental plants. PlantDB can hold information about plants of different species, cultivars or genetic composition. Introduction of a concise identifier system allows easy generation of pedigree trees. In addition, all information about any experimental plant – from growth conditions and dates over extracted samples such as RNA to files containing images of the plants – can be linked unequivocally. Conclusion We have been using PlantDB for several years in our laboratory and found that it greatly facilitates access to relevant information. PMID:18182106
Schoof, Heiko; Zaccaria, Paolo; Gundlach, Heidrun; Lemcke, Kai; Rudd, Stephen; Kolesov, Grigory; Arnold, Roland; Mewes, H. W.; Mayer, Klaus F. X.
2002-01-01
Arabidopsis thaliana is the first plant for which the complete genome has been sequenced and published. Annotation of complex eukaryotic genomes requires more than the assignment of genetic elements to the sequence. Besides completing the list of genes, we need to discover their cellular roles, their regulation and their interactions in order to understand the workings of the whole plant. The MIPS Arabidopsis thaliana Database (MAtDB; http://mips.gsf.de/proj/thal/db) started out as a repository for genome sequence data in the European Scientists Sequencing Arabidopsis (ESSA) project and the Arabidopsis Genome Initiative. Our aim is to transform MAtDB into an integrated biological knowledge resource by integrating diverse data, tools, query and visualization capabilities and by creating a comprehensive resource for Arabidopsis as a reference model for other species, including crop plants. PMID:11752263
ProteinWorldDB: querying radical pairwise alignments among protein sets from complete genomes.
Otto, Thomas Dan; Catanho, Marcos; Tristão, Cristian; Bezerra, Márcia; Fernandes, Renan Mathias; Elias, Guilherme Steinberger; Scaglia, Alexandre Capeletto; Bovermann, Bill; Berstis, Viktors; Lifschitz, Sergio; de Miranda, Antonio Basílio; Degrave, Wim
2010-03-01
Many analyses in modern biological research are based on comparisons between biological sequences, resulting in functional, evolutionary and structural inferences. When large numbers of sequences are compared, heuristics are often used resulting in a certain lack of accuracy. In order to improve and validate results of such comparisons, we have performed radical all-against-all comparisons of 4 million protein sequences belonging to the RefSeq database, using an implementation of the Smith-Waterman algorithm. This extremely intensive computational approach was made possible with the help of World Community Grid, through the Genome Comparison Project. The resulting database, ProteinWorldDB, which contains coordinates of pairwise protein alignments and their respective scores, is now made available. Users can download, compare and analyze the results, filtered by genomes, protein functions or clusters. ProteinWorldDB is integrated with annotations derived from Swiss-Prot, Pfam, KEGG, NCBI Taxonomy database and gene ontology. The database is a unique and valuable asset, representing a major effort to create a reliable and consistent dataset of cross-comparisons of the whole protein content encoded in hundreds of completely sequenced genomes using a rigorous dynamic programming approach. The database can be accessed through http://proteinworlddb.org
Database resources of the National Center for Biotechnology Information.
Sayers, Eric W; Barrett, Tanya; Benson, Dennis A; Bolton, Evan; Bryant, Stephen H; Canese, Kathi; Chetvernin, Vyacheslav; Church, Deanna M; DiCuccio, Michael; Federhen, Scott; Feolo, Michael; Fingerman, Ian M; Geer, Lewis Y; Helmberg, Wolfgang; Kapustin, Yuri; Landsman, David; Lipman, David J; Lu, Zhiyong; Madden, Thomas L; Madej, Tom; Maglott, Donna R; Marchler-Bauer, Aron; Miller, Vadim; Mizrachi, Ilene; Ostell, James; Panchenko, Anna; Phan, Lon; Pruitt, Kim D; Schuler, Gregory D; Sequeira, Edwin; Sherry, Stephen T; Shumway, Martin; Sirotkin, Karl; Slotta, Douglas; Souvorov, Alexandre; Starchenko, Grigory; Tatusova, Tatiana A; Wagner, Lukas; Wang, Yanli; Wilbur, W John; Yaschenko, Eugene; Ye, Jian
2011-01-01
In addition to maintaining the GenBank® nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data made available through the NCBI Web site. NCBI resources include Entrez, the Entrez Programming Utilities, MyNCBI, PubMed, PubMed Central (PMC), Entrez Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Primer-BLAST, COBALT, Electronic PCR, OrfFinder, Splign, ProSplign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, dbVar, Epigenomics, Cancer Chromosomes, Entrez Genomes and related tools, the Map Viewer, Model Maker, Evidence Viewer, Trace Archive, Sequence Read Archive, Retroviral Genotyping Tools, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus (GEO), Entrez Probe, GENSAT, Online Mendelian Inheritance in Man (OMIM), Online Mendelian Inheritance in Animals (OMIA), the Molecular Modeling Database (MMDB), the Conserved Domain Database (CDD), the Conserved Domain Architecture Retrieval Tool (CDART), IBIS, Biosystems, Peptidome, OMSSA, Protein Clusters and the PubChem suite of small molecule databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of these resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov.
Automated Euler and Navier-Stokes Database Generation for a Glide-Back Booster
NASA Technical Reports Server (NTRS)
Chaderjian, Neal M.; Rogers, Stuart E.; Aftosmis, Mike J.; Pandya, Shishir A.; Ahmad, Jasim U.; Tejnil, Edward
2004-01-01
The past two decades have seen a sustained increase in the use of high fidelity Computational Fluid Dynamics (CFD) in basic research, aircraft design, and the analysis of post-design issues. As the fidelity of a CFD method increases, the number of cases that can be readily and affordably computed greatly diminishes. However, computer speeds now exceed 2 GHz, hundreds of processors are currently available and more affordable, and advances in parallel CFD algorithms scale more readily with large numbers of processors. All of these factors make it feasible to compute thousands of high fidelity cases. However, there still remains the overwhelming task of monitoring the solution process. This paper presents an approach to automate the CFD solution process. A new software tool, AeroDB, is used to compute thousands of Euler and Navier-Stokes solutions for a 2nd generation glide-back booster in one week. The solution process exploits a common job-submission grid environment, the NASA Information Power Grid (IPG), using 13 computers located at 4 different geographical sites. Process automation and web-based access to a MySql database greatly reduces the user workload, removing much of the tedium and tendency for user input errors. The AeroDB framework is shown. The user submits/deletes jobs, monitors AeroDB's progress, and retrieves data and plots via a web portal. Once a job is in the database, a job launcher uses an IPG resource broker to decide which computers are best suited to run the job. Job/code requirements, the number of CPUs free on a remote system, and queue lengths are some of the parameters the broker takes into account. The Globus software provides secure services for user authentication, remote shell execution, and secure file transfers over an open network. AeroDB automatically decides when a job is completed. Currently, the Cart3D unstructured flow solver is used for the Euler equations, and the Overflow structured overset flow solver is used for the Navier-Stokes equations. Other codes can be readily included into the AeroDB framework.
Postel, Alexander; Schmeiser, Stefanie; Zimmermann, Bernd; Becher, Paul
2016-01-01
Molecular epidemiology has become an indispensable tool in the diagnosis of diseases and in tracing the infection routes of pathogens. Due to advances in conventional sequencing and the development of high throughput technologies, the field of sequence determination is in the process of being revolutionized. Platforms for sharing sequence information and providing standardized tools for phylogenetic analyses are becoming increasingly important. The database (DB) of the European Union (EU) and World Organisation for Animal Health (OIE) Reference Laboratory for classical swine fever offers one of the world’s largest semi-public virus-specific sequence collections combined with a module for phylogenetic analysis. The classical swine fever (CSF) DB (CSF-DB) became a valuable tool for supporting diagnosis and epidemiological investigations of this highly contagious disease in pigs with high socio-economic impacts worldwide. The DB has been re-designed and now allows for the storage and analysis of traditionally used, well established genomic regions and of larger genomic regions including complete viral genomes. We present an application example for the analysis of highly similar viral sequences obtained in an endemic disease situation and introduce the new geographic “CSF Maps” tool. The concept of this standardized and easy-to-use DB with an integrated genetic typing module is suited to serve as a blueprint for similar platforms for other human or animal viruses. PMID:27827988
OperomeDB: A Database of Condition-Specific Transcription Units in Prokaryotic Genomes.
Chetal, Kashish; Janga, Sarath Chandra
2015-01-01
Background. In prokaryotic organisms, a substantial fraction of adjacent genes are organized into operons-codirectionally organized genes in prokaryotic genomes with the presence of a common promoter and terminator. Although several available operon databases provide information with varying levels of reliability, very few resources provide experimentally supported results. Therefore, we believe that the biological community could benefit from having a new operon prediction database with operons predicted using next-generation RNA-seq datasets. Description. We present operomeDB, a database which provides an ensemble of all the predicted operons for bacterial genomes using available RNA-sequencing datasets across a wide range of experimental conditions. Although several studies have recently confirmed that prokaryotic operon structure is dynamic with significant alterations across environmental and experimental conditions, there are no comprehensive databases for studying such variations across prokaryotic transcriptomes. Currently our database contains nine bacterial organisms and 168 transcriptomes for which we predicted operons. User interface is simple and easy to use, in terms of visualization, downloading, and querying of data. In addition, because of its ability to load custom datasets, users can also compare their datasets with publicly available transcriptomic data of an organism. Conclusion. OperomeDB as a database should not only aid experimental groups working on transcriptome analysis of specific organisms but also enable studies related to computational and comparative operomics.
Pandey, Ram Vinay; Kofler, Robert; Orozco-terWengel, Pablo; Nolte, Viola; Schlötterer, Christian
2011-03-02
The enormous potential of natural variation for the functional characterization of genes has been neglected for a long time. Only since recently, functional geneticists are starting to account for natural variation in their analyses. With the new sequencing technologies it has become feasible to collect sequence information for multiple individuals on a genomic scale. In particular sequencing pooled DNA samples has been shown to provide a cost-effective approach for characterizing variation in natural populations. While a range of software tools have been developed for mapping these reads onto a reference genome and extracting SNPs, linking this information to population genetic estimators and functional information still poses a major challenge to many researchers. We developed PoPoolation DB a user-friendly integrated database. Popoolation DB links variation in natural populations with functional information, allowing a wide range of researchers to take advantage of population genetic data. PoPoolation DB provides the user with population genetic parameters (Watterson's θ or Tajima's π), Tajima's D, SNPs, allele frequencies and indels in regions of interest. The database can be queried by gene name, chromosomal position, or a user-provided query sequence or GTF file. We anticipate that PoPoolation DB will be a highly versatile tool for functional geneticists as well as evolutionary biologists. PoPoolation DB, available at http://www.popoolation.at/pgt, provides an integrated platform for researchers to investigate natural polymorphism and associated functional annotations from UCSC and Flybase genome browsers, population genetic estimators and RNA-seq information.
A mobile trauma database with charge capture.
Moulton, Steve; Myung, Dan; Chary, Aron; Chen, Joshua; Agarwal, Suresh; Emhoff, Tim; Burke, Peter; Hirsch, Erwin
2005-11-01
Charge capture plays an important role in every surgical practice. We have developed and merged a custom mobile database (DB) system with our trauma registry (TRACS), to better understand our billing methods, revenue generators, and areas for improved revenue capture. The mobile database runs on handheld devices using the Windows Compact Edition platform. The front end was written in C# and the back end is SQL. The mobile database operates as a thick client; it includes active and inactive patient lists, billing screens, hot pick lists, and Current Procedural Terminology and International Classification of Diseases, Ninth Revision code sets. Microsoft Information Internet Server provides secure data transaction services between the back ends stored on each device. Traditional, hand written billing information for three of five adult trauma surgeons was averaged over a 5-month period. Electronic billing information was then collected over a 3-month period using handheld devices and the subject software application. One surgeon used the software for all 3 months, and two surgeons used it for the latter 2 months of the electronic data collection period. This electronic billing information was combined with TRACS data to determine the clinical characteristics of the trauma patients who were and were not captured using the mobile database. Total charges increased by 135%, 148%, and 228% for each of the three trauma surgeons who used the mobile DB application. The majority of additional charges were for evaluation and management services. Patients who were captured and billed at the point of care using the mobile DB had higher Injury Severity Scores, were more likely to undergo an operative procedure, and had longer lengths of stay compared with those who were not captured. Total charges more than doubled using a mobile database to bill at the point of care. A subsequent comparison of TRACS data with billing information revealed a large amount of uncaptured patient revenue. Greater familiarity and broader use of mobile database technology holds the potential for even greater revenue capture.
MultitaskProtDB-II: an update of a database of multitasking/moonlighting proteins
Franco-Serrano, Luís; Hernández, Sergio; Calvo, Alejandra; Severi, María A; Ferragut, Gabriela; Pérez-Pons, JosepAntoni; Piñol, Jaume; Pich, Òscar; Mozo-Villarias, Ángel; Amela, Isaac
2018-01-01
Abstract Multitasking, or moonlighting, is the capability of some proteins to execute two or more biological functions. MultitaskProtDB-II is a database of multifunctional proteins that has been updated. In the previous version, the information contained was: NCBI and UniProt accession numbers, canonical and additional biological functions, organism, monomeric/oligomeric states, PDB codes and bibliographic references. In the present update, the number of entries has been increased from 288 to 694 moonlighting proteins. MultitaskProtDB-II is continually being curated and updated. The new database also contains the following information: GO descriptors for the canonical and moonlighting functions, three-dimensional structure (for those proteins lacking PDB structure, a model was made using Itasser and Phyre), the involvement of the proteins in human diseases (78% of human moonlighting proteins) and whether the protein is a target of a current drug (48% of human moonlighting proteins). These numbers highlight the importance of these proteins for the analysis and explanation of human diseases and target-directed drug design. Moreover, 25% of the proteins of the database are involved in virulence of pathogenic microorganisms, largely in the mechanism of adhesion to the host. This highlights their importance for the mechanism of microorganism infection and vaccine design. MultitaskProtDB-II is available at http://wallace.uab.es/multitaskII. PMID:29136215
NASA Astrophysics Data System (ADS)
Sakano, Toshikazu; Furukawa, Isao; Okumura, Akira; Yamaguchi, Takahiro; Fujii, Tetsuro; Ono, Sadayasu; Suzuki, Junji; Matsuya, Shoji; Ishihara, Teruo
2001-08-01
The wide spread of digital technology in the medical field has led to a demand for the high-quality, high-speed, and user-friendly digital image presentation system in the daily medical conferences. To fulfill this demand, we developed a presentation system for radiological and pathological images. It is composed of a super-high-definition (SHD) imaging system, a radiological image database (R-DB), a pathological image database (P-DB), and the network interconnecting these three. The R-DB consists of a 270GB RAID, a database server workstation, and a film digitizer. The P-DB includes an optical microscope, a four-million-pixel digital camera, a 90GB RAID, and a database server workstation. A 100Mbps Ethernet LAN interconnects all the sub-systems. The Web-based system operation software was developed for easy operation. We installed the whole system in NTT East Kanto Hospital to evaluate it in the weekly case conferences. The SHD system could display digital full-color images of 2048 x 2048 pixels on a 28-inch CRT monitor. The doctors evaluated the image quality and size, and found them applicable to the actual medical diagnosis. They also appreciated short image switching time that contributed to smooth presentation. Thus, we confirmed that its characteristics met the requirements.
footprintDB: a database of transcription factors with annotated cis elements and binding interfaces.
Sebastian, Alvaro; Contreras-Moreira, Bruno
2014-01-15
Traditional and high-throughput techniques for determining transcription factor (TF) binding specificities are generating large volumes of data of uneven quality, which are scattered across individual databases. FootprintDB integrates some of the most comprehensive freely available libraries of curated DNA binding sites and systematically annotates the binding interfaces of the corresponding TFs. The first release contains 2422 unique TF sequences, 10 112 DNA binding sites and 3662 DNA motifs. A survey of the included data sources, organisms and TF families was performed together with proprietary database TRANSFAC, finding that footprintDB has a similar coverage of multicellular organisms, while also containing bacterial regulatory data. A search engine has been designed that drives the prediction of DNA motifs for input TFs, or conversely of TF sequences that might recognize input regulatory sequences, by comparison with database entries. Such predictions can also be extended to a single proteome chosen by the user, and results are ranked in terms of interface similarity. Benchmark experiments with bacterial, plant and human data were performed to measure the predictive power of footprintDB searches, which were able to correctly recover 10, 55 and 90% of the tested sequences, respectively. Correctly predicted TFs had a higher interface similarity than the average, confirming its diagnostic value. Web site implemented in PHP,Perl, MySQL and Apache. Freely available from http://floresta.eead.csic.es/footprintdb.
Advanced technologies for scalable ATLAS conditions database access on the grid
NASA Astrophysics Data System (ADS)
Basset, R.; Canali, L.; Dimitrov, G.; Girone, M.; Hawkings, R.; Nevski, P.; Valassi, A.; Vaniachine, A.; Viegas, F.; Walker, R.; Wong, A.
2010-04-01
During massive data reprocessing operations an ATLAS Conditions Database application must support concurrent access from numerous ATLAS data processing jobs running on the Grid. By simulating realistic work-flow, ATLAS database scalability tests provided feedback for Conditions Db software optimization and allowed precise determination of required distributed database resources. In distributed data processing one must take into account the chaotic nature of Grid computing characterized by peak loads, which can be much higher than average access rates. To validate database performance at peak loads, we tested database scalability at very high concurrent jobs rates. This has been achieved through coordinated database stress tests performed in series of ATLAS reprocessing exercises at the Tier-1 sites. The goal of database stress tests is to detect scalability limits of the hardware deployed at the Tier-1 sites, so that the server overload conditions can be safely avoided in a production environment. Our analysis of server performance under stress tests indicates that Conditions Db data access is limited by the disk I/O throughput. An unacceptable side-effect of the disk I/O saturation is a degradation of the WLCG 3D Services that update Conditions Db data at all ten ATLAS Tier-1 sites using the technology of Oracle Streams. To avoid such bottlenecks we prototyped and tested a novel approach for database peak load avoidance in Grid computing. Our approach is based upon the proven idea of pilot job submission on the Grid: instead of the actual query, an ATLAS utility library sends to the database server a pilot query first.
MycoDB, a global database of plant response to mycorrhizal fungi.
Chaudhary, V Bala; Rúa, Megan A; Antoninka, Anita; Bever, James D; Cannon, Jeffery; Craig, Ashley; Duchicela, Jessica; Frame, Alicia; Gardes, Monique; Gehring, Catherine; Ha, Michelle; Hart, Miranda; Hopkins, Jacob; Ji, Baoming; Johnson, Nancy Collins; Kaonongbua, Wittaya; Karst, Justine; Koide, Roger T; Lamit, Louis J; Meadow, James; Milligan, Brook G; Moore, John C; Pendergast, Thomas H; Piculell, Bridget; Ramsby, Blake; Simard, Suzanne; Shrestha, Shubha; Umbanhowar, James; Viechtbauer, Wolfgang; Walters, Lawrence; Wilson, Gail W T; Zee, Peter C; Hoeksema, Jason D
2016-05-10
Plants form belowground associations with mycorrhizal fungi in one of the most common symbioses on Earth. However, few large-scale generalizations exist for the structure and function of mycorrhizal symbioses, as the nature of this relationship varies from mutualistic to parasitic and is largely context-dependent. We announce the public release of MycoDB, a database of 4,010 studies (from 438 unique publications) to aid in multi-factor meta-analyses elucidating the ecological and evolutionary context in which mycorrhizal fungi alter plant productivity. Over 10 years with nearly 80 collaborators, we compiled data on the response of plant biomass to mycorrhizal fungal inoculation, including meta-analysis metrics and 24 additional explanatory variables that describe the biotic and abiotic context of each study. We also include phylogenetic trees for all plants and fungi in the database. To our knowledge, MycoDB is the largest ecological meta-analysis database. We aim to share these data to highlight significant gaps in mycorrhizal research and encourage synthesis to explore the ecological and evolutionary generalities that govern mycorrhizal functioning in ecosystems.
MycoDB, a global database of plant response to mycorrhizal fungi
Chaudhary, V. Bala; Rúa, Megan A.; Antoninka, Anita; Bever, James D.; Cannon, Jeffery; Craig, Ashley; Duchicela, Jessica; Frame, Alicia; Gardes, Monique; Gehring, Catherine; Ha, Michelle; Hart, Miranda; Hopkins, Jacob; Ji, Baoming; Johnson, Nancy Collins; Kaonongbua, Wittaya; Karst, Justine; Koide, Roger T.; Lamit, Louis J.; Meadow, James; Milligan, Brook G.; Moore, John C.; Pendergast IV, Thomas H.; Piculell, Bridget; Ramsby, Blake; Simard, Suzanne; Shrestha, Shubha; Umbanhowar, James; Viechtbauer, Wolfgang; Walters, Lawrence; Wilson, Gail W. T.; Zee, Peter C.; Hoeksema, Jason D.
2016-01-01
Plants form belowground associations with mycorrhizal fungi in one of the most common symbioses on Earth. However, few large-scale generalizations exist for the structure and function of mycorrhizal symbioses, as the nature of this relationship varies from mutualistic to parasitic and is largely context-dependent. We announce the public release of MycoDB, a database of 4,010 studies (from 438 unique publications) to aid in multi-factor meta-analyses elucidating the ecological and evolutionary context in which mycorrhizal fungi alter plant productivity. Over 10 years with nearly 80 collaborators, we compiled data on the response of plant biomass to mycorrhizal fungal inoculation, including meta-analysis metrics and 24 additional explanatory variables that describe the biotic and abiotic context of each study. We also include phylogenetic trees for all plants and fungi in the database. To our knowledge, MycoDB is the largest ecological meta-analysis database. We aim to share these data to highlight significant gaps in mycorrhizal research and encourage synthesis to explore the ecological and evolutionary generalities that govern mycorrhizal functioning in ecosystems. PMID:27163938
Relax with CouchDB--into the non-relational DBMS era of bioinformatics.
Manyam, Ganiraju; Payton, Michelle A; Roth, Jack A; Abruzzo, Lynne V; Coombes, Kevin R
2012-07-01
With the proliferation of high-throughput technologies, genome-level data analysis has become common in molecular biology. Bioinformaticians are developing extensive resources to annotate and mine biological features from high-throughput data. The underlying database management systems for most bioinformatics software are based on a relational model. Modern non-relational databases offer an alternative that has flexibility, scalability, and a non-rigid design schema. Moreover, with an accelerated development pace, non-relational databases like CouchDB can be ideal tools to construct bioinformatics utilities. We describe CouchDB by presenting three new bioinformatics resources: (a) geneSmash, which collates data from bioinformatics resources and provides automated gene-centric annotations, (b) drugBase, a database of drug-target interactions with a web interface powered by geneSmash, and (c) HapMap-CN, which provides a web interface to query copy number variations from three SNP-chip HapMap datasets. In addition to the web sites, all three systems can be accessed programmatically via web services. Copyright © 2012 Elsevier Inc. All rights reserved.
MultitaskProtDB: a database of multitasking proteins.
Hernández, Sergio; Ferragut, Gabriela; Amela, Isaac; Perez-Pons, JosepAntoni; Piñol, Jaume; Mozo-Villarias, Angel; Cedano, Juan; Querol, Enrique
2014-01-01
We have compiled MultitaskProtDB, available online at http://wallace.uab.es/multitask, to provide a repository where the many multitasking proteins found in the literature can be stored. Multitasking or moonlighting is the capability of some proteins to execute two or more biological functions. Usually, multitasking proteins are experimentally revealed by serendipity. This ability of proteins to perform multitasking functions helps us to understand one of the ways used by cells to perform many complex functions with a limited number of genes. Even so, the study of this phenomenon is complex because, among other things, there is no database of moonlighting proteins. The existence of such a tool facilitates the collection and dissemination of these important data. This work reports the database, MultitaskProtDB, which is designed as a friendly user web page containing >288 multitasking proteins with their NCBI and UniProt accession numbers, canonical and additional biological functions, monomeric/oligomeric states, PDB codes when available and bibliographic references. This database also serves to gain insight into some characteristics of multitasking proteins such as frequencies of the different pairs of functions, phylogenetic conservation and so forth.
MycoDB, a global database of plant response to mycorrhizal fungi
NASA Astrophysics Data System (ADS)
Chaudhary, V. Bala; Rúa, Megan A.; Antoninka, Anita; Bever, James D.; Cannon, Jeffery; Craig, Ashley; Duchicela, Jessica; Frame, Alicia; Gardes, Monique; Gehring, Catherine; Ha, Michelle; Hart, Miranda; Hopkins, Jacob; Ji, Baoming; Johnson, Nancy Collins; Kaonongbua, Wittaya; Karst, Justine; Koide, Roger T.; Lamit, Louis J.; Meadow, James; Milligan, Brook G.; Moore, John C.; Pendergast, Thomas H., IV; Piculell, Bridget; Ramsby, Blake; Simard, Suzanne; Shrestha, Shubha; Umbanhowar, James; Viechtbauer, Wolfgang; Walters, Lawrence; Wilson, Gail W. T.; Zee, Peter C.; Hoeksema, Jason D.
2016-05-01
Plants form belowground associations with mycorrhizal fungi in one of the most common symbioses on Earth. However, few large-scale generalizations exist for the structure and function of mycorrhizal symbioses, as the nature of this relationship varies from mutualistic to parasitic and is largely context-dependent. We announce the public release of MycoDB, a database of 4,010 studies (from 438 unique publications) to aid in multi-factor meta-analyses elucidating the ecological and evolutionary context in which mycorrhizal fungi alter plant productivity. Over 10 years with nearly 80 collaborators, we compiled data on the response of plant biomass to mycorrhizal fungal inoculation, including meta-analysis metrics and 24 additional explanatory variables that describe the biotic and abiotic context of each study. We also include phylogenetic trees for all plants and fungi in the database. To our knowledge, MycoDB is the largest ecological meta-analysis database. We aim to share these data to highlight significant gaps in mycorrhizal research and encourage synthesis to explore the ecological and evolutionary generalities that govern mycorrhizal functioning in ecosystems.
MoonDB — A Data System for Analytical Data of Lunar Samples
NASA Astrophysics Data System (ADS)
Lehnert, K.; Ji, P.; Cai, M.; Evans, C.; Zeigler, R.
2018-04-01
MoonDB is a data system that makes analytical data from the Apollo lunar sample collection and lunar meteorites accessible by synthesizing published and unpublished datasets in a relational database with an online search interface.
DNetDB: The human disease network database based on dysfunctional regulation mechanism.
Yang, Jing; Wu, Su-Juan; Yang, Shao-You; Peng, Jia-Wei; Wang, Shi-Nuo; Wang, Fu-Yan; Song, Yu-Xing; Qi, Ting; Li, Yi-Xue; Li, Yuan-Yuan
2016-05-21
Disease similarity study provides new insights into disease taxonomy, pathogenesis, which plays a guiding role in diagnosis and treatment. The early studies were limited to estimate disease similarities based on clinical manifestations, disease-related genes, medical vocabulary concepts or registry data, which were inevitably biased to well-studied diseases and offered small chance of discovering novel findings in disease relationships. In other words, genome-scale expression data give us another angle to address this problem since simultaneous measurement of the expression of thousands of genes allows for the exploration of gene transcriptional regulation, which is believed to be crucial to biological functions. Although differential expression analysis based methods have the potential to explore new disease relationships, it is difficult to unravel the upstream dysregulation mechanisms of diseases. We therefore estimated disease similarities based on gene expression data by using differential coexpression analysis, a recently emerging method, which has been proved to be more potential to capture dysfunctional regulation mechanisms than differential expression analysis. A total of 1,326 disease relationships among 108 diseases were identified, and the relevant information constituted the human disease network database (DNetDB). Benefiting from the use of differential coexpression analysis, the potential common dysfunctional regulation mechanisms shared by disease pairs (i.e. disease relationships) were extracted and presented. Statistical indicators, common disease-related genes and drugs shared by disease pairs were also included in DNetDB. In total, 1,326 disease relationships among 108 diseases, 5,598 pathways, 7,357 disease-related genes and 342 disease drugs are recorded in DNetDB, among which 3,762 genes and 148 drugs are shared by at least two diseases. DNetDB is the first database focusing on disease similarity from the viewpoint of gene regulation mechanism. It provides an easy-to-use web interface to search and browse the disease relationships and thus helps to systematically investigate etiology and pathogenesis, perform drug repositioning, and design novel therapeutic interventions.Database URL: http://app.scbit.org/DNetDB/ #.
Xia, Kai; Dong, Dong; Han, Jing-Dong J
2006-01-01
Background Although protein-protein interaction (PPI) networks have been explored by various experimental methods, the maps so built are still limited in coverage and accuracy. To further expand the PPI network and to extract more accurate information from existing maps, studies have been carried out to integrate various types of functional relationship data. A frequently updated database of computationally analyzed potential PPIs to provide biological researchers with rapid and easy access to analyze original data as a biological network is still lacking. Results By applying a probabilistic model, we integrated 27 heterogeneous genomic, proteomic and functional annotation datasets to predict PPI networks in human. In addition to previously studied data types, we show that phenotypic distances and genetic interactions can also be integrated to predict PPIs. We further built an easy-to-use, updatable integrated PPI database, the Integrated Network Database (IntNetDB) online, to provide automatic prediction and visualization of PPI network among genes of interest. The networks can be visualized in SVG (Scalable Vector Graphics) format for zooming in or out. IntNetDB also provides a tool to extract topologically highly connected network neighborhoods from a specific network for further exploration and research. Using the MCODE (Molecular Complex Detections) algorithm, 190 such neighborhoods were detected among all the predicted interactions. The predicted PPIs can also be mapped to worm, fly and mouse interologs. Conclusion IntNetDB includes 180,010 predicted protein-protein interactions among 9,901 human proteins and represents a useful resource for the research community. Our study has increased prediction coverage by five-fold. IntNetDB also provides easy-to-use network visualization and analysis tools that allow biological researchers unfamiliar with computational biology to access and analyze data over the internet. The web interface of IntNetDB is freely accessible at . Visualization requires Mozilla version 1.8 (or higher) or Internet Explorer with installation of SVGviewer. PMID:17112386
dbCPG: A web resource for cancer predisposition genes.
Wei, Ran; Yao, Yao; Yang, Wu; Zheng, Chun-Hou; Zhao, Min; Xia, Junfeng
2016-06-21
Cancer predisposition genes (CPGs) are genes in which inherited mutations confer highly or moderately increased risks of developing cancer. Identification of these genes and understanding the biological mechanisms that underlie them is crucial for the prevention, early diagnosis, and optimized management of cancer. Over the past decades, great efforts have been made to identify CPGs through multiple strategies. However, information on these CPGs and their molecular functions is scattered. To address this issue and provide a comprehensive resource for researchers, we developed the Cancer Predisposition Gene Database (dbCPG, Database URL: http://bioinfo.ahu.edu.cn:8080/dbCPG/index.jsp), the first literature-based gene resource for exploring human CPGs. It contains 827 human (724 protein-coding, 23 non-coding, and 80 unknown type genes), 637 rats, and 658 mouse CPGs. Furthermore, data mining was performed to gain insights into the understanding of the CPGs data, including functional annotation, gene prioritization, network analysis of prioritized genes and overlap analysis across multiple cancer types. A user-friendly web interface with multiple browse, search, and upload functions was also developed to facilitate access to the latest information on CPGs. Taken together, the dbCPG database provides a comprehensive data resource for further studies of cancer predisposition genes.
Towards communication-efficient quantum oblivious key distribution
NASA Astrophysics Data System (ADS)
Panduranga Rao, M. V.; Jakobi, M.
2013-01-01
Symmetrically private information retrieval, a fundamental problem in the field of secure multiparty computation, is defined as follows: A database D of N bits held by Bob is queried by a user Alice who is interested in the bit Db in such a way that (1) Alice learns Db and only Db and (2) Bob does not learn anything about Alice's choice b. While solutions to this problem in the classical domain rely largely on unproven computational complexity theoretic assumptions, it is also known that perfect solutions that guarantee both database and user privacy are impossible in the quantum domain. Jakobi [Phys. Rev. APLRAAN1050-294710.1103/PhysRevA.83.022301 83, 022301 (2011)] proposed a protocol for oblivious transfer using well-known quantum key device (QKD) techniques to establish an oblivious key to solve this problem. Their solution provided a good degree of database and user privacy (using physical principles like the impossibility of perfectly distinguishing nonorthogonal quantum states and the impossibility of superluminal communication) while being loss-resistant and implementable with commercial QKD devices (due to the use of the Scarani-Acin-Ribordy-Gisin 2004 protocol). However, their quantum oblivious key distribution (QOKD) protocol requires a communication complexity of O(NlogN). Since modern databases can be extremely large, it is important to reduce this communication as much as possible. In this paper, we first suggest a modification of their protocol wherein the number of qubits that need to be exchanged is reduced to O(N). A subsequent generalization reduces the quantum communication complexity even further in such a way that only a few hundred qubits are needed to be transferred even for very large databases.
Poole, Colin F; Qian, Jing; Kiridena, Waruna; Dekay, Colleen; Koziol, Wladyslaw W
2006-11-17
The solvation parameter model is used to characterize the separation characteristics of two application-specific open-tubular columns (Rtx-Volatiles and Rtx-VGC) and a general purpose column for the separation of volatile organic compounds (DB-WAXetr) at five equally spaced temperatures over the range 60-140 degrees C. System constant differences and retention factor correlation plots are then used to determine selectivity differences between the above columns and their closest neighbors in a large database of system constants and retention factors for forty-four open-tubular columns. The Rtx-Volatiles column is shown to have separation characteristics predicted for a poly(dimethyldiphenylsiloxane) stationary phase containing about 16% diphenylsiloxane monomer. The Rtx-VGC column has separation properties similar to the poly(cyanopropylphenyldimethylsiloxane) stationary phase containing 14% cyanopropylphenylsiloxane monomer DB-1701 for non-polar and dipolar/polarizable compounds but significantly different characteristics for the separation of hydrogen-bond acids. For all practical purposes the DB-WAXetr column is shown to be selectivity equivalent to poly(ethylene glycol) columns prepared using different chemistries for bonding and immobilizing the stationary phase. Principal component analysis and cluster analysis are then used to classify the system constants for the above columns and a sub-database of eleven open-tubular columns (DB-1, HP-5, DB-VRX, Rtx-20, DB-35, Rtx-50, Rtx-65, DB-1301, DB-1701, DB-200, and DB-624) commonly used for the separation of volatile organic compounds. A rationale basis for column selection based on differences in intermolecular interactions is presented as an aid to method development for the separation of volatile organic compounds.
PGSB/MIPS PlantsDB Database Framework for the Integration and Analysis of Plant Genome Data.
Spannagl, Manuel; Nussbaumer, Thomas; Bader, Kai; Gundlach, Heidrun; Mayer, Klaus F X
2017-01-01
Plant Genome and Systems Biology (PGSB), formerly Munich Institute for Protein Sequences (MIPS) PlantsDB, is a database framework for the integration and analysis of plant genome data, developed and maintained for more than a decade now. Major components of that framework are genome databases and analysis resources focusing on individual (reference) genomes providing flexible and intuitive access to data. Another main focus is the integration of genomes from both model and crop plants to form a scaffold for comparative genomics, assisted by specialized tools such as the CrowsNest viewer to explore conserved gene order (synteny). Data exchange and integrated search functionality with/over many plant genome databases is provided within the transPLANT project.
2016-03-01
science IT information technology JBOD just a bunch of disks JDBC java database connectivity xviii JPME Joint Professional Military Education JSO...Joint Service Officer JVM java virtual machine MPP massively parallel processing MPTE Manpower, Personnel, Training, and Education NAVMAC Navy...27 external database, whether it is MySQL , Oracle, DB2, or SQL Server (Teller, 2015). Connectors optimize the data transfer by obtaining metadata
ERIC Educational Resources Information Center
Moore, Jack
1988-01-01
The article describes the IBM/Special Needs Exchange which consists of: (1) electronic mail, conferencing, and a library of text and program files on the CompuServe Information Service; and (2) a dial-in database of special education software for IBM and compatible computers. (DB)
The 2018 Nucleic Acids Research database issue and the online molecular biology database collection.
Rigden, Daniel J; Fernández, Xosé M
2018-01-04
The 2018 Nucleic Acids Research Database Issue contains 181 papers spanning molecular biology. Among them, 82 are new and 84 are updates describing resources that appeared in the Issue previously. The remaining 15 cover databases most recently published elsewhere. Databases in the area of nucleic acids include 3DIV for visualisation of data on genome 3D structure and RNArchitecture, a hierarchical classification of RNA families. Protein databases include the established SMART, ELM and MEROPS while GPCRdb and the newcomer STCRDab cover families of biomedical interest. In the area of metabolism, HMDB and Reactome both report new features while PULDB appears in NAR for the first time. This issue also contains reports on genomics resources including Ensembl, the UCSC Genome Browser and ENCODE. Update papers from the IUPHAR/BPS Guide to Pharmacology and DrugBank are highlights of the drug and drug target section while a number of proteomics databases including proteomicsDB are also covered. The entire Database Issue is freely available online on the Nucleic Acids Research website (https://academic.oup.com/nar). The NAR online Molecular Biology Database Collection has been updated, reviewing 138 entries, adding 88 new resources and eliminating 47 discontinued URLs, bringing the current total to 1737 databases. It is available at http://www.oxfordjournals.org/nar/database/c/. © The Author(s) 2018. Published by Oxford University Press on behalf of Nucleic Acids Research.
Woo, Patrick C. Y.; Teng, Jade L. L.; Yeung, Juilian M. Y.; Tse, Herman; Lau, Susanna K. P.; Yuen, Kwok-Yung
2011-01-01
Despite the increasing use of 16S rRNA gene sequencing, interpretation of 16S rRNA gene sequence results is one of the most difficult problems faced by clinical microbiologists and technicians. To overcome the problems we encountered in the existing databases during 16S rRNA gene sequence interpretation, we built a comprehensive database, 16SpathDB (http://147.8.74.24/16SpathDB) based on the 16S rRNA gene sequences of all medically important bacteria listed in the Manual of Clinical Microbiology and evaluated its use for automated identification of these bacteria. Among 91 nonduplicated bacterial isolates collected in our clinical microbiology laboratory, 71 (78%) were reported by 16SpathDB as a single bacterial species having >98.0% nucleotide identity with the query sequence, 19 (20.9%) were reported as more than one bacterial species having >98.0% nucleotide identity with the query sequence, and 1 (1.1%) was reported as no match. For the 71 bacterial isolates reported as a single bacterial species, all results were identical to their true identities as determined by a polyphasic approach. For the 19 bacterial isolates reported as more than one bacterial species, all results contained their true identities as determined by a polyphasic approach and all of them had their true identities as the “best match in 16SpathDB.” For the isolate (Gordonibacter pamelaeae) reported as no match, the bacterium has never been reported to be associated with human disease and was not included in the Manual of Clinical Microbiology. 16SpathDB is an automated, user-friendly, efficient, accurate, and regularly updated database for 16S rRNA gene sequence interpretation in clinical microbiology laboratories. PMID:21389154
GeneSigDB: a manually curated database and resource for analysis of gene expression signatures
Culhane, Aedín C.; Schröder, Markus S.; Sultana, Razvan; Picard, Shaita C.; Martinelli, Enzo N.; Kelly, Caroline; Haibe-Kains, Benjamin; Kapushesky, Misha; St Pierre, Anne-Alyssa; Flahive, William; Picard, Kermshlise C.; Gusenleitner, Daniel; Papenhausen, Gerald; O'Connor, Niall; Correll, Mick; Quackenbush, John
2012-01-01
GeneSigDB (http://www.genesigdb.org or http://compbio.dfci.harvard.edu/genesigdb/) is a database of gene signatures that have been extracted and manually curated from the published literature. It provides a standardized resource of published prognostic, diagnostic and other gene signatures of cancer and related disease to the community so they can compare the predictive power of gene signatures or use these in gene set enrichment analysis. Since GeneSigDB release 1.0, we have expanded from 575 to 3515 gene signatures, which were collected and transcribed from 1604 published articles largely focused on gene expression in cancer, stem cells, immune cells, development and lung disease. We have made substantial upgrades to the GeneSigDB website to improve accessibility and usability, including adding a tag cloud browse function, facetted navigation and a ‘basket’ feature to store genes or gene signatures of interest. Users can analyze GeneSigDB gene signatures, or upload their own gene list, to identify gene signatures with significant gene overlap and results can be viewed on a dynamic editable heatmap that can be downloaded as a publication quality image. All data in GeneSigDB can be downloaded in numerous formats including .gmt file format for gene set enrichment analysis or as a R/Bioconductor data file. GeneSigDB is available from http://www.genesigdb.org. PMID:22110038
MIPSPlantsDB—plant database resource for integrative and comparative plant genome research
Spannagl, Manuel; Noubibou, Octave; Haase, Dirk; Yang, Li; Gundlach, Heidrun; Hindemitt, Tobias; Klee, Kathrin; Haberer, Georg; Schoof, Heiko; Mayer, Klaus F. X.
2007-01-01
Genome-oriented plant research delivers rapidly increasing amount of plant genome data. Comprehensive and structured information resources are required to structure and communicate genome and associated analytical data for model organisms as well as for crops. The increase in available plant genomic data enables powerful comparative analysis and integrative approaches. PlantsDB aims to provide data and information resources for individual plant species and in addition to build a platform for integrative and comparative plant genome research. PlantsDB is constituted from genome databases for Arabidopsis, Medicago, Lotus, rice, maize and tomato. Complementary data resources for cis elements, repetive elements and extensive cross-species comparisons are implemented. The PlantsDB portal can be reached at . PMID:17202173
Anderson-Teixeira, Kristina J; Wang, Maria M H; McGarvey, Jennifer C; LeBauer, David S
2016-05-01
Tropical forests play a critical role in the global carbon (C) cycle, storing ~45% of terrestrial C and constituting the largest component of the terrestrial C sink. Despite their central importance to the global C cycle, their ecosystem-level C cycles are not as well-characterized as those of extra-tropical forests, and knowledge gaps hamper efforts to quantify C budgets across the tropics and to model tropical forest-climate interactions. To advance understanding of C dynamics of pantropical forests, we compiled a new database, the Tropical Forest C database (TropForC-db), which contains data on ground-based measurements of ecosystem-level C stocks and annual fluxes along with disturbance history. This database currently contains 3568 records from 845 plots in 178 geographically distinct areas, making it the largest and most comprehensive database of its type. Using TropForC-db, we characterized C stocks and fluxes for young, intermediate-aged, and mature forests. Relative to existing C budgets of extra-tropical forests, mature tropical broadleaf evergreen forests had substantially higher gross primary productivity (GPP) and ecosystem respiration (Reco), their autotropic respiration (Ra) consumed a larger proportion (~67%) of GPP, and their woody stem growth (ANPPstem) represented a smaller proportion of net primary productivity (NPP, ~32%) or GPP (~9%). In regrowth stands, aboveground biomass increased rapidly during the first 20 years following stand-clearing disturbance, with slower accumulation following agriculture and in deciduous forests, and continued to accumulate at a slower pace in forests aged 20-100 years. Most other C stocks likewise increased with stand age, while potential to describe age trends in C fluxes was generally data-limited. We expect that TropForC-db will prove useful for model evaluation and for quantifying the contribution of forests to the global C cycle. The database version associated with this publication is archived in Dryad (DOI: 10.5061/dryad.t516f) and a dynamic version is maintained at https://github.com/forc-db. Published 2016. This article is a U.S. Government work and is in the public domain in the USA.
CerebralWeb: a Cytoscape.js plug-in to visualize networks stratified by subcellular localization.
Frias, Silvia; Bryan, Kenneth; Brinkman, Fiona S L; Lynn, David J
2015-01-01
CerebralWeb is a light-weight JavaScript plug-in that extends Cytoscape.js to enable fast and interactive visualization of molecular interaction networks stratified based on subcellular localization or other user-supplied annotation. The application is designed to be easily integrated into any website and is configurable to support customized network visualization. CerebralWeb also supports the automatic retrieval of Cerebral-compatible localizations for human, mouse and bovine genes via a web service and enables the automated parsing of Cytoscape compatible XGMML network files. CerebralWeb currently supports embedded network visualization on the InnateDB (www.innatedb.com) and Allergy and Asthma Portal (allergen.innatedb.com) database and analysis resources. Database tool URL: http://www.innatedb.com/CerebralWeb © The Author(s) 2015. Published by Oxford University Press.
Modeling Solar Atmospheric Phenomena with AtomDB and PyAtomDB
NASA Astrophysics Data System (ADS)
Dupont, Marcus; Foster, Adam
2018-01-01
Taking advantage of the modeling tools made available by PyAtomDB (Foster 2015), we evaluated the impact of changing atomic data on solar phenomena, in particular their effects on models of coronal mass ejections (CME). Intitially, we perform modifications to the canonical SunNEI code (Murphy et al. 2011) in order to include non-equilibrium ionization (NEI) processes that occur in the CME modeled in SunNEI. The methods used involve the consideration of radiaitive cooling as well as ion balance calculations. These calculations were subsequently implemented within the SunNEI simulation. The insertion of aforementioned processes and parameter customizaton produced quite similar results of the original except for the case of iron. These differences were traced to inconsistencies in the recombination rates for Argon-like iron ions between the CHIANTI and AtomDB databases, even though they in theory use the same data. The key finding was that theoretical models are greatly impacted by the relative atomic database update cycles.Following the SunNEI comparison, we then use the AtomDB database to model the time depedencies of intensity flux spikes produced by a coronal shock wave (Ma et al. 2011). We produced a theretical representation for an ionizing plasma that interpolated over the intensity in four Astronomical Imaging Assembly (AIA) filters. Specifically, the 171 A (Fe IX) ,193 A (Fe XII, FeXXIV),211 A (Fe XIV),and 335 A (Fe XVI) wavelengths in order to assess the comparative spectral emissions between AtomDB and the observed data. The results of the theoretical model, in principle, shine light on both the equilibrium conditions before the shock and the non-equilibrium response to the shock front, as well as discrepancies introduced by changing the atomic data.
Makita, Yuko; Kawashima, Mika; Lau, Nyok Sean; Othman, Ahmad Sofiman; Matsui, Minami
2018-01-19
Natural rubber is an economically important material. Currently the Pará rubber tree, Hevea brasiliensis is the main commercial source. Little is known about rubber biosynthesis at the molecular level. Next-generation sequencing (NGS) technologies brought draft genomes of three rubber cultivars and a variety of RNA sequencing (RNA-seq) data. However, no current genome or transcriptome databases (DB) are organized by gene. A gene-oriented database is a valuable support for rubber research. Based on our original draft genome sequence of H. brasiliensis RRIM600, we constructed a rubber tree genome and transcriptome DB. Our DB provides genome information including gene functional annotations and multi-transcriptome data of RNA-seq, full-length cDNAs including PacBio Isoform sequencing (Iso-Seq), ESTs and genome wide transcription start sites (TSSs) derived from CAGE technology. Using our original and publically available RNA-seq data, we calculated co-expressed genes for identifying functionally related gene sets and/or genes regulated by the same transcription factor (TF). Users can access multi-transcriptome data through both a gene-oriented web page and a genome browser. For the gene searching system, we provide keyword search, sequence homology search and gene expression search; users can also select their expression threshold easily. The rubber genome and transcriptome DB provides rubber tree genome sequence and multi-transcriptomics data. This DB is useful for comprehensive understanding of the rubber transcriptome. This will assist both industrial and academic researchers for rubber and economically important close relatives such as R. communis, M. esculenta and J. curcas. The Rubber Transcriptome DB release 2017.03 is accessible at http://matsui-lab.riken.jp/rubber/ .
ProteinWorldDB: querying radical pairwise alignments among protein sets from complete genomes
Otto, Thomas Dan; Catanho, Marcos; Tristão, Cristian; Bezerra, Márcia; Fernandes, Renan Mathias; Elias, Guilherme Steinberger; Scaglia, Alexandre Capeletto; Bovermann, Bill; Berstis, Viktors; Lifschitz, Sergio; de Miranda, Antonio Basílio; Degrave, Wim
2010-01-01
Motivation: Many analyses in modern biological research are based on comparisons between biological sequences, resulting in functional, evolutionary and structural inferences. When large numbers of sequences are compared, heuristics are often used resulting in a certain lack of accuracy. In order to improve and validate results of such comparisons, we have performed radical all-against-all comparisons of 4 million protein sequences belonging to the RefSeq database, using an implementation of the Smith–Waterman algorithm. This extremely intensive computational approach was made possible with the help of World Community Grid™, through the Genome Comparison Project. The resulting database, ProteinWorldDB, which contains coordinates of pairwise protein alignments and their respective scores, is now made available. Users can download, compare and analyze the results, filtered by genomes, protein functions or clusters. ProteinWorldDB is integrated with annotations derived from Swiss-Prot, Pfam, KEGG, NCBI Taxonomy database and gene ontology. The database is a unique and valuable asset, representing a major effort to create a reliable and consistent dataset of cross-comparisons of the whole protein content encoded in hundreds of completely sequenced genomes using a rigorous dynamic programming approach. Availability: The database can be accessed through http://proteinworlddb.org Contact: otto@fiocruz.br PMID:20089515
dbPAF: an integrative database of protein phosphorylation in animals and fungi.
Ullah, Shahid; Lin, Shaofeng; Xu, Yang; Deng, Wankun; Ma, Lili; Zhang, Ying; Liu, Zexian; Xue, Yu
2016-03-24
Protein phosphorylation is one of the most important post-translational modifications (PTMs) and regulates a broad spectrum of biological processes. Recent progresses in phosphoproteomic identifications have generated a flood of phosphorylation sites, while the integration of these sites is an urgent need. In this work, we developed a curated database of dbPAF, containing known phosphorylation sites in H. sapiens, M. musculus, R. norvegicus, D. melanogaster, C. elegans, S. pombe and S. cerevisiae. From the scientific literature and public databases, we totally collected and integrated 54,148 phosphoproteins with 483,001 phosphorylation sites. Multiple options were provided for accessing the data, while original references and other annotations were also present for each phosphoprotein. Based on the new data set, we computationally detected significantly over-represented sequence motifs around phosphorylation sites, predicted potential kinases that are responsible for the modification of collected phospho-sites, and evolutionarily analyzed phosphorylation conservation states across different species. Besides to be largely consistent with previous reports, our results also proposed new features of phospho-regulation. Taken together, our database can be useful for further analyses of protein phosphorylation in human and other model organisms. The dbPAF database was implemented in PHP + MySQL and freely available at http://dbpaf.biocuckoo.org.
LenVarDB: database of length-variant protein domains.
Mutt, Eshita; Mathew, Oommen K; Sowdhamini, Ramanathan
2014-01-01
Protein domains are functionally and structurally independent modules, which add to the functional variety of proteins. This array of functional diversity has been enabled by evolutionary changes, such as amino acid substitutions or insertions or deletions, occurring in these protein domains. Length variations (indels) can introduce changes at structural, functional and interaction levels. LenVarDB (freely available at http://caps.ncbs.res.in/lenvardb/) traces these length variations, starting from structure-based sequence alignments in our Protein Alignments organized as Structural Superfamilies (PASS2) database, across 731 structural classification of proteins (SCOP)-based protein domain superfamilies connected to 2 730 625 sequence homologues. Alignment of sequence homologues corresponding to a structural domain is available, starting from a structure-based sequence alignment of the superfamily. Orientation of the length-variant (indel) regions in protein domains can be visualized by mapping them on the structure and on the alignment. Knowledge about location of length variations within protein domains and their visual representation will be useful in predicting changes within structurally or functionally relevant sites, which may ultimately regulate protein function. Non-technical summary: Evolutionary changes bring about natural changes to proteins that may be found in many organisms. Such changes could be reflected as amino acid substitutions or insertions-deletions (indels) in protein sequences. LenVarDB is a database that provides an early overview of observed length variations that were set among 731 protein families and after examining >2 million sequences. Indels are followed up to observe if they are close to the active site such that they can affect the activity of proteins. Inclusion of such information can aid the design of bioengineering experiments.
Brudey, Karine; Driscoll, Jeffrey R; Rigouts, Leen; Prodinger, Wolfgang M; Gori, Andrea; Al-Hajoj, Sahal A; Allix, Caroline; Aristimuño, Liselotte; Arora, Jyoti; Baumanis, Viesturs; Binder, Lothar; Cafrune, Patricia; Cataldi, Angel; Cheong, Soonfatt; Diel, Roland; Ellermeier, Christopher; Evans, Jason T; Fauville-Dufaux, Maryse; Ferdinand, Séverine; de Viedma, Dario Garcia; Garzelli, Carlo; Gazzola, Lidia; Gomes, Harrison M; Guttierez, M Cristina; Hawkey, Peter M; van Helden, Paul D; Kadival, Gurujaj V; Kreiswirth, Barry N; Kremer, Kristin; Kubin, Milan; Kulkarni, Savita P; Liens, Benjamin; Lillebaek, Troels; Ly, Ho Minh; Martin, Carlos; Martin, Christian; Mokrousov, Igor; Narvskaïa, Olga; Ngeow, Yun Fong; Naumann, Ludmilla; Niemann, Stefan; Parwati, Ida; Rahim, Zeaur; Rasolofo-Razanamparany, Voahangy; Rasolonavalona, Tiana; Rossetti, M Lucia; Rüsch-Gerdes, Sabine; Sajduda, Anna; Samper, Sofia; Shemyakin, Igor G; Singh, Urvashi B; Somoskovi, Akos; Skuce, Robin A; van Soolingen, Dick; Streicher, Elisabeth M; Suffys, Philip N; Tortoli, Enrico; Tracevska, Tatjana; Vincent, Véronique; Victor, Tommie C; Warren, Robin M; Yap, Sook Fan; Zaman, Khadiza; Portaels, Françoise; Rastogi, Nalin; Sola, Christophe
2006-01-01
Background The Direct Repeat locus of the Mycobacterium tuberculosis complex (MTC) is a member of the CRISPR (Clustered regularly interspaced short palindromic repeats) sequences family. Spoligotyping is the widely used PCR-based reverse-hybridization blotting technique that assays the genetic diversity of this locus and is useful both for clinical laboratory, molecular epidemiology, evolutionary and population genetics. It is easy, robust, cheap, and produces highly diverse portable numerical results, as the result of the combination of (1) Unique Events Polymorphism (UEP) (2) Insertion-Sequence-mediated genetic recombination. Genetic convergence, although rare, was also previously demonstrated. Three previous international spoligotype databases had partly revealed the global and local geographical structures of MTC bacilli populations, however, there was a need for the release of a new, more representative and extended, international spoligotyping database. Results The fourth international spoligotyping database, SpolDB4, describes 1939 shared-types (STs) representative of a total of 39,295 strains from 122 countries, which are tentatively classified into 62 clades/lineages using a mixed expert-based and bioinformatical approach. The SpolDB4 update adds 26 new potentially phylogeographically-specific MTC genotype families. It provides a clearer picture of the current MTC genomes diversity as well as on the relationships between the genetic attributes investigated (spoligotypes) and the infra-species classification and evolutionary history of the species. Indeed, an independent Naïve-Bayes mixture-model analysis has validated main of the previous supervised SpolDB3 classification results, confirming the usefulness of both supervised and unsupervised models as an approach to understand MTC population structure. Updated results on the epidemiological status of spoligotypes, as well as genetic prevalence maps on six main lineages are also shown. Our results suggests the existence of fine geographical genetic clines within MTC populations, that could mirror the passed and present Homo sapiens sapiens demographical and mycobacterial co-evolutionary history whose structure could be further reconstructed and modelled, thereby providing a large-scale conceptual framework of the global TB Epidemiologic Network. Conclusion Our results broaden the knowledge of the global phylogeography of the MTC complex. SpolDB4 should be a very useful tool to better define the identity of a given MTC clinical isolate, and to better analyze the links between its current spreading and previous evolutionary history. The building and mining of extended MTC polymorphic genetic databases is in progress. PMID:16519816
2004-06-01
remote databases, has seen little vendor acceptance. Each database ( Oracle , DB2, MySQL , etc.) has its own client- server protocol. Therefore each...existing standards – SQL , X.500/LDAP, FTP, etc. • View information dissemination as selective replication – State-oriented vs . message-oriented...allowing the 8 application to start. The resource management system would serve as a broker to the resources, making sure that resources are not
FlavorDB: a database of flavor molecules.
Garg, Neelansh; Sethupathy, Apuroop; Tuwani, Rudraksh; Nk, Rakhi; Dokania, Shubham; Iyer, Arvind; Gupta, Ayushi; Agrawal, Shubhra; Singh, Navjot; Shukla, Shubham; Kathuria, Kriti; Badhwar, Rahul; Kanji, Rakesh; Jain, Anupam; Kaur, Avneet; Nagpal, Rashmi; Bagler, Ganesh
2018-01-04
Flavor is an expression of olfactory and gustatory sensations experienced through a multitude of chemical processes triggered by molecules. Beyond their key role in defining taste and smell, flavor molecules also regulate metabolic processes with consequences to health. Such molecules present in natural sources have been an integral part of human history with limited success in attempts to create synthetic alternatives. Given their utility in various spheres of life such as food and fragrances, it is valuable to have a repository of flavor molecules, their natural sources, physicochemical properties, and sensory responses. FlavorDB (http://cosylab.iiitd.edu.in/flavordb) comprises of 25,595 flavor molecules representing an array of tastes and odors. Among these 2254 molecules are associated with 936 natural ingredients belonging to 34 categories. The dynamic, user-friendly interface of the resource facilitates exploration of flavor molecules for divergent applications: finding molecules matching a desired flavor or structure; exploring molecules of an ingredient; discovering novel food pairings; finding the molecular essence of food ingredients; associating chemical features with a flavor and more. Data-driven studies based on FlavorDB can pave the way for an improved understanding of flavor mechanisms. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
PeroxisomeDB: a database for the peroxisomal proteome, functional genomics and disease
Schlüter, Agatha; Fourcade, Stéphane; Domènech-Estévez, Enric; Gabaldón, Toni; Huerta-Cepas, Jaime; Berthommier, Guillaume; Ripp, Raymond; Wanders, Ronald J. A.; Poch, Olivier; Pujol, Aurora
2007-01-01
Peroxisomes are essential organelles of eukaryotic origin, ubiquitously distributed in cells and organisms, playing key roles in lipid and antioxidant metabolism. Loss or malfunction of peroxisomes causes more than 20 fatal inherited conditions. We have created a peroxisomal database () that includes the complete peroxisomal proteome of Homo sapiens and Saccharomyces cerevisiae, by gathering, updating and integrating the available genetic and functional information on peroxisomal genes. PeroxisomeDB is structured in interrelated sections ‘Genes’, ‘Functions’, ‘Metabolic pathways’ and ‘Diseases’, that include hyperlinks to selected features of NCBI, ENSEMBL and UCSC databases. We have designed graphical depictions of the main peroxisomal metabolic routes and have included updated flow charts for diagnosis. Precomputed BLAST, PSI-BLAST, multiple sequence alignment (MUSCLE) and phylogenetic trees are provided to assist in direct multispecies comparison to study evolutionary conserved functions and pathways. Highlights of the PeroxisomeDB include new tools developed for facilitating (i) identification of novel peroxisomal proteins, by means of identifying proteins carrying peroxisome targeting signal (PTS) motifs, (ii) detection of peroxisomes in silico, particularly useful for screening the deluge of newly sequenced genomes. PeroxisomeDB should contribute to the systematic characterization of the peroxisomal proteome and facilitate system biology approaches on the organelle. PMID:17135190
dbCPG: A web resource for cancer predisposition genes
Wei, Ran; Yao, Yao; Yang, Wu; Zheng, Chun-Hou; Zhao, Min; Xia, Junfeng
2016-01-01
Cancer predisposition genes (CPGs) are genes in which inherited mutations confer highly or moderately increased risks of developing cancer. Identification of these genes and understanding the biological mechanisms that underlie them is crucial for the prevention, early diagnosis, and optimized management of cancer. Over the past decades, great efforts have been made to identify CPGs through multiple strategies. However, information on these CPGs and their molecular functions is scattered. To address this issue and provide a comprehensive resource for researchers, we developed the Cancer Predisposition Gene Database (dbCPG, Database URL: http://bioinfo.ahu.edu.cn:8080/dbCPG/index.jsp), the first literature-based gene resource for exploring human CPGs. It contains 827 human (724 protein-coding, 23 non-coding, and 80 unknown type genes), 637 rats, and 658 mouse CPGs. Furthermore, data mining was performed to gain insights into the understanding of the CPGs data, including functional annotation, gene prioritization, network analysis of prioritized genes and overlap analysis across multiple cancer types. A user-friendly web interface with multiple browse, search, and upload functions was also developed to facilitate access to the latest information on CPGs. Taken together, the dbCPG database provides a comprehensive data resource for further studies of cancer predisposition genes. PMID:27192119
RAId_DbS: Peptide Identification using Database Searches with Realistic Statistics
Alves, Gelio; Ogurtsov, Aleksey Y; Yu, Yi-Kuo
2007-01-01
Background The key to mass-spectrometry-based proteomics is peptide identification. A major challenge in peptide identification is to obtain realistic E-values when assigning statistical significance to candidate peptides. Results Using a simple scoring scheme, we propose a database search method with theoretically characterized statistics. Taking into account possible skewness in the random variable distribution and the effect of finite sampling, we provide a theoretical derivation for the tail of the score distribution. For every experimental spectrum examined, we collect the scores of peptides in the database, and find good agreement between the collected score statistics and our theoretical distribution. Using Student's t-tests, we quantify the degree of agreement between the theoretical distribution and the score statistics collected. The T-tests may be used to measure the reliability of reported statistics. When combined with reported P-value for a peptide hit using a score distribution model, this new measure prevents exaggerated statistics. Another feature of RAId_DbS is its capability of detecting multiple co-eluted peptides. The peptide identification performance and statistical accuracy of RAId_DbS are assessed and compared with several other search tools. The executables and data related to RAId_DbS are freely available upon request. PMID:17961253
GeneSigDB—a curated database of gene expression signatures
Culhane, Aedín C.; Schwarzl, Thomas; Sultana, Razvan; Picard, Kermshlise C.; Picard, Shaita C.; Lu, Tim H.; Franklin, Katherine R.; French, Simon J.; Papenhausen, Gerald; Correll, Mick; Quackenbush, John
2010-01-01
The primary objective of most gene expression studies is the identification of one or more gene signatures; lists of genes whose transcriptional levels are uniquely associated with a specific biological phenotype. Whilst thousands of experimentally derived gene signatures are published, their potential value to the community is limited by their computational inaccessibility. Gene signatures are embedded in published article figures, tables or in supplementary materials, and are frequently presented using non-standard gene or probeset nomenclature. We present GeneSigDB (http://compbio.dfci.harvard.edu/genesigdb) a manually curated database of gene expression signatures. GeneSigDB release 1.0 focuses on cancer and stem cells gene signatures and was constructed from more than 850 publications from which we manually transcribed 575 gene signatures. Most gene signatures (n = 560) were successfully mapped to the genome to extract standardized lists of EnsEMBL gene identifiers. GeneSigDB provides the original gene signature, the standardized gene list and a fully traceable gene mapping history for each gene from the original transcribed data table through to the standardized list of genes. The GeneSigDB web portal is easy to search, allows users to compare their own gene list to those in the database, and download gene signatures in most common gene identifier formats. PMID:19934259
SilkPathDB: a comprehensive resource for the study of silkworm pathogens
Pan, Guo-Qing; Vossbrinck, Charles R.; Xu, Jin-Shan; Li, Chun-Feng; Chen, Jie; Long, Meng-Xian; Yang, Ming; Xu, Xiao-Fei; Xu, Chen; Debrunner-Vossbrinck, Bettina A.
2017-01-01
Silkworm pathogens have been heavily impeding the development of sericultural industry and play important roles in lepidopteran ecology, and some of which are used as biological insecticides. Rapid advances in studies on the omics of silkworm pathogens have produced a large amount of data, which need to be brought together centrally in a coherent and systematic manner. This will facilitate the reuse of these data for further analysis. We have collected genomic data for 86 silkworm pathogens from 4 taxa (fungi, microsporidia, bacteria and viruses) and from 4 lepidopteran hosts, and developed the open-access Silkworm Pathogen Database (SilkPathDB) to make this information readily available. The implementation of SilkPathDB involves integrating Drupal and GBrowse as a graphic interface for a Chado relational database which houses all of the datasets involved. The genomes have been assembled and annotated for comparative purposes and allow the search and analysis of homologous sequences, transposable elements, protein subcellular locations, including secreted proteins, and gene ontology. We believe that the SilkPathDB will aid researchers in the identification of silkworm parasites, understanding the mechanisms of silkworm infections, and the developmental ecology of silkworm parasites (gene expression) and their hosts. Database URL: http://silkpathdb.swu.edu.cn PMID:28365723
GigaDB: promoting data dissemination and reproducibility
Sneddon, Tam P.; Si Zhe, Xiao; Edmunds, Scott C.; Li, Peter; Goodman, Laurie; Hunter, Christopher I.
2014-01-01
Often papers are published where the underlying data supporting the research are not made available because of the limitations of making such large data sets publicly and permanently accessible. Even if the raw data are deposited in public archives, the essential analysis intermediaries, scripts or software are frequently not made available, meaning the science is not reproducible. The GigaScience journal is attempting to address this issue with the associated data storage and dissemination portal, the GigaScience database (GigaDB). Here we present the current version of GigaDB and reveal plans for the next generation of improvements. However, most importantly, we are soliciting responses from you, the users, to ensure that future developments are focused on the data storage and dissemination issues that still need resolving. Database URL: http://www.gigadb.org PMID:24622612
MultitaskProtDB: a database of multitasking proteins
Hernández, Sergio; Ferragut, Gabriela; Amela, Isaac; Perez-Pons, JosepAntoni; Piñol, Jaume; Mozo-Villarias, Angel; Cedano, Juan; Querol, Enrique
2014-01-01
We have compiled MultitaskProtDB, available online at http://wallace.uab.es/multitask, to provide a repository where the many multitasking proteins found in the literature can be stored. Multitasking or moonlighting is the capability of some proteins to execute two or more biological functions. Usually, multitasking proteins are experimentally revealed by serendipity. This ability of proteins to perform multitasking functions helps us to understand one of the ways used by cells to perform many complex functions with a limited number of genes. Even so, the study of this phenomenon is complex because, among other things, there is no database of moonlighting proteins. The existence of such a tool facilitates the collection and dissemination of these important data. This work reports the database, MultitaskProtDB, which is designed as a friendly user web page containing >288 multitasking proteins with their NCBI and UniProt accession numbers, canonical and additional biological functions, monomeric/oligomeric states, PDB codes when available and bibliographic references. This database also serves to gain insight into some characteristics of multitasking proteins such as frequencies of the different pairs of functions, phylogenetic conservation and so forth. PMID:24253302
LDSplitDB: a database for studies of meiotic recombination hotspots in MHC using human genomic data.
Guo, Jing; Chen, Hao; Yang, Peng; Lee, Yew Ti; Wu, Min; Przytycka, Teresa M; Kwoh, Chee Keong; Zheng, Jie
2018-04-20
Meiotic recombination happens during the process of meiosis when chromosomes inherited from two parents exchange genetic materials to generate chromosomes in the gamete cells. The recombination events tend to occur in narrow genomic regions called recombination hotspots. Its dysregulation could lead to serious human diseases such as birth defects. Although the regulatory mechanism of recombination events is still unclear, DNA sequence polymorphisms have been found to play crucial roles in the regulation of recombination hotspots. To facilitate the studies of the underlying mechanism, we developed a database named LDSplitDB which provides an integrative and interactive data mining and visualization platform for the genome-wide association studies of recombination hotspots. It contains the pre-computed association maps of the major histocompatibility complex (MHC) region in the 1000 Genomes Project and the HapMap Phase III datasets, and a genome-scale study of the European population from the HapMap Phase II dataset. Besides the recombination profiles, related data of genes, SNPs and different types of epigenetic modifications, which could be associated with meiotic recombination, are provided for comprehensive analysis. To meet the computational requirement of the rapidly increasing population genomics data, we prepared a lookup table of 400 haplotypes for recombination rate estimation using the well-known LDhat algorithm which includes all possible two-locus haplotype configurations. To the best of our knowledge, LDSplitDB is the first large-scale database for the association analysis of human recombination hotspots with DNA sequence polymorphisms. It provides valuable resources for the discovery of the mechanism of meiotic recombination hotspots. The information about MHC in this database could help understand the roles of recombination in human immune system. DATABASE URL: http://histone.scse.ntu.edu.sg/LDSplitDB.
Trevarton, Alexander J.; Mann, Michael B.; Knapp, Christoph; Araki, Hiromitsu; Wren, Jonathan D.; Stones-Havas, Steven; Black, Michael A.; Print, Cristin G.
2013-01-01
Despite on-going research, metastatic melanoma survival rates remain low and treatment options are limited. Researchers can now access a rapidly growing amount of molecular and clinical information about melanoma. This information is becoming difficult to assemble and interpret due to its dispersed nature, yet as it grows it becomes increasingly valuable for understanding melanoma. Integration of this information into a comprehensive resource to aid rational experimental design and patient stratification is needed. As an initial step in this direction, we have assembled a web-accessible melanoma database, MelanomaDB, which incorporates clinical and molecular data from publically available sources, which will be regularly updated as new information becomes available. This database allows complex links to be drawn between many different aspects of melanoma biology: genetic changes (e.g., mutations) in individual melanomas revealed by DNA sequencing, associations between gene expression and patient survival, data concerning drug targets, biomarkers, druggability, and clinical trials, as well as our own statistical analysis of relationships between molecular pathways and clinical parameters that have been produced using these data sets. The database is freely available at http://genesetdb.auckland.ac.nz/melanomadb/about.html. A subset of the information in the database can also be accessed through a freely available web application in the Illumina genomic cloud computing platform BaseSpace at http://www.biomatters.com/apps/melanoma-profiler-for-research. The MelanomaDB database illustrates dysregulation of specific signaling pathways across 310 exome-sequenced melanomas and in individual tumors and identifies the distribution of somatic variants in melanoma. We suggest that MelanomaDB can provide a context in which to interpret the tumor molecular profiles of individual melanoma patients relative to biological information and available drug therapies. PMID:23875173
TheHiveDB image data management and analysis framework.
Muehlboeck, J-Sebastian; Westman, Eric; Simmons, Andrew
2014-01-06
The hive database system (theHiveDB) is a web-based brain imaging database, collaboration, and activity system which has been designed as an imaging workflow management system capable of handling cross-sectional and longitudinal multi-center studies. It can be used to organize and integrate existing data from heterogeneous projects as well as data from ongoing studies. It has been conceived to guide and assist the researcher throughout the entire research process, integrating all relevant types of data across modalities (e.g., brain imaging, clinical, and genetic data). TheHiveDB is a modern activity and resource management system capable of scheduling image processing on both private compute resources and the cloud. The activity component supports common image archival and management tasks as well as established pipeline processing (e.g., Freesurfer for extraction of scalar measures from magnetic resonance images). Furthermore, via theHiveDB activity system algorithm developers may grant access to virtual machines hosting versioned releases of their tools to collaborators and the imaging community. The application of theHiveDB is illustrated with a brief use case based on organizing, processing, and analyzing data from the publically available Alzheimer Disease Neuroimaging Initiative.
TheHiveDB image data management and analysis framework
Muehlboeck, J-Sebastian; Westman, Eric; Simmons, Andrew
2014-01-01
The hive database system (theHiveDB) is a web-based brain imaging database, collaboration, and activity system which has been designed as an imaging workflow management system capable of handling cross-sectional and longitudinal multi-center studies. It can be used to organize and integrate existing data from heterogeneous projects as well as data from ongoing studies. It has been conceived to guide and assist the researcher throughout the entire research process, integrating all relevant types of data across modalities (e.g., brain imaging, clinical, and genetic data). TheHiveDB is a modern activity and resource management system capable of scheduling image processing on both private compute resources and the cloud. The activity component supports common image archival and management tasks as well as established pipeline processing (e.g., Freesurfer for extraction of scalar measures from magnetic resonance images). Furthermore, via theHiveDB activity system algorithm developers may grant access to virtual machines hosting versioned releases of their tools to collaborators and the imaging community. The application of theHiveDB is illustrated with a brief use case based on organizing, processing, and analyzing data from the publically available Alzheimer Disease Neuroimaging Initiative. PMID:24432000
Clima, Rosanna; Preste, Roberto; Calabrese, Claudia; Diroma, Maria Angela; Santorsola, Mariangela; Scioscia, Gaetano; Simone, Domenico; Shen, Lishuang; Gasparre, Giuseppe; Attimonelli, Marcella
2017-01-01
The HmtDB resource hosts a database of human mitochondrial genome sequences from individuals with healthy and disease phenotypes. The database is intended to support both population geneticists as well as clinicians undertaking the task to assess the pathogenicity of specific mtDNA mutations. The wide application of next-generation sequencing (NGS) has provided an enormous volume of high-resolution data at a low price, increasing the availability of human mitochondrial sequencing data, which called for a cogent and significant expansion of HmtDB data content that has more than tripled in the current release. We here describe additional novel features, including: (i) a complete, user-friendly restyling of the web interface, (ii) links to the command-line stand-alone and web versions of the MToolBox package, an up-to-date tool to reconstruct and analyze human mitochondrial DNA from NGS data and (iii) the implementation of the Reconstructed Sapiens Reference Sequence (RSRS) as mitochondrial reference sequence. The overall update renders HmtDB an even more handy and useful resource as it enables a more rapid data access, processing and analysis. HmtDB is accessible at http://www.hmtdb.uniba.it/. PMID:27899581
VitisExpDB: a database resource for grape functional genomics.
Doddapaneni, Harshavardhan; Lin, Hong; Walker, M Andrew; Yao, Jiqiang; Civerolo, Edwin L
2008-02-28
The family Vitaceae consists of many different grape species that grow in a range of climatic conditions. In the past few years, several studies have generated functional genomic information on different Vitis species and cultivars, including the European grape vine, Vitis vinifera. Our goal is to develop a comprehensive web data source for Vitaceae. VitisExpDB is an online MySQL-PHP driven relational database that houses annotated EST and gene expression data for V. vinifera and non-vinifera grape species and varieties. Currently, the database stores approximately 320,000 EST sequences derived from 8 species/hybrids, their annotation (BLAST top match) details and Gene Ontology based structured vocabulary. Putative homologs for each EST in other species and varieties along with information on their percent nucleotide identities, phylogenetic relationship and common primers can be retrieved. The database also includes information on probe sequence and annotation features of the high density 60-mer gene expression chip consisting of approximately 20,000 non-redundant set of ESTs. Finally, the database includes 14 processed global microarray expression profile sets. Data from 12 of these expression profile sets have been mapped onto metabolic pathways. A user-friendly web interface with multiple search indices and extensively hyperlinked result features that permit efficient data retrieval has been developed. Several online bioinformatics tools that interact with the database along with other sequence analysis tools have been added. In addition, users can submit their ESTs to the database. The developed database provides genomic resource to grape community for functional analysis of genes in the collection and for the grape genome annotation and gene function identification. The VitisExpDB database is available through our website http://cropdisease.ars.usda.gov/vitis_at/main-page.htm.
VitisExpDB: A database resource for grape functional genomics
Doddapaneni, Harshavardhan; Lin, Hong; Walker, M Andrew; Yao, Jiqiang; Civerolo, Edwin L
2008-01-01
Background The family Vitaceae consists of many different grape species that grow in a range of climatic conditions. In the past few years, several studies have generated functional genomic information on different Vitis species and cultivars, including the European grape vine, Vitis vinifera. Our goal is to develop a comprehensive web data source for Vitaceae. Description VitisExpDB is an online MySQL-PHP driven relational database that houses annotated EST and gene expression data for V. vinifera and non-vinifera grape species and varieties. Currently, the database stores ~320,000 EST sequences derived from 8 species/hybrids, their annotation (BLAST top match) details and Gene Ontology based structured vocabulary. Putative homologs for each EST in other species and varieties along with information on their percent nucleotide identities, phylogenetic relationship and common primers can be retrieved. The database also includes information on probe sequence and annotation features of the high density 60-mer gene expression chip consisting of ~20,000 non-redundant set of ESTs. Finally, the database includes 14 processed global microarray expression profile sets. Data from 12 of these expression profile sets have been mapped onto metabolic pathways. A user-friendly web interface with multiple search indices and extensively hyperlinked result features that permit efficient data retrieval has been developed. Several online bioinformatics tools that interact with the database along with other sequence analysis tools have been added. In addition, users can submit their ESTs to the database. Conclusion The developed database provides genomic resource to grape community for functional analysis of genes in the collection and for the grape genome annotation and gene function identification. The VitisExpDB database is available through our website . PMID:18307813
The Web-Database Connection Tools for Sharing Information on the Campus Intranet.
ERIC Educational Resources Information Center
Thibeault, Nancy E.
This paper evaluates four tools for creating World Wide Web pages that interface with Microsoft Access databases: DB Gateway, Internet Database Assistant (IDBA), Microsoft Internet Database Connector (IDC), and Cold Fusion. The system requirements and features of each tool are discussed. A sample application, "The Virtual Help Desk"…
Haunsberger, Stefan J; Connolly, Niamh M C; Prehn, Jochen H M
2017-02-15
The miRBase database is the central and official repository for miRNAs and the current release is miRBase version 21.0. Name changes in different miRBase releases cause inconsistencies in miRNA names from version to version. When working with only a small number of miRNAs the translation can be done manually. However, with large sets of miRNAs, the necessary correction of such inconsistencies becomes burdensome and error-prone. We developed miRNAmeConverter , available as a Bioconductor R package and web interface that addresses the challenges associated with mature miRNA name inconsistencies. The main algorithm implemented enables high-throughput automatic translation of species-independent mature miRNA names to user selected miRBase versions. The web interface enables users less familiar with R to translate miRNA names given in form of a list or embedded in text and download of the results. The miRNAmeConverter R package is open source under the Artistic-2.0 license. It is freely available from Bioconductor ( http://bioconductor.org/packages/miRNAmeConverter ). The web interface is based on R Shiny and can be accessed under the URL http://www.systemsmedicineireland.ie/tools/mirna-name-converter/ . The database that miRNAmeConverter depends on is provided by the annotation package miRBaseVersions.db and can be downloaded from Bioconductor ( http://bioconductor.org/packages/miRBaseVersions.db ). Minimum R version 3.3.0 is required. stefanhaunsberger@rcsi.ie. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
Sharing mutants and experimental information prepublication using FgMutantDB
USDA-ARS?s Scientific Manuscript database
There has been no central location for storing generated mutants of Fusarium graminearum or for data associated with these mutants. Instead researchers relied on several independent, non-integrated databases. FgMutantDB was designed as a simple spreadsheet that is accessible globally on the web th...
An ECG storage and retrieval system embedded in client server HIS utilizing object-oriented DB.
Wang, C; Ohe, K; Sakurai, T; Nagase, T; Kaihara, S
1996-02-01
In the University of Tokyo Hospital, the improved client server HIS has been applied to clinical practice and physicians can order prescription, laboratory examination, ECG examination and radiographic examination, etc. directly by themselves and read results of these examinations, except medical signal waves, schema and image, on UNIX workstations. Recently, we designed and developed an ECG storage and retrieval system embedded in the client server HIS utilizing object-oriented database to take the first step in dealing with digitized signal, schema and image data and show waves, graphics, and images directly to physicians by the client server HIS. The system was developed based on object-oriented analysis and design, and implemented with object-oriented database management system (OODMS) and C++ programming language. In this paper, we describe the ECG data model, functions of the storage and retrieval system, features of user interface and the result of its implementation in the HIS.
Nosql for Storage and Retrieval of Large LIDAR Data Collections
NASA Astrophysics Data System (ADS)
Boehm, J.; Liu, K.
2015-08-01
Developments in LiDAR technology over the past decades have made LiDAR to become a mature and widely accepted source of geospatial information. This in turn has led to an enormous growth in data volume. The central idea for a file-centric storage of LiDAR point clouds is the observation that large collections of LiDAR data are typically delivered as large collections of files, rather than single files of terabyte size. This split of the dataset, commonly referred to as tiling, was usually done to accommodate a specific processing pipeline. It makes therefore sense to preserve this split. A document oriented NoSQL database can easily emulate this data partitioning, by representing each tile (file) in a separate document. The document stores the metadata of the tile. The actual files are stored in a distributed file system emulated by the NoSQL database. We demonstrate the use of MongoDB a highly scalable document oriented NoSQL database for storing large LiDAR files. MongoDB like any NoSQL database allows for queries on the attributes of the document. As a specialty MongoDB also allows spatial queries. Hence we can perform spatial queries on the bounding boxes of the LiDAR tiles. Inserting and retrieving files on a cloud-based database is compared to native file system and cloud storage transfer speed.
dndDB: a database focused on phosphorothioation of the DNA backbone.
Ou, Hong-Yu; He, Xinyi; Shao, Yucheng; Tai, Cui; Rajakumar, Kumar; Deng, Zixin
2009-01-01
The Dnd DNA degradation phenotype was first observed during electrophoresis of genomic DNA from Streptomyces lividans more than 20 years ago. It was subsequently shown to be governed by the five-gene dnd cluster. Similar gene clusters have now been found to be widespread among many other distantly related bacteria. Recently the dnd cluster was shown to mediate the incorporation of sulphur into the DNA backbone via a sequence-selective, stereo-specific phosphorothioate modification in Escherichia coli B7A. Intriguingly, to date all identified dnd clusters lie within mobile genetic elements, the vast majority in laterally transferred genomic islands. We organized available data from experimental and bioinformatics analyses about the DNA phosphorothioation phenomenon and associated documentation as a dndDB database. It contains the following detailed information: (i) Dnd phenotype; (ii) dnd gene clusters; (iii) genomic islands harbouring dnd genes; (iv) Dnd proteins and conserved domains. As of 25 December 2008, dndDB contained data corresponding to 24 bacterial species exhibiting the Dnd phenotype reported in the scientific literature. In addition, via in silico analysis, dndDB identified 26 syntenic dnd clusters from 25 species of Eubacteria and Archaea, 25 dnd-bearing genomic islands and one dnd plasmid containing 114 dnd genes. A further 397 other genes coding for proteins with varying levels of similarity to Dnd proteins were also included in dndDB. A broad range of similarity search, sequence alignment and phylogenetic tools are readily accessible to allow for to individualized directions of research focused on dnd genes. dndDB can facilitate efficient investigation of a wide range of aspects relating to dnd DNA modification and other island-encoded functions in host organisms. dndDB version 1.0 is freely available at http://mml.sjtu.edu.cn/dndDB/.
Evaluation of relational and NoSQL database architectures to manage genomic annotations.
Schulz, Wade L; Nelson, Brent G; Felker, Donn K; Durant, Thomas J S; Torres, Richard
2016-12-01
While the adoption of next generation sequencing has rapidly expanded, the informatics infrastructure used to manage the data generated by this technology has not kept pace. Historically, relational databases have provided much of the framework for data storage and retrieval. Newer technologies based on NoSQL architectures may provide significant advantages in storage and query efficiency, thereby reducing the cost of data management. But their relative advantage when applied to biomedical data sets, such as genetic data, has not been characterized. To this end, we compared the storage, indexing, and query efficiency of a common relational database (MySQL), a document-oriented NoSQL database (MongoDB), and a relational database with NoSQL support (PostgreSQL). When used to store genomic annotations from the dbSNP database, we found the NoSQL architectures to outperform traditional, relational models for speed of data storage, indexing, and query retrieval in nearly every operation. These findings strongly support the use of novel database technologies to improve the efficiency of data management within the biological sciences. Copyright © 2016 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Rack, F. R.
2005-12-01
The Integrated Ocean Drilling Program (IODP: 2003-2013 initial phase) is the successor to the Deep Sea Drilling Project (DSDP: 1968-1983) and the Ocean Drilling Program (ODP: 1985-2003). These earlier scientific drilling programs amassed collections of sediment and rock cores (over 300 kilometers stored in four repositories) and data organized in distributed databases and in print or electronic publications. International members of the IODP have established, through memoranda, the right to have access to: (1) all data, samples, scientific and technical results, all engineering plans, data or other information produced under contract to the program; and, (2) all data from geophysical and other site surveys performed in support of the program which are used for drilling planning. The challenge that faces the individual platform operators and management of IODP is to find the right balance and appropriate synergies among the needs, expectations and requirements of stakeholders. The evolving model for IODP database services consists of the management and integration of data collected onboard the various IODP platforms (including downhole logging and syn-cruise site survey information), legacy data from DSDP and ODP, data derived from post-cruise research and publications, and other IODP-relevant information types, to form a common, program-wide IODP information system (e.g., IODP Portal) which will be accessible to both researchers and the public. The JANUS relational database of ODP was introduced in 1997 and the bulk of ODP shipboard data has been migrated into this system, which is comprised of a relational data model consisting of over 450 tables. The JANUS database includes paleontological, lithostratigraphic, chemical, physical, sedimentological, and geophysical data from a global distribution of sites. For ODP Legs 100 through 210, and including IODP Expeditions 301 through 308, JANUS has been used to store data from 233,835 meters of core recovered, which are comprised of 38,039 cores, with 202,281 core sections stored in repositories, which have resulted in the taking of 2,299,180 samples for scientists and other users (http://iodp.tamu.edu/janusweb/general/dbtable.cgi). JANUS and other IODP databases are viewed as components of an evolving distributed network of databases, supported by metadata catalogs and middleware with XML workflows, that are intended to provide access to DSDP/ODP/IODP cores and sample-based data as well as other distributed geoscience data collections (e.g., CHRONOS, PetDB, SedDB). These data resources can be explored through the use of emerging data visualization environments, such as GeoWall, CoreWall (http://(www.evl.uic.edu/cavern/corewall), a multi-screen display for viewing cores and related data, GeoWall-2 and LambdaVision, a very-high resolution, networked environment for data exploration and visualization, and others. The U.S Implementing Organization (USIO) for the IODP, also known as the JOI Alliance, is a partnership between Joint Oceanographic Institutions (JOI), Texas A&M University, and Lamont-Doherty Earth Observatory of Columbia University. JOI is a consortium of 20 premier oceanographic research institutions that serves the U.S. scientific community by leading large-scale, global research programs in scientific ocean drilling and ocean observing. For more than 25 years, JOI has helped facilitate discovery and advance global understanding of the Earth and its oceans through excellence in program management.
DrugMetZ DB: an anthology of human drug metabolizing Chytochrome P450 enzymes.
Antony, Tresa Remya Thomas; Nagarajan, Shanthi
2006-11-14
Understandings the basics of Cytochrome P450 (P450 or CYP) will help to discern drug metabolism. CYP, a super-family of heme-thiolate proteins, are found in almost all living organisms and is involved in the biotransformation of a diverse range of xenobiotics, therapeutic drugs and toxins. Here, we describe DrugMetZ DB, a database for CYP metabolizing drugs. The DB is implemented in MySQL, PHP and HTML. www.bicpu.edu.in/DrugMetZDB/
CottonGen: a genomics, genetics and breeding database for cotton research
USDA-ARS?s Scientific Manuscript database
CottonGen (http://www.cottongen.org) is a curated and integrated web-based relational database providing access to publicly available genomic, genetic and breeding data for cotton. CottonGen supercedes CottonDB and the Cotton Marker Database, with enhanced tools for easier data sharing, mining, vis...
Liu, Wanting; Xiang, Lunping; Zheng, Tingkai; Jin, Jingjie
2018-01-01
Abstract Translation is a key regulatory step, linking transcriptome and proteome. Two major methods of translatome investigations are RNC-seq (sequencing of translating mRNA) and Ribo-seq (ribosome profiling). To facilitate the investigation of translation, we built a comprehensive database TranslatomeDB (http://www.translatomedb.net/) which provides collection and integrated analysis of published and user-generated translatome sequencing data. The current version includes 2453 Ribo-seq, 10 RNC-seq and their 1394 corresponding mRNA-seq datasets in 13 species. The database emphasizes the analysis functions in addition to the dataset collections. Differential gene expression (DGE) analysis can be performed between any two datasets of same species and type, both on transcriptome and translatome levels. The translation indices translation ratios, elongation velocity index and translational efficiency can be calculated to quantitatively evaluate translational initiation efficiency and elongation velocity, respectively. All datasets were analyzed using a unified, robust, accurate and experimentally-verifiable pipeline based on the FANSe3 mapping algorithm and edgeR for DGE analyzes. TranslatomeDB also allows users to upload their own datasets and utilize the identical unified pipeline to analyze their data. We believe that our TranslatomeDB is a comprehensive platform and knowledgebase on translatome and proteome research, releasing the biologists from complex searching, analyzing and comparing huge sequencing data without needing local computational power. PMID:29106630
Gianni, Daniele; McKeever, Steve; Yu, Tommy; Britten, Randall; Delingette, Hervé; Frangi, Alejandro; Hunter, Peter; Smith, Nicolas
2010-06-28
Sharing and reusing anatomical models over the Web offers a significant opportunity to progress the investigation of cardiovascular diseases. However, the current sharing methodology suffers from the limitations of static model delivery (i.e. embedding static links to the models within Web pages) and of a disaggregated view of the model metadata produced by publications and cardiac simulations in isolation. In the context of euHeart--a research project targeting the description and representation of cardiovascular models for disease diagnosis and treatment purposes--we aim to overcome the above limitations with the introduction of euHeartDB, a Web-enabled database for anatomical models of the heart. The database implements a dynamic sharing methodology by managing data access and by tracing all applications. In addition to this, euHeartDB establishes a knowledge link with the physiome model repository by linking geometries to CellML models embedded in the simulation of cardiac behaviour. Furthermore, euHeartDB uses the exFormat--a preliminary version of the interoperable FieldML data format--to effectively promote reuse of anatomical models, and currently incorporates Continuum Mechanics, Image Analysis, Signal Processing and System Identification Graphical User Interface (CMGUI), a rendering engine, to provide three-dimensional graphical views of the models populating the database. Currently, euHeartDB stores 11 cardiac geometries developed within the euHeart project consortium.
NASA Astrophysics Data System (ADS)
Appel, Marius; Lahn, Florian; Pebesma, Edzer; Buytaert, Wouter; Moulds, Simon
2016-04-01
Today's amount of freely available data requires scientists to spend large parts of their work on data management. This is especially true in environmental sciences when working with large remote sensing datasets, such as obtained from earth-observation satellites like the Sentinel fleet. Many frameworks like SpatialHadoop or Apache Spark address the scalability but target programmers rather than data analysts, and are not dedicated to imagery or array data. In this work, we use the open-source data management and analytics system SciDB to bring large earth-observation datasets closer to analysts. Its underlying data representation as multidimensional arrays fits naturally to earth-observation datasets, distributes storage and computational load over multiple instances by multidimensional chunking, and also enables efficient time-series based analyses, which is usually difficult using file- or tile-based approaches. Existing interfaces to R and Python furthermore allow for scalable analytics with relatively little learning effort. However, interfacing SciDB and file-based earth-observation datasets that come as tiled temporal snapshots requires a lot of manual bookkeeping during ingestion, and SciDB natively only supports loading data from CSV-like and custom binary formatted files, which currently limits its practical use in earth-observation analytics. To make it easier to work with large multi-temporal datasets in SciDB, we developed software tools that enrich SciDB with earth observation metadata and allow working with commonly used file formats: (i) the SciDB extension library scidb4geo simplifies working with spatiotemporal arrays by adding relevant metadata to the database and (ii) the Geospatial Data Abstraction Library (GDAL) driver implementation scidb4gdal allows to ingest and export remote sensing imagery from and to a large number of file formats. Using added metadata on temporal resolution and coverage, the GDAL driver supports time-based ingestion of imagery to existing multi-temporal SciDB arrays. While our SciDB plugin works directly in the database, the GDAL driver has been specifically developed using a minimum amount of external dependencies (i.e. CURL). Source code for both tools is available from github [1]. We present these tools in a case-study that demonstrates the ingestion of multi-temporal tiled earth-observation data to SciDB, followed by a time-series analysis using R and SciDBR. Through the exclusive use of open-source software, our approach supports reproducibility in scalable large-scale earth-observation analytics. In the future, these tools can be used in an automated way to let scientists only work on ready-to-use SciDB arrays to significantly reduce the data management workload for domain scientists. [1] https://github.com/mappl/scidb4geo} and \\url{https://github.com/mappl/scidb4gdal
Salehi, Ali; Jimenez-Berni, Jose; Deery, David M; Palmer, Doug; Holland, Edward; Rozas-Larraondo, Pablo; Chapman, Scott C; Georgakopoulos, Dimitrios; Furbank, Robert T
2015-01-01
To our knowledge, there is no software or database solution that supports large volumes of biological time series sensor data efficiently and enables data visualization and analysis in real time. Existing solutions for managing data typically use unstructured file systems or relational databases. These systems are not designed to provide instantaneous response to user queries. Furthermore, they do not support rapid data analysis and visualization to enable interactive experiments. In large scale experiments, this behaviour slows research discovery, discourages the widespread sharing and reuse of data that could otherwise inform critical decisions in a timely manner and encourage effective collaboration between groups. In this paper we present SensorDB, a web based virtual laboratory that can manage large volumes of biological time series sensor data while supporting rapid data queries and real-time user interaction. SensorDB is sensor agnostic and uses web-based, state-of-the-art cloud and storage technologies to efficiently gather, analyse and visualize data. Collaboration and data sharing between different agencies and groups is thereby facilitated. SensorDB is available online at http://sensordb.csiro.au.
NordicDB: a Nordic pool and portal for genome-wide control data.
Leu, Monica; Humphreys, Keith; Surakka, Ida; Rehnberg, Emil; Muilu, Juha; Rosenström, Päivi; Almgren, Peter; Jääskeläinen, Juha; Lifton, Richard P; Kyvik, Kirsten Ohm; Kaprio, Jaakko; Pedersen, Nancy L; Palotie, Aarno; Hall, Per; Grönberg, Henrik; Groop, Leif; Peltonen, Leena; Palmgren, Juni; Ripatti, Samuli
2010-12-01
A cost-efficient way to increase power in a genetic association study is to pool controls from different sources. The genotyping effort can then be directed to large case series. The Nordic Control database, NordicDB, has been set up as a unique resource in the Nordic area and the data are available for authorized users through the web portal (http://www.nordicdb.org). The current version of NordicDB pools together high-density genome-wide SNP information from ∼5000 controls originating from Finnish, Swedish and Danish studies and shows country-specific allele frequencies for SNP markers. The genetic homogeneity of the samples was investigated using multidimensional scaling (MDS) analysis and pairwise allele frequency differences between the studies. The plot of the first two MDS components showed excellent resemblance to the geographical placement of the samples, with a clear NW-SE gradient. We advise researchers to assess the impact of population structure when incorporating NordicDB controls in association studies. This harmonized Nordic database presents a unique genome-wide resource for future genetic association studies in the Nordic countries.
Schoof, Heiko; Ernst, Rebecca; Nazarov, Vladimir; Pfeifer, Lukas; Mewes, Hans-Werner; Mayer, Klaus F. X.
2004-01-01
Arabidopsis thaliana is the most widely studied model plant. Functional genomics is intensively underway in many laboratories worldwide. Beyond the basic annotation of the primary sequence data, the annotated genetic elements of Arabidopsis must be linked to diverse biological data and higher order information such as metabolic or regulatory pathways. The MIPS Arabidopsis thaliana database MAtDB aims to provide a comprehensive resource for Arabidopsis as a genome model that serves as a primary reference for research in plants and is suitable for transfer of knowledge to other plants, especially crops. The genome sequence as a common backbone serves as a scaffold for the integration of data, while, in a complementary effort, these data are enhanced through the application of state-of-the-art bioinformatics tools. This information is visualized on a genome-wide and a gene-by-gene basis with access both for web users and applications. This report updates the information given in a previous report and provides an outlook on further developments. The MAtDB web interface can be accessed at http://mips.gsf.de/proj/thal/db. PMID:14681437
N-Glycan Structure Annotation of Glycopeptides Using a Linearized Glycan Structure Database (GlyDB)
Ren, Jian Min; Rejtar, Tomas; Li, Lingyun; Karger, Barry L.
2008-01-01
While glycoproteins are abundant in nature, and changes in glycosylation occur in cancer and other diseases, glycoprotein characterization remains a challenge due to the structural complexity of the biopolymers. This paper presents a general strategy, termed GlyDB, for glycan structure annotation of N-linked glycopeptides from tandem mass spectra in the LC-MS analysis of proteolytic digests of glycoproteins. The GlyDB approach takes advantage of low-energy collision induced dissociation of N-linked glycopeptides that preferentially cleaves the glycosidic bonds while the peptide backbone remains intact. A theoretical glycan structure database derived from biosynthetic rules for N-linked glycans was constructed employing a novel representation of branched glycan structures consisting of multiple linear sequences. The commonly used peptide identification program, Sequest, could then be utilized to assign experimental tandem mass spectra to individual glycoforms. Analysis of synthetic glycopeptides and well-characterized glycoproteins demonstrate that the GlyDB approach can be a useful tool for annotation of glycan structures and for selection of a limited number of potential glycan structure candidates for targeted validation. PMID:17625816
NordicDB: a Nordic pool and portal for genome-wide control data
Leu, Monica; Humphreys, Keith; Surakka, Ida; Rehnberg, Emil; Muilu, Juha; Rosenström, Päivi; Almgren, Peter; Jääskeläinen, Juha; Lifton, Richard P; Kyvik, Kirsten Ohm; Kaprio, Jaakko; Pedersen, Nancy L; Palotie, Aarno; Hall, Per; Grönberg, Henrik; Groop, Leif; Peltonen, Leena; Palmgren, Juni; Ripatti, Samuli
2010-01-01
A cost-efficient way to increase power in a genetic association study is to pool controls from different sources. The genotyping effort can then be directed to large case series. The Nordic Control database, NordicDB, has been set up as a unique resource in the Nordic area and the data are available for authorized users through the web portal (http://www.nordicdb.org). The current version of NordicDB pools together high-density genome-wide SNP information from ∼5000 controls originating from Finnish, Swedish and Danish studies and shows country-specific allele frequencies for SNP markers. The genetic homogeneity of the samples was investigated using multidimensional scaling (MDS) analysis and pairwise allele frequency differences between the studies. The plot of the first two MDS components showed excellent resemblance to the geographical placement of the samples, with a clear NW–SE gradient. We advise researchers to assess the impact of population structure when incorporating NordicDB controls in association studies. This harmonized Nordic database presents a unique genome-wide resource for future genetic association studies in the Nordic countries. PMID:20664631
AgeFactDB—the JenAge Ageing Factor Database—towards data integration in ageing research
Hühne, Rolf; Thalheim, Torsten; Sühnel, Jürgen
2014-01-01
AgeFactDB (http://agefactdb.jenage.de) is a database aimed at the collection and integration of ageing phenotype data including lifespan information. Ageing factors are considered to be genes, chemical compounds or other factors such as dietary restriction, whose action results in a changed lifespan or another ageing phenotype. Any information related to the effects of ageing factors is called an observation and is presented on observation pages. To provide concise access to the complete information for a particular ageing factor, corresponding observations are also summarized on ageing factor pages. In a first step, ageing-related data were primarily taken from existing databases such as the Ageing Gene Database—GenAge, the Lifespan Observations Database and the Dietary Restriction Gene Database—GenDR. In addition, we have started to include new ageing-related information. Based on homology data taken from the HomoloGene Database, AgeFactDB also provides observation and ageing factor pages of genes that are homologous to known ageing-related genes. These homologues are considered as candidate or putative ageing-related genes. AgeFactDB offers a variety of search and browse options, and also allows the download of ageing factor or observation lists in TSV, CSV and XML formats. PMID:24217911
DNAproDB: an interactive tool for structural analysis of DNA–protein complexes
Sagendorf, Jared M.
2017-01-01
Abstract Many biological processes are mediated by complex interactions between DNA and proteins. Transcription factors, various polymerases, nucleases and histones recognize and bind DNA with different levels of binding specificity. To understand the physical mechanisms that allow proteins to recognize DNA and achieve their biological functions, it is important to analyze structures of DNA–protein complexes in detail. DNAproDB is a web-based interactive tool designed to help researchers study these complexes. DNAproDB provides an automated structure-processing pipeline that extracts structural features from DNA–protein complexes. The extracted features are organized in structured data files, which are easily parsed with any programming language or viewed in a browser. We processed a large number of DNA–protein complexes retrieved from the Protein Data Bank and created the DNAproDB database to store this data. Users can search the database by combining features of the DNA, protein or DNA–protein interactions at the interface. Additionally, users can upload their own structures for processing privately and securely. DNAproDB provides several interactive and customizable tools for creating visualizations of the DNA–protein interface at different levels of abstraction that can be exported as high quality figures. All functionality is documented and freely accessible at http://dnaprodb.usc.edu. PMID:28431131
Assessment of the SFC database for analysis and modeling
NASA Technical Reports Server (NTRS)
Centeno, Martha A.
1994-01-01
SFC is one of the four clusters that make up the Integrated Work Control System (IWCS), which will integrate the shuttle processing databases at Kennedy Space Center (KSC). The IWCS framework will enable communication among the four clusters and add new data collection protocols. The Shop Floor Control (SFC) module has been operational for two and a half years; however, at this stage, automatic links to the other 3 modules have not been implemented yet, except for a partial link to IOS (CASPR). SFC revolves around a DB/2 database with PFORMS acting as the database management system (DBMS). PFORMS is an off-the-shelf DB/2 application that provides a set of data entry screens and query forms. The main dynamic entity in the SFC and IOS database is a task; thus, the physical storage location and update privileges are driven by the status of the WAD. As we explored the SFC values, we realized that there was much to do before actually engaging in continuous analysis of the SFC data. Half way into this effort, it was realized that full scale analysis would have to be a future third phase of this effort. So, we concentrated on getting to know the contents of the database, and in establishing an initial set of tools to start the continuous analysis process. Specifically, we set out to: (1) provide specific procedures for statistical models, so as to enhance the TP-OAO office analysis and modeling capabilities; (2) design a data exchange interface; (3) prototype the interface to provide inputs to SCRAM; and (4) design a modeling database. These objectives were set with the expectation that, if met, they would provide former TP-OAO engineers with tools that would help them demonstrate the importance of process-based analyses. The latter, in return, will help them obtain the cooperation of various organizations in charting out their individual processes.
Bioinformatics Analysis of Protein Phosphorylation in Plant Systems Biology Using P3DB.
Yao, Qiuming; Xu, Dong
2017-01-01
Protein phosphorylation is one of the most pervasive protein post-translational modification events in plant cells. It is involved in many plant biological processes, such as plant growth, organ development, and plant immunology, by regulating or switching signaling and metabolic pathways. High-throughput experimental methods like mass spectrometry can easily characterize hundreds to thousands of phosphorylation events in a single experiment. With the increasing volume of the data sets, Plant Protein Phosphorylation DataBase (P3DB, http://p3db.org ) provides a comprehensive, systematic, and interactive online platform to deposit, query, analyze, and visualize these phosphorylation events in many plant species. It stores the protein phosphorylation sites in the context of identified mass spectra, phosphopeptides, and phosphoproteins contributed from various plant proteome studies. In addition, P3DB associates these plant phosphorylation sites to protein physicochemical information in the protein charts and tertiary structures, while various protein annotations from hierarchical kinase phosphatase families, protein domains, and gene ontology are also added into the database. P3DB not only provides rich information, but also interconnects and provides visualization of the data in networks, in systems biology context. Currently, P3DB includes the KiC (Kinase Client) assay network, the protein-protein interaction network, the kinase-substrate network, the phosphatase-substrate network, and the protein domain co-occurrence network. All of these are available to query for and visualize existing phosphorylation events. Although P3DB only hosts experimentally identified phosphorylation data, it provides a plant phosphorylation prediction model for any unknown queries on the fly. P3DB is an entry point to the plant phosphorylation community to deposit and visualize any customized data sets within this systems biology framework. Nowadays, P3DB has become one of the major bioinformatics platforms of protein phosphorylation in plant biology.
Sánchez-de-Madariaga, Ricardo; Muñoz, Adolfo; Castro, Antonio L; Moreno, Oscar; Pascual, Mario
2018-01-01
This research shows a protocol to assess the computational complexity of querying relational and non-relational (NoSQL (not only Structured Query Language)) standardized electronic health record (EHR) medical information database systems (DBMS). It uses a set of three doubling-sized databases, i.e. databases storing 5000, 10,000 and 20,000 realistic standardized EHR extracts, in three different database management systems (DBMS): relational MySQL object-relational mapping (ORM), document-based NoSQL MongoDB, and native extensible markup language (XML) NoSQL eXist. The average response times to six complexity-increasing queries were computed, and the results showed a linear behavior in the NoSQL cases. In the NoSQL field, MongoDB presents a much flatter linear slope than eXist. NoSQL systems may also be more appropriate to maintain standardized medical information systems due to the special nature of the updating policies of medical information, which should not affect the consistency and efficiency of the data stored in NoSQL databases. One limitation of this protocol is the lack of direct results of improved relational systems such as archetype relational mapping (ARM) with the same data. However, the interpolation of doubling-size database results to those presented in the literature and other published results suggests that NoSQL systems might be more appropriate in many specific scenarios and problems to be solved. For example, NoSQL may be appropriate for document-based tasks such as EHR extracts used in clinical practice, or edition and visualization, or situations where the aim is not only to query medical information, but also to restore the EHR in exactly its original form. PMID:29608174
Sánchez-de-Madariaga, Ricardo; Muñoz, Adolfo; Castro, Antonio L; Moreno, Oscar; Pascual, Mario
2018-03-19
This research shows a protocol to assess the computational complexity of querying relational and non-relational (NoSQL (not only Structured Query Language)) standardized electronic health record (EHR) medical information database systems (DBMS). It uses a set of three doubling-sized databases, i.e. databases storing 5000, 10,000 and 20,000 realistic standardized EHR extracts, in three different database management systems (DBMS): relational MySQL object-relational mapping (ORM), document-based NoSQL MongoDB, and native extensible markup language (XML) NoSQL eXist. The average response times to six complexity-increasing queries were computed, and the results showed a linear behavior in the NoSQL cases. In the NoSQL field, MongoDB presents a much flatter linear slope than eXist. NoSQL systems may also be more appropriate to maintain standardized medical information systems due to the special nature of the updating policies of medical information, which should not affect the consistency and efficiency of the data stored in NoSQL databases. One limitation of this protocol is the lack of direct results of improved relational systems such as archetype relational mapping (ARM) with the same data. However, the interpolation of doubling-size database results to those presented in the literature and other published results suggests that NoSQL systems might be more appropriate in many specific scenarios and problems to be solved. For example, NoSQL may be appropriate for document-based tasks such as EHR extracts used in clinical practice, or edition and visualization, or situations where the aim is not only to query medical information, but also to restore the EHR in exactly its original form.
Desai, Neel; Alentorn-Geli, Eduard; van Eck, Carola F; Musahl, Volker; Fu, Freddie H; Karlsson, Jón; Samuelsson, Kristian
2016-03-01
The aim of this systematic review was to apply the anatomic ACL reconstruction scoring checklist (AARSC) and to evaluate the degree to which clinical studies comparing single-bundle (SB) and double-bundle (DB) ACL reconstructions are anatomic. A systematic electronic search was performed using the databases PubMed (MEDLINE), EMBASE and Cochrane Library. Studies published from January 1995 to January 2014 comparing SB and DB ACL reconstructions with clinical outcome measurements were included. The items from the AARSC were recorded for both the SB and DB groups in each study. Eight-thousand nine-hundred and ninety-four studies were analysed, 77 were included. Randomized clinical trials (29; 38%) and prospective comparative studies (29; 38%) were the most frequent study type. Most studies were published in 2011 (19; 25%). The most commonly reported items for both SB and DB groups were as follows: graft type (152; 99%), femoral and tibial fixation method (149; 97% respectively), knee flexion angle during graft tensioning (124; 8%) and placement of the tibial tunnel at the ACL insertion site (101; 66%). The highest level of documentation used for ACL tunnel position for both groups was often one dimensional, e.g. drawing, operative notes or o'clock reference. The DB reconstruction was in general more thoroughly reported. The means for the AARSC were 6.9 ± 2.8 for the SB group and 8.3 ± 2.8 for the DB group. Both means were below a proposed required minimum score of 10 for anatomic ACL reconstruction. There was substantial underreporting of surgical data for both the SB and DB groups in clinical studies. This underreporting creates difficulties when analysing, comparing and pooling results of scientific studies on this subject.
Database resources of the National Center for Biotechnology Information
Wheeler, David L.; Barrett, Tanya; Benson, Dennis A.; Bryant, Stephen H.; Canese, Kathi; Chetvernin, Vyacheslav; Church, Deanna M.; DiCuccio, Michael; Edgar, Ron; Federhen, Scott; Geer, Lewis Y.; Helmberg, Wolfgang; Kapustin, Yuri; Kenton, David L.; Khovayko, Oleg; Lipman, David J.; Madden, Thomas L.; Maglott, Donna R.; Ostell, James; Pruitt, Kim D.; Schuler, Gregory D.; Schriml, Lynn M.; Sequeira, Edwin; Sherry, Stephen T.; Sirotkin, Karl; Souvorov, Alexandre; Starchenko, Grigory; Suzek, Tugba O.; Tatusov, Roman; Tatusova, Tatiana A.; Wagner, Lukas; Yaschenko, Eugene
2006-01-01
In addition to maintaining the GenBank(R) nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data made available through NCBI's Web site. NCBI resources include Entrez, the Entrez Programming Utilities, MyNCBI, PubMed, PubMed Central, Entrez Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Electronic PCR, OrfFinder, Spidey, Splign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, Cancer Chromosomes, Entrez Genomes and related tools, the Map Viewer, Model Maker, Evidence Viewer, Clusters of Orthologous Groups, Retroviral Genotyping Tools, HIV-1, Human Protein Interaction Database, SAGEmap, Gene Expression Omnibus, Entrez Probe, GENSAT, Online Mendelian Inheritance in Man, Online Mendelian Inheritance in Animals, the Molecular Modeling Database, the Conserved Domain Database, the Conserved Domain Architecture Retrieval Tool and the PubChem suite of small molecule databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized datasets. All of the resources can be accessed through the NCBI home page at: . PMID:16381840
The Star Schema Benchmark and Augmented Fact Table Indexing
NASA Astrophysics Data System (ADS)
O'Neil, Patrick; O'Neil, Elizabeth; Chen, Xuedong; Revilak, Stephen
We provide a benchmark measuring star schema queries retrieving data from a fact table with Where clause column restrictions on dimension tables. Clustering is crucial to performance with modern disk technology, since retrievals with filter factors down to 0.0005 are now performed most efficiently by sequential table search rather than by indexed access. DB2’s Multi-Dimensional Clustering (MDC) provides methods to "dice" the fact table along a number of orthogonal "dimensions", but only when these dimensions are columns in the fact table. The diced cells cluster fact rows on several of these "dimensions" at once so queries restricting several such columns can access crucially localized data, with much faster query response. Unfortunately, columns of dimension tables of a star schema are not usually represented in the fact table. In this paper, we show a simple way to adjoin physical copies of dimension columns to the fact table, dicing data to effectively cluster query retrieval, and explain how such dicing can be achieved on database products other than DB2. We provide benchmark measurements to show successful use of this methodology on three commercial database products.
SpecDB: The AAVSO’s Public Repository for Spectra of Variable Stars
NASA Astrophysics Data System (ADS)
Kafka, Stella; Weaver, John; Silvis, George; Beck, Sara
2018-01-01
SpecDB is the American Association of Variable Star Observers (AAVSO) spectral database. Accessible to any astronomer with the capability to perform spectroscopy, SpecDB provides an unprecedented scientific opportunity for amateur and professional astronomers around the globe. Backed by the Variable Star Index, one of the most utilized variable star catalogs, SpecDB is expected to become one of the world leading databases of its kind. Once verified by a team of expert spectroscopists, an observer can upload spectra of variable stars target easily and efficiently. Uploaded spectra can then be searched for, previewed, and downloaded for inclusion in publications. Close community development and involvement will ensure a user-friendly and versatile database, compatible with the needs of 21st century astrophysics. Observations of 1D spectra are submitted as FITS files. All spectra are required to be preprocessed for wavelength calibration and dark subtraction; Bias and flat are strongly recommended. First time observers are required to submit a spectrum of a standard (non-variable) star to be checked for errors in technique or equipment. Regardless of user validation, FITS headers must include several value cards detailing the observation, as well as information regarding the observer, equipment, and observing site in accordance with existing AAVSO records. This enforces consistency and provides necessary details for follow up analysis. Requirements are provided to users in a comprehensive guidebook and accompanying technical manual. Upon submission, FITS headers are automatically checked for errors and any anomalies are immediately fed back to the user. Successful candidates can then submit at will, including multiple simultaneous submissions. All published observations can be searched and interactively previewed. Community involvement will be enhanced by an associated forum where users can discuss observation techniques and suggest improvements to the database.
Ahmed, Zeeshan; Zeeshan, Saman; Fleischmann, Pauline; Rössler, Wolfgang; Dandekar, Thomas
2014-01-01
Field studies on arthropod ecology and behaviour require simple and robust monitoring tools, preferably with direct access to an integrated database. We have developed and here present a database tool allowing smart-phone based monitoring of arthropods. This smart phone application provides an easy solution to collect, manage and process the data in the field which has been a very difficult task for field biologists using traditional methods. To monitor our example species, the desert ant Cataglyphis fortis, we considered behavior, nest search runs, feeding habits and path segmentations including detailed information on solar position and azimuth calculation, ant orientation and time of day. For this we established a user friendly database system integrating the Ant-App-DB with a smart phone and tablet application, combining experimental data manipulation with data management and providing solar position and timing estimations without any GPS or GIS system. Moreover, the new desktop application Dataplus allows efficient data extraction and conversion from smart phone application to personal computers, for further ecological data analysis and sharing. All features, software code and database as well as Dataplus application are made available completely free of charge and sufficiently generic to be easily adapted to other field monitoring studies on arthropods or other migratory organisms. The software applications Ant-App-DB and Dataplus described here are developed using the Android SDK, Java, XML, C# and SQLite Database.
Ahmed, Zeeshan; Zeeshan, Saman; Fleischmann, Pauline; Rössler, Wolfgang; Dandekar, Thomas
2015-01-01
Field studies on arthropod ecology and behaviour require simple and robust monitoring tools, preferably with direct access to an integrated database. We have developed and here present a database tool allowing smart-phone based monitoring of arthropods. This smart phone application provides an easy solution to collect, manage and process the data in the field which has been a very difficult task for field biologists using traditional methods. To monitor our example species, the desert ant Cataglyphis fortis, we considered behavior, nest search runs, feeding habits and path segmentations including detailed information on solar position and azimuth calculation, ant orientation and time of day. For this we established a user friendly database system integrating the Ant-App-DB with a smart phone and tablet application, combining experimental data manipulation with data management and providing solar position and timing estimations without any GPS or GIS system. Moreover, the new desktop application Dataplus allows efficient data extraction and conversion from smart phone application to personal computers, for further ecological data analysis and sharing. All features, software code and database as well as Dataplus application are made available completely free of charge and sufficiently generic to be easily adapted to other field monitoring studies on arthropods or other migratory organisms. The software applications Ant-App-DB and Dataplus described here are developed using the Android SDK, Java, XML, C# and SQLite Database. PMID:25977753
MCAW-DB: A glycan profile database capturing the ambiguity of glycan recognition patterns.
Hosoda, Masae; Takahashi, Yushi; Shiota, Masaaki; Shinmachi, Daisuke; Inomoto, Renji; Higashimoto, Shinichi; Aoki-Kinoshita, Kiyoko F
2018-05-11
Glycan-binding protein (GBP) interaction experiments, such as glycan microarrays, are often used to understand glycan recognition patterns. However, oftentimes the interpretation of glycan array experimental data makes it difficult to identify discrete GBP binding patterns due to their ambiguity. It is known that lectins, for example, are non-specific in their binding affinities; the same lectin can bind to different monosaccharides or even different glycan structures. In bioinformatics, several tools to mine the data generated from these sorts of experiments have been developed. These tools take a library of predefined motifs, which are commonly-found glycan patterns such as sialyl-Lewis X, and attempt to identify the motif(s) that are specific to the GBP being analyzed. In our previous work, as opposed to using predefined motifs, we developed the Multiple Carbohydrate Alignment with Weights (MCAW) tool to visualize the state of the glycans being recognized by the GBP under analysis. We previously reported on the effectiveness of our tool and algorithm by analyzing several glycan array datasets from the Consortium of Functional Glycomics (CFG). In this work, we report on our analysis of 1081 data sets which we collected from the CFG, the results of which we have made publicly and freely available as a database called MCAW-DB. We introduce this database, its usage and describe several analysis results. We show how MCAW-DB can be used to analyze glycan-binding patterns of GBPs amidst their ambiguity. For example, the visualization of glycan-binding patterns in MCAW-DB show how they correlate with the concentrations of the samples used in the array experiments. Using MCAW-DB, the patterns of glycans found to bind to various GBP-glycan binding proteins are visualized, indicating the binding "environment" of the glycans. Thus, the ambiguity of glycan recognition is numerically represented, along with the patterns of monosaccharides surrounding the binding region. The profiles in MCAW-DB could potentially be used as predictors of affinity of unknown or novel glycans to particular GBPs by comparing how well they match the existing profiles for those GBPs. Moreover, as the glycan profiles of diseased tissues become available, glycan alignments could also be used to identify glycan biomarkers unique to that tissue. Databases of these alignments may be of great use for drug discovery. Copyright © 2018 The Authors. Published by Elsevier Ltd.. All rights reserved.
Narsai, Reena; Devenish, James; Castleden, Ian; Narsai, Kabir; Xu, Lin; Shou, Huixia; Whelan, James
2013-01-01
Omics research in Oryza sativa (rice) relies on the use of multiple databases to obtain different types of information to define gene function. We present Rice DB, an Oryza information portal that is a functional genomics database, linking gene loci to comprehensive annotations, expression data and the subcellular location of encoded proteins. Rice DB has been designed to integrate the direct comparison of rice with Arabidopsis (Arabidopsis thaliana), based on orthology or ‘expressology’, thus using and combining available information from two pre-eminent plant models. To establish Rice DB, gene identifiers (more than 40 types) and annotations from a variety of sources were compiled, functional information based on large-scale and individual studies was manually collated, hundreds of microarrays were analysed to generate expression annotations, and the occurrences of potential functional regulatory motifs in promoter regions were calculated. A range of computational subcellular localization predictions were also run for all putative proteins encoded in the rice genome, and experimentally confirmed protein localizations have been collated, curated and linked to functional studies in rice. A single search box allows anything from gene identifiers (for rice and/or Arabidopsis), motif sequences, subcellular location, to keyword searches to be entered, with the capability of Boolean searches (such as AND/OR). To demonstrate the utility of Rice DB, several examples are presented including a rice mitochondrial proteome, which draws on a variety of sources for subcellular location data within Rice DB. Comparisons of subcellular location, functional annotations, as well as transcript expression in parallel with Arabidopsis reveals examples of conservation between rice and Arabidopsis, using Rice DB (http://ricedb.plantenergy.uwa.edu.au). PMID:24147765
Narsai, Reena; Devenish, James; Castleden, Ian; Narsai, Kabir; Xu, Lin; Shou, Huixia; Whelan, James
2013-12-01
Omics research in Oryza sativa (rice) relies on the use of multiple databases to obtain different types of information to define gene function. We present Rice DB, an Oryza information portal that is a functional genomics database, linking gene loci to comprehensive annotations, expression data and the subcellular location of encoded proteins. Rice DB has been designed to integrate the direct comparison of rice with Arabidopsis (Arabidopsis thaliana), based on orthology or 'expressology', thus using and combining available information from two pre-eminent plant models. To establish Rice DB, gene identifiers (more than 40 types) and annotations from a variety of sources were compiled, functional information based on large-scale and individual studies was manually collated, hundreds of microarrays were analysed to generate expression annotations, and the occurrences of potential functional regulatory motifs in promoter regions were calculated. A range of computational subcellular localization predictions were also run for all putative proteins encoded in the rice genome, and experimentally confirmed protein localizations have been collated, curated and linked to functional studies in rice. A single search box allows anything from gene identifiers (for rice and/or Arabidopsis), motif sequences, subcellular location, to keyword searches to be entered, with the capability of Boolean searches (such as AND/OR). To demonstrate the utility of Rice DB, several examples are presented including a rice mitochondrial proteome, which draws on a variety of sources for subcellular location data within Rice DB. Comparisons of subcellular location, functional annotations, as well as transcript expression in parallel with Arabidopsis reveals examples of conservation between rice and Arabidopsis, using Rice DB (http://ricedb.plantenergy.uwa.edu.au). © 2013 The Authors The Plant Journal © 2013 John Wiley & Sons Ltd.
Bouyssié, David; Dubois, Marc; Nasso, Sara; Gonzalez de Peredo, Anne; Burlet-Schiltz, Odile; Aebersold, Ruedi; Monsarrat, Bernard
2015-01-01
The analysis and management of MS data, especially those generated by data independent MS acquisition, exemplified by SWATH-MS, pose significant challenges for proteomics bioinformatics. The large size and vast amount of information inherent to these data sets need to be properly structured to enable an efficient and straightforward extraction of the signals used to identify specific target peptides. Standard XML based formats are not well suited to large MS data files, for example, those generated by SWATH-MS, and compromise high-throughput data processing and storing. We developed mzDB, an efficient file format for large MS data sets. It relies on the SQLite software library and consists of a standardized and portable server-less single-file database. An optimized 3D indexing approach is adopted, where the LC-MS coordinates (retention time and m/z), along with the precursor m/z for SWATH-MS data, are used to query the database for data extraction. In comparison with XML formats, mzDB saves ∼25% of storage space and improves access times by a factor of twofold up to even 2000-fold, depending on the particular data access. Similarly, mzDB shows also slightly to significantly lower access times in comparison with other formats like mz5. Both C++ and Java implementations, converting raw or XML formats to mzDB and providing access methods, will be released under permissive license. mzDB can be easily accessed by the SQLite C library and its drivers for all major languages, and browsed with existing dedicated GUIs. The mzDB described here can boost existing mass spectrometry data analysis pipelines, offering unprecedented performance in terms of efficiency, portability, compactness, and flexibility. PMID:25505153
Selectivity assessment of DB-200 and DB-VRX open-tubular capillary columns.
Kiridena, W; Koziola, W W; Poole, C F
2001-10-12
The solvation parameter model is used to study the influence of composition and temperature on the selectivity of two poly(siloxane) stationary phases used for open-tubular capillary column gas chromatography. The poly(methyltrifluoropropyldimethylsiloxane) stationary phase, DB-200, has low cohesion, intermediate dipolarity/polarizability, low hydrogen-bond basicity, no hydrogen-bond acidity, and repulsive electron lone pair interactions. The DB-VRX stationary phase has low cohesion, low dipolarity/polarizability, low hydrogen-bond basicity and no hydrogen-bond acidity and no capacity for electron lone pair interactions. The selectivity of the two stationary phases is complementary to those in a database of 11 stationary phase chemistries determined under the same experimental conditions.
MetPetDB: A database for metamorphic geochemistry
NASA Astrophysics Data System (ADS)
Spear, Frank S.; Hallett, Benjamin; Pyle, Joseph M.; Adalı, Sibel; Szymanski, Boleslaw K.; Waters, Anthony; Linder, Zak; Pearce, Shawn O.; Fyffe, Matthew; Goldfarb, Dennis; Glickenhouse, Nickolas; Buletti, Heather
2009-12-01
We present a data model for the initial implementation of MetPetDB, a geochemical database specific to metamorphic rock samples. The database is designed around the concept of preservation of spatial relationships, at all scales, of chemical analyses and their textural setting. Objects in the database (samples) represent physical rock samples; each sample may contain one or more subsamples with associated geochemical and image data. Samples, subsamples, geochemical data, and images are described with attributes (some required, some optional); these attributes also serve as search delimiters. All data in the database are classified as published (i.e., archived or published data), public or private. Public and published data may be freely searched and downloaded. All private data is owned; permission to view, edit, download and otherwise manipulate private data may be granted only by the data owner; all such editing operations are recorded by the database to create a data version log. The sharing of data permissions among a group of collaborators researching a common sample is done by the sample owner through the project manager. User interaction with MetPetDB is hosted by a web-based platform based upon the Java servlet application programming interface, with the PostgreSQL relational database. The database web portal includes modules that allow the user to interact with the database: registered users may save and download public and published data, upload private data, create projects, and assign permission levels to project collaborators. An Image Viewer module provides for spatial integration of image and geochemical data. A toolkit consisting of plotting and geochemical calculation software for data analysis and a mobile application for viewing the public and published data is being developed. Future issues to address include population of the database, integration with other geochemical databases, development of the analysis toolkit, creation of data models for derivative data, and building a community-wide user base. It is believed that this and other geochemical databases will enable more productive collaborations, generate more efficient research efforts, and foster new developments in basic research in the field of solid earth geochemistry.
Design and characterization of microstrip based E-field sensor for GSM and UMTS frequency bands
NASA Astrophysics Data System (ADS)
Narang, N.; Dubey, S. K.; Negi, P. S.; Ojha, V. N.
2016-12-01
An Electric (E-) field sensor based on coplanar waveguide-fed microstrip antenna to measure E-field strength for dual-band operation at 914 MHz and 2.1 GHz is proposed, designed, and characterized. The parametric optimization of the design has been performed to obtain resonance at global system for mobile communication and universal mobile telecommunication system frequency band. Low return loss (-17 dB and -19 dB), appropriate gain (0.50 dB and 1.55 dB), and isotropic behaviour (directivity ˜ 1 dB), respectively, at 914 MHz and 2.1 GHz, are obtained for probing application. Antenna factor (AF) is used as an important parameter to characterize the performance of the E-field sensor. The AF measurement is explained in detail and results are reported. Finally, using the designed E-field sensor, the E-field strength measurements are carried out in a transverse electromagnetic cell. The key sources of uncertainties in the measurement are identified, evaluated, and incorporated into the final results. The measurement results are compared with theoretical values, which are found in good agreement. For comparative validation, the results are evaluated with reference to an already calibrated commercially available isotropic probe.
Design and characterization of microstrip based E-field sensor for GSM and UMTS frequency bands.
Narang, N; Dubey, S K; Negi, P S; Ojha, V N
2016-12-01
An Electric (E-) field sensor based on coplanar waveguide-fed microstrip antenna to measure E-field strength for dual-band operation at 914 MHz and 2.1 GHz is proposed, designed, and characterized. The parametric optimization of the design has been performed to obtain resonance at global system for mobile communication and universal mobile telecommunication system frequency band. Low return loss (-17 dB and -19 dB), appropriate gain (0.50 dB and 1.55 dB), and isotropic behaviour (directivity ∼ 1 dB), respectively, at 914 MHz and 2.1 GHz, are obtained for probing application. Antenna factor (AF) is used as an important parameter to characterize the performance of the E-field sensor. The AF measurement is explained in detail and results are reported. Finally, using the designed E-field sensor, the E-field strength measurements are carried out in a transverse electromagnetic cell. The key sources of uncertainties in the measurement are identified, evaluated, and incorporated into the final results. The measurement results are compared with theoretical values, which are found in good agreement. For comparative validation, the results are evaluated with reference to an already calibrated commercially available isotropic probe.
Database resources of the National Center for Biotechnology Information
Wheeler, David L.; Barrett, Tanya; Benson, Dennis A.; Bryant, Stephen H.; Canese, Kathi; Chetvernin, Vyacheslav; Church, Deanna M.; DiCuccio, Michael; Edgar, Ron; Federhen, Scott; Feolo, Michael; Geer, Lewis Y.; Helmberg, Wolfgang; Kapustin, Yuri; Khovayko, Oleg; Landsman, David; Lipman, David J.; Madden, Thomas L.; Maglott, Donna R.; Miller, Vadim; Ostell, James; Pruitt, Kim D.; Schuler, Gregory D.; Shumway, Martin; Sequeira, Edwin; Sherry, Steven T.; Sirotkin, Karl; Souvorov, Alexandre; Starchenko, Grigory; Tatusov, Roman L.; Tatusova, Tatiana A.; Wagner, Lukas; Yaschenko, Eugene
2008-01-01
In addition to maintaining the GenBank(R) nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data available through NCBI's web site. NCBI resources include Entrez, the Entrez Programming Utilities, My NCBI, PubMed, PubMed Central, Entrez Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link, Electronic PCR, OrfFinder, Spidey, Splign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, Cancer Chromosomes, Entrez Genome, Genome Project and related tools, the Trace, Assembly, and Short Read Archives, the Map Viewer, Model Maker, Evidence Viewer, Clusters of Orthologous Groups, Influenza Viral Resources, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus, Entrez Probe, GENSAT, Database of Genotype and Phenotype, Online Mendelian Inheritance in Man, Online Mendelian Inheritance in Animals, the Molecular Modeling Database, the Conserved Domain Database, the Conserved Domain Architecture Retrieval Tool and the PubChem suite of small molecule databases. Augmenting the web applications are custom implementations of the BLAST program optimized to search specialized data sets. These resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov. PMID:18045790
SerpentinaDB: a database of plant-derived molecules of Rauvolfia serpentina.
Pathania, Shivalika; Ramakrishnan, Sai Mukund; Randhawa, Vinay; Bagler, Ganesh
2015-08-04
Plant-derived molecules (PDMs) are known to be a rich source of diverse scaffolds that could serve as a basis for rational drug design. Structured compilation of phytochemicals from traditional medicinal plants can facilitate prospection for novel PDMs and their analogs as therapeutic agents. Rauvolfia serpentina is an important medicinal plant, endemic to Himalayan mountain ranges of Indian subcontinent, reported to be of immense therapeutic value against various diseases. We present SerpentinaDB, a structured compilation of 147 R. serpentina PDMs, inclusive of their plant part source, chemical classification, IUPAC, SMILES, physicochemical properties, and 3D chemical structures with associated references. It also provides refined search option for identification of analogs of natural molecules against ZINC database at user-defined cut-off. SerpentinaDB is an exhaustive resource of R. serpentina molecules facilitating prospection for therapeutic molecules from a medicinally important source of natural products. It also provides refined search option to explore the neighborhood of chemical space against ZINC database to identify analogs of natural molecules obtained as leads. In a previous study, we have demonstrated the utility of this resource by identifying novel aldose reductase inhibitors towards intervention of complications of diabetes.
Wang, Ruijia; Nambiar, Ram; Zheng, Dinghai
2018-01-01
Abstract PolyA_DB is a database cataloging cleavage and polyadenylation sites (PASs) in several genomes. Previous versions were based mainly on expressed sequence tags (ESTs), which had a limited amount and could lead to inaccurate PAS identification due to the presence of internal A-rich sequences in transcripts. Here, we present an updated version of the database based solely on deep sequencing data. First, PASs are mapped by the 3′ region extraction and deep sequencing (3′READS) method, ensuring unequivocal PAS identification. Second, a large volume of data based on diverse biological samples increases PAS coverage by 3.5-fold over the EST-based version and provides PAS usage information. Third, strand-specific RNA-seq data are used to extend annotated 3′ ends of genes to obtain more thorough annotations of alternative polyadenylation (APA) sites. Fourth, conservation information of PAS across mammals sheds light on significance of APA sites. The database (URL: http://www.polya-db.org/v3) currently holds PASs in human, mouse, rat and chicken, and has links to the UCSC genome browser for further visualization and for integration with other genomic data. PMID:29069441
BitterDB: a database of bitter compounds
Wiener, Ayana; Shudler, Marina; Levit, Anat; Niv, Masha Y.
2012-01-01
Basic taste qualities like sour, salty, sweet, bitter and umami serve specific functions in identifying food components found in the diet of humans and animals, and are recognized by proteins in the oral cavity. Recognition of bitter taste and aversion to it are thought to protect the organism against the ingestion of poisonous food compounds, which are often bitter. Interestingly, bitter taste receptors are expressed not only in the mouth but also in extraoral tissues, such as the gastrointestinal tract, indicating that they may play a role in digestive and metabolic processes. BitterDB database, available at http://bitterdb.agri.huji.ac.il/bitterdb/, includes over 550 compounds that were reported to taste bitter to humans. The compounds can be searched by name, chemical structure, similarity to other bitter compounds, association with a particular human bitter taste receptor, and so on. The database also contains information on mutations in bitter taste receptors that were shown to influence receptor activation by bitter compounds. The aim of BitterDB is to facilitate studying the chemical features associated with bitterness. These studies may contribute to predicting bitterness of unknown compounds, predicting ligands for bitter receptors from different species and rational design of bitterness modulators. PMID:21940398
The Gypsy Database (GyDB) of mobile genetic elements
Lloréns, C.; Futami, R.; Bezemer, D.; Moya, A.
2008-01-01
In this article, we introduce the Gypsy Database (GyDB) of mobile genetic elements, an in-progress database devoted to the non-redundant analysis and evolutionary-based classification of mobile genetic elements. In this first version, we contemplate eukaryotic Ty3/Gypsy and Retroviridae long terminal repeats (LTR) retroelements. Phylogenetic analyses based on the gag-pro-pol internal region commonly presented by these two groups strongly support a certain number of previously described Ty3/Gypsy lineages originally reported from reverse-transcriptase (RT) analyses. Vertebrate retroviruses (Retroviridae) are also constituted in several monophyletic groups consistent with genera proposed by the ICTV nomenclature, as well as with the current tendency to classify both endogenous and exogenous retroviruses by three major classes (I, II and III). Our inference indicates that all protein domains codified by the gag-pro-pol internal region of these two groups agree in a collective presentation of a particular evolutionary history, which may be used as a main criterion to differentiate their molecular diversity in a comprehensive collection of phylogenies and non-redundant molecular profiles useful in the identification of new Ty3/Gypsy and Retroviridae species. The GyDB project is available at http://gydb.uv.es. PMID:17895280
HodDB: Design and Analysis of a Query Processor for Brick.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fierro, Gabriel; Culler, David
Brick is a recently proposed metadata schema and ontology for describing building components and the relationships between them. It represents buildings as directed labeled graphs using the RDF data model. Using the SPARQL query language, building-agnostic applications query a Brick graph to discover the set of resources and relationships they require to operate. Latency-sensitive applications, such as user interfaces, demand response and modelpredictive control, require fast queries — conventionally less than 100ms. We benchmark a set of popular open-source and commercial SPARQL databases against three real Brick models using seven application queries and find that none of them meet thismore » performance target. This lack of performance can be attributed to design decisions that optimize for queries over large graphs consisting of billions of triples, but give poor spatial locality and join performance on the small dense graphs typical of Brick. We present the design and evaluation of HodDB, a RDF/SPARQL database for Brick built over a node-based index structure. HodDB performs Brick queries 3-700x faster than leading SPARQL databases and consistently meets the 100ms threshold, enabling the portability of important latency-sensitive building applications.« less
MyMolDB: a micromolecular database solution with open source and free components.
Xia, Bing; Tai, Zheng-Fu; Gu, Yu-Cheng; Li, Bang-Jing; Ding, Li-Sheng; Zhou, Yan
2011-10-01
To manage chemical structures in small laboratories is one of the important daily tasks. Few solutions are available on the internet, and most of them are closed source applications. The open-source applications typically have limited capability and basic cheminformatics functionalities. In this article, we describe an open-source solution to manage chemicals in research groups based on open source and free components. It has a user-friendly interface with the functions of chemical handling and intensive searching. MyMolDB is a micromolecular database solution that supports exact, substructure, similarity, and combined searching. This solution is mainly implemented using scripting language Python with a web-based interface for compound management and searching. Almost all the searches are in essence done with pure SQL on the database by using the high performance of the database engine. Thus, impressive searching speed has been archived in large data sets for no external Central Processing Unit (CPU) consuming languages were involved in the key procedure of the searching. MyMolDB is an open-source software and can be modified and/or redistributed under GNU General Public License version 3 published by the Free Software Foundation (Free Software Foundation Inc. The GNU General Public License, Version 3, 2007. Available at: http://www.gnu.org/licenses/gpl.html). The software itself can be found at http://code.google.com/p/mymoldb/. Copyright © 2011 Wiley Periodicals, Inc.
Zhu, Chengsheng; Miller, Maximilian
2018-01-01
Abstract Microbial functional diversification is driven by environmental factors, i.e. microorganisms inhabiting the same environmental niche tend to be more functionally similar than those from different environments. In some cases, even closely phylogenetically related microbes differ more across environments than across taxa. While microbial similarities are often reported in terms of taxonomic relationships, no existing databases directly link microbial functions to the environment. We previously developed a method for comparing microbial functional similarities on the basis of proteins translated from their sequenced genomes. Here, we describe fusionDB, a novel database that uses our functional data to represent 1374 taxonomically distinct bacteria annotated with available metadata: habitat/niche, preferred temperature, and oxygen use. Each microbe is encoded as a set of functions represented by its proteome and individual microbes are connected via common functions. Users can search fusionDB via combinations of organism names and metadata. Moreover, the web interface allows mapping new microbial genomes to the functional spectrum of reference bacteria, rendering interactive similarity networks that highlight shared functionality. fusionDB provides a fast means of comparing microbes, identifying potential horizontal gene transfer events, and highlighting key environment-specific functionality. PMID:29112720
Comparing NetCDF and SciDB on managing and querying 5D hydrologic dataset
NASA Astrophysics Data System (ADS)
Liu, Haicheng; Xiao, Xiao
2016-11-01
Efficiently extracting information from high dimensional hydro-meteorological modelling datasets requires smart solutions. Traditional methods are mostly based on files, which can be edited and accessed handily. But they have problems of efficiency due to contiguous storage structure. Others propose databases as an alternative for advantages such as native functionalities for manipulating multidimensional (MD) arrays, smart caching strategy and scalability. In this research, NetCDF file based solutions and the multidimensional array database management system (DBMS) SciDB applying chunked storage structure are benchmarked to determine the best solution for storing and querying 5D large hydrologic modelling dataset. The effect of data storage configurations including chunk size, dimension order and compression on query performance is explored. Results indicate that dimension order to organize storage of 5D data has significant influence on query performance if chunk size is very large. But the effect becomes insignificant when chunk size is properly set. Compression of SciDB mostly has negative influence on query performance. Caching is an advantage but may be influenced by execution of different query processes. On the whole, NetCDF solution without compression is in general more efficient than the SciDB DBMS.
Koopmans, Bastijn; Smit, August B; Verhage, Matthijs; Loos, Maarten
2017-04-04
Systematic, standardized and in-depth phenotyping and data analyses of rodent behaviour empowers gene-function studies, drug testing and therapy design. However, no data repositories are currently available for standardized quality control, data analysis and mining at the resolution of individual mice. Here, we present AHCODA-DB, a public data repository with standardized quality control and exclusion criteria aimed to enhance robustness of data, enabled with web-based mining tools for the analysis of individually and group-wise collected mouse phenotypic data. AHCODA-DB allows monitoring in vivo effects of compounds collected from conventional behavioural tests and from automated home-cage experiments assessing spontaneous behaviour, anxiety and cognition without human interference. AHCODA-DB includes such data from mutant mice (transgenics, knock-out, knock-in), (recombinant) inbred strains, and compound effects in wildtype mice and disease models. AHCODA-DB provides real time statistical analyses with single mouse resolution and versatile suite of data presentation tools. On March 9th, 2017 AHCODA-DB contained 650 k data points on 2419 parameters from 1563 mice. AHCODA-DB provides users with tools to systematically explore mouse behavioural data, both with positive and negative outcome, published and unpublished, across time and experiments with single mouse resolution. The standardized (automated) experimental settings and the large current dataset (1563 mice) in AHCODA-DB provide a unique framework for the interpretation of behavioural data and drug effects. The use of common ontologies allows data export to other databases such as the Mouse Phenome Database. Unbiased presentation of positive and negative data obtained under the highly standardized screening conditions increase cost efficiency of publicly funded mouse screening projects and help to reach consensus conclusions on drug responses and mouse behavioural phenotypes. The website is publicly accessible through https://public.sylics.com and can be viewed in every recent version of all commonly used browsers.
Carbon - Bulk Density Relationships for Highly Weathered Soils of the Americas
NASA Astrophysics Data System (ADS)
Nave, L. E.
2014-12-01
Soils are dynamic natural bodies composed of mineral and organic materials. As a result of this mixed composition, essential properties of soils such as their apparent density, organic and mineral contents are typically correlated. Negative relationships between bulk density (Db) and organic matter concentration provide well-known examples across a broad range of soils, and such quantitative relationships among soil properties are useful for a variety of applications. First, gap-filling or data interpolation often are necessary to develop large soil carbon (C) datasets; furthermore, limitations of access to analytical instruments may preclude C determinations for every soil sample. In such cases, equations to derive soil C concentrations from basic measures of soil mass, volume, and density offer significant potential for purposes of soil C stock estimation. To facilitate estimation of soil C stocks on highly weathered soils of the Americas, I used observations from the International Soil Carbon Network (ISCN) database to develop carbon - bulk density prediction equations for Oxisols and Ultisols. Within a small sample set of georeferenced Oxisols (n=89), 29% of the variation in A horizon C concentrations can be predicted from Db. Including the A-horizon sand content improves predictive capacity to 35%. B horizon C concentrations (n=285) were best predicted by Db and clay content, but were more variable than A-horizons (only 10% of variation explained by linear regression). Among Ultisols, a larger sample set allowed investigation of specific horizons of interest. For example, C concentrations of plowed A (Ap) horizons are predictable based on Db, sand and silt contents (n=804, r2=0.38); gleyed argillic (Btg) horizon concentrations are predictable from Db, sand and clay contents (n=190, r2=0.23). Because soil C stock estimates are more sensitive to variation in soil mass and volume determinations than to variation in C concentration, prediction equations such as these may be used on carefully collected samples to constrain soil C stocks. The geo-referenced ISCN database allows users the opportunity to derive similar predictive relationships among measured soil parameters; continued input of new datasets from highly weathered soils of the Americas will improve the precision of these prediction equations.
ApiEST-DB: analyzing clustered EST data of the apicomplexan parasites.
Li, Li; Crabtree, Jonathan; Fischer, Steve; Pinney, Deborah; Stoeckert, Christian J; Sibley, L David; Roos, David S
2004-01-01
ApiEST-DB (http://www.cbil.upenn.edu/paradbs-servlet/) provides integrated access to publicly available EST data from protozoan parasites in the phylum Apicomplexa. The database currently incorporates a total of nearly 100,000 ESTs from several parasite species of clinical and/or veterinary interest, including Eimeria tenella, Neospora caninum, Plasmodium falciparum, Sarcocystis neurona and Toxoplasma gondii. To facilitate analysis of these data, EST sequences were clustered and assembled to form consensus sequences for each organism, and these assemblies were then subjected to automated annotation via similarity searches against protein and domain databases. The underlying relational database infrastructure, Genomics Unified Schema (GUS), enables complex biologically based queries, facilitating validation of gene models, identification of alternative splicing, detection of single nucleotide polymorphisms, identification of stage-specific genes and recognition of phylogenetically conserved and phylogenetically restricted sequences.
ForC: a global database of forest carbon stocks and fluxes
DOE Office of Scientific and Technical Information (OSTI.GOV)
Anderson-Teixeira, Kristina J.; Wang, Maria M. H.; McGarvey, Jennifer C.
Forests play an influential role in the global carbon (C) cycle, storing roughly half of terrestrial C and annually exchanging with the atmosphere more than ten times the carbon dioxide (CO 2) emitted by anthropogenic activities. Yet, scaling up from ground-based measurements of forest C stocks and fluxes to understand global scale C cycling and its climate sensitivity remains an important challenge. Tens of thousands of forest C measurements have been made, but these data have yet to be integrated into a single database that makes them accessible for integrated analyses. Here we present an open-access global Forest Carbon databasemore » (ForC) containing records of ground-based measurements of ecosystem-level C stocks and annual fluxes, along with disturbance history and methodological information. ForC expands upon the previously published tropical portion of this database, TropForC (DOI: 10.5061/dryad.t516f), now including 17,538 records (previously 3568) representing 2,731 plots (previously 845) in 826 geographically distinct areas (previously 178). The database covers all forested biogeographic and climate zones, represents forest stands of all ages, and includes 89 C cycle variables collected between 1934 and 2015. We expect that ForC will prove useful for macroecological analyses of forest C cycling, for evaluation of model predictions or remote sensing products, for quantifying the contribution of forests to the global C cycle, and for supporting international efforts to inventory forest carbon and greenhouse gas exchange. A dynamic version of ForC-db is maintained at https://github.com/forc-db, and we encourage the research community to collaborate in updating, correcting, expanding, and utilizing this database.« less
The Chandra Source Catalog 2.0: Calibrations
NASA Astrophysics Data System (ADS)
Graessle, Dale E.; Evans, Ian N.; Rots, Arnold H.; Allen, Christopher E.; Anderson, Craig S.; Budynkiewicz, Jamie A.; Burke, Douglas; Chen, Judy C.; Civano, Francesca Maria; D'Abrusco, Raffaele; Doe, Stephen M.; Evans, Janet D.; Fabbiano, Giuseppina; Gibbs, Danny G., II; Glotfelty, Kenny J.; Grier, John D.; Hain, Roger; Hall, Diane M.; Harbo, Peter N.; Houck, John C.; Lauer, Jennifer L.; Laurino, Omar; Lee, Nicholas P.; Martínez-Galarza, Juan Rafael; McCollough, Michael L.; McDowell, Jonathan C.; Miller, Joseph; McLaughlin, Warren; Morgan, Douglas L.; Mossman, Amy E.; Nguyen, Dan T.; Nichols, Joy S.; Nowak, Michael A.; Paxson, Charles; Plummer, David A.; Primini, Francis Anthony; Siemiginowska, Aneta; Sundheim, Beth A.; Tibbetts, Michael; Van Stone, David W.; Zografou, Panagoula
2018-01-01
Among the many enhancements implemented for the release of Chandra Source Catalog (CSC) 2.0 are improvements in the processing calibration database (CalDB). We have included a thorough overhaul of the CalDB software used in the processing. The software system upgrade, called "CalDB version 4," allows for a more rational and consistent specification of flight configurations and calibration boundary conditions. Numerous improvements in the specific calibrations applied have also been added. Chandra's radiometric and detector response calibrations vary considerably with time, detector operating temperature, and position on the detector. The CalDB has been enhanced to provide the best calibrations possible to each observation over the fifteen-year period included in CSC 2.0. Calibration updates include an improved ACIS contamination model, as well as updated time-varying gain (i.e., photon energy) and quantum efficiency maps for ACIS and HRC-I. Additionally, improved corrections for the ACIS quantum efficiency losses due to CCD charge transfer inefficiency (CTI) have been added for each of the ten ACIS detectors. These CTI corrections are now time and temperature-dependent, allowing ACIS to maintain a 0.3% energy calibration accuracy over the 0.5-7.0 keV range for any ACIS source in the catalog. Radiometric calibration (effective area) accuracy is estimated at ~4% over that range. We include a few examples where improvements in the Chandra CalDB allow for improved data reduction and modeling for the new CSC.This work has been supported by NASA under contract NAS 8-03060 to the Smithsonian Astrophysical Observatory for operation of the Chandra X-ray Center.
DB Dehydrogenase: an online integrated structural database on enzyme dehydrogenase.
Nandy, Suman Kumar; Bhuyan, Rajabrata; Seal, Alpana
2012-01-01
Dehydrogenase enzymes are almost inevitable for metabolic processes. Shortage or malfunctioning of dehydrogenases often leads to several acute diseases like cancers, retinal diseases, diabetes mellitus, Alzheimer, hepatitis B & C etc. With advancement in modern-day research, huge amount of sequential, structural and functional data are generated everyday and widens the gap between structural attributes and its functional understanding. DB Dehydrogenase is an effort to relate the functionalities of dehydrogenase with its structures. It is a completely web-based structural database, covering almost all dehydrogenases [~150 enzyme classes, ~1200 entries from ~160 organisms] whose structures are known. It is created by extracting and integrating various online resources to provide the true and reliable data and implemented by MySQL relational database through user friendly web interfaces using CGI Perl. Flexible search options are there for data extraction and exploration. To summarize, sequence, structure, function of all dehydrogenases in one place along with the necessary option of cross-referencing; this database will be utile for researchers to carry out further work in this field. The database is available for free at http://www.bifku.in/DBD/
USDA-ARS?s Scientific Manuscript database
The Maize Database (MaizeDB) to the Maize Genetics and Genomics Database (MaizeGDB) turns 20 this year, and such a significant milestone must be celebrated! With the release of the B73 reference sequence and more sequenced genomes on the way, the maize community needs to address various opportunitie...
Bouwman, Jildau; Dragsted, Lars O.; Drevon, Christian A.; Elliott, Ruan; de Groot, Philip; Kaput, Jim; Mathers, John C.; Müller, Michael; Pepping, Fre; Saito, Jahn; Scalbert, Augustin; Radonjic, Marijana; Rocca-Serra, Philippe; Travis, Anthony; Wopereis, Suzan; Evelo, Chris T.
2010-01-01
The challenge of modern nutrition and health research is to identify food-based strategies promoting life-long optimal health and well-being. This research is complex because it exploits a multitude of bioactive compounds acting on an extensive network of interacting processes. Whereas nutrition research can profit enormously from the revolution in ‘omics’ technologies, it has discipline-specific requirements for analytical and bioinformatic procedures. In addition to measurements of the parameters of interest (measures of health), extensive description of the subjects of study and foods or diets consumed is central for describing the nutritional phenotype. We propose and pursue an infrastructural activity of constructing the “Nutritional Phenotype database” (dbNP). When fully developed, dbNP will be a research and collaboration tool and a publicly available data and knowledge repository. Creation and implementation of the dbNP will maximize benefits to the research community by enabling integration and interrogation of data from multiple studies, from different research groups, different countries and different—omics levels. The dbNP is designed to facilitate storage of biologically relevant, pre-processed—omics data, as well as study descriptive and study participant phenotype data. It is also important to enable the combination of this information at different levels (e.g. to facilitate linkage of data describing participant phenotype, genotype and food intake with information on study design and—omics measurements, and to combine all of this with existing knowledge). The biological information stored in the database (i.e. genetics, transcriptomics, proteomics, biomarkers, metabolomics, functional assays, food intake and food composition) is tailored to nutrition research and embedded in an environment of standard procedures and protocols, annotations, modular data-basing, networking and integrated bioinformatics. The dbNP is an evolving enterprise, which is only sustainable if it is accepted and adopted by the wider nutrition and health research community as an open source, pre-competitive and publicly available resource where many partners both can contribute and profit from its developments. We introduce the Nutrigenomics Organisation (NuGO, http://www.nugo.org) as a membership association responsible for establishing and curating the dbNP. Within NuGO, all efforts related to dbNP (i.e. usage, coordination, integration, facilitation and maintenance) will be directed towards a sustainable and federated infrastructure. PMID:21052526
Database resources of the National Center for Biotechnology Information.
Wheeler, David L; Barrett, Tanya; Benson, Dennis A; Bryant, Stephen H; Canese, Kathi; Chetvernin, Vyacheslav; Church, Deanna M; DiCuccio, Michael; Edgar, Ron; Federhen, Scott; Geer, Lewis Y; Kapustin, Yuri; Khovayko, Oleg; Landsman, David; Lipman, David J; Madden, Thomas L; Maglott, Donna R; Ostell, James; Miller, Vadim; Pruitt, Kim D; Schuler, Gregory D; Sequeira, Edwin; Sherry, Steven T; Sirotkin, Karl; Souvorov, Alexandre; Starchenko, Grigory; Tatusov, Roman L; Tatusova, Tatiana A; Wagner, Lukas; Yaschenko, Eugene
2007-01-01
In addition to maintaining the GenBank nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data made available through NCBI's Web site. NCBI resources include Entrez, the Entrez Programming Utilities, My NCBI, PubMed, PubMed Central, Entrez Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link(BLink), Electronic PCR, OrfFinder, Spidey, Splign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, Cancer Chromosomes, Entrez Genome, Genome Project and related tools, the Trace and Assembly Archives, the Map Viewer, Model Maker, Evidence Viewer, Clusters of Orthologous Groups (COGs), Viral Genotyping Tools, Influenza Viral Resources, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus (GEO), Entrez Probe, GENSAT, Online Mendelian Inheritance in Man (OMIM), Online Mendelian Inheritance in Animals (OMIA), the Molecular Modeling Database (MMDB), the Conserved Domain Database (CDD), the Conserved Domain Architecture Retrieval Tool (CDART) and the PubChem suite of small molecule databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. These resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov.
Database resources of the National Center for Biotechnology Information.
Sayers, Eric W; Barrett, Tanya; Benson, Dennis A; Bryant, Stephen H; Canese, Kathi; Chetvernin, Vyacheslav; Church, Deanna M; DiCuccio, Michael; Edgar, Ron; Federhen, Scott; Feolo, Michael; Geer, Lewis Y; Helmberg, Wolfgang; Kapustin, Yuri; Landsman, David; Lipman, David J; Madden, Thomas L; Maglott, Donna R; Miller, Vadim; Mizrachi, Ilene; Ostell, James; Pruitt, Kim D; Schuler, Gregory D; Sequeira, Edwin; Sherry, Stephen T; Shumway, Martin; Sirotkin, Karl; Souvorov, Alexandre; Starchenko, Grigory; Tatusova, Tatiana A; Wagner, Lukas; Yaschenko, Eugene; Ye, Jian
2009-01-01
In addition to maintaining the GenBank nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data made available through the NCBI web site. NCBI resources include Entrez, the Entrez Programming Utilities, MyNCBI, PubMed, PubMed Central, Entrez Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Electronic PCR, OrfFinder, Spidey, Splign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, Cancer Chromosomes, Entrez Genomes and related tools, the Map Viewer, Model Maker, Evidence Viewer, Clusters of Orthologous Groups (COGs), Retroviral Genotyping Tools, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus (GEO), Entrez Probe, GENSAT, Online Mendelian Inheritance in Man (OMIM), Online Mendelian Inheritance in Animals (OMIA), the Molecular Modeling Database (MMDB), the Conserved Domain Database (CDD), the Conserved Domain Architecture Retrieval Tool (CDART) and the PubChem suite of small molecule databases. Augmenting many of the web applications is custom implementation of the BLAST program optimized to search specialized data sets. All of the resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov.
A Concept for Continuous Monitoring that Reduces Redundancy in Information Assurance Processes
2011-09-01
System.out.println(“Driver loaded”); String url=“jdbc:postgresql://localhost/IAcontrols”; String user = “ postgres ”; String pwd... postgres ”; Connection DB_mobile_conn = DriverManager.getConnection(url,user,pwd); System.out.println(“Database Connect ok...user = “ postgres ”; String pwd = “ postgres ”; Connection DB_mobile_conn = DriverManager.getConnection(url,user,pwd); System.out.println
RepeatsDB-lite: a web server for unit annotation of tandem repeat proteins.
Hirsh, Layla; Paladin, Lisanna; Piovesan, Damiano; Tosatto, Silvio C E
2018-05-09
RepeatsDB-lite (http://protein.bio.unipd.it/repeatsdb-lite) is a web server for the prediction of repetitive structural elements and units in tandem repeat (TR) proteins. TRs are a widespread but poorly annotated class of non-globular proteins carrying heterogeneous functions. RepeatsDB-lite extends the prediction to all TR types and strongly improves the performance both in terms of computational time and accuracy over previous methods, with precision above 95% for solenoid structures. The algorithm exploits an improved TR unit library derived from the RepeatsDB database to perform an iterative structural search and assignment. The web interface provides tools for analyzing the evolutionary relationships between units and manually refine the prediction by changing unit positions and protein classification. An all-against-all structure-based sequence similarity matrix is calculated and visualized in real-time for every user edit. Reviewed predictions can be submitted to RepeatsDB for review and inclusion.
A 205GHz Amplifier in 90nm CMOS Technology
2017-03-01
San Jose State University San Jose, CA, USA Abstract: This paper presents a 205GHz amplifier drawing 43.4mA from a 0.9V power supply with...10.5dB power gain, Psat of -1.6dBm, and P1dB ≈ -5.8dBm in a standard 90nm CMOS process. Moreover, the design employs internal (layout-based) /external...reported in [2]. In this paper, two neutralization techniques, internal and external approaches, have been implemented to achieve higher power
TRANSFAC: an integrated system for gene expression regulation.
Wingender, E; Chen, X; Hehl, R; Karas, H; Liebich, I; Matys, V; Meinhardt, T; Prüss, M; Reuter, I; Schacherer, F
2000-01-01
TRANSFAC is a database on transcription factors, their genomic binding sites and DNA-binding profiles (http://transfac.gbf.de/TRANSFAC/). Its content has been enhanced, in particular by information about training sequences used for the construction of nucleotide matrices as well as by data on plant sites and factors. Moreover, TRANSFAC has been extended by two new modules: PathoDB provides data on pathologically relevant mutations in regulatory regions and transcription factor genes, whereas S/MARt DB compiles features of scaffold/matrix attached regions (S/MARs) and the proteins binding to them. Additionally, the databases TRANSPATH, about signal transduction, and CYTOMER, about organs and cell types, have been extended and are increasingly integrated with the TRANSFAC data sources.
Mysql Data-Base Applications for Dst-Like Physics Analysis
NASA Astrophysics Data System (ADS)
Tsenov, Roumen
2004-07-01
The data and analysis model developed and being used in the HARP experiment for studying hadron production at CERN Proton Synchrotron is discussed. Emphasis is put on usage of data-base (DB) back-ends for persistent storing and retrieving "alive" C++ objects encapsulating raw and reconstructed data. Concepts of "Data Summary Tape" (DST) as a logical collection of DB-persistent data of different types, and of "intermediate DST" (iDST) as a physical "tag" of DST, are introduced. iDST level of persistency allows a powerful, DST-level of analysis to be performed by applications running on an isolated machine (even laptop) with no connection to the experiment's main data storage. Implementation of these concepts is considered.
OpenFluDB, a database for human and animal influenza virus
Liechti, Robin; Gleizes, Anne; Kuznetsov, Dmitry; Bougueleret, Lydie; Le Mercier, Philippe; Bairoch, Amos; Xenarios, Ioannis
2010-01-01
Although research on influenza lasted for more than 100 years, it is still one of the most prominent diseases causing half a million human deaths every year. With the recent observation of new highly pathogenic H5N1 and H7N7 strains, and the appearance of the influenza pandemic caused by the H1N1 swine-like lineage, a collaborative effort to share observations on the evolution of this virus in both animals and humans has been established. The OpenFlu database (OpenFluDB) is a part of this collaborative effort. It contains genomic and protein sequences, as well as epidemiological data from more than 27 000 isolates. The isolate annotations include virus type, host, geographical location and experimentally tested antiviral resistance. Putative enhanced pathogenicity as well as human adaptation propensity are computed from protein sequences. Each virus isolate can be associated with the laboratories that collected, sequenced and submitted it. Several analysis tools including multiple sequence alignment, phylogenetic analysis and sequence similarity maps enable rapid and efficient mining. The contents of OpenFluDB are supplied by direct user submission, as well as by a daily automatic procedure importing data from public repositories. Additionally, a simple mechanism facilitates the export of OpenFluDB records to GenBank. This resource has been successfully used to rapidly and widely distribute the sequences collected during the recent human swine flu outbreak and also as an exchange platform during the vaccine selection procedure. Database URL: http://openflu.vital-it.ch. PMID:20624713
Clasey, Jody L; Gater, David R
2005-11-01
To compare (1) total body volume (V(b)) and density (D(b)) measurements obtained by hydrostatic weighing (HW) and air displacement plethysmography (ADP) in adults with spinal cord injury (SCI); (2) measured and predicted thoracic gas volume (V(TG)); and (3) differences in percentage of fat measurements using ADP-obtained D(b) and HW-obtained D(b) measures that were interchanged in a 4-compartment body composition model (4-comp %fat). Twenty adults with SCI underwent ADP and V(TG), and HW testing. In a subgroup (n=13) of subjects, 4-comp %fat procedures were computed. Research laboratories in a university setting. Twenty adults with SCI below the T3 vertebrae and motor complete paraplegia. Not applicable. Statistical analyses, including determination of group mean differences, shared variance, total error, and 95% confidence intervals. The 2 methods yielded small yet significantly different V(b) and D(b). The groups' mean V(TG) did not differ significantly, but the large relative differences indicated an unacceptable amount of individual error. When the 4-comp %fat measurements were compared, there was a trend toward significant differences (P=.08). ADP is a valid alternative method of determining the V(b) and D(b) in adults with SCI; however, the predicted V(TG) should be used with caution.
An advanced web query interface for biological databases
Latendresse, Mario; Karp, Peter D.
2010-01-01
Although most web-based biological databases (DBs) offer some type of web-based form to allow users to author DB queries, these query forms are quite restricted in the complexity of DB queries that they can formulate. They can typically query only one DB, and can query only a single type of object at a time (e.g. genes) with no possible interaction between the objects—that is, in SQL parlance, no joins are allowed between DB objects. Writing precise queries against biological DBs is usually left to a programmer skillful enough in complex DB query languages like SQL. We present a web interface for building precise queries for biological DBs that can construct much more precise queries than most web-based query forms, yet that is user friendly enough to be used by biologists. It supports queries containing multiple conditions, and connecting multiple object types without using the join concept, which is unintuitive to biologists. This interactive web interface is called the Structured Advanced Query Page (SAQP). Users interactively build up a wide range of query constructs. Interactive documentation within the SAQP describes the schema of the queried DBs. The SAQP is based on BioVelo, a query language based on list comprehension. The SAQP is part of the Pathway Tools software and is available as part of several bioinformatics web sites powered by Pathway Tools, including the BioCyc.org site that contains more than 500 Pathway/Genome DBs. PMID:20624715
Field validation of food outlet databases: the Latino food environment in North Carolina, USA.
Rummo, Pasquale E; Albrecht, Sandra S; Gordon-Larsen, Penny
2015-04-01
Obtaining valid, reliable measures of food environments that serve Latino communities is important for understanding barriers to healthy eating in this at-risk population. The primary aim of the study was to examine agreement between retail food outlet data from two commercial databases, Nielsen TDLinx (TDLinx) for food stores and Dun & Bradstreet (D&B) for food stores and restaurants, relative to field observations of food stores and restaurants in thirty-one census tracts in Durham County, NC, USA. We also examined differences by proportion of Hispanic population (≥23·4 % Hispanic population) in the census tract and for outlets classified in the field as 'Latino' on the basis of signage and use of Spanish language. One hundred and seventy-four food stores and 337 restaurants in Durham County, NC, USA. We found that overall sensitivity of food store listings in TDLinx was higher (64 %) than listings in D&B (55 %). Twenty-five food stores were characterized by auditors as Latino food stores, with 20 % identified in TDLinx, 52 % in D&B and 56 % in both sources. Overall sensitivity of restaurants (68 %) was higher than sensitivity of Latino restaurants (38 %) listed in D&B. Sensitivity did not differ substantially by Hispanic composition of neighbourhoods. Our findings suggest that while TDLinx and D&B commercial data sources perform well for total food stores, they perform less well in identifying small and independent food outlets, including many Latino food stores and restaurants.
Integrated database for identifying candidate genes for Aspergillus flavus resistance in maize
2010-01-01
Background Aspergillus flavus Link:Fr, an opportunistic fungus that produces aflatoxin, is pathogenic to maize and other oilseed crops. Aflatoxin is a potent carcinogen, and its presence markedly reduces the value of grain. Understanding and enhancing host resistance to A. flavus infection and/or subsequent aflatoxin accumulation is generally considered an efficient means of reducing grain losses to aflatoxin. Different proteomic, genomic and genetic studies of maize (Zea mays L.) have generated large data sets with the goal of identifying genes responsible for conferring resistance to A. flavus, or aflatoxin. Results In order to maximize the usage of different data sets in new studies, including association mapping, we have constructed a relational database with web interface integrating the results of gene expression, proteomic (both gel-based and shotgun), Quantitative Trait Loci (QTL) genetic mapping studies, and sequence data from the literature to facilitate selection of candidate genes for continued investigation. The Corn Fungal Resistance Associated Sequences Database (CFRAS-DB) (http://agbase.msstate.edu/) was created with the main goal of identifying genes important to aflatoxin resistance. CFRAS-DB is implemented using MySQL as the relational database management system running on a Linux server, using an Apache web server, and Perl CGI scripts as the web interface. The database and the associated web-based interface allow researchers to examine many lines of evidence (e.g. microarray, proteomics, QTL studies, SNP data) to assess the potential role of a gene or group of genes in the response of different maize lines to A. flavus infection and subsequent production of aflatoxin by the fungus. Conclusions CFRAS-DB provides the first opportunity to integrate data pertaining to the problem of A. flavus and aflatoxin resistance in maize in one resource and to support queries across different datasets. The web-based interface gives researchers different query options for mining the database across different types of experiments. The database is publically available at http://agbase.msstate.edu. PMID:20946609
Integrated database for identifying candidate genes for Aspergillus flavus resistance in maize.
Kelley, Rowena Y; Gresham, Cathy; Harper, Jonathan; Bridges, Susan M; Warburton, Marilyn L; Hawkins, Leigh K; Pechanova, Olga; Peethambaran, Bela; Pechan, Tibor; Luthe, Dawn S; Mylroie, J E; Ankala, Arunkanth; Ozkan, Seval; Henry, W B; Williams, W P
2010-10-07
Aspergillus flavus Link:Fr, an opportunistic fungus that produces aflatoxin, is pathogenic to maize and other oilseed crops. Aflatoxin is a potent carcinogen, and its presence markedly reduces the value of grain. Understanding and enhancing host resistance to A. flavus infection and/or subsequent aflatoxin accumulation is generally considered an efficient means of reducing grain losses to aflatoxin. Different proteomic, genomic and genetic studies of maize (Zea mays L.) have generated large data sets with the goal of identifying genes responsible for conferring resistance to A. flavus, or aflatoxin. In order to maximize the usage of different data sets in new studies, including association mapping, we have constructed a relational database with web interface integrating the results of gene expression, proteomic (both gel-based and shotgun), Quantitative Trait Loci (QTL) genetic mapping studies, and sequence data from the literature to facilitate selection of candidate genes for continued investigation. The Corn Fungal Resistance Associated Sequences Database (CFRAS-DB) (http://agbase.msstate.edu/) was created with the main goal of identifying genes important to aflatoxin resistance. CFRAS-DB is implemented using MySQL as the relational database management system running on a Linux server, using an Apache web server, and Perl CGI scripts as the web interface. The database and the associated web-based interface allow researchers to examine many lines of evidence (e.g. microarray, proteomics, QTL studies, SNP data) to assess the potential role of a gene or group of genes in the response of different maize lines to A. flavus infection and subsequent production of aflatoxin by the fungus. CFRAS-DB provides the first opportunity to integrate data pertaining to the problem of A. flavus and aflatoxin resistance in maize in one resource and to support queries across different datasets. The web-based interface gives researchers different query options for mining the database across different types of experiments. The database is publically available at http://agbase.msstate.edu.
Photo-z-SQL: Integrated, flexible photometric redshift computation in a database
NASA Astrophysics Data System (ADS)
Beck, R.; Dobos, L.; Budavári, T.; Szalay, A. S.; Csabai, I.
2017-04-01
We present a flexible template-based photometric redshift estimation framework, implemented in C#, that can be seamlessly integrated into a SQL database (or DB) server and executed on-demand in SQL. The DB integration eliminates the need to move large photometric datasets outside a database for redshift estimation, and utilizes the computational capabilities of DB hardware. The code is able to perform both maximum likelihood and Bayesian estimation, and can handle inputs of variable photometric filter sets and corresponding broad-band magnitudes. It is possible to take into account the full covariance matrix between filters, and filter zero points can be empirically calibrated using measurements with given redshifts. The list of spectral templates and the prior can be specified flexibly, and the expensive synthetic magnitude computations are done via lazy evaluation, coupled with a caching of results. Parallel execution is fully supported. For large upcoming photometric surveys such as the LSST, the ability to perform in-place photo-z calculation would be a significant advantage. Also, the efficient handling of variable filter sets is a necessity for heterogeneous databases, for example the Hubble Source Catalog, and for cross-match services such as SkyQuery. We illustrate the performance of our code on two reference photo-z estimation testing datasets, and provide an analysis of execution time and scalability with respect to different configurations. The code is available for download at https://github.com/beckrob/Photo-z-SQL.
ARIADNE: a Tracking System for Relationships in LHCb Metadata
NASA Astrophysics Data System (ADS)
Shapoval, I.; Clemencic, M.; Cattaneo, M.
2014-06-01
The data processing model of the LHCb experiment implies handling of an evolving set of heterogeneous metadata entities and relationships between them. The entities range from software and databases states to architecture specificators and software/data deployment locations. For instance, there is an important relationship between the LHCb Conditions Database (CondDB), which provides versioned, time dependent geometry and conditions data, and the LHCb software, which is the data processing applications (used for simulation, high level triggering, reconstruction and analysis of physics data). The evolution of CondDB and of the LHCb applications is a weakly-homomorphic process. It means that relationships between a CondDB state and LHCb application state may not be preserved across different database and application generations. These issues may lead to various kinds of problems in the LHCb production, varying from unexpected application crashes to incorrect data processing results. In this paper we present Ariadne - a generic metadata relationships tracking system based on the novel NoSQL Neo4j graph database. Its aim is to track and analyze many thousands of evolving relationships for cases such as the one described above, and several others, which would otherwise remain unmanaged and potentially harmful. The highlights of the paper include the system's implementation and management details, infrastructure needed for running it, security issues, first experience of usage in the LHCb production and potential of the system to be applied to a wider set of LHCb tasks.
Fischer, Steve; Aurrecoechea, Cristina; Brunk, Brian P.; Gao, Xin; Harb, Omar S.; Kraemer, Eileen T.; Pennington, Cary; Treatman, Charles; Kissinger, Jessica C.; Roos, David S.; Stoeckert, Christian J.
2011-01-01
Web sites associated with the Eukaryotic Pathogen Bioinformatics Resource Center (EuPathDB.org) have recently introduced a graphical user interface, the Strategies WDK, intended to make advanced searching and set and interval operations easy and accessible to all users. With a design guided by usability studies, the system helps motivate researchers to perform dynamic computational experiments and explore relationships across data sets. For example, PlasmoDB users seeking novel therapeutic targets may wish to locate putative enzymes that distinguish pathogens from their hosts, and that are expressed during appropriate developmental stages. When a researcher runs one of the approximately 100 searches available on the site, the search is presented as a first step in a strategy. The strategy is extended by running additional searches, which are combined with set operators (union, intersect or minus), or genomic interval operators (overlap, contains). A graphical display uses Venn diagrams to make the strategy’s flow obvious. The interface facilitates interactive adjustment of the component searches with changes propagating forward through the strategy. Users may save their strategies, creating protocols that can be shared with colleagues. The strategy system has now been deployed on all EuPathDB databases, and successfully deployed by other projects. The Strategies WDK uses a configurable MVC architecture that is compatible with most genomics and biological warehouse databases, and is available for download at code.google.com/p/strategies-wdk. Database URL: www.eupathdb.org PMID:21705364
DNAVaxDB: the first web-based DNA vaccine database and its data analysis
2014-01-01
Since the first DNA vaccine studies were done in the 1990s, thousands more studies have followed. Here we report the development and analysis of DNAVaxDB (http://www.violinet.org/dnavaxdb), the first publically available web-based DNA vaccine database that curates, stores, and analyzes experimentally verified DNA vaccines, DNA vaccine plasmid vectors, and protective antigens used in DNA vaccines. All data in DNAVaxDB are annotated from reliable resources, particularly peer-reviewed articles. Among over 140 DNA vaccine plasmids, some plasmids were more frequently used in one type of pathogen than others; for example, pCMVi-UB for G- bacterial DNA vaccines, and pCAGGS for viral DNA vaccines. Presently, over 400 DNA vaccines containing over 370 protective antigens from over 90 infectious and non-infectious diseases have been curated in DNAVaxDB. While extracellular and bacterial cell surface proteins and adhesin proteins were frequently used for DNA vaccine development, the majority of protective antigens used in Chlamydophila DNA vaccines are localized to the inner portion of the cell. The DNA vaccine priming, other vaccine boosting vaccination regimen has been widely used to induce protection against infection of different pathogens such as HIV. Parasitic and cancer DNA vaccines were also systematically analyzed. User-friendly web query and visualization interfaces are available in DNAVaxDB for interactive data search. To support data exchange, the information of DNA vaccines, plasmids, and protective antigens is stored in the Vaccine Ontology (VO). DNAVaxDB is targeted to become a timely and vital source of DNA vaccines and related data and facilitate advanced DNA vaccine research and development. PMID:25104313
Vernetti, Lawrence; Bergenthal, Luke; Shun, Tong Ying; Taylor, D. Lansing
2016-01-01
Abstract Microfluidic human organ models, microphysiology systems (MPS), are currently being developed as predictive models of drug safety and efficacy in humans. To design and validate MPS as predictive of human safety liabilities requires safety data for a reference set of compounds, combined with in vitro data from the human organ models. To address this need, we have developed an internet database, the MPS database (MPS-Db), as a powerful platform for experimental design, data management, and analysis, and to combine experimental data with reference data, to enable computational modeling. The present study demonstrates the capability of the MPS-Db in early safety testing using a human liver MPS to relate the effects of tolcapone and entacapone in the in vitro model to human in vivo effects. These two compounds were chosen to be evaluated as a representative pair of marketed drugs because they are structurally similar, have the same target, and were found safe or had an acceptable risk in preclinical and clinical trials, yet tolcapone induced unacceptable levels of hepatotoxicity while entacapone was found to be safe. Results demonstrate the utility of the MPS-Db as an essential resource for relating in vitro organ model data to the multiple biochemical, preclinical, and clinical data sources on in vivo drug effects. PMID:28781990
Rail-dbGaP: analyzing dbGaP-protected data in the cloud with Amazon Elastic MapReduce.
Nellore, Abhinav; Wilks, Christopher; Hansen, Kasper D; Leek, Jeffrey T; Langmead, Ben
2016-08-15
Public archives contain thousands of trillions of bases of valuable sequencing data. More than 40% of the Sequence Read Archive is human data protected by provisions such as dbGaP. To analyse dbGaP-protected data, researchers must typically work with IT administrators and signing officials to ensure all levels of security are implemented at their institution. This is a major obstacle, impeding reproducibility and reducing the utility of archived data. We present a protocol and software tool for analyzing protected data in a commercial cloud. The protocol, Rail-dbGaP, is applicable to any tool running on Amazon Web Services Elastic MapReduce. The tool, Rail-RNA v0.2, is a spliced aligner for RNA-seq data, which we demonstrate by running on 9662 samples from the dbGaP-protected GTEx consortium dataset. The Rail-dbGaP protocol makes explicit for the first time the steps an investigator must take to develop Elastic MapReduce pipelines that analyse dbGaP-protected data in a manner compliant with NIH guidelines. Rail-RNA automates implementation of the protocol, making it easy for typical biomedical investigators to study protected RNA-seq data, regardless of their local IT resources or expertise. Rail-RNA is available from http://rail.bio Technical details on the Rail-dbGaP protocol as well as an implementation walkthrough are available at https://github.com/nellore/rail-dbgap Detailed instructions on running Rail-RNA on dbGaP-protected data using Amazon Web Services are available at http://docs.rail.bio/dbgap/ : anellore@gmail.com or langmea@cs.jhu.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.
Wilson, Richard H
2015-04-01
In 1940, a cooperative effort by the radio networks and Bell Telephone produced the volume unit (vu) meter that has been the mainstay instrument for monitoring the level of speech signals in commercial broadcasting and research laboratories. With the use of computers, today the amplitude of signals can be quantified easily using the root mean square (rms) algorithm. Researchers had previously reported that amplitude estimates of sentences and running speech were 4.8 dB higher when measured with a vu meter than when calculated with rms. This study addresses the vu-rms relation as applied to the carrier phrase and target word paradigm used to assess word-recognition abilities, the premise being that by definition the word-recognition paradigm is a special and different case from that described previously. The purpose was to evaluate the vu and rms amplitude relations for the carrier phrases and target words commonly used to assess word-recognition abilities. In addition, the relations with the target words between rms level and recognition performance were examined. Descriptive and correlational. Two recoded versions of the Northwestern University Auditory Test No. 6 were evaluated, the Auditec of St. Louis (Auditec) male speaker and the Department of Veterans Affairs (VA) female speaker. Using both visual and auditory cues from a waveform editor, the temporal onsets and offsets were defined for each carrier phrase and each target word. The rms amplitudes for those segments then were computed and expressed in decibels with reference to the maximum digitization range. The data were maintained for each of the four Northwestern University Auditory Test No. 6 word lists. Descriptive analyses were used with linear regressions used to evaluate the reliability of the measurement technique and the relation between the rms levels of the target words and recognition performances. Although there was a 1.3 dB difference between the calibration tones, the mean levels of the carrier phrases for the two recordings were -14.8 dB (Auditec) and -14.1 dB (VA) with standard deviations <1 dB. For the target words, the mean amplitudes were -19.9 dB (Auditec) and -18.3 dB (VA) with standard deviations ranging from 1.3 to 2.4 dB. The mean durations for the carrier phrases of both recordings were 593-594 msec, with the mean durations of the target words a little different, 509 msec (Auditec) and 528 msec (VA). Random relations were observed between the recognition performances and rms levels of the target words. Amplitude and temporal data for the individual words are provided. The rms levels of the carrier phrases closely approximated (±1 dB) the rms levels of the calibration tones, both of which were set to 0 vu (dB). The rms levels of the target words were 5-6 dB below the levels of the carrier phrases and were substantially more variable than the levels of the carrier phrases. The relation between the rms levels of the target words and recognition performances on the words was random. American Academy of Audiology.
HippDB: a database of readily targeted helical protein-protein interactions.
Bergey, Christina M; Watkins, Andrew M; Arora, Paramjit S
2013-11-01
HippDB catalogs every protein-protein interaction whose structure is available in the Protein Data Bank and which exhibits one or more helices at the interface. The Web site accepts queries on variables such as helix length and sequence, and it provides computational alanine scanning and change in solvent-accessible surface area values for every interfacial residue. HippDB is intended to serve as a starting point for structure-based small molecule and peptidomimetic drug development. HippDB is freely available on the web at http://www.nyu.edu/projects/arora/hippdb. The Web site is implemented in PHP, MySQL and Apache. Source code freely available for download at http://code.google.com/p/helidb, implemented in Perl and supported on Linux. arora@nyu.edu.
Pilon, Alan C; Valli, Marilia; Dametto, Alessandra C; Pinto, Meri Emili F; Freire, Rafael T; Castro-Gamboa, Ian; Andricopulo, Adriano D; Bolzani, Vanderlan S
2017-08-03
The intrinsic value of biodiversity extends beyond species diversity, genetic heritage, ecosystem variability and ecological services, such as climate regulation, water quality, nutrient cycling and the provision of reproductive habitats it is also an inexhaustible source of molecules and products beneficial to human well-being. To uncover the chemistry of Brazilian natural products, the Nuclei of Bioassays, Ecophysiology and Biosynthesis of Natural Products Database (NuBBE DB ) was created as the first natural product library from Brazilian biodiversity. Since its launch in 2013, the NuBBE DB has proven to be an important resource for new drug design and dereplication studies. Consequently, continuous efforts have been made to expand its contents and include a greater diversity of natural sources to establish it as a comprehensive compendium of available biogeochemical information about Brazilian biodiversity. The content in the NuBBE DB is freely accessible online (https://nubbe.iq.unesp.br/portal/nubbedb.html) and provides validated multidisciplinary information, chemical descriptors, species sources, geographic locations, spectroscopic data (NMR) and pharmacological properties. Herein, we report the latest advancements concerning the interface, content and functionality of the NuBBE DB . We also present a preliminary study on the current profile of the compounds present in Brazilian territory.
CerealsDB 3.0: expansion of resources and data integration.
Wilkinson, Paul A; Winfield, Mark O; Barker, Gary L A; Tyrrell, Simon; Bian, Xingdong; Allen, Alexandra M; Burridge, Amanda; Coghill, Jane A; Waterfall, Christy; Caccamo, Mario; Davey, Robert P; Edwards, Keith J
2016-06-24
The increase in human populations around the world has put pressure on resources, and as a consequence food security has become an important challenge for the 21st century. Wheat (Triticum aestivum) is one of the most important crops in human and livestock diets, and the development of wheat varieties that produce higher yields, combined with increased resistance to pests and resilience to changes in climate, has meant that wheat breeding has become an important focus of scientific research. In an attempt to facilitate these improvements in wheat, plant breeders have employed molecular tools to help them identify genes for important agronomic traits that can be bred into new varieties. Modern molecular techniques have ensured that the rapid and inexpensive characterisation of SNP markers and their validation with modern genotyping methods has produced a valuable resource that can be used in marker assisted selection. CerealsDB was created as a means of quickly disseminating this information to breeders and researchers around the globe. CerealsDB version 3.0 is an online resource that contains a wide range of genomic datasets for wheat that will assist plant breeders and scientists to select the most appropriate markers for use in marker assisted selection. CerealsDB includes a database which currently contains in excess of a million putative varietal SNPs, of which several hundreds of thousands have been experimentally validated. In addition, CerealsDB also contains new data on functional SNPs predicted to have a major effect on protein function and we have constructed a web service to encourage data integration and high-throughput programmatic access. CerealsDB is an open access website that hosts information on SNPs that are considered useful for both plant breeders and research scientists. The recent inclusion of web services designed to federate genomic data resources allows the information on CerealsDB to be more fully integrated with the WheatIS network and other biological databases.
Joint Battlespace Infosphere: Information Management Within a C2 Enterprise
2005-06-01
using. In version 1.2, we support both MySQL and Oracle as underlying implementations where the XML metadata schema is mapped into relational tables in...Identity Servers, Role-Based Access Control, and Policy Representation – Databases: Oracle , MySQL , TigerLogic, Berkeley XML DB 15 Instrumentation Services...converted to SQL for execution. Invocations are then forwarded to the appropriate underlying IOR core components that have the responsibility of issuing
P³DB 3.0: From plant phosphorylation sites to protein networks.
Yao, Qiuming; Ge, Huangyi; Wu, Shangquan; Zhang, Ning; Chen, Wei; Xu, Chunhui; Gao, Jianjiong; Thelen, Jay J; Xu, Dong
2014-01-01
In the past few years, the Plant Protein Phosphorylation Database (P(3)DB, http://p3db.org) has become one of the most significant in vivo data resources for studying plant phosphoproteomics. We have substantially updated P(3)DB with respect to format, new datasets and analytic tools. In the P(3)DB 3.0, there are altogether 47 923 phosphosites in 16 477 phosphoproteins curated across nine plant organisms from 32 studies, which have met our multiple quality standards for acquisition of in vivo phosphorylation site data. Centralized by these phosphorylation data, multiple related data and annotations are provided, including protein-protein interaction (PPI), gene ontology, protein tertiary structures, orthologous sequences, kinase/phosphatase classification and Kinase Client Assay (KiC Assay) data--all of which provides context for the phosphorylation event. In addition, P(3)DB 3.0 incorporates multiple network viewers for the above features, such as PPI network, kinase-substrate network, phosphatase-substrate network, and domain co-occurrence network to help study phosphorylation from a systems point of view. Furthermore, the new P(3)DB reflects a community-based design through which users can share datasets and automate data depository processes for publication purposes. Each of these new features supports the goal of making P(3)DB a comprehensive, systematic and interactive platform for phosphoproteomics research.
E-Commerce May Help Colleges Cut Costs and Paperwork.
ERIC Educational Resources Information Center
Olsen, Florence
2000-01-01
Describes the increasing trend of incorporating electronic commerce methods to purchasing systems at colleges and universities. Provides examples from the University of Pennsylvania, Harvard University (Massachusetts), California State University at Fullerton, and the University of California at Los Angeles. (DB)
Automated CFD Database Generation for a 2nd Generation Glide-Back-Booster
NASA Technical Reports Server (NTRS)
Chaderjian, Neal M.; Rogers, Stuart E.; Aftosmis, Michael J.; Pandya, Shishir A.; Ahmad, Jasim U.; Tejmil, Edward
2003-01-01
A new software tool, AeroDB, is used to compute thousands of Euler and Navier-Stokes solutions for a 2nd generation glide-back booster in one week. The solution process exploits a common job-submission grid environment using 13 computers located at 4 different geographical sites. Process automation and web-based access to the database greatly reduces the user workload, removing much of the tedium and tendency for user input errors. The database consists of forces, moments, and solution files obtained by varying the Mach number, angle of attack, and sideslip angle. The forces and moments compare well with experimental data. Stability derivatives are also computed using a monotone cubic spline procedure. Flow visualization and three-dimensional surface plots are used to interpret and characterize the nature of computed flow fields.
Climatological Data Option in My Weather Impacts Decision Aid (MyWIDA) Overview
2017-07-18
rules. It consists of 2 databases, a data service server, a collection of web service, and web applications that show weather impacts on selected...3.1.2 ClimoDB 5 3.2 Data Service 5 3.2.1 Data Requestor 5 3.2.2 Data Decoder 6 3.2.3 Post Processor 6 3.2.4 Job Scheduler 6 3.3 Web Service 6...6.1 Additional Data Option 9 6.2 Impact Overlay Web Service 9 6.3 Graphical User Interface 9 7. References 10 List of Symbols, Abbreviations, and
Draper, John; Enot, David P; Parker, David; Beckmann, Manfred; Snowdon, Stuart; Lin, Wanchang; Zubair, Hassan
2009-01-01
Background Metabolomics experiments using Mass Spectrometry (MS) technology measure the mass to charge ratio (m/z) and intensity of ionised molecules in crude extracts of complex biological samples to generate high dimensional metabolite 'fingerprint' or metabolite 'profile' data. High resolution MS instruments perform routinely with a mass accuracy of < 5 ppm (parts per million) thus providing potentially a direct method for signal putative annotation using databases containing metabolite mass information. Most database interfaces support only simple queries with the default assumption that molecules either gain or lose a single proton when ionised. In reality the annotation process is confounded by the fact that many ionisation products will be not only molecular isotopes but also salt/solvent adducts and neutral loss fragments of original metabolites. This report describes an annotation strategy that will allow searching based on all potential ionisation products predicted to form during electrospray ionisation (ESI). Results Metabolite 'structures' harvested from publicly accessible databases were converted into a common format to generate a comprehensive archive in MZedDB. 'Rules' were derived from chemical information that allowed MZedDB to generate a list of adducts and neutral loss fragments putatively able to form for each structure and calculate, on the fly, the exact molecular weight of every potential ionisation product to provide targets for annotation searches based on accurate mass. We demonstrate that data matrices representing populations of ionisation products generated from different biological matrices contain a large proportion (sometimes > 50%) of molecular isotopes, salt adducts and neutral loss fragments. Correlation analysis of ESI-MS data features confirmed the predicted relationships of m/z signals. An integrated isotope enumerator in MZedDB allowed verification of exact isotopic pattern distributions to corroborate experimental data. Conclusion We conclude that although ultra-high accurate mass instruments provide major insight into the chemical diversity of biological extracts, the facile annotation of a large proportion of signals is not possible by simple, automated query of current databases using computed molecular formulae. Parameterising MZedDB to take into account predicted ionisation behaviour and the biological source of any sample improves greatly both the frequency and accuracy of potential annotation 'hits' in ESI-MS data. PMID:19622150
myPhyloDB: a local web-server and database for the storage and analysis of metagenomics data
USDA-ARS?s Scientific Manuscript database
The advent of next-generation sequencing has resulted in an explosion of metagenomics data associated with microbial communities from a variety of ecosystems. However, no database and/or analytical software is currently available that allows for archival and cross-study comparison of such data. my...
2012-11-27
with powerful analysis tools and an informatics approach leveraging best-of-breed NoSQL databases, in order to store, search and retrieve relevant...dictionaries, and JavaScript also has good support. The MongoDB project[15] was chosen as a scalable NoSQL data store for the cheminfor- matics components
A database for reproducible manipulation research: CapriDB - Capture, Print, Innovate.
Pokorny, Florian T; Bekiroglu, Yasemin; Pauwels, Karl; Butepage, Judith; Scherer, Clara; Kragic, Danica
2017-04-01
We present a novel approach and database which combines the inexpensive generation of 3D object models via monocular or RGB-D camera images with 3D printing and a state of the art object tracking algorithm. Unlike recent efforts towards the creation of 3D object databases for robotics, our approach does not require expensive and controlled 3D scanning setups and aims to enable anyone with a camera to scan, print and track complex objects for manipulation research. The proposed approach results in detailed textured mesh models whose 3D printed replicas provide close approximations of the originals. A key motivation for utilizing 3D printed objects is the ability to precisely control and vary object properties such as the size, material properties and mass distribution in the 3D printing process to obtain reproducible conditions for robotic manipulation research. We present CapriDB - an extensible database resulting from this approach containing initially 40 textured and 3D printable mesh models together with tracking features to facilitate the adoption of the proposed approach.
Featured Article: Genotation: Actionable knowledge for the scientific reader
Willis, Ethan; Sakauye, Mark; Jose, Rony; Chen, Hao; Davis, Robert L
2016-01-01
We present an article viewer application that allows a scientific reader to easily discover and share knowledge by linking genomics-related concepts to knowledge of disparate biomedical databases. High-throughput data streams generated by technical advancements have contributed to scientific knowledge discovery at an unprecedented rate. Biomedical Informaticists have created a diverse set of databases to store and retrieve the discovered knowledge. The diversity and abundance of such resources present biomedical researchers a challenge with knowledge discovery. These challenges highlight a need for a better informatics solution. We use a text mining algorithm, Genomine, to identify gene symbols from the text of a journal article. The identified symbols are supplemented with information from the GenoDB knowledgebase. Self-updating GenoDB contains information from NCBI Gene, Clinvar, Medgen, dbSNP, KEGG, PharmGKB, Uniprot, and Hugo Gene databases. The journal viewer is a web application accessible via a web browser. The features described herein are accessible on www.genotation.org. The Genomine algorithm identifies gene symbols with an accuracy shown by .65 F-Score. GenoDB currently contains information regarding 59,905 gene symbols, 5633 drug–gene relationships, 5981 gene–disease relationships, and 713 pathways. This application provides scientific readers with actionable knowledge related to concepts of a manuscript. The reader will be able to save and share supplements to be visualized in a graphical manner. This provides convenient access to details of complex biological phenomena, enabling biomedical researchers to generate novel hypothesis to further our knowledge in human health. This manuscript presents a novel application that integrates genomic, proteomic, and pharmacogenomic information to supplement content of a biomedical manuscript and enable readers to automatically discover actionable knowledge. PMID:26900164
Featured Article: Genotation: Actionable knowledge for the scientific reader.
Nagahawatte, Panduka; Willis, Ethan; Sakauye, Mark; Jose, Rony; Chen, Hao; Davis, Robert L
2016-06-01
We present an article viewer application that allows a scientific reader to easily discover and share knowledge by linking genomics-related concepts to knowledge of disparate biomedical databases. High-throughput data streams generated by technical advancements have contributed to scientific knowledge discovery at an unprecedented rate. Biomedical Informaticists have created a diverse set of databases to store and retrieve the discovered knowledge. The diversity and abundance of such resources present biomedical researchers a challenge with knowledge discovery. These challenges highlight a need for a better informatics solution. We use a text mining algorithm, Genomine, to identify gene symbols from the text of a journal article. The identified symbols are supplemented with information from the GenoDB knowledgebase. Self-updating GenoDB contains information from NCBI Gene, Clinvar, Medgen, dbSNP, KEGG, PharmGKB, Uniprot, and Hugo Gene databases. The journal viewer is a web application accessible via a web browser. The features described herein are accessible on www.genotation.org The Genomine algorithm identifies gene symbols with an accuracy shown by .65 F-Score. GenoDB currently contains information regarding 59,905 gene symbols, 5633 drug-gene relationships, 5981 gene-disease relationships, and 713 pathways. This application provides scientific readers with actionable knowledge related to concepts of a manuscript. The reader will be able to save and share supplements to be visualized in a graphical manner. This provides convenient access to details of complex biological phenomena, enabling biomedical researchers to generate novel hypothesis to further our knowledge in human health. This manuscript presents a novel application that integrates genomic, proteomic, and pharmacogenomic information to supplement content of a biomedical manuscript and enable readers to automatically discover actionable knowledge. © 2016 by the Society for Experimental Biology and Medicine.
2011-01-01
Background Renewed interest in plant × environment interactions has risen in the post-genomic era. In this context, high-throughput phenotyping platforms have been developed to create reproducible environmental scenarios in which the phenotypic responses of multiple genotypes can be analysed in a reproducible way. These platforms benefit hugely from the development of suitable databases for storage, sharing and analysis of the large amount of data collected. In the model plant Arabidopsis thaliana, most databases available to the scientific community contain data related to genetic and molecular biology and are characterised by an inadequacy in the description of plant developmental stages and experimental metadata such as environmental conditions. Our goal was to develop a comprehensive information system for sharing of the data collected in PHENOPSIS, an automated platform for Arabidopsis thaliana phenotyping, with the scientific community. Description PHENOPSIS DB is a publicly available (URL: http://bioweb.supagro.inra.fr/phenopsis/) information system developed for storage, browsing and sharing of online data generated by the PHENOPSIS platform and offline data collected by experimenters and experimental metadata. It provides modules coupled to a Web interface for (i) the visualisation of environmental data of an experiment, (ii) the visualisation and statistical analysis of phenotypic data, and (iii) the analysis of Arabidopsis thaliana plant images. Conclusions Firstly, data stored in the PHENOPSIS DB are of interest to the Arabidopsis thaliana community, particularly in allowing phenotypic meta-analyses directly linked to environmental conditions on which publications are still scarce. Secondly, data or image analysis modules can be downloaded from the Web interface for direct usage or as the basis for modifications according to new requirements. Finally, the structure of PHENOPSIS DB provides a useful template for the development of other similar databases related to genotype × environment interactions. PMID:21554668
Flandrois, Jean-Pierre; Lina, Gérard; Dumitrescu, Oana
2014-04-14
Tuberculosis is an infectious bacterial disease caused by Mycobacterium tuberculosis. It remains a major health threat, killing over one million people every year worldwide. An early antibiotic therapy is the basis of the treatment, and the emergence and spread of multidrug and extensively drug-resistant mutant strains raise significant challenges. As these bacteria grow very slowly, drug resistance mutations are currently detected using molecular biology techniques. Resistance mutations are identified by sequencing the resistance-linked genes followed by a comparison with the literature data. The only online database is the TB Drug Resistance Mutation database (TBDReaM database); however, it requires mutation detection before use, and its interrogation is complex due to its loose syntax and grammar. The MUBII-TB-DB database is a simple, highly structured text-based database that contains a set of Mycobacterium tuberculosis mutations (DNA and proteins) occurring at seven loci: rpoB, pncA, katG; mabA(fabG1)-inhA, gyrA, gyrB, and rrs. Resistance mutation data were extracted after the systematic review of MEDLINE referenced publications before March 2013. MUBII analyzes the query sequence obtained by PCR-sequencing using two parallel strategies: i) a BLAST search against a set of previously reconstructed mutated sequences and ii) the alignment of the query sequences (DNA and its protein translation) with the wild-type sequences. The post-treatment includes the extraction of the aligned sequences together with their descriptors (position and nature of mutations). The whole procedure is performed using the internet. The results are graphs (alignments) and text (description of the mutation, therapeutic significance). The system is quick and easy to use, even for technicians without bioinformatics training. MUBII-TB-DB is a structured database of the mutations occurring at seven loci of major therapeutic value in tuberculosis management. Moreover, the system provides interpretation of the mutations in biological and therapeutic terms and can evolve by the addition of newly described mutations. Its goal is to provide easy and comprehensive access through a client-server model over the Web to an up-to-date database of mutations that lead to the resistance of M. tuberculosis to antibiotics.
Using Web Ontology Language to Integrate Heterogeneous Databases in the Neurosciences
Lam, Hugo Y.K.; Marenco, Luis; Shepherd, Gordon M.; Miller, Perry L.; Cheung, Kei-Hoi
2006-01-01
Integrative neuroscience involves the integration and analysis of diverse types of neuroscience data involving many different experimental techniques. This data will increasingly be distributed across many heterogeneous databases that are web-accessible. Currently, these databases do not expose their schemas (database structures) and their contents to web applications/agents in a standardized, machine-friendly way. This limits database interoperation. To address this problem, we describe a pilot project that illustrates how neuroscience databases can be expressed using the Web Ontology Language, which is a semantically-rich ontological language, as a common data representation language to facilitate complex cross-database queries. In this pilot project, an existing tool called “D2RQ” was used to translate two neuroscience databases (NeuronDB and CoCoDat) into OWL, and the resulting OWL ontologies were then merged. An OWL-based reasoner (Racer) was then used to provide a sophisticated query language (nRQL) to perform integrated queries across the two databases based on the merged ontology. This pilot project is one step toward exploring the use of semantic web technologies in the neurosciences. PMID:17238384
Creating a FIESTA (Framework for Integrated Earth Science and Technology Applications) with MagIC
NASA Astrophysics Data System (ADS)
Minnett, R.; Koppers, A. A. P.; Jarboe, N.; Tauxe, L.; Constable, C.
2017-12-01
The Magnetics Information Consortium (https://earthref.org/MagIC) has recently developed a containerized web application to considerably reduce the friction in contributing, exploring and combining valuable and complex datasets for the paleo-, geo- and rock magnetic scientific community. The data produced in this scientific domain are inherently hierarchical and the communities evolving approaches to this scientific workflow, from sampling to taking measurements to multiple levels of interpretations, require a large and flexible data model to adequately annotate the results and ensure reproducibility. Historically, contributing such detail in a consistent format has been prohibitively time consuming and often resulted in only publishing the highly derived interpretations. The new open-source (https://github.com/earthref/MagIC) application provides a flexible upload tool integrated with the data model to easily create a validated contribution and a powerful search interface for discovering datasets and combining them to enable transformative science. MagIC is hosted at EarthRef.org along with several interdisciplinary geoscience databases. A FIESTA (Framework for Integrated Earth Science and Technology Applications) is being created by generalizing MagIC's web application for reuse in other domains. The application relies on a single configuration document that describes the routing, data model, component settings and external services integrations. The container hosts an isomorphic Meteor JavaScript application, MongoDB database and ElasticSearch search engine. Multiple containers can be configured as microservices to serve portions of the application or rely on externally hosted MongoDB, ElasticSearch, or third-party services to efficiently scale computational demands. FIESTA is particularly well suited for many Earth Science disciplines with its flexible data model, mapping, account management, upload tool to private workspaces, reference metadata, image galleries, full text searches and detailed filters. EarthRef's Seamount Catalog of bathymetry and morphology data, EarthRef's Geochemical Earth Reference Model (GERM) databases, and Oregon State University's Marine and Geology Repository (http://osu-mgr.org) will benefit from custom adaptations of FIESTA.
MutAIT: an online genetic toxicology data portal and analysis tools.
Avancini, Daniele; Menzies, Georgina E; Morgan, Claire; Wills, John; Johnson, George E; White, Paul A; Lewis, Paul D
2016-05-01
Assessment of genetic toxicity and/or carcinogenic activity is an essential element of chemical screening programs employed to protect human health. Dose-response and gene mutation data are frequently analysed by industry, academia and governmental agencies for regulatory evaluations and decision making. Over the years, a number of efforts at different institutions have led to the creation and curation of databases to house genetic toxicology data, largely, with the aim of providing public access to facilitate research and regulatory assessments. This article provides a brief introduction to a new genetic toxicology portal called Mutation Analysis Informatics Tools (MutAIT) (www.mutait.org) that provides easy access to two of the largest genetic toxicology databases, the Mammalian Gene Mutation Database (MGMD) and TransgenicDB. TransgenicDB is a comprehensive collection of transgenic rodent mutation data initially compiled and collated by Health Canada. The updated MGMD contains approximately 50 000 individual mutation spectral records from the published literature. The portal not only gives access to an enormous quantity of genetic toxicology data, but also provides statistical tools for dose-response analysis and calculation of benchmark dose. Two important R packages for dose-response analysis are provided as web-distributed applications with user-friendly graphical interfaces. The 'drsmooth' package performs dose-response shape analysis and determines various points of departure (PoD) metrics and the 'PROAST' package provides algorithms for dose-response modelling. The MutAIT statistical tools, which are currently being enhanced, provide users with an efficient and comprehensive platform to conduct quantitative dose-response analyses and determine PoD values that can then be used to calculate human exposure limits or margins of exposure. © The Author 2015. Published by Oxford University Press on behalf of the UK Environmental Mutagen Society. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Internet application to tele-audiology--"nothin' but net".
Givens, Gregg D; Elangovan, Saravanan
2003-12-01
The Telehealth program at East Carolina University has developed a system for real-time assessment of auditory thresholds using computer driven control of a remote audiometer via the Internet. The present study used 45 adult participants in a double-blind study of 2 different systems: a conventional audiometer and an audiometer operated remotely via the Internet. The audiometric thresholds assessed by these 2 systems varied by no more than 1.3 dB for air conduction and 1.2 dB for bone conduction. The results demonstrated the feasibility of this new "telehearing" audiometric system. With the rapid development of Internet-based applications, telehealth has the potential to provide important healthcare coverage for rural areas where specialized audiological services are lacking.
Mazzarelli, Joan M; Brestelli, John; Gorski, Regina K; Liu, Junmin; Manduchi, Elisabetta; Pinney, Deborah F; Schug, Jonathan; White, Peter; Kaestner, Klaus H; Stoeckert, Christian J
2007-01-01
EPConDB (http://www.cbil.upenn.edu/EPConDB) is a public web site that supports research in diabetes, pancreatic development and beta-cell function by providing information about genes expressed in cells of the pancreas. EPConDB displays expression profiles for individual genes and information about transcripts, promoter elements and transcription factor binding sites. Gene expression results are obtained from studies examining tissue expression, pancreatic development and growth, differentiation of insulin-producing cells, islet or beta-cell injury, and genetic models of impaired beta-cell function. The expression datasets are derived using different microarray platforms, including the BCBC PancChips and Affymetrix gene expression arrays. Other datasets include semi-quantitative RT-PCR and MPSS expression studies. For selected microarray studies, lists of differentially expressed genes, derived from PaGE analysis, are displayed on the site. EPConDB provides database queries and tools to examine the relationship between a gene, its transcriptional regulation, protein function and expression in pancreatic tissues.
Loeffler, Ivonne; Liebisch, Marita; Daniel, Christoph; Amann, Kerstin; Wolf, Gunter
2017-12-01
Progressive diabetic nephropathy (DN) is characterized by tubulointerstitial fibrosis that is caused by accumulation of extracellular matrix. Induced by several factors, matrix-producing myofibroblasts may to some extent originate from tubular cells by epithelial-to-mesenchymal transition (EMT). Although previous data document that activation of hypoxia-inducible factor (HIF) signalling can be renoprotective in acute kidney disease, this issue remains controversial in chronic kidney injury. Here, we studied whether DN and EMT-like changes are ameliorated in a mouse model of type 2 diabetes mellitus with increased stability and activity of the HIF. We used db/db mice that were crossed with transgenic mice expressing reduced levels of mitogen-activated protein kinase organizer 1 (MORG1), a scaffold protein interacting with prolyl hydroxylase domain 3 (PHD3), because of deletion of one MORG1 allele. We found significantly reduced nephropathy in diabetic MORG1+/- heterozygous mice compared with the diabetic wild-types (db/dbXMORG1+/+). Furthermore, we demonstrated that EMT-like changes in the tubulointerstitium of diabetic wild-type MORG1+/+ mice are present, whereas diabetic mice with reduced expression of MORG1 showed significantly fewer EMT-like changes. These findings reveal that a deletion of one MORG1 allele inhibits the development of DN in db/db mice. The data suggest that the diminished interstitial fibrosis in these mice is a likely consequence of suppressed EMT-like changes. © The Author 2017. Published by Oxford University Press on behalf of ERA-EDTA. All rights reserved.
NeuroTransDB: highly curated and structured transcriptomic metadata for neurodegenerative diseases
Bagewadi, Shweta; Adhikari, Subash; Dhrangadhariya, Anjani; Irin, Afroza Khanam; Ebeling, Christian; Namasivayam, Aishwarya Alex; Page, Matthew; Hofmann-Apitius, Martin
2015-01-01
Neurodegenerative diseases are chronic debilitating conditions, characterized by progressive loss of neurons that represent a significant health care burden as the global elderly population continues to grow. Over the past decade, high-throughput technologies such as the Affymetrix GeneChip microarrays have provided new perspectives into the pathomechanisms underlying neurodegeneration. Public transcriptomic data repositories, namely Gene Expression Omnibus and curated ArrayExpress, enable researchers to conduct integrative meta-analysis; increasing the power to detect differentially regulated genes in disease and explore patterns of gene dysregulation across biologically related studies. The reliability of retrospective, large-scale integrative analyses depends on an appropriate combination of related datasets, in turn requiring detailed meta-annotations capturing the experimental setup. In most cases, we observe huge variation in compliance to defined standards for submitted metadata in public databases. Much of the information to complete, or refine meta-annotations are distributed in the associated publications. For example, tissue preparation or comorbidity information is frequently described in an article’s supplementary tables. Several value-added databases have employed additional manual efforts to overcome this limitation. However, none of these databases explicate annotations that distinguish human and animal models in neurodegeneration context. Therefore, adopting a more specific disease focus, in combination with dedicated disease ontologies, will better empower the selection of comparable studies with refined annotations to address the research question at hand. In this article, we describe the detailed development of NeuroTransDB, a manually curated database containing metadata annotations for neurodegenerative studies. The database contains more than 20 dimensions of metadata annotations within 31 mouse, 5 rat and 45 human studies, defined in collaboration with domain disease experts. We elucidate the step-by-step guidelines used to critically prioritize studies from public archives and their metadata curation and discuss the key challenges encountered. Curated metadata for Alzheimer’s disease gene expression studies are available for download. Database URL: www.scai.fraunhofer.de/NeuroTransDB.html PMID:26475471
mpMoRFsDB: a database of molecular recognition features in membrane proteins.
Gypas, Foivos; Tsaousis, Georgios N; Hamodrakas, Stavros J
2013-10-01
Molecular recognition features (MoRFs) are small, intrinsically disordered regions in proteins that undergo a disorder-to-order transition on binding to their partners. MoRFs are involved in protein-protein interactions and may function as the initial step in molecular recognition. The aim of this work was to collect, organize and store all membrane proteins that contain MoRFs. Membrane proteins constitute ∼30% of fully sequenced proteomes and are responsible for a wide variety of cellular functions. MoRFs were classified according to their secondary structure, after interacting with their partners. We identified MoRFs in transmembrane and peripheral membrane proteins. The position of transmembrane protein MoRFs was determined in relation to a protein's topology. All information was stored in a publicly available mySQL database with a user-friendly web interface. A Jmol applet is integrated for visualization of the structures. mpMoRFsDB provides valuable information related to disorder-based protein-protein interactions in membrane proteins. http://bioinformatics.biol.uoa.gr/mpMoRFsDB
NASA Astrophysics Data System (ADS)
Park, J. H.; Chi, H. C.; Lim, I. S.; Seong, Y. J.; Pak, J.
2017-12-01
During the first phase of EEW(Earthquake Early Warning) service to the public by KMA (Korea Meteorological Administration) from 2015 in Korea, KIGAM(Korea Institute of Geoscience and Mineral Resources) has adopted ElarmS2 of UC Berkeley BSL and modified local magnitude relation, travel time curves and association procedures so called TrigDB back-filling method. The TrigDB back-filling method uses a database of sorted lists of stations based on epicentral distances of the pre-defined events located on the grids for 1,401 × 1,601 = 2,243,001 events around the Korean Peninsula at a grid spacing of 0.05 degrees. When the version of an event is updated, the TrigDB back-filling method is invoked. First, the grid closest to the epicenter of an event is chosen from the database and candidate stations, which are stations corresponding to the chosen grid and also adjacent to the already-associated stations, are selected. Second, the directions from the chosen grid to the associated stations are averaged to represent the direction of wave propagation, which is used as a reference for computing apparent travel times. The apparent travel times for the associated stations are computed using a P wave velocity of 5.5 km/s from the grid to the projected points in the reference direction. The travel times for the triggered candidate stations are also computed and used to obtain the difference between the apparent travel times of the associated stations and the triggered candidates. Finally, if the difference in the apparent travel times is less than that of the arrival times, the method forces the triggered candidate station to be associated with the event and updates the event location. This method is useful to reduce false locations of events which could be generated from the deep (> 500 km) and regional distance earthquakes happening on the subduction pacific plate boundaries. In comparison of the case study between TrigDB back-filling applied system and the others, we could get the more reliable results in the early stagy of the version updating by forced association of the neighbored stations.
Recent Efforts in Data Compilations for Nuclear Astrophysics
NASA Astrophysics Data System (ADS)
Dillmann, Iris
2008-05-01
Some recent efforts in compiling data for astrophysical purposes are introduced, which were discussed during a JINA-CARINA Collaboration meeting on ``Nuclear Physics Data Compilation for Nucleosynthesis Modeling'' held at the ECT* in Trento/Italy from May 29th-June 3rd, 2007. The main goal of this collaboration is to develop an updated and unified nuclear reaction database for modeling a wide variety of stellar nucleosynthesis scenarios. Presently a large number of different reaction libraries (REACLIB) are used by the astrophysics community. The ``JINA Reaclib Database'' on http://www.nscl.msu.edu/~nero/db/ aims to merge and fit the latest experimental stellar cross sections and reaction rate data of various compilations, e.g. NACRE and its extension for Big Bang nucleosynthesis, Caughlan and Fowler, Iliadis et al., and KADoNiS. The KADoNiS (Karlsruhe Astrophysical Database of Nucleosynthesis in Stars, http://nuclear-astrophysics.fzk.de/kadonis) project is an online database for neutron capture cross sections relevant to the s process. The present version v0.2 is already included in a REACLIB file from Basel university (http://download.nucastro.org/astro/reaclib). The present status of experimental stellar (n,γ) cross sections in KADoNiS is shown. It contains recommended cross sections for 355 isotopes between 1H and 210Bi, over 80% of them deduced from experimental data. A ``high priority list'' for measurements and evaluations for light charged-particle reactions set up by the JINA-CARINA collaboration is presented. The central web access point to submit and evaluate new data is provided by the Oak Ridge group via the http://www.nucastrodata.org homepage. ``Workflow tools'' aim to make the evaluation process transparent and allow users to follow the progress.
Noise levels of dental equipment used in dental college of Damascus University.
Qsaibati, Mhd Loutify; Ibrahim, Ousama
2014-11-01
In dental practical classes, the acoustic environment is characterized by high noise levels in relation to other teaching areas. The aims of this study were to measure noise levels produced during the different dental learning clinics, by equipments used in dental learning areas under different working conditions and by used and brand new handpieces under different working conditions. The noise levels were measured by using a noise level meter with a microphone, which was placed at a distance of 15 cm from a main noise source in pre-clinical and clinical areas. In laboratories, the microphone was placed at a distance of 15 cm and another reading was taken 2 m away. Noise levels of dental learning clinics were measured by placing noise level meter at clinic center. The data were collected, tabulated and statistically analyzed using t-tests. Significance level was set at 5%. In dental clinics, the highest noise was produced by micro motor handpiece while cutting on acrylic (92.2 dB) and lowest noise (51.7 dB) was created by ultrasonic scaler without suction pump. The highest noise in laboratories was caused by sandblaster (96 dB at a distance of 15 cm) and lowest noise by stone trimmer when only turned on (61.8 dB at a distance of 2 m). There was significant differences in noise levels of the equipment's used in dental laboratories and dental learning clinics (P = 0.007). The highest noise level recorded in clinics was at pedodontic clinic (67.37 dB). Noise levels detected in this study were considered to be close to the limit of risk of hearing loss 85 dB.
Noise levels of dental equipment used in dental college of Damascus University
Qsaibati, Mhd. Loutify; Ibrahim, Ousama
2014-01-01
Background: In dental practical classes, the acoustic environment is characterized by high noise levels in relation to other teaching areas. The aims of this study were to measure noise levels produced during the different dental learning clinics, by equipments used in dental learning areas under different working conditions and by used and brand new handpieces under different working conditions. Materials and Methods: The noise levels were measured by using a noise level meter with a microphone, which was placed at a distance of 15 cm from a main noise source in pre-clinical and clinical areas. In laboratories, the microphone was placed at a distance of 15 cm and another reading was taken 2 m away. Noise levels of dental learning clinics were measured by placing noise level meter at clinic center. The data were collected, tabulated and statistically analyzed using t-tests. Significance level was set at 5%. Results: In dental clinics, the highest noise was produced by micro motor handpiece while cutting on acrylic (92.2 dB) and lowest noise (51.7 dB) was created by ultrasonic scaler without suction pump. The highest noise in laboratories was caused by sandblaster (96 dB at a distance of 15 cm) and lowest noise by stone trimmer when only turned on (61.8 dB at a distance of 2 m). There was significant differences in noise levels of the equipment's used in dental laboratories and dental learning clinics (P = 0.007). The highest noise level recorded in clinics was at pedodontic clinic (67.37 dB). Conclusions: Noise levels detected in this study were considered to be close to the limit of risk of hearing loss 85 dB. PMID:25540655
Faye, M B; Martin, C; Schmerber, S
2013-01-01
We report two surgical techniques devised to restore a disrupted incudostapedial joint. Thirty patients underwent rebridging of distal portion of incus long process in the ENT Department of University of Grenoble and Saint-Etienne, between October 1998 and September 2002. Two types of ossicular prostheses were used: A titanium-gold angle prosthesis according to Plester Winkel Kurz (n = 16 patients), and a hydroxylapatite prosthesis as Martin Incudo Prosthesis (n = 14 patients). The average hearing gain in short term is of 8.30 dB for the Martin-Incudo group. It is of 5.23 dB in the Winkel group. Seven and three cases of failures (Residual Rinne > 20 dB) were noticed respectively in the groups Martin-Incudo and Winkel. Seven and four cases of labyrinthisation were observed respectively in the groups Martin-Incudo and Winkel. The average hearing gain in long term is 3.43 dB in the Martin-Incudo group; and 2.85 dB among patients with Winkel Kurz prosthesis. Average residual Rinne is higher than 20 dB in the Winkel group. The hearing gain is not statistically significant between the two groups (p > 0.05). The titanium partial prosthesis did not give good functional results. In the case of a limited lysis (< 2 mm) of the distal portion of incus, we use the cement or cartilage interposition. When ossicular chain cannot be preserved entirely, we privilege incus transposition or a titanium PORP. The Martin-Incudo prosthesis seems interesting in the event of lysis of 2 mm of the long process of incus, nevertheless engineering changes are necessary in order to make rigid the incudostapedial joint.
Validation of a for anaerobic bacteria optimized MALDI-TOF MS biotyper database: The ENRIA project.
Veloo, A C M; Jean-Pierre, H; Justesen, U S; Morris, T; Urban, E; Wybo, I; Kostrzewa, M; Friedrich, A W
2018-03-12
Within the ENRIA project, several 'expertise laboratories' collaborated in order to optimize the identification of clinical anaerobic isolates by using a widely available platform, the Biotyper Matrix Assisted Laser Desorption Ionization Time-of-Flight Mass Spectrometry (MALDI-TOF MS). Main Spectral Profiles (MSPs) of well characterized anaerobic strains were added to one of the latest updates of the Biotyper database db6903; (V6 database) for common use. MSPs of anaerobic strains nominated for addition to the Biotyper database are included in this validation. In this study, we validated the optimized database (db5989 [V5 database] + ENRIA MSPs) using 6309 anaerobic isolates. Using the V5 database 71.1% of the isolates could be identified with high confidence, 16.9% with low confidence and 12.0% could not be identified. Including the MSPs added to the V6 database and all MSPs created within the ENRIA project, the amount of strains identified with high confidence increased to 74.8% and 79.2%, respectively. Strains that could not be identified using MALDI-TOF MS decreased to 10.4% and 7.3%, respectively. The observed increase in high confidence identifications differed per genus. For Bilophila wadsworthia, Prevotella spp., gram-positive anaerobic cocci and other less commonly encountered species more strains were identified with higher confidence. A subset of the non-identified strains (42.1%) were identified using 16S rDNA gene sequencing. The obtained identities demonstrated that strains could not be identified either due to the generation of spectra of insufficient quality or due to the fact that no MSP of the encountered species was present in the database. Undoubtedly, the ENRIA project has successfully increased the number of anaerobic isolates that can be identified with high confidence. We therefore recommend further expansion of the database to include less frequently isolated species as this would also allow us to gain valuable insight into the clinical relevance of these less common anaerobic bacteria. Copyright © 2018 The Authors. Published by Elsevier Ltd.. All rights reserved.
Partial DNA sequencing of Douglas-fir cDNAs used in RFLP mapping
K.D. Jermstad; D.L. Bassoni; C.S. Kinlaw; D.B. Neale
1998-01-01
DNA sequences from 87 Douglas-fir (Pseudotsuga menziesii [Mirb.] Franco) cDNA RFLP probes were determined. Sequences were submitted to the GenBank dbEST database and searched for similarity against nucleotide and protein databases using the BLASTn and BLASTx programs. Twenty-one sequences (24%) were assigned putative functions; 18 of which...
The present report describes a strategy to refine the current Cramer classification of the TTC concept using a broad database (DB) termed TTC RepDose. Cramer classes 1-3 overlap to some extent, indicating a need for a better separation of structural classes likely to be toxic, mo...
RPA tree-level database users guide
Patrick D. Miles; Scott A. Pugh; Brad Smith; Sonja N. Oswalt
2014-01-01
The Forest and Rangeland Renewable Resources Planning Act (RPA) of 1974 calls for a periodic assessment of the Nation's renewable resources. The Forest Inventory and Analysis (FIA) program of the U.S. Forest Service supports the RPA effort by providing information on the forest resources of the United States. The RPA tree-level database (RPAtreeDB) was generated...
BIGNASim: a NoSQL database structure and analysis portal for nucleic acids simulation data.
Hospital, Adam; Andrio, Pau; Cugnasco, Cesare; Codo, Laia; Becerra, Yolanda; Dans, Pablo D; Battistini, Federica; Torres, Jordi; Goñi, Ramón; Orozco, Modesto; Gelpí, Josep Ll
2016-01-04
Molecular dynamics simulation (MD) is, just behind genomics, the bioinformatics tool that generates the largest amounts of data, and that is using the largest amount of CPU time in supercomputing centres. MD trajectories are obtained after months of calculations, analysed in situ, and in practice forgotten. Several projects to generate stable trajectory databases have been developed for proteins, but no equivalence exists in the nucleic acids world. We present here a novel database system to store MD trajectories and analyses of nucleic acids. The initial data set available consists mainly of the benchmark of the new molecular dynamics force-field, parmBSC1. It contains 156 simulations, with over 120 μs of total simulation time. A deposition protocol is available to accept the submission of new trajectory data. The database is based on the combination of two NoSQL engines, Cassandra for storing trajectories and MongoDB to store analysis results and simulation metadata. The analyses available include backbone geometries, helical analysis, NMR observables and a variety of mechanical analyses. Individual trajectories and combined meta-trajectories can be downloaded from the portal. The system is accessible through http://mmb.irbbarcelona.org/BIGNASim/. Supplementary Material is also available on-line at http://mmb.irbbarcelona.org/BIGNASim/SuppMaterial/. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
[EXPERIENCE IN THE APPLICATION OF DATABASES ON BLOODSUCKING INSECTS IN ZOOLOGICAL STUDIES].
Medvedev, S G; Khalikov, R G
2016-01-01
The paper summarizes long-term experience of accumulating and summarizing the faunistic information by means of separate databases (DB) and information analytical systems (IAS), and also prospects of its representation by modern multi-user informational systems. The experience obtained during development and practical use of the PARHOST1 IAS for the study of the world flea fauna and work with personal databases created for the study of bloodsucking insects (lice and blackflies) is analyzed. Research collection material on type series of 57 species and subspecies of fleas of the fauna of Russia was approved as a part of multi-user information retrieval system on the web-portal of the Zoological Institute of the Russian Academy of Sciences. According former investigations, the system allows depositing the information in the authentic form and performing its gradual transformation, i. e. its unification and structuring. In order to provide continuity of DB refill, the possibility of work of operators with different degree of competence is provided.
Jantzen, Rodolphe; Rance, Bastien; Katsahian, Sandrine; Burgun, Anita; Looten, Vincent
2018-01-01
Open data available largely and with minimal constraints to the general public and journalists are needed to help rebuild trust between citizens and the health system. By opening data, we can expect to increase the democratic accountability, the self-empowerment of citizens. This article aims at assessing the quality and reusability of the Transparency - Health database with regards to the FAIR principles. More specifically, we observe the quality of the identity of the French medical doctors in the Transp-db. This study shows that the quality of the data in the Transp-db does not allow to identity with certainty those who benefit from an advantage or remuneration to be confirmed, reducing noticeably the impact of the open data effort.
VizieR Online Data Catalog: UY UMa and EF Boo compiled time of minima (Yu+, 2017)
NASA Astrophysics Data System (ADS)
Yu, Y.-X.; Zhang, X.-D.; Hu, K.; Xiang, F.-Y.
2017-11-01
In order to construct the (O-C) diagram to analyze the period change of UY UMa, we have performed a careful search for all available times of light minima. A total of 76 times of light minima were collected and listed in Table 2. >From the literatures and two well-known databases (i.e., the O-C gateway (http://var.astro.cz/ocgate) and the Lichtenknecker database of the BAV (http://www.bav-astro.de/LkDB/index.php)), we have collected a total of 75 available times of light minima for EF Boo, which are summarized in Table 3. (3 data files).
Kleinboelting, Nils; Huep, Gunnar; Weisshaar, Bernd
2017-01-01
SimpleSearch provides access to a database containing information about T-DNA insertion lines of the GABI-Kat collection of Arabidopsis thaliana mutants. These mutants are an important tool for reverse genetics, and GABI-Kat is the second largest collection of such T-DNA insertion mutants. Insertion sites were deduced from flanking sequence tags (FSTs), and the database contains information about mutant plant lines as well as insertion alleles. Here, we describe improvements within the interface (available at http://www.gabi-kat.de/db/genehits.php) and with regard to the database content that have been realized in the last five years. These improvements include the integration of the Araport11 genome sequence annotation data containing the recently updated A. thaliana structural gene descriptions, an updated visualization component that displays groups of insertions with very similar insertion positions, mapped confirmation sequences, and primers. The visualization component provides a quick way to identify insertions of interest, and access to improved data about the exact structure of confirmed insertion alleles. In addition, the database content has been extended by incorporating additional insertion alleles that were detected during the confirmation process, as well as by adding new FSTs that have been produced during continued efforts to complement gaps in FST availability. Finally, the current database content regarding predicted and confirmed insertion alleles as well as primer sequences has been made available as downloadable flat files. © The Author 2016. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists.
Automatic meta-data collection of STP observation data
NASA Astrophysics Data System (ADS)
Ishikura, S.; Kimura, E.; Murata, K.; Kubo, T.; Shinohara, I.
2006-12-01
For the geo-science and the STP (Solar-Terrestrial Physics) studies, various observations have been done by satellites and ground-based observatories up to now. These data are saved and managed at many organizations, but no common procedure and rule to provide and/or share these data files. Researchers have felt difficulty in searching and analyzing such different types of data distributed over the Internet. To support such cross-over analyses of observation data, we have developed the STARS (Solar-Terrestrial data Analysis and Reference System). The STARS consists of client application (STARS-app), the meta-database (STARS- DB), the portal Web service (STARS-WS) and the download agent Web service (STARS DLAgent-WS). The STARS-DB includes directory information, access permission, protocol information to retrieve data files, hierarchy information of mission/team/data and user information. Users of the STARS are able to download observation data files without knowing locations of the files by using the STARS-DB. We have implemented the Portal-WS to retrieve meta-data from the meta-database. One reason we use the Web service is to overcome a variety of firewall restrictions which is getting stricter in recent years. Now it is difficult for the STARS client application to access to the STARS-DB by sending SQL query to obtain meta- data from the STARS-DB. Using the Web service, we succeeded in placing the STARS-DB behind the Portal- WS and prevent from exposing it on the Internet. The STARS accesses to the Portal-WS by sending the SOAP (Simple Object Access Protocol) request over HTTP. Meta-data is received as a SOAP Response. The STARS DLAgent-WS provides clients with data files downloaded from data sites. The data files are provided with a variety of protocols (e.g., FTP, HTTP, FTPS and SFTP). These protocols are individually selected at each site. The clients send a SOAP request with download request messages and receive observation data files as a SOAP Response with DIME-Attachment. By introducing the DLAgent-WS, we overcame the problem that the data management policies of each data site are independent. Another important issue to be overcome is how to collect the meta-data of observation data files. So far, STARS-DB managers have added new records to the meta-database and updated them manually. We have had a lot of troubles to maintain the meta-database because observation data are generated every day and the quantity of data files increases explosively. For that purpose, we have attempted to automate collection of the meta-data. In this research, we adopted the RSS 1.0 (RDF Site Summary) as a format to exchange meta-data in the STP fields. The RSS is an RDF vocabulary that provides a multipurpose extensible meta-data description and is suitable for syndication of meta-data. Most of the data in the present study are described in the CDF (Common Data Format), which is a self- describing data format. We have converted meta-information extracted from the CDF data files into RSS files. The program to generate the RSS files is executed on data site server once a day and the RSS files provide information of new data files. The RSS files are collected by RSS collection server once a day and the meta- data are stored in the STARS-DB.
Plechakova, Olga; Tranchant-Dubreuil, Christine; Benedet, Fabrice; Couderc, Marie; Tinaut, Alexandra; Viader, Véronique; De Block, Petra; Hamon, Perla; Campa, Claudine; de Kochko, Alexandre; Hamon, Serge; Poncet, Valérie
2009-01-01
Background In the past few years, functional genomics information has been rapidly accumulating on Rubiaceae species and especially on those belonging to the Coffea genus (coffee trees). An increasing number of expressed sequence tag (EST) data and EST- or genomic-derived microsatellite markers have been generated, together with Conserved Ortholog Set (COS) markers. This considerably facilitates comparative genomics or map-based genetic studies through the common use of orthologous loci across different species. Similar genomic information is available for e.g. tomato or potato, members of the Solanaceae family. Since both Rubiaceae and Solanaceae belong to the Euasterids I (lamiids) integration of information on genetic markers would be possible and lead to more efficient analyses and discovery of key loci involved in important traits such as fruit development, quality, and maturation, or adaptation. Our goal was to develop a comprehensive web data source for integrated information on validated orthologous markers in Rubiaceae. Description MoccaDB is an online MySQL-PHP driven relational database that houses annotated and/or mapped microsatellite markers in Rubiaceae. In its current release, the database stores 638 markers that have been defined on 259 ESTs and 379 genomic sequences. Marker information was retrieved from 11 published works, and completed with original data on 132 microsatellite markers validated in our laboratory. DNA sequences were derived from three Coffea species/hybrids. Microsatellite markers were checked for similarity, in vitro tested for cross-amplification and diversity/polymorphism status in up to 38 Rubiaceae species belonging to the Cinchonoideae and Rubioideae subfamilies. Functional annotation was provided and some markers associated with described metabolic pathways were also integrated. Users can search the database for marker, sequence, map or diversity information through multi-option query forms. The retrieved data can be browsed and downloaded, along with protocols used, using a standard web browser. MoccaDB also integrates bioinformatics tools (CMap viewer and local BLAST) and hyperlinks to related external data sources (NCBI GenBank and PubMed, SOL Genomic Network database). Conclusion We believe that MoccaDB will be extremely useful for all researchers working in the areas of comparative and functional genomics and molecular evolution, in general, and population analysis and association mapping of Rubiaceae and Solanaceae species, in particular. PMID:19788737
Non-B DB: a database of predicted non-B DNA-forming motifs in mammalian genomes.
Cer, Regina Z; Bruce, Kevin H; Mudunuri, Uma S; Yi, Ming; Volfovsky, Natalia; Luke, Brian T; Bacolla, Albino; Collins, Jack R; Stephens, Robert M
2011-01-01
Although the capability of DNA to form a variety of non-canonical (non-B) structures has long been recognized, the overall significance of these alternate conformations in biology has only recently become accepted en masse. In order to provide access to genome-wide locations of these classes of predicted structures, we have developed non-B DB, a database integrating annotations and analysis of non-B DNA-forming sequence motifs. The database provides the most complete list of alternative DNA structure predictions available, including Z-DNA motifs, quadruplex-forming motifs, inverted repeats, mirror repeats and direct repeats and their associated subsets of cruciforms, triplex and slipped structures, respectively. The database also contains motifs predicted to form static DNA bends, short tandem repeats and homo(purine•pyrimidine) tracts that have been associated with disease. The database has been built using the latest releases of the human, chimp, dog, macaque and mouse genomes, so that the results can be compared directly with other data sources. In order to make the data interpretable in a genomic context, features such as genes, single-nucleotide polymorphisms and repetitive elements (SINE, LINE, etc.) have also been incorporated. The database is accessed through query pages that produce results with links to the UCSC browser and a GBrowse-based genomic viewer. It is freely accessible at http://nonb.abcc.ncifcrf.gov.
Carotenoids Database: structures, chemical fingerprints and distribution among organisms.
Yabuzaki, Junko
2017-01-01
To promote understanding of how organisms are related via carotenoids, either evolutionarily or symbiotically, or in food chains through natural histories, we built the Carotenoids Database. This provides chemical information on 1117 natural carotenoids with 683 source organisms. For extracting organisms closely related through the biosynthesis of carotenoids, we offer a new similarity search system 'Search similar carotenoids' using our original chemical fingerprint 'Carotenoid DB Chemical Fingerprints'. These Carotenoid DB Chemical Fingerprints describe the chemical substructure and the modification details based upon International Union of Pure and Applied Chemistry (IUPAC) semi-systematic names of the carotenoids. The fingerprints also allow (i) easier prediction of six biological functions of carotenoids: provitamin A, membrane stabilizers, odorous substances, allelochemicals, antiproliferative activity and reverse MDR activity against cancer cells, (ii) easier classification of carotenoid structures, (iii) partial and exact structure searching and (iv) easier extraction of structural isomers and stereoisomers. We believe this to be the first attempt to establish fingerprints using the IUPAC semi-systematic names. For extracting close profiled organisms, we provide a new tool 'Search similar profiled organisms'. Our current statistics show some insights into natural history: carotenoids seem to have been spread largely by bacteria, as they produce C30, C40, C45 and C50 carotenoids, with the widest range of end groups, and they share a small portion of C40 carotenoids with eukaryotes. Archaea share an even smaller portion with eukaryotes. Eukaryotes then have evolved a considerable variety of C40 carotenoids. Considering carotenoids, eukaryotes seem more closely related to bacteria than to archaea aside from 16S rRNA lineage analysis. : http://carotenoiddb.jp. © The Author(s) 2017. Published by Oxford University Press.
ExpressionDB: An open source platform for distributing genome-scale datasets.
Hughes, Laura D; Lewis, Scott A; Hughes, Michael E
2017-01-01
RNA-sequencing (RNA-seq) and microarrays are methods for measuring gene expression across the entire transcriptome. Recent advances have made these techniques practical and affordable for essentially any laboratory with experience in molecular biology. A variety of computational methods have been developed to decrease the amount of bioinformatics expertise necessary to analyze these data. Nevertheless, many barriers persist which discourage new labs from using functional genomics approaches. Since high-quality gene expression studies have enduring value as resources to the entire research community, it is of particular importance that small labs have the capacity to share their analyzed datasets with the research community. Here we introduce ExpressionDB, an open source platform for visualizing RNA-seq and microarray data accommodating virtually any number of different samples. ExpressionDB is based on Shiny, a customizable web application which allows data sharing locally and online with customizable code written in R. ExpressionDB allows intuitive searches based on gene symbols, descriptions, or gene ontology terms, and it includes tools for dynamically filtering results based on expression level, fold change, and false-discovery rates. Built-in visualization tools include heatmaps, volcano plots, and principal component analysis, ensuring streamlined and consistent visualization to all users. All of the scripts for building an ExpressionDB with user-supplied data are freely available on GitHub, and the Creative Commons license allows fully open customization by end-users. We estimate that a demo database can be created in under one hour with minimal programming experience, and that a new database with user-supplied expression data can be completed and online in less than one day.
SciDB versus Spark: A Preliminary Comparison Based on an Earth Science Use Case
NASA Astrophysics Data System (ADS)
Clune, T.; Kuo, K. S.; Doan, K.; Oloso, A.
2015-12-01
We compare two Big Data technologies, SciDB and Spark, for performance, usability, and extensibility, when applied to a representative Earth science use case. SciDB is a new-generation parallel distributed database management system (DBMS) based on the array data model that is capable of handling multidimensional arrays efficiently but requires lengthy data ingest prior to analysis, whereas Spark is a fast and general engine for large scale data processing that can immediately process raw data files and thereby avoid the ingest process. Once data have been ingested, SciDB is very efficient in database operations such as subsetting. Spark, on the other hand, provides greater flexibility by supporting a wide variety of high-level tools including DBMS's. For the performance aspect of this preliminary comparison, we configure Spark to operate directly on text or binary data files and thereby limit the need for additional tools. Arguably, a more appropriate comparison would involve exploring other configurations of Spark which exploit supported high-level tools, but that is beyond our current resources. To make the comparison as "fair" as possible, we export the arrays produced by SciDB into text files (or converting them to binary files) for the intake by Spark and thereby avoid any additional file processing penalties. The Earth science use case selected for this comparison is the identification and tracking of snowstorms in the NASA Modern Era Retrospective-analysis for Research and Applications (MERRA) reanalysis data. The identification portion of the use case is to flag all grid cells of the MERRA high-resolution hourly data that satisfies our criteria for snowstorm, whereas the tracking portion connects flagged cells adjacent in time and space to form a snowstorm episode. We will report the results of our comparisons at this presentation.
ASM Based Synthesis of Handwritten Arabic Text Pages
Al-Hamadi, Ayoub; Elzobi, Moftah; El-etriby, Sherif; Ghoneim, Ahmed
2015-01-01
Document analysis tasks, as text recognition, word spotting, or segmentation, are highly dependent on comprehensive and suitable databases for training and validation. However their generation is expensive in sense of labor and time. As a matter of fact, there is a lack of such databases, which complicates research and development. This is especially true for the case of Arabic handwriting recognition, that involves different preprocessing, segmentation, and recognition methods, which have individual demands on samples and ground truth. To bypass this problem, we present an efficient system that automatically turns Arabic Unicode text into synthetic images of handwritten documents and detailed ground truth. Active Shape Models (ASMs) based on 28046 online samples were used for character synthesis and statistical properties were extracted from the IESK-arDB database to simulate baselines and word slant or skew. In the synthesis step ASM based representations are composed to words and text pages, smoothed by B-Spline interpolation and rendered considering writing speed and pen characteristics. Finally, we use the synthetic data to validate a segmentation method. An experimental comparison with the IESK-arDB database encourages to train and test document analysis related methods on synthetic samples, whenever no sufficient natural ground truthed data is available. PMID:26295059
ASM Based Synthesis of Handwritten Arabic Text Pages.
Dinges, Laslo; Al-Hamadi, Ayoub; Elzobi, Moftah; El-Etriby, Sherif; Ghoneim, Ahmed
2015-01-01
Document analysis tasks, as text recognition, word spotting, or segmentation, are highly dependent on comprehensive and suitable databases for training and validation. However their generation is expensive in sense of labor and time. As a matter of fact, there is a lack of such databases, which complicates research and development. This is especially true for the case of Arabic handwriting recognition, that involves different preprocessing, segmentation, and recognition methods, which have individual demands on samples and ground truth. To bypass this problem, we present an efficient system that automatically turns Arabic Unicode text into synthetic images of handwritten documents and detailed ground truth. Active Shape Models (ASMs) based on 28046 online samples were used for character synthesis and statistical properties were extracted from the IESK-arDB database to simulate baselines and word slant or skew. In the synthesis step ASM based representations are composed to words and text pages, smoothed by B-Spline interpolation and rendered considering writing speed and pen characteristics. Finally, we use the synthetic data to validate a segmentation method. An experimental comparison with the IESK-arDB database encourages to train and test document analysis related methods on synthetic samples, whenever no sufficient natural ground truthed data is available.
Krischer, Jeffrey P; Gopal-Srivastava, Rashmi; Groft, Stephen C; Eckstein, David J
2014-08-01
Established in 2003 by the Office of Rare Diseases Research (ORDR), in collaboration with several National Institutes of Health (NIH) Institutes/Centers, the Rare Diseases Clinical Research Network (RDCRN) consists of multiple clinical consortia conducting research in more than 200 rare diseases. The RDCRN supports longitudinal or natural history, pilot, Phase I, II, and III, case-control, cross-sectional, chart review, physician survey, bio-repository, and RDCRN Contact Registry (CR) studies. To date, there have been 24,684 participants enrolled on 120 studies from 446 sites worldwide. An additional 11,533 individuals participate in the CR. Through a central data management and coordinating center (DMCC), the RDCRN's platform for the conduct of observational research encompasses electronic case report forms, federated databases, and an online CR for epidemiological and survey research. An ORDR-governed data repository (through dbGaP, a database for genotype and phenotype information from the National Library of Medicine) has been created. DMCC coordinates with ORDR to register and upload study data to dbGaP for data sharing with the scientific community. The platform provided by the RDCRN DMCC has supported 128 studies, six of which were successfully conducted through the online CR, with 2,352 individuals accrued and a median enrollment time of just 2 months. The RDCRN has built a powerful suite of web-based tools that provide for integration of federated and online database support that can accommodate a large number of rare diseases on a global scale. RDCRN studies have made important advances in the diagnosis and treatment of rare diseases.
Browning, George G; Rovers, Maroeska M; Williamson, Ian; Lous, Jørgen; Burton, Martin J
2010-10-06
Otitis media with effusion (OME; 'glue ear') is common in childhood and surgical treatment with grommets (ventilation tubes) is widespread but controversial. To assess the effectiveness of grommet insertion compared with myringotomy or non-surgical treatment in children with OME. We searched the Cochrane ENT Disorders Group Trials Register, other electronic databases and additional sources for published and unpublished trials (most recent search: 22 March 2010). Randomised controlled trials evaluating the effect of grommets. Outcomes studied included hearing level, duration of middle ear effusion, language and speech development, cognitive development, behaviour and adverse effects. Data from studies were extracted by two authors and checked by the other authors. We included 10 trials (1728 participants). Some trials randomised children (grommets versus no grommets), others ears (grommet one ear only). The severity of OME in children varied between trials. Only one 'by child' study (MRC: TARGET) had particularly stringent audiometric entry criteria. No trial was identified that used long-term grommets.Grommets were mainly beneficial in the first six months by which time natural resolution lead to improved hearing in the non-surgically treated children also. Only one high quality trial that randomised children (N = 211) reported results at three months; the mean hearing level was 12 dB better (95% CI 10 to 14 dB) in those treated with grommets as compared to the controls. Meta-analyses of three high quality trials (N = 523) showed a benefit of 4 dB (95% CI 2 to 6 dB) at six to nine months. At 12 and 18 months follow up no differences in mean hearing levels were found.Data from three trials that randomised ears (N = 230 ears) showed similar effects to the trials that randomised children. At four to six months mean hearing level was 10 dB better in the grommet ear (95% CI 5 to 16 dB), and at 7 to 12 months and 18 to 24 months was 6 dB (95% CI 2 to 10 dB) and 5 dB (95% CI 3 to 8 dB) dB better.No effect was found on language or speech development or for behaviour, cognitive or quality of life outcomes.Tympanosclerosis was seen in about a third of ears that received grommets. Otorrhoea was common in infants, but in older children (three to seven years) occurred in < 2% of grommet ears over two years of follow up. In children with OME the effect of grommets on hearing, as measured by standard tests, appears small and diminishes after six to nine months by which time natural resolution also leads to improved hearing in the non-surgically treated children. No effect was found on other child outcomes but data on these were sparse. No study has been performed in children with established speech, language, learning or developmental problems so no conclusions can be made regarding treatment of such children.
Vavougios, Georgios D; Solenov, Evgeniy I; Hatzoglou, Chrissi; Baturina, Galina S; Katkova, Liubov E; Molyvdas, Paschalis Adam; Gourgoulianis, Konstantinos I; Zarogiannis, Sotirios G
2015-10-01
The aim of our study was to assess the differential gene expression of Parkinson protein 7 (PARK7) interactome in malignant pleural mesothelioma (MPM) using data mining techniques to identify novel candidate genes that may play a role in the pathogenicity of MPM. We constructed the PARK7 interactome using the ConsensusPathDB database. We then interrogated the Oncomine Cancer Microarray database using the Gordon Mesothelioma Study, for differential gene expression of the PARK7 interactome. In ConsensusPathDB, 38 protein interactors of PARK7 were identified. In the Gordon Mesothelioma Study, 34 of them were assessed out of which SUMO1, UBC3, KIAA0101, HDAC2, DAXX, RBBP4, BBS1, NONO, RBBP7, HTRA2, and STUB1 were significantly overexpressed whereas TRAF6 and MTA2 were significantly underexpressed in MPM patients (network 2). Furthermore, Kaplan-Meier analysis revealed that MPM patients with high BBS1 expression had a median overall survival of 16.5 vs. 8.7 mo of those that had low expression. For validation purposes, we performed a meta-analysis in Oncomine database in five sarcoma datasets. Eight network 2 genes (KIAA0101, HDAC2, SUMO1, RBBP4, NONO, RBBP7, HTRA2, and MTA2) were significantly differentially expressed in an array of 18 different sarcoma types. Finally, Gene Ontology annotation enrichment analysis revealed significant roles of the PARK7 interactome in NuRD, CHD, and SWI/SNF protein complexes. In conclusion, we identified 13 novel genes differentially expressed in MPM, never reported before. Among them, BBS1 emerged as a novel predictor of overall survival in MPM. Finally, we identified that PARK7 interactome is involved in novel pathways pertinent in MPM disease. Copyright © 2015 the American Physiological Society.
NASA Astrophysics Data System (ADS)
Lu, Yinghui; Clothiaux, Eugene E.; Aydin, Kültegin; Botta, Giovanni; Verlinde, Johannes
2013-12-01
Using the Generalized Multi-particle Mie-method (GMM), Botta et al. (in this issue) [7] created a database of backscattering cross sections for 412 different ice crystal dendrites at X-, Ka- and W-band wavelengths for different incident angles. The Rayleigh-Gans theory, which accounts for interference effects but ignores interactions between different parts of an ice crystal, explains much, but not all, of the variability in the database of backscattering cross sections. Differences between it and the GMM range from -3.5 dB to +2.5 dB and are highly dependent on the incident angle. To explain the residual variability a physically intuitive iterative method was developed to estimate the internal electric field within an ice crystal that accounts for interactions between the neighboring regions within it. After modifying the Rayleigh-Gans theory using this estimated internal electric field, the difference between the estimated backscattering cross sections and those from the GMM method decreased to within 0.5 dB for most of the ice crystals. The largest percentage differences occur when the form factor from the Rayleigh-Gans theory is close to zero. Both interference effects and neighbor interactions are sensitive to the morphology of ice crystals. Improvements in ice-microphysical models are necessary to predict or diagnose internal structures within ice crystals to aid in more accurate interpretation of radar returns. Observations of the morphology of ice crystals are, in turn, necessary to guide the development of such ice-microphysical models and to better understand the statistical properties of ice crystal morphologies in different environmental conditions.
Six year effectiveness of a population based two tier infant hearing screening programme.
Russ, S A; Rickards, F; Poulakis, Z; Barker, M; Saunders, K; Wake, M
2002-04-01
To determine whether a two tier universal infant hearing screening programme (population based risk factor ascertainment and universal distraction testing) lowered median age of diagnosis of bilateral congenital hearing impairment (CHI) >40 dB HL in Victoria, Australia. Comparison of whole population birth cohorts pre and post introduction of the Victorian Infant Hearing Screening Program (VIHSP). All babies surviving the neonatal period born in Victoria in 1989 (pre-VIHSP) and 1993 (post-VIHSP) were studied. (1) Pre-1992: distraction test at 7-9 months. (2) Post-1992: infants with risk factors for CHI referred for auditory brain stem evoked response (ABR) assessment; all others screened by modified distraction test at 7-9 months. Of the 1989 cohort (n = 63 454), 1.65/1000 were fitted with hearing aids for CHI by end 1995, compared with 2.09/1000 of the 1993 cohort (n = 64 116) by end 1999. Of these, 79 cases from the 1989 cohort (1.24/1000) and 72 cases from the 1993 cohort (1.12/1000) had CHI >40 dB HL. Median age at diagnosis of CHI >40 dB HL for the 1989 birth cohort was 20.3 months, and for the 1993 cohort was 14.2 months. Median age at diagnosis fell significantly for severe CHI but not for moderate or profound CHI. Significantly more babies with CHI >40 dB HL were diagnosed by 6 months of age in 1993 than in 1989 (21.7% v 6.3%). Compared to the six years pre-VIHSP, numbers aided by six months were consistently higher in the six years post-VIHSP (1.05 per 100 000 births versus 13.4 per 100 000 births per year). VIHSP resulted in very early diagnosis for more infants and lowered median age of diagnosis of severe CHI. However, overall results were disappointing.
Six year effectiveness of a population based two tier infant hearing screening programme
Russ, S; Rickards, F; Poulakis, Z; Barker, M; Saunders, K; Wake, M
2002-01-01
Aims: To determine whether a two tier universal infant hearing screening programme (population based risk factor ascertainment and universal distraction testing) lowered median age of diagnosis of bilateral congenital hearing impairment (CHI) >40 dB HL in Victoria, Australia. Methods: Comparison of whole population birth cohorts pre and post introduction of the Victorian Infant Hearing Screening Program (VIHSP). All babies surviving the neonatal period born in Victoria in 1989 (pre-VIHSP) and 1993 (post-VIHSP) were studied. (1) Pre-1992: distraction test at 7–9 months. (2) Post-1992: infants with risk factors for CHI referred for auditory brain stem evoked response (ABR) assessment; all others screened by modified distraction test at 7–9 months. Results: Of the 1989 cohort (n = 63 454), 1.65/1000 were fitted with hearing aids for CHI by end 1995, compared with 2.09/1000 of the 1993 cohort (n = 64 116) by end 1999. Of these, 79 cases from the 1989 cohort (1.24/1000) and 72 cases from the 1993 cohort (1.12/1000) had CHI >40 dB HL. Median age at diagnosis of CHI >40 dB HL for the 1989 birth cohort was 20.3 months, and for the 1993 cohort was 14.2 months. Median age at diagnosis fell significantly for severe CHI but not for moderate or profound CHI. Significantly more babies with CHI >40 dB HL were diagnosed by 6 months of age in 1993 than in 1989 (21.7% v 6.3%). Compared to the six years pre-VIHSP, numbers aided by six months were consistently higher in the six years post-VIHSP (1.05 per 100 000 births versus 13.4 per 100 000 births per year). Conclusions: VIHSP resulted in very early diagnosis for more infants and lowered median age of diagnosis of severe CHI. However, overall results were disappointing. PMID:11919095
SW#db: GPU-Accelerated Exact Sequence Similarity Database Search.
Korpar, Matija; Šošić, Martin; Blažeka, Dino; Šikić, Mile
2015-01-01
In recent years we have witnessed a growth in sequencing yield, the number of samples sequenced, and as a result-the growth of publicly maintained sequence databases. The increase of data present all around has put high requirements on protein similarity search algorithms with two ever-opposite goals: how to keep the running times acceptable while maintaining a high-enough level of sensitivity. The most time consuming step of similarity search are the local alignments between query and database sequences. This step is usually performed using exact local alignment algorithms such as Smith-Waterman. Due to its quadratic time complexity, alignments of a query to the whole database are usually too slow. Therefore, the majority of the protein similarity search methods prior to doing the exact local alignment apply heuristics to reduce the number of possible candidate sequences in the database. However, there is still a need for the alignment of a query sequence to a reduced database. In this paper we present the SW#db tool and a library for fast exact similarity search. Although its running times, as a standalone tool, are comparable to the running times of BLAST, it is primarily intended to be used for exact local alignment phase in which the database of sequences has already been reduced. It uses both GPU and CPU parallelization and was 4-5 times faster than SSEARCH, 6-25 times faster than CUDASW++ and more than 20 times faster than SSW at the time of writing, using multiple queries on Swiss-prot and Uniref90 databases.
Trask, Aaron J; Delbin, Maria A; Katz, Paige S; Zanesco, Angelina; Lucchesi, Pamela A
2012-01-01
The goals of the present study were to compare coronary resistance microvessel (CRM) remodeling between type 1 diabetes mellitus (T1DM) and type 2 diabetes mellitus (T2DM) mice, and to determine the impact of aerobic exercise training on CRM remodeling in diabetes. Eight week old male mice were divided into T1DM: control sedentary (Control-SD), T1DM sedentary (T1DM-SD) induced by streptozotocin, and T1DM exercise trained (T1DM-TR); T2DM: control sedentary (Db/db-SD), T2DM sedentary (db/db-SD), and T2DM trained (db/db-TR). Aerobic exercise training (TR) was performed on a mouse treadmill for 8weeks. CRMs were isolated and mounted on a pressure myograph to measure and record vascular remodeling and mechanics. CRM diameters, wall thickness, stress-strain, incremental modulus remained unchanged in T1DM-SD mice compared to control, and exercise training showed no effect. In contrast, CRMs isolated from db/db-SD mice exhibited decreased luminal diameter with thicker microvascular walls, which significantly increased the wall:lumen ratio (Db/db-SD: 5.8±0.3 vs. db/db-SD: 8.9±0.7, p<0.001). Compared to db/db-SD mice, coronary arterioles isolated from db/db-TR mice had similar internal diameter and wall thickness, while wall:lumen ratio (6.8±0.2, p<0.05) and growth index (db/db-SD: 16.2 vs. db/db-TR: 4.3, % over Db/db) were reduced. These data show that CRMs undergo adverse inward hypertrophic remodeling only in T2DM, but not T1DM, and that aerobic exercise training can partially mitigate this process. Copyright © 2012 Elsevier Inc. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Arana, Maite Rocío, E-mail: arana@ifise-conicet.gov.ar; Tocchetti, Guillermo Nicolás, E-mail: gtocchetti@live.com.ar; Domizi, Pablo, E-mail: domizi@ibr-conicet.gov.ar
2015-09-01
The cAMP pathway is a universal signaling pathway regulating many cellular processes including metabolic routes, growth and differentiation. However, its effects on xenobiotic biotransformation and transport systems are poorly characterized. The effect of cAMP on expression and activity of GST and MRP2 was evaluated in Caco-2 cells, a model of intestinal epithelium. Cells incubated with the cAMP permeable analog dibutyryl cyclic AMP (db-cAMP: 1,10,100 μM) for 48 h exhibited a dose–response increase in GST class α and MRP2 protein expression. Incubation with forskolin, an activator of adenylyl cyclase, confirmed the association between intracellular cAMP and upregulation of MRP2. Consistent withmore » increased expression of GSTα and MRP2, db-cAMP enhanced their activities, as well as cytoprotection against the common substrate 1-chloro-2,4-dinitrobenzene. Pretreatment with protein kinase A (PKA) inhibitors totally abolished upregulation of MRP2 and GSTα induced by db-cAMP. In silico analysis together with experiments consisting of treatment with db-cAMP of Caco-2 cells transfected with a reporter construct containing CRE and AP-1 sites evidenced participation of these sites in MRP2 upregulation. Further studies involving the transcription factors CREB and AP-1 (c-JUN, c-FOS and ATF2) demonstrated increased levels of total c-JUN and phosphorylation of c-JUN and ATF2 by db-cAMP, which were suppressed by a PKA inhibitor. Co-immunoprecipitation and ChIP assay studies demonstrated that db-cAMP increased c-JUN/ATF2 interaction, with further recruitment to the region of the MRP2 promoter containing CRE and AP-1 sites. We conclude that cAMP induces GSTα and MRP2 expression and activity in Caco-2 cells via the PKA pathway, thus regulating detoxification of specific xenobiotics. - Highlights: • cAMP positively modulates the expression and activity of GST and MRP2 in Caco-2 cells. • Such induction resulted in increased cytoprotection against chemical injury. • PKA signaling pathway is involved downstream of cAMP. • Transcriptional MRP2 regulation ultimately involved participation of c-JUN and ATF2.« less
TrypsNetDB: An integrated framework for the functional characterization of trypanosomatid proteins
Gazestani, Vahid H.; Yip, Chun Wai; Nikpour, Najmeh; Berghuis, Natasha
2017-01-01
Trypanosomatid parasites cause serious infections in humans and production losses in livestock. Due to the high divergence from other eukaryotes, such as humans and model organisms, the functional roles of many trypanosomatid proteins cannot be predicted by homology-based methods, rendering a significant portion of their proteins as uncharacterized. Recent technological advances have led to the availability of multiple systematic and genome-wide datasets on trypanosomatid parasites that are informative regarding the biological role(s) of their proteins. Here, we report TrypsNetDB (http://trypsNetDB.org), a web-based resource for the functional annotation of 16 different species/strains of trypanosomatid parasites. The database not only visualizes the network context of the queried protein(s) in an intuitive way but also examines the response of the represented network in more than 50 different biological contexts and its enrichment for various biological terms and pathways, protein sequence signatures, and potential RNA regulatory elements. The interactome core of the database, as of Jan 23, 2017, contains 101,187 interactions among 13,395 trypanosomatid proteins inferred from 97 genome-wide and focused studies on the interactome of these organisms. PMID:28158179
48 CFR 22.404-1 - Types of wage determinations.
Code of Federal Regulations, 2010 CFR
2010-10-01
... in the “Archived DB WD” database on WDOL for information purposes only. Contracting officers may not use an archived wage determination in a contract action without obtaining prior approval of the...
Huang, Le; You, Yong-Ke; Zhu, Tracy Y; Zheng, Li-Zhen; Huang, Xiao-Ru; Chen, Hai-Yong; Yao, Dong; Lan, Hui-Yao; Qin, Ling
2016-06-10
This study aimed to evaluate the validation of the leptin receptor-deficient mice model for secondary osteoporosis associated with type 2 diabetes mellitus (T2DM) at bone micro-architectural level. Thirty three 36-week old male mice were divided into four groups: normal control (db/m) (n = 7), leptin receptor-deficient T2DM (db/db) (n = 8), human C-reactive protein (CRP) transgenic normal control (crp/db/m) (n = 7), and human CRP transgenic T2DM (crp/db/db) (n = 11). Lumber vertebrae (L5) and bilateral lower limbs were scanned by micro-CT to analyze trabecular and cortical bone quality. Right femora were used for three-point bending to analyze the mechanical properties. Trabecular bone quality at L5 was better in db/db or crp/db/db group in terms of bone mineral density (BMD), bone volume fraction, connectivity density, trabecular number and separation (all p < 0.05). However the indices measured at proximal tibia showed comparable trabecular BMD and microarchitecture among the four groups. Femur length in crp/db/db group was significantly shorter than db/m group (p < 0.05) and cortices were thinner in db/db and crp/db/db groups (p > 0.05). Maximum loading and energy yield in mechanical test were similar among groups while the elastic modulus in db/db and crp/db/db significantly lower than db/m. The leptin-receptor mice is not a proper model for secondary osteoporosis associated with T2DM.
NASA Astrophysics Data System (ADS)
Huang, Le; You, Yong-Ke; Zhu, Tracy Y.; Zheng, Li-Zhen; Huang, Xiao-Ru; Chen, Hai-Yong; Yao, Dong; Lan, Hui-Yao; Qin, Ling
2016-06-01
This study aimed to evaluate the validation of the leptin receptor-deficient mice model for secondary osteoporosis associated with type 2 diabetes mellitus (T2DM) at bone micro-architectural level. Thirty three 36-week old male mice were divided into four groups: normal control (db/m) (n = 7), leptin receptor-deficient T2DM (db/db) (n = 8), human C-reactive protein (CRP) transgenic normal control (crp/db/m) (n = 7), and human CRP transgenic T2DM (crp/db/db) (n = 11). Lumber vertebrae (L5) and bilateral lower limbs were scanned by micro-CT to analyze trabecular and cortical bone quality. Right femora were used for three-point bending to analyze the mechanical properties. Trabecular bone quality at L5 was better in db/db or crp/db/db group in terms of bone mineral density (BMD), bone volume fraction, connectivity density, trabecular number and separation (all p < 0.05). However the indices measured at proximal tibia showed comparable trabecular BMD and microarchitecture among the four groups. Femur length in crp/db/db group was significantly shorter than db/m group (p < 0.05) and cortices were thinner in db/db and crp/db/db groups (p > 0.05). Maximum loading and energy yield in mechanical test were similar among groups while the elastic modulus in db/db and crp/db/db significantly lower than db/m. The leptin-receptor mice is not a proper model for secondary osteoporosis associated with T2DM.
Mohtadi, Nicholas; Barber, Rhamona; Chan, Denise; Paolucci, Elizabeth Oddone
2016-05-01
Complications/adverse events of anterior cruciate ligament (ACL) surgery are underreported, despite pooled level 1 data in systematic reviews. All adverse events/complications occurring within a 2-year postoperative period after primary ACL reconstruction, as part of a large randomized clinical trial (RCT), were identified and described. Prospective, double-blind randomized clinical trial. Patients and the independent trained examiner were blinded to treatment allocation. University-based orthopedic referral practice. Three hundred thirty patients (14-50 years; 183 males) with isolated ACL deficiency were intraoperatively randomized to ACL reconstruction with 1 autograft type. Graft harvest and arthroscopic portal incisions were identical. Patients were equally distributed to patellar tendon (PT), quadruple-stranded hamstring tendon (HT), and double-bundle (DB) hamstring autograft ACL reconstruction. Adverse events/complications were patient reported, documented, and diagnoses confirmed. Two major complications occurred: pulmonary embolism and septic arthritis. Twenty-four patients (7.3%) required repeat surgery, including 25 separate operations: PT = 7 (6.4%), HT = 9 (8.2%), and DB = 8 (7.3%). Repeat surgery was performed for meniscal tears (3.6%; n = 12), intra-articular scarring (2.7%; n = 9), chondral pathology (0.6%; n = 2), and wound dehiscence (0.3%; n = 1). Other complications included wound problems, sensory nerve damage, muscle tendon injury, tibial periostitis, and suspected meniscal tears and chondral lesions. Overall, more complications occurred in the HT/DB groups (PT = 24; HT = 31; DB = 45), but more PT patients complained of moderate or severe kneeling pain (PT = 17; HT = 9; DB = 4) at 2 years. Overall, ACL reconstructive surgery is safe. Major complications were uncommon. Secondary surgery was necessary 7.3% of the time for complications/adverse events (excluding graft reinjury or revisions) within the first 2 years. Level 1 (therapeutic studies). This article reports on the complications/adverse events that were prospectively identified up to 2 years postoperatively, in a defined patient population participating in a large double-blind randomized clinical trial comparing PT, single-bundle hamstring, and DB hamstring reconstructions for ACL rupture.
Atzori, A S; Tedeschi, L O; Cannas, A
2013-05-01
The economic efficiency of dairy farms is the main goal of farmers. The objective of this work was to use routinely available information at the dairy farm level to develop an index of profitability to rank dairy farms and to assist the decision-making process of farmers to increase the economic efficiency of the entire system. A stochastic modeling approach was used to study the relationships between inputs and profitability (i.e., income over feed cost; IOFC) of dairy cattle farms. The IOFC was calculated as: milk revenue + value of male calves + culling revenue - herd feed costs. Two databases were created. The first one was a development database, which was created from technical and economic variables collected in 135 dairy farms. The second one was a synthetic database (sDB) created from 5,000 synthetic dairy farms using the Monte Carlo technique and based on the characteristics of the development database data. The sDB was used to develop a ranking index as follows: (1) principal component analysis (PCA), excluding IOFC, was used to identify principal components (sPC); and (2) coefficient estimates of a multiple regression of the IOFC on the sPC were obtained. Then, the eigenvectors of the sPC were used to compute the principal component values for the original 135 dairy farms that were used with the multiple regression coefficient estimates to predict IOFC (dRI; ranking index from development database). The dRI was used to rank the original 135 dairy farms. The PCA explained 77.6% of the sDB variability and 4 sPC were selected. The sPC were associated with herd profile, milk quality and payment, poor management, and reproduction based on the significant variables of the sPC. The mean IOFC in the sDB was 0.1377 ± 0.0162 euros per liter of milk (€/L). The dRI explained 81% of the variability of the IOFC calculated for the 135 original farms. When the number of farms below and above 1 standard deviation (SD) of the dRI were calculated, we found that 21 farms had dRI<-1 SD, 32 farms were between -1 SD and 0, 67 farms were between 0 and +1 SD, and 15 farms had dRI>+1 SD. The top 10% of the farms had a dRI greater than 0.170 €/L, whereas the bottom 10% farms had a dRI lower than 0.116 €/L. This stochastic approach allowed us to understand the relationships among the inputs of the studied dairy farms and to develop a ranking index for comparison purposes. The developed methodology may be improved by using more inputs at the dairy farm level and considering the actual cost to measure profitability. Copyright © 2013 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Creating databases for biological information: an introduction.
Stein, Lincoln
2002-08-01
The essence of bioinformatics is dealing with large quantities of information. Whether it be sequencing data, microarray data files, mass spectrometric data (e.g., fingerprints), the catalog of strains arising from an insertional mutagenesis project, or even large numbers of PDF files, there inevitably comes a time when the information can simply no longer be managed with files and directories. This is where databases come into play. This unit briefly reviews the characteristics of several database management systems, including flat file, indexed file, and relational databases, as well as ACeDB. It compares their strengths and weaknesses and offers some general guidelines for selecting an appropriate database management system.
Constructing Benchmark Databases and Protocols for Medical Image Analysis: Diabetic Retinopathy
Kauppi, Tomi; Kämäräinen, Joni-Kristian; Kalesnykiene, Valentina; Sorri, Iiris; Uusitalo, Hannu; Kälviäinen, Heikki
2013-01-01
We address the performance evaluation practices for developing medical image analysis methods, in particular, how to establish and share databases of medical images with verified ground truth and solid evaluation protocols. Such databases support the development of better algorithms, execution of profound method comparisons, and, consequently, technology transfer from research laboratories to clinical practice. For this purpose, we propose a framework consisting of reusable methods and tools for the laborious task of constructing a benchmark database. We provide a software tool for medical image annotation helping to collect class label, spatial span, and expert's confidence on lesions and a method to appropriately combine the manual segmentations from multiple experts. The tool and all necessary functionality for method evaluation are provided as public software packages. As a case study, we utilized the framework and tools to establish the DiaRetDB1 V2.1 database for benchmarking diabetic retinopathy detection algorithms. The database contains a set of retinal images, ground truth based on information from multiple experts, and a baseline algorithm for the detection of retinopathy lesions. PMID:23956787
ERIC Educational Resources Information Center
Kozma, Tamas; Radacsi, Imre
2000-01-01
Addresses the problem of educating minorities when the political borders of European nation-states fail to coincide with ethno-linguistic realities. Suggests two solutions to problems of higher education for ethno-linguistic minorities: (1) multilingual universities, and (2) regional cooperation in higher education in border areas. (Author/DB)
ERIC Educational Resources Information Center
Gray, Peter J.
Ways a microcomputer can be used to establish and maintain an evaluation database and types of data management features possible on a microcomputer are described in this report, which contains step-by-step procedures and numerous examples for establishing a database, manipulating data, and designing and printing reports. Following a brief…
A resource for benchmarking the usefulness of protein structure models.
Carbajo, Daniel; Tramontano, Anna
2012-08-02
Increasingly, biologists and biochemists use computational tools to design experiments to probe the function of proteins and/or to engineer them for a variety of different purposes. The most effective strategies rely on the knowledge of the three-dimensional structure of the protein of interest. However it is often the case that an experimental structure is not available and that models of different quality are used instead. On the other hand, the relationship between the quality of a model and its appropriate use is not easy to derive in general, and so far it has been analyzed in detail only for specific application. This paper describes a database and related software tools that allow testing of a given structure based method on models of a protein representing different levels of accuracy. The comparison of the results of a computational experiment on the experimental structure and on a set of its decoy models will allow developers and users to assess which is the specific threshold of accuracy required to perform the task effectively. The ModelDB server automatically builds decoy models of different accuracy for a given protein of known structure and provides a set of useful tools for their analysis. Pre-computed data for a non-redundant set of deposited protein structures are available for analysis and download in the ModelDB database. IMPLEMENTATION, AVAILABILITY AND REQUIREMENTS: Project name: A resource for benchmarking the usefulness of protein structure models. Project home page: http://bl210.caspur.it/MODEL-DB/MODEL-DB_web/MODindex.php.Operating system(s): Platform independent. Programming language: Perl-BioPerl (program); mySQL, Perl DBI and DBD modules (database); php, JavaScript, Jmol scripting (web server). Other requirements: Java Runtime Environment v1.4 or later, Perl, BioPerl, CPAN modules, HHsearch, Modeller, LGA, NCBI Blast package, DSSP, Speedfill (Surfnet) and PSAIA. License: Free. Any restrictions to use by non-academics: No.
Weller, Kathryn E; Greene, Geoffrey W; Redding, Colleen A; Paiva, Andrea L; Lofgren, Ingrid; Nash, Jessica T; Kobayashi, Hisanori
2014-01-01
To develop and validate an instrument to assess environmentally conscious eating (Green Eating [GE]) behavior (BEH) and GE Transtheoretical Model constructs including Stage of Change (SOC), Decisional Balance (DB), and Self-efficacy (SE). Cross-sectional instrument development survey. Convenience sample (n = 954) of 18- to 24-year-old college students from a northeastern university. The sample was randomly split: (N1) and (N2). N1 was used for exploratory factor analyses using principal components analyses; N2 was used for confirmatory analyses (structural modeling) and reliability analyses (coefficient α). The full sample was used for measurement invariance (multi-group confirmatory analyses) and convergent validity (BEH) and known group validation (DB and SE) by SOC using analysis of variance. Reliable (α > .7), psychometrically sound, and stable measures included 2 correlated 5-item DB subscales (Pros and Cons), 2 correlated SE subscales (school [5 items] and home [3 items]), and a single 6-item BEH scale. Most students (66%) were in Precontemplation and Contemplation SOC. Behavior, DB, and SE scales differed significantly by SOC (P < .001) with moderate to large effect sizes, as predicted by the Transtheoretical Model, which supported the validity of these measures. Successful development and preliminary validation of this 25-item GE instrument provides a basis for assessment as well as development of tailored interventions for college students. Copyright © 2014 Society for Nutrition Education and Behavior. Published by Elsevier Inc. All rights reserved.
Normative data for the Words-in-Noise Test for 6- to 12-year-old children.
Wilson, Richard H; Farmer, Nicole M; Gandhi, Avni; Shelburne, Emily; Weaver, Jamie
2010-10-01
To establish normative data for children on the Words-in-Noise Test (WIN; R. H. Wilson, 2003; R. H. Wilson & R. McArdle, 2007). Forty-two children in each of 7 age groups, ranging in age from 6 to 12 years (n=294), and 24 young adults (age range: 18-27 years) with normal hearing for pure tones participated. All listeners were screened at 15 dB HL (American National Standards Institute, 2004) with the octave interval between 500 and 4000 Hz. Randomizations of WIN Lists 1, 2, and 1 or WIN Lists 2, 1, and 2 were presented with the noise fixed at 70 dB SPL, followed by presentation at 90 dB SPL of the 70 Northwestern University Auditory Test No. 6 (T. W. Tillman & R. Carhart, 1966) words used in the WIN. Finally, the Peabody Picture Vocabulary Test-Revised (L. M. Dunn & L. M. Dunn, 1981) was administered. Testing was conducted in a quiet room. There were 3 main findings: (a) The biggest change in recognition performance occurred between the ages of 6 and 7 years; (b) from 9 to 12 years, recognition performance was stable; and (c) performance by young adults (18-27 years) was slightly better (1-2 dB) than performance by the older children. The WIN can be used with children as young as 6 years of age; however, age-specific ranges of normal recognition performance must be used.
An approach to efficient mobility management in intelligent networks
NASA Technical Reports Server (NTRS)
Murthy, K. M. S.
1995-01-01
Providing personal communication systems supporting full mobility require intelligent networks for tracking mobile users and facilitating outgoing and incoming calls over different physical and network environments. In realizing the intelligent network functionalities, databases play a major role. Currently proposed network architectures envision using the SS7-based signaling network for linking these DB's and also for interconnecting DB's with switches. If the network has to support ubiquitous, seamless mobile services, then it has to support additionally mobile application parts, viz., mobile origination calls, mobile destination calls, mobile location updates and inter-switch handovers. These functions will generate significant amount of data and require them to be transferred between databases (HLR, VLR) and switches (MSC's) very efficiently. In the future, the users (fixed or mobile) may use and communicate with sophisticated CPE's (e.g. multimedia, multipoint and multisession calls) which may require complex signaling functions. This will generate volumness service handling data and require efficient transfer of these message between databases and switches. Consequently, the network providers would be able to add new services and capabilities to their networks incrementally, quickly and cost-effectively.
jSPyDB, an open source database-independent tool for data management
NASA Astrophysics Data System (ADS)
Pierro, Giuseppe Antonio; Cavallari, Francesca; Di Guida, Salvatore; Innocente, Vincenzo
2011-12-01
Nowadays, the number of commercial tools available for accessing Databases, built on Java or .Net, is increasing. However, many of these applications have several drawbacks: usually they are not open-source, they provide interfaces only with a specific kind of database, they are platform-dependent and very CPU and memory consuming. jSPyDB is a free web-based tool written using Python and Javascript. It relies on jQuery and python libraries, and is intended to provide a simple handler to different database technologies inside a local web browser. Such a tool, exploiting fast access libraries such as SQLAlchemy, is easy to install, and to configure. The design of this tool envisages three layers. The front-end client side in the local web browser communicates with a backend server. Only the server is able to connect to the different databases for the purposes of performing data definition and manipulation. The server makes the data available to the client, so that the user can display and handle them safely. Moreover, thanks to jQuery libraries, this tool supports export of data in different formats, such as XML and JSON. Finally, by using a set of pre-defined functions, users are allowed to create their customized views for a better data visualization. In this way, we optimize the performance of database servers by avoiding short connections and concurrent sessions. In addition, security is enforced since we do not provide users the possibility to directly execute any SQL statement.
NASA Technical Reports Server (NTRS)
Todd, Nancy S.
2016-01-01
The rock and soil samples returned from the Apollo missions from 1969-72 have supported 46 years of research leading to advances in our understanding of the formation and evolution of the inner Solar System. NASA has been engaged in several initiatives that aim to restore, digitize, and make available to the public existing published and unpublished research data for the Apollo samples. One of these initiatives is a collaboration with IEDA (Interdisciplinary Earth Data Alliance) to develop MoonDB, a lunar geochemical database modeled after PetDB (Petrological Database of the Ocean Floor). In support of this initiative, NASA has adopted the use of IGSN (International Geo Sample Number) to generate persistent, unique identifiers for lunar samples that scientists can use when publishing research data. To facilitate the IGSN registration of the original 2,200 samples and over 120,000 subdivided samples, NASA has developed an application that retrieves sample metadata from the Lunar Curation Database and uses the SESAR API to automate the generation of IGSNs and registration of samples into SESAR (System for Earth Sample Registration). This presentation will describe the work done by NASA to map existing sample metadata to the IGSN metadata and integrate the IGSN registration process into the sample curation workflow, the lessons learned from this effort, and how this work can be extended in the future to help deal with the registration of large numbers of samples.
Research on high availability architecture of SQL and NoSQL
NASA Astrophysics Data System (ADS)
Wang, Zhiguo; Wei, Zhiqiang; Liu, Hao
2017-03-01
With the advent of the era of big data, amount and importance of data have increased dramatically. SQL database develops in performance and scalability, but more and more companies tend to use NoSQL database as their databases, because NoSQL database has simpler data model and stronger extension capacity than SQL database. Almost all database designers including SQL database and NoSQL database aim to improve performance and ensure availability by reasonable architecture which can reduce the effects of software failures and hardware failures, so that they can provide better experiences for their customers. In this paper, I mainly discuss the architectures of MySQL, MongoDB, and Redis, which are high available and have been deployed in practical application environment, and design a hybrid architecture.
EvoSNP-DB: A database of genetic diversity in East Asian populations.
Kim, Young Uk; Kim, Young Jin; Lee, Jong-Young; Park, Kiejung
2013-08-01
Genome-wide association studies (GWAS) have become popular as an approach for the identification of large numbers of phenotype-associated variants. However, differences in genetic architecture and environmental factors mean that the effect of variants can vary across populations. Understanding population genetic diversity is valuable for the investigation of possible population specific and independent effects of variants. EvoSNP-DB aims to provide information regarding genetic diversity among East Asian populations, including Chinese, Japanese, and Korean. Non-redundant SNPs (1.6 million) were genotyped in 54 Korean trios (162 samples) and were compared with 4 million SNPs from HapMap phase II populations. EvoSNP-DB provides two user interfaces for data query and visualization, and integrates scores of genetic diversity (Fst and VarLD) at the level of SNPs, genes, and chromosome regions. EvoSNP-DB is a web-based application that allows users to navigate and visualize measurements of population genetic differences in an interactive manner, and is available online at [http://biomi.cdc.go.kr/EvoSNP/].
Matuo, Yushi; Matsunami, Hidetoshi; Takemura, Masao; Saito, Kuniaki
2011-12-01
The Resource Center for Health Science (RECHS) has initiated a project based on the development and utilization of Bio-Resources/Database (BR/DB), comprising personal health records(PHR), such as health/medical records of the health of individuals, physically consolidated with bio-resources, e.g. serum, urine etc. taken from the same individuals. This is characterized as analytical alterations of BR/DB annually collected from healthy individuals, targeting 100,000, but not as data dependent on the number of unhealthy individuals so far investigated. The purpose is to establish a primary defense for the improvement of QOL by applying BR/DB to analysis by epidemiology and clinical chemistry. Furthermore, it also contributes to the construction of a PHR system planned as a national project. The RECHS coordinating activities are fully dependent on as many general hospitals as possible on the basis of regional medical services, and academia groups capable of analyzing BR/DB.
Potential hazard of hearing damage to students in undergraduate popular music courses.
Barlow, Christopher
2010-12-01
In recent years, there has been a rapid growth in university courses related to popular and commercial music, with a commensurate increase in the number of students studying these courses. Students of popular music subjects are frequently involved in the use of electronically amplified sound for rehearsal and recording, in addition to the "normal" noise exposure commonly associated with young people. The combination of these two elements suggests a higher than average noise exposure hazard for these students. To date, the majority of noise studies on students have focused on exposure from personal music players and on classical, orchestral, and marching band musicians. One hundred students across a range of university popular music courses were surveyed using a 30-point questionnaire regarding their musical habits both within and external to their university courses. This was followed by noise dosimetry of studios/recording spaces and music venues popular with students. Questionnaire responses showed 76% of subjects reported having experienced symptoms associated with hearing loss, while only 18% reported using hearing protection devices. Rehearsals averaged 11.5 hrs/wk, with a mean duration 2 hrs 13 mins and mean level of 98 dB LAEQ. Ninety-four percent of subjects reported attending concerts or nightclubs at least once per week, and measured exposure in two of these venues ranged from 98 to 112 dB LAEQ with a mean of 98.9 dB LAEQ over a 4.5-hr period. Results suggested an extremely high hazard of excessive noise exposure among this group from both their social and study-based music activities.
Elevated Steroid Hormone Production in the db/db Mouse Model of Obesity and Type 2 Diabetes.
Hofmann, Anja; Peitzsch, Mirko; Brunssen, Coy; Mittag, Jennifer; Jannasch, Annett; Frenzel, Annika; Brown, Nicholas; Weldon, Steven M; Eisenhofer, Graeme; Bornstein, Stefan R; Morawietz, Henning
2017-01-01
Obesity and type 2 diabetes have become a major public health problem worldwide. Steroid hormone dysfunction appears to be linked to development of obesity and type 2 diabetes and correction of steroid abnormalities may offer new approaches to therapy. We therefore analyzed plasma steroids in 15-16 week old obese and diabetic db/db mice using liquid chromatography-tandem mass spectrometry. Lean db/+ served as controls. Db/db mice developed obesity, hyperglycemia, hyperleptinemia, and hyperlipidemia. Hepatic triglyceride storage was increased and adiponectin and pancreatic insulin were lowered. Aldosterone, corticosterone, 11-deoxycorticosterone, and progesterone were respectively increased by 3.6-, 2.9-, 3.4, and 1.7-fold in db/db mice compared to controls. Ratios of aldosterone-to-progesterone and corticosterone-to-progesterone were respectively 2.0- and 1.5-fold higher in db/db mice. Genes associated with steroidogenesis were quantified in the adrenal glands and gonadal adipose tissues. In adrenals, Cyp11b2 , Cyp11b1 , Cyp21a1 , Hsd3b1 , Cyp11a1 , and StAR were all significantly increased in db/db mice compared with db/+ controls. In adipose tissue, no Cyp11b2 or Cyp11b1 transcripts were detected and no differences in Cyp21a1 , Hsd3b1 , Cyp11a1 , or StAR expression were found between db/+ and db/db mice. In conclusion, the present study showed an elevated steroid hormone production and adrenal steroidogenesis in the db/db model of obesity and type 2 diabetes. © Georg Thieme Verlag KG Stuttgart · New York.
ERIC Educational Resources Information Center
Currents, 2000
2000-01-01
A chart of 40 alumni-development database systems provides information on vendor/Web site, address, contact/phone, software name, price range, minimum suggested workstation/suggested server, standard reports/reporting tools, minimum/maximum record capacity, and number of installed sites/client type. (DB)
Reengineering a database for clinical trials management: lessons for system architects.
Brandt, C A; Nadkarni, P; Marenco, L; Karras, B T; Lu, C; Schacter, L; Fisk, J M; Miller, P L
2000-10-01
This paper describes the process of enhancing Trial/DB, a database system for clinical studies management. The system's enhancements have been driven by the need to maximize the effectiveness of developer personnel in supporting numerous and diverse users, of study designers in setting up new studies, and of administrators in managing ongoing studies. Trial/DB was originally designed to work over a local area network within a single institution, and basic architectural changes were necessary to make it work over the Internet efficiently as well as securely. Further, as its use spread to diverse communities of users, changes were made to let the processes of study design and project management adapt to the working styles of the principal investigators and administrators for each study. The lessons learned in the process should prove instructive for system architects as well as managers of electronic patient record systems.
An Unsupervised Approach for Extraction of Blood Vessels from Fundus Images.
Dash, Jyotiprava; Bhoi, Nilamani
2018-04-26
Pathological disorders may happen due to small changes in retinal blood vessels which may later turn into blindness. Hence, the accurate segmentation of blood vessels is becoming a challenging task for pathological analysis. This paper offers an unsupervised recursive method for extraction of blood vessels from ophthalmoscope images. First, a vessel-enhanced image is generated with the help of gamma correction and contrast-limited adaptive histogram equalization (CLAHE). Next, the vessels are extracted iteratively by applying an adaptive thresholding technique. At last, a final vessel segmented image is produced by applying a morphological cleaning operation. Evaluations are accompanied on the publicly available digital retinal images for vessel extraction (DRIVE) and Child Heart And Health Study in England (CHASE_DB1) databases using nine different measurements. The proposed method achieves average accuracies of 0.957 and 0.952 on DRIVE and CHASE_DB1 databases respectively.
The Binding Database: data management and interface design.
Chen, Xi; Lin, Yuhmei; Liu, Ming; Gilson, Michael K
2002-01-01
The large and growing body of experimental data on biomolecular binding is of enormous value in developing a deeper understanding of molecular biology, in developing new therapeutics, and in various molecular design applications. However, most of these data are found only in the published literature and are therefore difficult to access and use. No existing public database has focused on measured binding affinities and has provided query capabilities that include chemical structure and sequence homology searches. We have created Binding DataBase (BindingDB), a public, web-accessible database of measured binding affinities. BindingDB is based upon a relational data specification for describing binding measurements via Isothermal Titration Calorimetry (ITC) and enzyme inhibition. A corresponding XML Document Type Definition (DTD) is used to create and parse intermediate files during the on-line deposition process and will also be used for data interchange, including collection of data from other sources. The on-line query interface, which is constructed with Java Servlet technology, supports standard SQL queries as well as searches for molecules by chemical structure and sequence homology. The on-line deposition interface uses Java Server Pages and JavaBean objects to generate dynamic HTML and to store intermediate results. The resulting data resource provides a range of functionality with brisk response-times, and lends itself well to continued development and enhancement.
NASA Astrophysics Data System (ADS)
Minnett, R.; Koppers, A. A. P.; Jarboe, N.; Jonestrask, L.; Tauxe, L.; Constable, C.
2016-12-01
The Magnetics Information Consortium (https://earthref.org/MagIC/) develops and maintains a database and web application for supporting the paleo-, geo-, and rock magnetic scientific community. Historically, this objective has been met with an Oracle database and a Perl web application at the San Diego Supercomputer Center (SDSC). The Oracle Enterprise Cluster at SDSC, however, was decommissioned in July of 2016 and the cost for MagIC to continue using Oracle became prohibitive. This provided MagIC with a unique opportunity to reexamine the entire technology stack and data model. MagIC has developed an open-source web application using the Meteor (http://meteor.com) framework and a MongoDB database. The simplicity of the open-source full-stack framework that Meteor provides has improved MagIC's development pace and the increased flexibility of the data schema in MongoDB encouraged the reorganization of the MagIC Data Model. As a result of incorporating actively developed open-source projects into the technology stack, MagIC has benefited from their vibrant software development communities. This has translated into a more modern web application that has significantly improved the user experience for the paleo-, geo-, and rock magnetic scientific community.
RefPrimeCouch—a reference gene primer CouchApp
Silbermann, Jascha; Wernicke, Catrin; Pospisil, Heike; Frohme, Marcus
2013-01-01
To support a quantitative real-time polymerase chain reaction standardization project, a new reference gene database application was required. The new database application was built with the explicit goal of simplifying not only the development process but also making the user interface more responsive and intuitive. To this end, CouchDB was used as the backend with a lightweight dynamic user interface implemented client-side as a one-page web application. Data entry and curation processes were streamlined using an OpenRefine-based workflow. The new RefPrimeCouch database application provides its data online under an Open Database License. Database URL: http://hpclife.th-wildau.de:5984/rpc/_design/rpc/view.html PMID:24368831
RefPrimeCouch--a reference gene primer CouchApp.
Silbermann, Jascha; Wernicke, Catrin; Pospisil, Heike; Frohme, Marcus
2013-01-01
To support a quantitative real-time polymerase chain reaction standardization project, a new reference gene database application was required. The new database application was built with the explicit goal of simplifying not only the development process but also making the user interface more responsive and intuitive. To this end, CouchDB was used as the backend with a lightweight dynamic user interface implemented client-side as a one-page web application. Data entry and curation processes were streamlined using an OpenRefine-based workflow. The new RefPrimeCouch database application provides its data online under an Open Database License. Database URL: http://hpclife.th-wildau.de:5984/rpc/_design/rpc/view.html.
EST databases and web tools for EST projects.
Shen, Yao-Qing; O'Brien, Emmet; Koski, Liisa; Lang, B Franz; Burger, Gertraud
2009-01-01
This chapter outlines key considerations for constructing and implementing an EST database. Instead of showing the technological details step by step, emphasis is put on the design of an EST database suited to the specific needs of EST projects and how to choose the most suitable tools. Using TBestDB as an example, we illustrate the essential factors to be considered for database construction and the steps for data population and annotation. This process employs technologies such as PostgreSQL, Perl, and PHP to build the database and interface, and tools such as AutoFACT for data processing and annotation. We discuss these in comparison to other available technologies and tools, and explain the reasons for our choices.
Saunders, Rebecca E; Instrell, Rachael; Rispoli, Rossella; Jiang, Ming; Howell, Michael
2013-01-01
High-throughput screening (HTS) uses technologies such as RNA interference to generate loss-of-function phenotypes on a genomic scale. As these technologies become more popular, many research institutes have established core facilities of expertise to deal with the challenges of large-scale HTS experiments. As the efforts of core facility screening projects come to fruition, focus has shifted towards managing the results of these experiments and making them available in a useful format that can be further mined for phenotypic discovery. The HTS-DB database provides a public view of data from screening projects undertaken by the HTS core facility at the CRUK London Research Institute. All projects and screens are described with comprehensive assay protocols, and datasets are provided with complete descriptions of analysis techniques. This format allows users to browse and search data from large-scale studies in an informative and intuitive way. It also provides a repository for additional measurements obtained from screens that were not the focus of the project, such as cell viability, and groups these data so that it can provide a gene-centric summary across several different cell lines and conditions. All datasets from our screens that can be made available can be viewed interactively and mined for further hit lists. We believe that in this format, the database provides researchers with rapid access to results of large-scale experiments that might facilitate their understanding of genes/compounds identified in their own research. DATABASE URL: http://hts.cancerresearchuk.org/db/public.
Østergaard, Mette V; Pinto, Vanda; Stevenson, Kirsty; Worm, Jesper; Fink, Lisbeth N; Coward, Richard J M
2017-02-01
Diabetic nephropathy (DN) is the leading cause of kidney failure in the world. To understand important mechanisms underlying this condition, and to develop new therapies, good animal models are required. In mouse models of type 1 diabetes, the DBA/2J strain has been shown to be more susceptible to develop kidney disease than other common strains. We hypothesized this would also be the case in type 2 diabetes. We studied db/db and wild-type (wt) DBA/2J mice and compared these with the db/db BLKS/J mouse, which is currently the most widely used type 2 DN model. Mice were analyzed from age 6 to 12 wk for systemic insulin resistance, albuminuria, and glomerular histopathological and ultrastructural changes. Body weight and nonfasted blood glucose were increased by 8 wk in both genders, while systemic insulin resistance commenced by 6 wk in female and 8 wk in male db/db DBA/2J mice. The urinary albumin-to-creatinine ratio (ACR) was closely linked to systemic insulin resistance in both sexes and was increased ~50-fold by 12 wk of age in the db/db DBA/2J cohort. Glomerulosclerosis, foot process effacement, and glomerular basement membrane thickening were observed at 12 wk of age in db/db DBA/2J mice. Compared with db/db BLKS/J mice, db/db DBA/2J mice had significantly increased levels of urinary ACR, but similar glomerular histopathological and ultrastructural changes. The db/db DBA/2J mouse is a robust model of early-stage albuminuric DN, and its levels of albuminuria correlate closely with systemic insulin resistance. This mouse model will be helpful in defining early mechanisms of DN and ultimately the development of novel therapies. Copyright © 2017 the American Physiological Society.
Databases and Associated Tools for Glycomics and Glycoproteomics.
Lisacek, Frederique; Mariethoz, Julien; Alocci, Davide; Rudd, Pauline M; Abrahams, Jodie L; Campbell, Matthew P; Packer, Nicolle H; Ståhle, Jonas; Widmalm, Göran; Mullen, Elaine; Adamczyk, Barbara; Rojas-Macias, Miguel A; Jin, Chunsheng; Karlsson, Niclas G
2017-01-01
The access to biodatabases for glycomics and glycoproteomics has proven to be essential for current glycobiological research. This chapter presents available databases that are devoted to different aspects of glycobioinformatics. This includes oligosaccharide sequence databases, experimental databases, 3D structure databases (of both glycans and glycorelated proteins) and association of glycans with tissue, disease, and proteins. Specific search protocols are also provided using tools associated with experimental databases for converting primary glycoanalytical data to glycan structural information. In particular, researchers using glycoanalysis methods by U/HPLC (GlycoBase), MS (GlycoWorkbench, UniCarb-DB, GlycoDigest), and NMR (CASPER) will benefit from this chapter. In addition we also include information on how to utilize glycan structural information to query databases that associate glycans with proteins (UniCarbKB) and with interactions with pathogens (SugarBind).
DOE Office of Scientific and Technical Information (OSTI.GOV)
Anderson-Teixeira, Kristina J.; DeLucia, Evan H.; Duval, Benjamin D.
2015-10-29
To advance understanding of C dynamics of forests globally, we compiled a new database, the Forest C database (ForC-db), which contains data on ground-based measurements of ecosystem-level C stocks and annual fluxes along with disturbance history. This database currently contains 18,791 records from 2009 sites, making it the largest and most comprehensive database of C stocks and flows in forest ecosystems globally. The tropical component of the database will be published in conjunction with a manuscript that is currently under review (Anderson-Teixeira et al., in review). Database development continues, and we hope to maintain a dynamic instance of the entiremore » (global) database.« less
Classification of Chemicals Based On Structured Toxicity Information
Thirty years and millions of dollars worth of pesticide registration toxicity studies, historically stored as hardcopy and scanned documents, have been digitized into highly standardized and structured toxicity data within the Toxicity Reference Database (ToxRefDB). Toxicity-bas...
NASA Astrophysics Data System (ADS)
Wan, Meng; Wu, Chao; Wang, Jing; Qiu, Yulei; Xin, Liping; Mullender, Sjoerd; Mühleisen, Hannes; Scheers, Bart; Zhang, Ying; Nes, Niels; Kersten, Martin; Huang, Yongpan; Deng, Jinsong; Wei, Jianyan
2016-11-01
The ground-based wide-angle camera array (GWAC), a part of the SVOM space mission, will search for various types of optical transients by continuously imaging a field of view (FOV) of 5000 degrees2 every 15 s. Each exposure consists of 36 × 4k × 4k pixels, typically resulting in 36 × ˜175,600 extracted sources. For a modern time-domain astronomy project like GWAC, which produces massive amounts of data with a high cadence, it is challenging to search for short timescale transients in both real-time and archived data, and to build long-term light curves for variable sources. Here, we develop a high-cadence, high-density light curve pipeline (HCHDLP) to process the GWAC data in real-time, and design a distributed shared-nothing database to manage the massive amount of archived data which will be used to generate a source catalog with more than 100 billion records during 10 years of operation. First, we develop HCHDLP based on the column-store DBMS of MonetDB, taking advantage of MonetDB’s high performance when applied to massive data processing. To realize the real-time functionality of HCHDLP, we optimize the pipeline in its source association function, including both time and space complexity from outside the database (SQL semantic) and inside (RANGE-JOIN implementation), as well as in its strategy of building complex light curves. The optimized source association function is accelerated by three orders of magnitude. Second, we build a distributed database using a two-level time partitioning strategy via the MERGE TABLE and REMOTE TABLE technology of MonetDB. Intensive tests validate that our database architecture is able to achieve both linear scalability in response time and concurrent access by multiple users. In summary, our studies provide guidance for a solution to GWAC in real-time data processing and management of massive data.
2012-09-01
downconverters; telemetry RF preamplifiers; telemetry multicouplers; telemetry receivers 16. SECURITY CLASSIFICATION OF: 17. LIMITATION OF ABSTRACT Same as...Continuing Engineering Education Program, George Washington University , 1994. A-5 Figure A-2. Graphical representation of intercept point...NFdb) is expressed in decibels and noise factor (nf ) in decimal units. For example, a noise figure of 3 dB corresponds to a noise factor of 2
DoGSD: the dog and wolf genome SNP database.
Bai, Bing; Zhao, Wen-Ming; Tang, Bi-Xia; Wang, Yan-Qing; Wang, Lu; Zhang, Zhang; Yang, He-Chuan; Liu, Yan-Hu; Zhu, Jun-Wei; Irwin, David M; Wang, Guo-Dong; Zhang, Ya-Ping
2015-01-01
The rapid advancement of next-generation sequencing technology has generated a deluge of genomic data from domesticated dogs and their wild ancestor, grey wolves, which have simultaneously broadened our understanding of domestication and diseases that are shared by humans and dogs. To address the scarcity of single nucleotide polymorphism (SNP) data provided by authorized databases and to make SNP data more easily/friendly usable and available, we propose DoGSD (http://dogsd.big.ac.cn), the first canidae-specific database which focuses on whole genome SNP data from domesticated dogs and grey wolves. The DoGSD is a web-based, open-access resource comprising ∼ 19 million high-quality whole-genome SNPs. In addition to the dbSNP data set (build 139), DoGSD incorporates a comprehensive collection of SNPs from two newly sequenced samples (1 wolf and 1 dog) and collected SNPs from three latest dog/wolf genetic studies (7 wolves and 68 dogs), which were taken together for analysis with the population genetic statistics, Fst. In addition, DoGSD integrates some closely related information including SNP annotation, summary lists of SNPs located in genes, synonymous and non-synonymous SNPs, sampling location and breed information. All these features make DoGSD a useful resource for in-depth analysis in dog-/wolf-related studies. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Boué, Stéphanie; Talikka, Marja; Westra, Jurjen Willem; Hayes, William; Di Fabio, Anselmo; Park, Jennifer; Schlage, Walter K; Sewer, Alain; Fields, Brett; Ansari, Sam; Martin, Florian; Veljkovic, Emilija; Kenney, Renee; Peitsch, Manuel C; Hoeng, Julia
2015-01-01
With the wealth of publications and data available, powerful and transparent computational approaches are required to represent measured data and scientific knowledge in a computable and searchable format. We developed a set of biological network models, scripted in the Biological Expression Language, that reflect causal signaling pathways across a wide range of biological processes, including cell fate, cell stress, cell proliferation, inflammation, tissue repair and angiogenesis in the pulmonary and cardiovascular context. This comprehensive collection of networks is now freely available to the scientific community in a centralized web-based repository, the Causal Biological Network database, which is composed of over 120 manually curated and well annotated biological network models and can be accessed at http://causalbionet.com. The website accesses a MongoDB, which stores all versions of the networks as JSON objects and allows users to search for genes, proteins, biological processes, small molecules and keywords in the network descriptions to retrieve biological networks of interest. The content of the networks can be visualized and browsed. Nodes and edges can be filtered and all supporting evidence for the edges can be browsed and is linked to the original articles in PubMed. Moreover, networks may be downloaded for further visualization and evaluation. Database URL: http://causalbionet.com © The Author(s) 2015. Published by Oxford University Press.
Siew, Joyce Phui Yee; Khan, Asif M; Tan, Paul T J; Koh, Judice L Y; Seah, Seng Hong; Koo, Chuay Yeng; Chai, Siaw Ching; Armugam, Arunmozhiarasi; Brusic, Vladimir; Jeyaseelan, Kandiah
2004-12-12
Sequence annotations, functional and structural data on snake venom neurotoxins (svNTXs) are scattered across multiple databases and literature sources. Sequence annotations and structural data are available in the public molecular databases, while functional data are almost exclusively available in the published articles. There is a need for a specialized svNTXs database that contains NTX entries, which are organized, well annotated and classified in a systematic manner. We have systematically analyzed svNTXs and classified them using structure-function groups based on their structural, functional and phylogenetic properties. Using conserved motifs in each phylogenetic group, we built an intelligent module for the prediction of structural and functional properties of unknown NTXs. We also developed an annotation tool to aid the functional prediction of newly identified NTXs as an additional resource for the venom research community. We created a searchable online database of NTX proteins sequences (http://research.i2r.a-star.edu.sg/Templar/DB/snake_neurotoxin). This database can also be found under Swiss-Prot Toxin Annotation Project website (http://www.expasy.org/sprot/).
Karadimas, H.; Hemery, F.; Roland, P.; Lepage, E.
2000-01-01
In medical software development, the use of databases plays a central role. However, most of the databases have heterogeneous encoding and data models. To deal with these variations in the application code directly is error-prone and reduces the potential reuse of the produced software. Several approaches to overcome these limitations have been proposed in the medical database literature, which will be presented. We present a simple solution, based on a Java library, and a central Metadata description file in XML. This development approach presents several benefits in software design and development cycles, the main one being the simplicity in maintenance. PMID:11079915
Li, Jun; Tai, Cui; Deng, Zixin; Zhong, Weihong; He, Yongqun; Ou, Hong-Yu
2017-01-10
VRprofile is a Web server that facilitates rapid investigation of virulence and antibiotic resistance genes, as well as extends these trait transfer-related genetic contexts, in newly sequenced pathogenic bacterial genomes. The used backend database MobilomeDB was firstly built on sets of known gene cluster loci of bacterial type III/IV/VI/VII secretion systems and mobile genetic elements, including integrative and conjugative elements, prophages, class I integrons, IS elements and pathogenicity/antibiotic resistance islands. VRprofile is thus able to co-localize the homologs of these conserved gene clusters using HMMer or BLASTp searches. With the integration of the homologous gene cluster search module with a sequence composition module, VRprofile has exhibited better performance for island-like region predictions than the other widely used methods. In addition, VRprofile also provides an integrated Web interface for aligning and visualizing identified gene clusters with MobilomeDB-archived gene clusters, or a variety set of bacterial genomes. VRprofile might contribute to meet the increasing demands of re-annotations of bacterial variable regions, and aid in the real-time definitions of disease-relevant gene clusters in pathogenic bacteria of interest. VRprofile is freely available at http://bioinfo-mml.sjtu.edu.cn/VRprofile. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Mozaffari, Mahmood S; Abdelsayed, Rafik; Liu, Jun Yao; Zakhary, Ibrahim; Baban, Babak
2012-02-01
Hallmark features of type 2 diabetes mellitus include glucosuria and polyuria. Further, renal aquaporin 2 is pivotal to regulation of fluid excretion and urine osmolality. Accordingly, we tested the hypothesis that the db/db mouse displays increased glucosuria and fluid excretion but reduced urine osmolality in association with decreased renal aquaporin 2 level. In addition, we examined the effect of chromium picolinate (Cr(pic)3) which is purported to improve glycemic control. The db/db mice excreted more urine in association with marked glucose excretion but lower urine osmolality than db/m control group. Light microscopic examination of renal tissue revealed proliferation of tubular structures in db/db compared to the db/m mice, a feature validated with Ki67 immunostaining. Further, these tubules showed generally similar immunostaining intensity and pattern for aquaporin 2 indicating that proliferated tubules are of distal origin. On the other hand, renal aquaporin 2 protein level was significantly higher in the db/db than db/m group. Treatment of db/db mice with Cr(pic)3 reduced plasma glucose and hemoglobin A1c (~15-17%, p<0.05) and Ki67 positive cells but other parameters were similar to their untreated counterparts. Collectively, these findings suggest that proliferation of renal distal tubules and increased aquaporin 2 level likely represent an adaptive mechanism to regulate fluid excretion to prevent dehydration in the setting of marked glucosuria in the db/db mouse, features not affected by Cr(pic)3 treatment. These observations are of relevance to increasing interest in developing therapeutic agents that facilitate renal glucose elimination. Copyright © 2011 Elsevier Inc. All rights reserved.
Mozaffari, Mahmood S.; Abdelsayed, Rafik; Liu, Jun Yao; Zakhary, Ibrahim; Baban, Babak
2011-01-01
Hallmark features of type 2 diabetes mellitus include glucosuria and polyuria. Further, renal aquaporin 2 is pivotal to regulation of fluid excretion and urine osmolality. Accordingly, we tested the hypothesis that the db/db mouse displays increased glucosuria and fluid excretion but reduced urine osmolality in association with decreased renal aquaporin 2 level. In addition, we examined the effect of chromium picolinate (Cr(pic)3) which is purported to improve glycemic control. The db/db mice excreted more urine in association with marked glucose excretion but lower urine osmolality than db/m control group. Light microscopic examination of renal tissue revealed proliferation of tubular structures in db/db compared to the db/m mice, a feature validated with Ki67 immunostaining. Further, these tubules showed generally similar immunostaining intensity and pattern for aquaporin 2 indicating that proliferated tubules are of distal origin. On the other hand, renal aquaporin 2 protein level was significantly higher in the db/db than db/m group. Treatment of db/db mice with Cr(pic)3 reduced plasma glucose and hemoglobin A1c (~ 15–17%, p<0.05) and Ki67 positive cells but other parameters were similar to their untreated counterparts. Collectively, these findings suggest that proliferation of renal distal tubules and increased aquaporin 2 level likely represent an adaptive mechanism to regulate fluid excretion to prevent dehydration in the setting of marked glucosuria in the db/db mouse, features not affected by Cr(pic)3 treatment. These observations are of relevance to increasing interest in developing therapeutic agents that facilitate renal glucose elimination. PMID:21983138
AnClim and ProClimDB software for data quality control and homogenization of time series
NASA Astrophysics Data System (ADS)
Stepanek, Petr
2015-04-01
During the last decade, a software package consisting of AnClim, ProClimDB and LoadData for processing (mainly climatological) data has been created. This software offers a complex solution for processing of climatological time series, starting from loading the data from a central database (e.g. Oracle, software LoadData), through data duality control and homogenization to time series analysis, extreme value evaluations and RCM outputs verification and correction (ProClimDB and AnClim software). The detection of inhomogeneities is carried out on a monthly scale through the application of AnClim, or newly by R functions called from ProClimDB, while quality control, the preparation of reference series and the correction of found breaks is carried out by the ProClimDB software. The software combines many statistical tests, types of reference series and time scales (monthly, seasonal and annual, daily and sub-daily ones). These can be used to create an "ensemble" of solutions, which may be more reliable than any single method. AnClim software is suitable for educational purposes: e.g. for students getting acquainted with methods used in climatology. Built-in graphical tools and comparison of various statistical tests help in better understanding of a given method. ProClimDB is, on the contrary, tool aimed for processing of large climatological datasets. Recently, functions from R may be used within the software making it more efficient in data processing and capable of easy inclusion of new methods (when available under R). An example of usage is easy comparison of methods for correction of inhomogeneities in daily data (HOM of Paul Della-Marta, SPLIDHOM method of Olivier Mestre, DAP - own method, QM of Xiaolan Wang and others). The software is available together with further information on www.climahom.eu . Acknowledgement: this work was partially funded by the project "Building up a multidisciplinary scientific team focused on drought" No. CZ.1.07/2.3.00/20.0248.
Integrated Historical Tsunami Event and Deposit Database
NASA Astrophysics Data System (ADS)
Dunbar, P. K.; McCullough, H. L.
2010-12-01
The National Geophysical Data Center (NGDC) provides integrated access to historical tsunami event, deposit, and proxy data. The NGDC tsunami archive initially listed tsunami sources and locations with observed tsunami effects. Tsunami frequency and intensity are important for understanding tsunami hazards. Unfortunately, tsunami recurrence intervals often exceed the historic record. As a result, NGDC expanded the archive to include the Global Tsunami Deposits Database (GTD_DB). Tsunami deposits are the physical evidence left behind when a tsunami impacts a shoreline or affects submarine sediments. Proxies include co-seismic subsidence, turbidite deposits, changes in biota following an influx of marine water in a freshwater environment, etc. By adding past tsunami data inferred from the geologic record, the GTD_DB extends the record of tsunamis backward in time. Although the best methods for identifying tsunami deposits and proxies in the geologic record remain under discussion, developing an overall picture of where tsunamis have affected coasts, calculating recurrence intervals, and approximating runup height and inundation distance provides a better estimate of a region’s true tsunami hazard. Tsunami deposit and proxy descriptions in the GTD_DB were compiled from published data found in journal articles, conference proceedings, theses, books, conference abstracts, posters, web sites, etc. The database now includes over 1,200 descriptions compiled from over 1,100 citations. Each record in the GTD_DB is linked to its bibliographic citation where more information on the deposit can be found. The GTD_DB includes data for over 50 variables such as: event description (e.g., 2010 Chile Tsunami), geologic time period, year, deposit location name, latitude, longitude, country, associated body of water, setting during the event (e.g., beach, lake, river, deep sea), upper and lower contacts, underlying and overlying material, etc. If known, the tsunami source mechanism (e.g., earthquake, landslide, volcanic eruption, asteroid impact) is also specified. Observations (grain size, sedimentary structure, bed thickness, number of layers, etc.) are stored along with the conclusions drawn from the evidence by the author (wave height, flow depth, flow velocity, number of waves, etc.). Geologic time periods in the GTD_DB range from Precambrian to Quaternary, but the majority (70%) are from the Quaternary period. This period includes events such as: the 2004 Indian Ocean tsunami, the Cascadia subduction zone earthquakes and tsunamis, the 1755 Lisbon tsunami, the A.D. 79 Vesuvius tsunami, the 3500 BP Santorini caldera collapse and tsunami, and the 7000 BP Storegga landslide-generated tsunami. Prior to the Quaternary period, the majority of the paleotsunamis are due to impact events such as: the Tertiary Chesapeake Bay Bolide, Cretaceous-Tertiary (K/T) Boundary, Cretaceous Manson, and Devonian Alamo. The tsunami deposits are integrated with the historical tsunami event database where applicable. For example, users can search for articles describing deposits related to the 1755 Lisbon tsunami and view those records, as well as link to the related historic event record. The data and information may be viewed using tools designed to extract and display data (selection forms, Web Map Services, and Web Feature Services).
Maignen, Francois; Hauben, Manfred; Hung, Eric; Van Holle, Lionel; Dogne, Jean-Michel
2014-02-01
Masking is a statistical issue by which signals are hidden by the presence of other medicines in the database. In the absence algorithm, the impact of the masking effect has not been fully investigated. Our study is aimed at assessing the extent and the impact of the masking effect on two large spontaneous reporting databases. Cross sectional study using a set of terms of importance for public health in two spontaneous reporting databases. The analyses were performed on EudraVigilance (EV) and the Pfizer spontaneous reporting database (PfDB). Using the masking ratio, we have identified and removed the products inducing the highest masking effect. Studying a total of almost 50 000 drug-event combinations masking had an impact on approximately 60% of drug-event combinations were masked by another product with a masking ratio >1 in EV and 84% in PfDB. The prevalence of important masking was quite rare (0.003% of the DECs) and mainly affected events rarely reported in EV. The products involved in the highest masking effects are products known to induce the reaction. The removal of the masking effect of the highest masking product has revealed 974 signals of disproportionate reporting in EV including true signals. The study shows that the original ranking provided by the quantitative methods included in our study is marginally affected by the removal of the masking product. Our study suggests that significant masking is rare in large spontaneous databases and mostly affects events rarely reported in EV. Copyright © 2013 John Wiley & Sons, Ltd.
A public HTLV-1 molecular epidemiology database for sequence management and data mining.
Araujo, Thessika Hialla Almeida; Souza-Brito, Leandro Inacio; Libin, Pieter; Deforche, Koen; Edwards, Dustin; de Albuquerque-Junior, Antonio Eduardo; Vandamme, Anne-Mieke; Galvao-Castro, Bernardo; Alcantara, Luiz Carlos Junior
2012-01-01
It is estimated that 15 to 20 million people are infected with the human T-cell lymphotropic virus type 1 (HTLV-1). At present, there are more than 2,000 unique HTLV-1 isolate sequences published. A central database to aggregate sequence information from a range of epidemiological aspects including HTLV-1 infections, pathogenesis, origins, and evolutionary dynamics would be useful to scientists and physicians worldwide. Described here, we have developed a database that collects and annotates sequence data and can be accessed through a user-friendly search interface. The HTLV-1 Molecular Epidemiology Database website is available at http://htlv1db.bahia.fiocruz.br/. All data was obtained from publications available at GenBank or through contact with the authors. The database was developed using Apache Webserver 2.1.6 and SGBD MySQL. The webpage interfaces were developed in HTML and sever-side scripting written in PHP. The HTLV-1 Molecular Epidemiology Database is hosted on the Gonçalo Moniz/FIOCRUZ Research Center server. There are currently 2,457 registered sequences with 2,024 (82.37%) of those sequences representing unique isolates. Of these sequences, 803 (39.67%) contain information about clinical status (TSP/HAM, 17.19%; ATL, 7.41%; asymptomatic, 12.89%; other diseases, 2.17%; and no information, 60.32%). Further, 7.26% of sequences contain information on patient gender while 5.23% of sequences provide the age of the patient. The HTLV-1 Molecular Epidemiology Database retrieves and stores annotated HTLV-1 proviral sequences from clinical, epidemiological, and geographical studies. The collected sequences and related information are now accessible on a publically available and user-friendly website. This open-access database will support clinical research and vaccine development related to viral genotype.
D'haenens, Wendy; Dhooge, Ingeborg; De Vel, Eddy; Maes, Leen; Bockstael, Annelies; Vinck, Bart M
2007-08-01
The present study utilized a commercially available multiple auditory steady-state response (ASSR) system to test normal hearing adults (n=55). The primary objective was to evaluate the impact of the mixed modulation (MM) and the novel proposed exponential AM(2)/FM stimuli on the signal-to-noise ratio (SNR) and threshold estimation accuracy, through a within-subject comparison. The second aim was to establish a normative database for both stimulus types. The results demonstrated that the AM(2)/FM and MM stimulus had a similar effect on the SNR, whereas the ASSR threshold results revealed that the AM(2)/FM produced better thresholds than the MM stimulus for the 500, 1000, and 4000 Hz carrier frequency. The mean difference scores to tones of 500, 1000, 2000, and 4000 Hz were for the MM stimulus: 20+/-12, 14+/-9, 10+/-8, and 12+/-8 dB; and for the AM(2)/FM stimulus: 18+/-13, 12+/-8, 11+/-8, and 10+/-8 dB, respectively. The current research confirms that the AM(2)/FM stimulus can be used efficiently to test normal hearing adults.
PIPEMicroDB: microsatellite database and primer generation tool for pigeonpea genome
Sarika; Arora, Vasu; Iquebal, M. A.; Rai, Anil; Kumar, Dinesh
2013-01-01
Molecular markers play a significant role for crop improvement in desirable characteristics, such as high yield, resistance to disease and others that will benefit the crop in long term. Pigeonpea (Cajanus cajan L.) is the recently sequenced legume by global consortium led by ICRISAT (Hyderabad, India) and been analysed for gene prediction, synteny maps, markers, etc. We present PIgeonPEa Microsatellite DataBase (PIPEMicroDB) with an automated primer designing tool for pigeonpea genome, based on chromosome wise as well as location wise search of primers. Total of 123 387 Short Tandem Repeats (STRs) were extracted from pigeonpea genome, available in public domain using MIcroSAtellite tool (MISA). The database is an online relational database based on ‘three-tier architecture’ that catalogues information of microsatellites in MySQL and user-friendly interface is developed using PHP. Search for STRs may be customized by limiting their location on chromosome as well as number of markers in that range. This is a novel approach and is not been implemented in any of the existing marker database. This database has been further appended with Primer3 for primer designing of selected markers with left and right flankings of size up to 500 bp. This will enable researchers to select markers of choice at desired interval over the chromosome. Furthermore, one can use individual STRs of a targeted region over chromosome to narrow down location of gene of interest or linked Quantitative Trait Loci (QTLs). Although it is an in silico approach, markers’ search based on characteristics and location of STRs is expected to be beneficial for researchers. Database URL: http://cabindb.iasri.res.in/pigeonpea/ PMID:23396298
PIPEMicroDB: microsatellite database and primer generation tool for pigeonpea genome.
Sarika; Arora, Vasu; Iquebal, M A; Rai, Anil; Kumar, Dinesh
2013-01-01
Molecular markers play a significant role for crop improvement in desirable characteristics, such as high yield, resistance to disease and others that will benefit the crop in long term. Pigeonpea (Cajanus cajan L.) is the recently sequenced legume by global consortium led by ICRISAT (Hyderabad, India) and been analysed for gene prediction, synteny maps, markers, etc. We present PIgeonPEa Microsatellite DataBase (PIPEMicroDB) with an automated primer designing tool for pigeonpea genome, based on chromosome wise as well as location wise search of primers. Total of 123 387 Short Tandem Repeats (STRs) were extracted from pigeonpea genome, available in public domain using MIcroSAtellite tool (MISA). The database is an online relational database based on 'three-tier architecture' that catalogues information of microsatellites in MySQL and user-friendly interface is developed using PHP. Search for STRs may be customized by limiting their location on chromosome as well as number of markers in that range. This is a novel approach and is not been implemented in any of the existing marker database. This database has been further appended with Primer3 for primer designing of selected markers with left and right flankings of size up to 500 bp. This will enable researchers to select markers of choice at desired interval over the chromosome. Furthermore, one can use individual STRs of a targeted region over chromosome to narrow down location of gene of interest or linked Quantitative Trait Loci (QTLs). Although it is an in silico approach, markers' search based on characteristics and location of STRs is expected to be beneficial for researchers. Database URL: http://cabindb.iasri.res.in/pigeonpea/
Improved cerebral energetics and ketone body metabolism in db/db mice
Andersen, Jens V; Christensen, Sofie K; Nissen, Jakob D
2016-01-01
It is becoming evident that type 2 diabetes mellitus is affecting brain energy metabolism. The importance of alternative substrates for the brain in type 2 diabetes mellitus is poorly understood. The aim of this study was to investigate whether ketone bodies are relevant candidates to compensate for cerebral glucose hypometabolism and unravel the functionality of cerebral mitochondria in type 2 diabetes mellitus. Acutely isolated cerebral cortical and hippocampal slices of db/db mice were incubated in media containing [U-13C]glucose, [1,2-13C]acetate or [U-13C]β-hydroxybutyrate and tissue extracts were analysed by mass spectrometry. Oxygen consumption and ATP synthesis of brain mitochondria of db/db mice were assessed by Seahorse XFe96 and luciferin-luciferase assay, respectively. Glucose hypometabolism was observed for both cerebral cortical and hippocampal slices of db/db mice. Significant increased metabolism of [1,2-13C]acetate and [U-13C]β-hydroxybutyrate was observed for hippocampal slices of db/db mice. Furthermore, brain mitochondria of db/db mice exhibited elevated oxygen consumption and ATP synthesis rate. This study provides evidence of several changes in brain energy metabolism in type 2 diabetes mellitus. The increased hippocampal ketone body utilization and improved mitochondrial function in db/db mice, may act as adaptive mechanisms in order to maintain cerebral energetics during hampered glucose metabolism. PMID:28058963
Curating and Preserving the Big Canopy Database System: an Active Curation Approach using SEAD
NASA Astrophysics Data System (ADS)
Myers, J.; Cushing, J. B.; Lynn, P.; Weiner, N.; Ovchinnikova, A.; Nadkarni, N.; McIntosh, A.
2015-12-01
Modern research is increasingly dependent upon highly heterogeneous data and on the associated cyberinfrastructure developed to organize, analyze, and visualize that data. However, due to the complexity and custom nature of such combined data-software systems, it can be very challenging to curate and preserve them for the long term at reasonable cost and in a way that retains their scientific value. In this presentation, we describe how this challenge was met in preserving the Big Canopy Database (CanopyDB) system using an agile approach and leveraging the Sustainable Environment - Actionable Data (SEAD) DataNet project's hosted data services. The CanopyDB system was developed over more than a decade at Evergreen State College to address the needs of forest canopy researchers. It is an early yet sophisticated exemplar of the type of system that has become common in biological research and science in general, including multiple relational databases for different experiments, a custom database generation tool used to create them, an image repository, and desktop and web tools to access, analyze, and visualize this data. SEAD provides secure project spaces with a semantic content abstraction (typed content with arbitrary RDF metadata statements and relationships to other content), combined with a standards-based curation and publication pipeline resulting in packaged research objects with Digital Object Identifiers. Using SEAD, our cross-project team was able to incrementally ingest CanopyDB components (images, datasets, software source code, documentation, executables, and virtualized services) and to iteratively define and extend the metadata and relationships needed to document them. We believe that both the process, and the richness of the resultant standards-based (OAI-ORE) preservation object, hold lessons for the development of best-practice solutions for preserving scientific data in association with the tools and services needed to derive value from it.
EuroPineDB: a high-coverage web database for maritime pine transcriptome
2011-01-01
Background Pinus pinaster is an economically and ecologically important species that is becoming a woody gymnosperm model. Its enormous genome size makes whole-genome sequencing approaches are hard to apply. Therefore, the expressed portion of the genome has to be characterised and the results and annotations have to be stored in dedicated databases. Description EuroPineDB is the largest sequence collection available for a single pine species, Pinus pinaster (maritime pine), since it comprises 951 641 raw sequence reads obtained from non-normalised cDNA libraries and high-throughput sequencing from adult (xylem, phloem, roots, stem, needles, cones, strobili) and embryonic (germinated embryos, buds, callus) maritime pine tissues. Using open-source tools, sequences were optimally pre-processed, assembled, and extensively annotated (GO, EC and KEGG terms, descriptions, SNPs, SSRs, ORFs and InterPro codes). As a result, a 10.5× P. pinaster genome was covered and assembled in 55 322 UniGenes. A total of 32 919 (59.5%) of P. pinaster UniGenes were annotated with at least one description, revealing at least 18 466 different genes. The complete database, which is designed to be scalable, maintainable, and expandable, is freely available at: http://www.scbi.uma.es/pindb/. It can be retrieved by gene libraries, pine species, annotations, UniGenes and microarrays (i.e., the sequences are distributed in two-colour microarrays; this is the only conifer database that provides this information) and will be periodically updated. Small assemblies can be viewed using a dedicated visualisation tool that connects them with SNPs. Any sequence or annotation set shown on-screen can be downloaded. Retrieval mechanisms for sequences and gene annotations are provided. Conclusions The EuroPineDB with its integrated information can be used to reveal new knowledge, offers an easy-to-use collection of information to directly support experimental work (including microarray hybridisation), and provides deeper knowledge on the maritime pine transcriptome. PMID:21762488
Mangiola, Stefano; Young, Neil D; Korhonen, Pasi; Mondal, Alinda; Scheerlinck, Jean-Pierre; Sternberg, Paul W; Cantacessi, Cinzia; Hall, Ross S; Jex, Aaron R; Gasser, Robin B
2013-12-01
Compounded by a massive global food shortage, many parasitic diseases have a devastating, long-term impact on animal and human health and welfare worldwide. Parasitic helminths (worms) affect the health of billions of animals. Unlocking the systems biology of these neglected pathogens will underpin the design of new and improved interventions against them. Currently, the functional annotation of genomic and transcriptomic sequence data for socio-economically important parasitic worms relies almost exclusively on comparative bioinformatic analyses using model organism- and other databases. However, many genes and gene products of parasitic helminths (often >50%) cannot be annotated using this approach, because they are specific to parasites and/or do not have identifiable homologs in other organisms for which sequence data are available. This inability to fully annotate transcriptomes and predicted proteomes is a major challenge and constrains our understanding of the biology of parasites, interactions with their hosts and of parasitism and the pathogenesis of disease on a molecular level. In the present article, we compiled transcriptomic data sets of key, socioeconomically important parasitic helminths, and constructed and validated a curated database, called HelmDB (www.helmdb.org). We demonstrate how this database can be used effectively for the improvement of functional annotation by employing data integration and clustering. Importantly, HelmDB provides a practical and user-friendly toolkit for sequence browsing and comparative analyses among divergent helminth groups (including nematodes and trematodes), and should be readily adaptable and applicable to a wide range of other organisms. This web-based, integrative database should assist 'systems biology' studies of parasitic helminths, and the discovery and prioritization of novel drug and vaccine targets. This focus provides a pathway toward developing new and improved approaches for the treatment and control of parasitic diseases, with the potential for important biotechnological outcomes. Copyright © 2012 Elsevier Inc. All rights reserved.
ERIC Educational Resources Information Center
Vaughn, Kelley; Hales, Cindy; Bush, Marta; Fox, James
1998-01-01
Describes implementation of functional behavioral assessment (FBA) through collaboration between a university (East Tennessee State University) and the local school system. Discusses related issues such as factors in team training, team size, FBA adaptations, and replicability of the FBA team model. (Author/DB)
Introducing glycomics data into the Semantic Web
2013-01-01
Background Glycoscience is a research field focusing on complex carbohydrates (otherwise known as glycans)a, which can, for example, serve as “switches” that toggle between different functions of a glycoprotein or glycolipid. Due to the advancement of glycomics technologies that are used to characterize glycan structures, many glycomics databases are now publicly available and provide useful information for glycoscience research. However, these databases have almost no link to other life science databases. Results In order to implement support for the Semantic Web most efficiently for glycomics research, the developers of major glycomics databases agreed on a minimal standard for representing glycan structure and annotation information using RDF (Resource Description Framework). Moreover, all of the participants implemented this standard prototype and generated preliminary RDF versions of their data. To test the utility of the converted data, all of the data sets were uploaded into a Virtuoso triple store, and several SPARQL queries were tested as “proofs-of-concept” to illustrate the utility of the Semantic Web in querying across databases which were originally difficult to implement. Conclusions We were able to successfully retrieve information by linking UniCarbKB, GlycomeDB and JCGGDB in a single SPARQL query to obtain our target information. We also tested queries linking UniProt with GlycoEpitope as well as lectin data with GlycomeDB through PDB. As a result, we have been able to link proteomics data with glycomics data through the implementation of Semantic Web technologies, allowing for more flexible queries across these domains. PMID:24280648
Introducing glycomics data into the Semantic Web.
Aoki-Kinoshita, Kiyoko F; Bolleman, Jerven; Campbell, Matthew P; Kawano, Shin; Kim, Jin-Dong; Lütteke, Thomas; Matsubara, Masaaki; Okuda, Shujiro; Ranzinger, Rene; Sawaki, Hiromichi; Shikanai, Toshihide; Shinmachi, Daisuke; Suzuki, Yoshinori; Toukach, Philip; Yamada, Issaku; Packer, Nicolle H; Narimatsu, Hisashi
2013-11-26
Glycoscience is a research field focusing on complex carbohydrates (otherwise known as glycans)a, which can, for example, serve as "switches" that toggle between different functions of a glycoprotein or glycolipid. Due to the advancement of glycomics technologies that are used to characterize glycan structures, many glycomics databases are now publicly available and provide useful information for glycoscience research. However, these databases have almost no link to other life science databases. In order to implement support for the Semantic Web most efficiently for glycomics research, the developers of major glycomics databases agreed on a minimal standard for representing glycan structure and annotation information using RDF (Resource Description Framework). Moreover, all of the participants implemented this standard prototype and generated preliminary RDF versions of their data. To test the utility of the converted data, all of the data sets were uploaded into a Virtuoso triple store, and several SPARQL queries were tested as "proofs-of-concept" to illustrate the utility of the Semantic Web in querying across databases which were originally difficult to implement. We were able to successfully retrieve information by linking UniCarbKB, GlycomeDB and JCGGDB in a single SPARQL query to obtain our target information. We also tested queries linking UniProt with GlycoEpitope as well as lectin data with GlycomeDB through PDB. As a result, we have been able to link proteomics data with glycomics data through the implementation of Semantic Web technologies, allowing for more flexible queries across these domains.
Use of a Universal Identifier by Grant Applicants
OMB has issued a policy directive to implement the requirement for grant applicants to provide a Dun & Bradstreet (D&B) Data Universal NNumbering System (DUNS) number when applying for FedFederal grants or cooperative agreements on or after October 1,2003.
Exercise Training Prevents Coronary Endothelial Dysfunction in Type 2 Diabetic Mice.
Lee, Sewon; Park, Yoonjung; Zhang, Cuihua
2011-10-01
Type 2 diabetes (T2D) is a leading risk factor for cardiovascular diseases including atherosclerosis and coronary heart disease. Exercise training (ET) is thought to have a beneficial effect on these disorders, but the basis for this effect is not fully understood. Because endothelial dysfunction plays a key role in the pathological events leading to cardiovascular complications in T2D, we hypothesized that the effects of ET will be evidenced by improvements in coronary endothelial function. To test this hypothesis, we assessed the effects of ET on vascular function of diabetic (db/db, Lepr(db)) mice by evaluating endothelial function of isolated coronary arterioles of wild-type (WT) and db/db mice with/without ET. Although dilation of vessels to the endothelial-independent vasodilator, sodium nitroprusside was not different between db/db and WT, dilation to the endothelial-dependent agonist, acetylcholine (ACh), was impaired in db/db compared to WT mice. Vasodilation to ACh was restored in db/db with ET and insulin sensitivity was improved in the db/db after ET. Exercise did not change body weight of db/db, but superoxide dismutase (SOD1 and SOD2) and phosphorylated- eNOS protein (Ser1177) expression in heart tissue was up-regulated whereas tumor necrosis factor-alpha (TNF-α) protein level was decreased by ET. Serum level of interleukin-6 (IL-6) was higher in db/db mice but ET decreased IL-6. This suggests that ET may improve endothelial function by increasing nitric oxide bioavailability as well as decreasing chronic inflammation. We suggest this connection may be the basis for the benefit of ET in T2D.
Li, Yi-Cheng; Chi, Yu-Chieh; Cheng, Min-Chi; Lu, I-Cheng; Chen, Jason; Lin, Gong-Ru
2013-07-15
The coherent injection-locking and directly modulation of a long-cavity colorless laser diode with 1% end-facet reflectance and weak-resonant longitudinal modes is employed as an universal optical transmitter to demonstrated for optical 16-QAM OFDM transmission at 12 Gbit/s over 25 km in a DWDM-PON system. The optimized bias current of 30 mA (~1.5Ith) with corresponding extinction ratio (ER) of 6 dB and the external injection power of -9 dBm is (are) required for such a wavelength-locked universal transmitter to carry the 16-QAM and 122-subcarrier formatted OFDM and data-stream. By increasing external injection-locking from -9 dBm to 0 dBm, the peak-to-peak chirp of the OFDM data stream reduces from 7.7 to 5.4 GHz. The side mode suppression ratio (SMSR) of up to 50 dB is achieved with wider detuning range between -0.5 nm to 2.0 nm under an injection power of 0 dBm. By modulating such a colorless laser diode with an OFDM data stream of 122 subcarriers at a central carrier frequency of 1.5625 GHz and a total bandwidth of 3 GHz, the transmission data rate of up to 12 Gbit/s in standard single-mode fiber over 25 km is demonstrated to achieve an error vector magnitude (EVM) of 5.435%. Such a universal colorless DWDM-PON transmitter can deliver the optical OFDM data-stream at 12 Gbit/s QAM-OFDM data after 25-km transmission with a receiving power sensitivity of -7 dBm at BER of 3.6 × 10(-7) when pre-amplifying the OFDM data by 5 dB.
[Standardization of terminology in laboratory medicine I].
Yoon, Soo Young; Yoon, Jong Hyun; Min, Won Ki; Lim, Hwan Sub; Song, Junghan; Chae, Seok Lae; Lee, Chang Kyu; Kwon, Jung Ah; Lee, Kap No
2007-04-01
Standardization of medical terminology is essential for data transmission between health-care institutions or clinical laboratories and for maximizing the benefits of information technology. Purpose of our study was to standardize the medical terms used in the clinical laboratory, such as test names, units, terms used in result descriptions, etc. During the first year of the study, we developed a standard database of concept names for laboratory terms, which covered the terms used in government health care centers, their branch offices, and primary health care units. Laboratory terms were collected from the electronic data interchange (EDI) codes from National Health Insurance Corporation (NHIC), Logical Observation Identifier Names and Codes (LOINC) database, community health centers and their branch offices, and clinical laboratories of representative university medical centers. For standard expression, we referred to the English-Korean/ Korean-English medical dictionary of Korean Medical Association and the rules for foreign language translation. Programs for mapping between LOINC DB and EDI code and for translating English to Korean were developed. A Korean standard laboratory terminology database containing six axial concept names such as components, property, time aspect, system (specimen), scale type, and method type was established for 7,508 test observations. Short names and a mapping table for EDI codes and Unified Medical Language System (UMLS) were added. Synonym tables for concept names, words used in the database, and six axial terms were prepared to make it easier to find the standard terminology with common terms used in the field of laboratory medicine. Here we report for the first time a Korean standard laboratory terminology database for test names, result description terms, result units covering most laboratory tests in primary healthcare centers.
A Systematic Review of Yoga for Balance in a Healthy Population
Nkodo, Amélie-Françoise; Moonaz, Steffany Haaz; Dagnelie, Gislin
2014-01-01
Abstract Objective: A systematic review was done of the evidence on yoga for improving balance. Design: Relevant articles and reviews were identified in major databases (PubMed, MEDLINE®, IndMed, Web of Knowledge, EMBASE, EBSCO, Science Direct, and Google Scholar), and their reference lists searched. Key search words were yoga, balance, proprioception, falling, fear of falling, and falls. Included studies were peer-reviewed articles published in English before June 2012, using healthy populations. All yoga styles and study designs were included. Two (2) raters individually rated study quality using the Downs & Black (DB) checklist. Final scores were achieved by consensus. Achievable scores ranged from 0 to 27. Effect size (ES) was calculated where possible. Results: Fifteen (15) of 152 studies (age range 10–93, n=688) met the inclusion criteria: 5 randomized controlled trials (RCTs), 4 quasi-experimental, 2 cross-sectional, and 4 single-group designs. DB scores ranged from 10 to 24 (RCTs), 14–19 (quasi-experimental), 6–12 (cross-sectional), and 11–20 (single group). Studies varied by yoga style, frequency of practice, and duration. Eleven (11) studies found positive results (p<0.05) on at least one balance outcome. ES ranged from −0.765 to 2.71 (for 8 studies) and was not associated with DB score. Conclusions: Yoga may have a beneficial effect on balance, but variable study design and poor reporting quality obscure the results. Balance as an outcome is underutilized, and more probing measures are needed. PMID:24517304
Noise pollution in intensive care units and emergency wards.
Khademi, Gholamreza; Roudi, Masoumeh; Shah Farhat, Ahmad; Shahabian, Masoud
2011-01-01
The improvement of technology has increased noise levels in hospital Wards to higher than international standard levels (35-45 dB). Higher noise levels than the maximum level result in patient's instability and dissatisfaction. Moreover, it will have serious negative effects on the staff's health and the quality of their services. The purpose of this survey is to analyze the level of noise in intensive care units and emergency wards of the Imam Reza Teaching Hospital, Mashhad. This research was carried out in November 2009 during morning shifts between 7:30 to 12:00. Noise levels were measured 10 times at 30-minute intervals in the nursing stations of 10 wards of the emergency, the intensive care units, and the Nephrology and Kidney Transplant Departments of Imam Reza University Hospital, Mashhad. The noise level in the nursing stations was tested for both the maximum level (Lmax) and the equalizing level (Leq). The research was based on the comparison of equalizing levels (Leq) because maximum levels were unstable. In our survey the average level (Leq) in all wards was much higher than the standard level. The maximum level (Lmax) in most wards was 85-86 dB and just in one measurement in the Internal ICU reached 94 dB. The average level of Leq in all wards was 60.2 dB. In emergency units, it was 62.2 dB, but it was not time related. The highest average level (Leq) was measured at 11:30 AM and the peak was measured in the Nephrology nursing station. The average levels of noise in intensive care units and also emergency wards were more than the standard levels and as it is known these wards have vital roles in treatment procedures, so more attention is needed in this area.
Collaborative WiFi Fingerprinting Using Sensor-Based Navigation on Smartphones.
Zhang, Peng; Zhao, Qile; Li, You; Niu, Xiaoji; Zhuang, Yuan; Liu, Jingnan
2015-07-20
This paper presents a method that trains the WiFi fingerprint database using sensor-based navigation solutions. Since micro-electromechanical systems (MEMS) sensors provide only a short-term accuracy but suffer from the accuracy degradation with time, we restrict the time length of available indoor navigation trajectories, and conduct post-processing to improve the sensor-based navigation solution. Different middle-term navigation trajectories that move in and out of an indoor area are combined to make up the database. Furthermore, we evaluate the effect of WiFi database shifts on WiFi fingerprinting using the database generated by the proposed method. Results show that the fingerprinting errors will not increase linearly according to database (DB) errors in smartphone-based WiFi fingerprinting applications.
Collaborative WiFi Fingerprinting Using Sensor-Based Navigation on Smartphones
Zhang, Peng; Zhao, Qile; Li, You; Niu, Xiaoji; Zhuang, Yuan; Liu, Jingnan
2015-01-01
This paper presents a method that trains the WiFi fingerprint database using sensor-based navigation solutions. Since micro-electromechanical systems (MEMS) sensors provide only a short-term accuracy but suffer from the accuracy degradation with time, we restrict the time length of available indoor navigation trajectories, and conduct post-processing to improve the sensor-based navigation solution. Different middle-term navigation trajectories that move in and out of an indoor area are combined to make up the database. Furthermore, we evaluate the effect of WiFi database shifts on WiFi fingerprinting using the database generated by the proposed method. Results show that the fingerprinting errors will not increase linearly according to database (DB) errors in smartphone-based WiFi fingerprinting applications. PMID:26205269
The Results of Development of the Project ZOOINT and its Future Perspectives
NASA Astrophysics Data System (ADS)
Smirnov, I. S.; Lobanov, A. L.; Alimov, A. F.; Medvedev, S. G.; Golikov, A. A.
The work on a computerization of main processes of accumulation and analysis of the collection, expert and literary data on a systematics and faunistics of various taxa of animal (a basis of studying of a biological diversity) was started in the Zoological Institute in 1987. In 1991 the idea of creating of the software package, ZOOlogical INTegrated system (ZOOINT) was born. ZOOINT could provide a loading operation about collections and simultaneously would allow to analyze the accumulated data with the help of various queries. During execution, the project ZOOINT was transformed slightly and has given results a little bit distinguished from planned earlier, but even more valuable. In the Internet the site about the information retrieval system (IRS) ZOOINT was built also ( ZOOINT ). The implementation of remote access to the taxonomic information, with possibility to work with databases (DB) of the IRS ZOOINT in the on-line mode was scheduled. It has required not only innovation of computer park of the developers and users, but also mastering of new software: language HTML, operating system of Windows NT, and technology of Active Server Pages (ASP). One of the serious problems of creating of databases and the IRS on zoology is the problem of representation of hierarchical classification. Building the classifiers, specialized standard taxonomic databases, which have obtained the name ZOOCOD solved this problem. The lately magnified number of attempts of creating of taxonomic electronic lists, tables and DB has required development of some primary rules of unification of zoological systematic databases. These rules assume their application in institutes of the biological profile, in which the processes of a computerization are very slowly, and the building of databases is in the most rudimentary state. These some positions and the standards of construction of biological (taxonomic) databases should facilitate dialogue of the biologists, application in the near future of most advanced technologies of development of the DB (for example, usage of the XML platform) and, eventually, building of the modern information systems. The work on the project is carried out at support of the RFBR grant N 02-07-90217; programs "The Information system on a biodiversity of Russia" and Project N 15 "Antarctic Regions".
ExpoCastDB: A Publicly Accessible Database for Observational Exposure Data
The application of environmental informatics tools for human health risk assessment will require the development of advanced exposure information technology resources. Exposure data for chemicals is often not readily accessible. There is a pressing need for easily accessible, che...
Classification and Dose-Response Characterization of ...
Thirty years and over a billion of today’s dollars worth of pesticide registration toxicity studies, historically stored as hardcopy and scanned documents, have been digitized into highly standardized and structured toxicity data, within the U.S. Environmental Protection Agency’s (EPA) Toxicity Reference Database (ToxRefDB). The source toxicity data in ToxRefDB covers multiple study types, including subchronic, developmental, reproductive, chronic, and cancer studies, resulting in a diverse set of endpoints and toxicities. Novel approaches to chemical classification are performed as a model application of ToxRefDB and as an essential need for highly detailed chemical classifications within the EPA’s ToxCast™ research program. In order to develop predictive models and biological signatures utilizing high-throughput screening (HTS) and in vitro genomic data, endpoints and toxicities must first be identified and globally characterized for ToxCast Phase I chemicals. Secondarily, dose-response characterization within and across toxicity endpoints provide insight into key precursor toxicity events and overall endpoint relevance. Toxicity-based chemical classification and dose-response characterization utilizing ToxRefDB prioritized toxicity endpoints and differentiated toxicity outcomes across a large chemical set.
Automated CFD Parameter Studies on Distributed Parallel Computers
NASA Technical Reports Server (NTRS)
Rogers, Stuart E.; Aftosmis, Michael; Pandya, Shishir; Tejnil, Edward; Ahmad, Jasim; Kwak, Dochan (Technical Monitor)
2002-01-01
The objective of the current work is to build a prototype software system which will automated the process of running CFD jobs on Information Power Grid (IPG) resources. This system should remove the need for user monitoring and intervention of every single CFD job. It should enable the use of many different computers to populate a massive run matrix in the shortest time possible. Such a software system has been developed, and is known as the AeroDB script system. The approach taken for the development of AeroDB was to build several discrete modules. These include a database, a job-launcher module, a run-manager module to monitor each individual job, and a web-based user portal for monitoring of the progress of the parameter study. The details of the design of AeroDB are presented in the following section. The following section provides the results of a parameter study which was performed using AeroDB for the analysis of a reusable launch vehicle (RLV). The paper concludes with a section on the lessons learned in this effort, and ideas for future work in this area.
NASA Astrophysics Data System (ADS)
Taha, Mutasem O.; Habash, Maha; Khanfar, Mohammad A.
2014-05-01
Glucokinase (GK) is involved in normal glucose homeostasis and therefore it is a valid target for drug design and discovery efforts. GK activators (GKAs) have excellent potential as treatments of hyperglycemia and diabetes. The combined recent interest in GKAs, together with docking limitations and shortages of docking validation methods prompted us to use our new 3D-QSAR analysis, namely, docking-based comparative intermolecular contacts analysis (dbCICA), to validate docking configurations performed on a group of GKAs within GK binding site. dbCICA assesses the consistency of docking by assessing the correlation between ligands' affinities and their contacts with binding site spots. Optimal dbCICA models were validated by receiver operating characteristic curve analysis and comparative molecular field analysis. dbCICA models were also converted into valid pharmacophores that were used as search queries to mine 3D structural databases for new GKAs. The search yielded several potent bioactivators that experimentally increased GK bioactivity up to 7.5-folds at 10 μM.
Park, Chan Hum; Yokozawa, Takako; Noh, Jeong Sook
2014-08-01
This study was conducted to examine whether oligonol, a low-molecular-weight polyphenol derived from lychee fruit, has an ameliorative effect on diabetes-induced alterations, such as advanced glycation end product (AGE) formation or apoptosis in the kidneys of db/db mice with type 2 diabetes. Oligonol [10 or 20 mg/(kg body weight · d), orally] was administered every day for 8 wk to prediabetic db/db mice, and its effect was compared with vehicle-treated db/db and normal control mice (m/m). The administration of oligonol decreased the elevated renal glucose concentrations and reactive oxygen species in db/db mice (P < 0.05). The increased serum urea nitrogen and creatinine concentrations, which reflect renal dysfunction in db/db mice, were substantially lowered by oligonol. Oligonol reduced renal protein expression of NAD(P)H oxidase subunits (p22 phagocytic oxidase and NAD(P)H oxidase-4), AGEs (except for pentosidine), and c-Jun N-terminal kinase B-targeting proinflammatory tumor necrosis factor-α (P < 0.05). Oligonol improved the expressions of antiapoptotic [B-cell lymphoma protein 2 (Bcl-2) and survivin] and proapoptotic [Bcl-2-associated X protein, cytochrome c, and caspase-3] proteins in the kidneys of db/db mice (P < 0.05). In conclusion, these results provide important evidence that oligonol exhibits a pleiotropic effect on AGE formation and apoptosis-related variables, representing renoprotective effects against the development of diabetic complications in db/db mice with type 2 diabetes. © 2014 American Society for Nutrition.
Qi, Lu; Kang, Kihwa; Zhang, Cuilin; van Dam, Rob M; Kraft, Peter; Hunter, David; Lee, Chih-Hao; Hu, Frank B
2008-11-01
To examine the longitudinal association of fat mass-and obesity-associated (FTO) variant with obesity, circulating adipokine levels, and FTO expression in various materials from human and mouse. We genotyped rs9939609 in 2,287 men and 3,520 women from two prospective cohorts. Plasma adiponectin and leptin were measured in a subset of diabetic men (n = 854) and women (n = 987). Expression of FTO was tested in adipocytes from db/db mice and mouse macrophages. We observed a trend toward decreasing associations between rs9939609 and BMI at older age (>or=65 years) in men, whereas the associations were constant across different age groups in women. In addition, the single nucleotide polymorphism (SNP) rs9939609 was associated with lower plasma adiponectin (log[e]--means, 1.82 +/- 0.04, 1.73 +/- 0.03, and 1.68 +/- 0.05 for TT, TA, and AA genotypes, respectively; P for trend = 0.02) and leptin (log[e]--means, 3.56 +/- 0.04, 3.63 +/- 0.04, and 3.70 +/- 0.06; P for trend = 0.06) in diabetic women. Adjustment for BMI attenuated the associations. FTO gene was universally expressed in human and mice tissues, including adipocytes. In an ancillary study of adipocytes from db/db mice, FTO expression was approximately 50% lower than in those from wild-type mice. The association between FTO SNP rs9939609 and obesity risk may decline at older age. The variant affects circulating adiponectin and leptin levels through the changes in BMI. In addition, the expression of FTO gene was reduced in adipocytes from db/db mice.
Ignatieva, Elena V; Igoshin, Alexander V; Yudin, Nikolay S
2017-12-28
Tick-borne encephalitis is caused by the neurotropic, positive-sense RNA virus, tick-borne encephalitis virus (TBEV). TBEV infection can lead to a variety of clinical manifestations ranging from slight fever to severe neurological illness. Very little is known about genetic factors predisposing to severe forms of disease caused by TBEV. The aims of the study were to compile a catalog of human genes involved in response to TBEV infection and to rank genes from the catalog based on the number of neighbors in the network of pairwise interactions involving these genes and TBEV RNA or proteins. Based on manual review and curation of scientific publications a catalog comprising 140 human genes involved in response to TBEV infection was developed. To provide access to data on all genes, the TBEVhostDB web resource ( http://icg.nsc.ru/TBEVHostDB/ ) was created. We reconstructed a network formed by pairwise interactions between TBEV virion itself, viral RNA and viral proteins and 140 genes/proteins from TBEVHostDB. Genes were ranked according to the number of interactions in the network. Two genes/proteins (CCR5 and IFNAR1) that had maximal number of interactions were revealed. It was found that the subnetworks formed by CCR5 and IFNAR1 and their neighbors were a fragments of two key pathways functioning during the course of tick-borne encephalitis: (1) the attenuation of interferon-I signaling pathway by the TBEV NS5 protein that targeted peptidase D; (2) proinflammation and tissue damage pathway triggered by chemokine receptor CCR5 interacting with CD4, CCL3, CCL4, CCL2. Among nine genes associated with severe forms of TBEV infection, three genes/proteins (CCR5, IL10, ARID1B) were found to have protein-protein interactions within the network, and two genes/proteins (IFNL3 and the IL10, that was just mentioned) were up- or down-regulated in response to TBEV infection. Based on this finding, potential mechanisms for participation of CCR5, IL10, ARID1B, and IFNL3 in the host response to TBEV infection were suggested. A database comprising 140 human genes involved in response to TBEV infection was compiled and the TBEVHostDB web resource, providing access to all genes was created. This is the first effort of integrating and unifying data on genetic factors that may predispose to severe forms of diseases caused by TBEV. The TBEVHostDB could potentially be used for assessment of risk factors for severe forms of tick-borne encephalitis and for the design of personalized pharmacological strategies for the treatment of TBEV infection.
Performance Evaluation and Requirements Assessment for Gravity Gradient Referenced Navigation
Lee, Jisun; Kwon, Jay Hyoun; Yu, Myeongjong
2015-01-01
In this study, simulation tests for gravity gradient referenced navigation (GGRN) are conducted to verify the effects of various factors such as database (DB) and sensor errors, flight altitude, DB resolution, initial errors, and measurement update rates on the navigation performance. Based on the simulation results, requirements for GGRN are established for position determination with certain target accuracies. It is found that DB and sensor errors and flight altitude have strong effects on the navigation performance. In particular, a DB and sensor with accuracies of 0.1 E and 0.01 E, respectively, are required to determine the position more accurately than or at a level similar to the navigation performance of terrain referenced navigation (TRN). In most cases, the horizontal position error of GGRN is less than 100 m. However, the navigation performance of GGRN is similar to or worse than that of a pure inertial navigation system when the DB and sensor errors are 3 E or 5 E each and the flight altitude is 3000 m. Considering that the accuracy of currently available gradiometers is about 3 E or 5 E, GGRN does not show much advantage over TRN at present. However, GGRN is expected to exhibit much better performance in the near future when accurate DBs and gravity gradiometer are available. PMID:26184212
PrenDB, a Substrate Prediction Database to Enable Biocatalytic Use of Prenyltransferases.
Gunera, Jakub; Kindinger, Florian; Li, Shu-Ming; Kolb, Peter
2017-03-10
Prenyltransferases of the dimethylallyltryptophan synthase (DMATS) superfamily catalyze the attachment of prenyl or prenyl-like moieties to diverse acceptor compounds. These acceptor molecules are generally aromatic in nature and mostly indole or indole-like. Their catalytic transformation represents a major skeletal diversification step in the biosynthesis of secondary metabolites, including the indole alkaloids. DMATS enzymes thus contribute significantly to the biological and pharmacological diversity of small molecule metabolites. Understanding the substrate specificity of these enzymes could create opportunities for their biocatalytic use in preparing complex synthetic scaffolds. However, there has been no framework to achieve this in a rational way. Here, we report a chemoinformatic pipeline to enable prenyltransferase substrate prediction. We systematically catalogued 32 unique prenyltransferases and 167 unique substrates to create possible reaction matrices and compiled these data into a browsable database named PrenDB. We then used a newly developed algorithm based on molecular fragmentation to automatically extract reactive chemical epitopes. The analysis of the collected data sheds light on the thus far explored substrate space of DMATS enzymes. To assess the predictive performance of our virtual reaction extraction tool, 38 potential substrates were tested as prenyl acceptors in assays with three prenyltransferases, and we were able to detect turnover in >55% of the cases. The database, PrenDB (www.kolblab.org/prendb.php), enables the prediction of potential substrates for chemoenzymatic synthesis through substructure similarity and virtual chemical transformation techniques. It aims at making prenyltransferases and their highly regio- and stereoselective reactions accessible to the research community for integration in synthetic work flows. © 2017 by The American Society for Biochemistry and Molecular Biology, Inc.
Hearing impairment related to age in Usher syndrome types 1B and 2A.
Wagenaar, M; van Aarem, A; Huygen, P; Pieke-Dahl, S; Kimberling, W; Cremers, C
1999-04-01
To evaluate hearing impairment in 2 common genetic subtypes of Usher syndrome, USH1B and USH2A. Cross-sectional analysis of hearing threshold related to age in patients with genotypes determined by linkage and mutation analysis. Otolaryngology department, university referral center. Nineteen patients with USH1B and 27 with USH2A were examined. All participants were living in the Netherlands and Belgium. Pure tone audiometry of the best ear at last visit. The patients with USH1B had residual hearing without age dependence, with minimum thresholds of 80, 95, and 120 dB at 0.25, 0.5, and 1 to 2 kHz, respectively. Mean thresholds of patients with USH2A were about 45 to 55 dB better than these minimum values. Distinctive audiographic features of patients with USH2A were maximum hearing thresholds of 70, 80, and 100 dB at 0.25, 0.5, and 1 kHz, respectively, only at younger than 40 years. Progression of hearing impairment in USH2A was 0.7 dB/y on average for 0.25 to 4 kHz and could not be explained by presbyacusis alone. The USH1B and USH2A can be easily distinguished by hearing impairment at younger than 40 years at the low frequencies. Hearing impairment in our patients with USH2A could be characterized as progressive.
PolySac3DB: an annotated data base of 3 dimensional structures of polysaccharides.
Sarkar, Anita; Pérez, Serge
2012-11-14
Polysaccharides are ubiquitously present in the living world. Their structural versatility makes them important and interesting components in numerous biological and technological processes ranging from structural stabilization to a variety of immunologically important molecular recognition events. The knowledge of polysaccharide three-dimensional (3D) structure is important in studying carbohydrate-mediated host-pathogen interactions, interactions with other bio-macromolecules, drug design and vaccine development as well as material science applications or production of bio-ethanol. PolySac3DB is an annotated database that contains the 3D structural information of 157 polysaccharide entries that have been collected from an extensive screening of scientific literature. They have been systematically organized using standard names in the field of carbohydrate research into 18 categories representing polysaccharide families. Structure-related information includes the saccharides making up the repeat unit(s) and their glycosidic linkages, the expanded 3D representation of the repeat unit, unit cell dimensions and space group, helix type, diffraction diagram(s) (when applicable), experimental and/or simulation methods used for structure description, link to the abstract of the publication, reference and the atomic coordinate files for visualization and download. The database is accompanied by a user-friendly graphical user interface (GUI). It features interactive displays of polysaccharide structures and customized search options for beginners and experts, respectively. The site also serves as an information portal for polysaccharide structure determination techniques. The web-interface also references external links where other carbohydrate-related resources are available. PolySac3DB is established to maintain information on the detailed 3D structures of polysaccharides. All the data and features are available via the web-interface utilizing the search engine and can be accessed at http://polysac3db.cermav.cnrs.fr.
Liu, Xiaofeng; Ouyang, Sisheng; Yu, Biao; Liu, Yabo; Huang, Kai; Gong, Jiayu; Zheng, Siyuan; Li, Zhihua; Li, Honglin; Jiang, Hualiang
2010-01-01
In silico drug target identification, which includes many distinct algorithms for finding disease genes and proteins, is the first step in the drug discovery pipeline. When the 3D structures of the targets are available, the problem of target identification is usually converted to finding the best interaction mode between the potential target candidates and small molecule probes. Pharmacophore, which is the spatial arrangement of features essential for a molecule to interact with a specific target receptor, is an alternative method for achieving this goal apart from molecular docking method. PharmMapper server is a freely accessed web server designed to identify potential target candidates for the given small molecules (drugs, natural products or other newly discovered compounds with unidentified binding targets) using pharmacophore mapping approach. PharmMapper hosts a large, in-house repertoire of pharmacophore database (namely PharmTargetDB) annotated from all the targets information in TargetBank, BindingDB, DrugBank and potential drug target database, including over 7000 receptor-based pharmacophore models (covering over 1500 drug targets information). PharmMapper automatically finds the best mapping poses of the query molecule against all the pharmacophore models in PharmTargetDB and lists the top N best-fitted hits with appropriate target annotations, as well as respective molecule’s aligned poses are presented. Benefited from the highly efficient and robust triangle hashing mapping method, PharmMapper bears high throughput ability and only costs 1 h averagely to screen the whole PharmTargetDB. The protocol was successful in finding the proper targets among the top 300 pharmacophore candidates in the retrospective benchmarking test of tamoxifen. PharmMapper is available at http://59.78.96.61/pharmmapper. PMID:20430828
NASA Astrophysics Data System (ADS)
El Ouahabi, A.; Martrat, B.; Lopez, J. F.; Grimalt, J. O.
2012-04-01
The ability to simulate the rhythm of abrupt climate changes (ACCs) will depend on the availability of high quality palæoclimate databases with sufficient temporal resolution to make relevant inferences from a human perspective. This study presents the PIG2LIG-4FUTURE database (P2L-4F db). The philosophy behind is to facilitate access to data not only for the scientific community, but also for those outside this community and, in doing so, ensure that the data are as useful as possible to help in answering a challenging key question: What is the risk of ACCs in periods similar to the present one? The P2L-4F db identifies an intra- and inter-event stratigraphy. A breakdown of the events in the PIG (present interglacial, time period from which more information is available and it is reasonably well dated) is defined in order to (2) identify ACCs in the LIG (last interglacial, much less known and dated period) for (4) better evaluation of the likelihood of sudden shifts within warm climate behaviour (i.e. next centuries, FUTURE). For this db, both the PIG and the LIG include deglaciation and interglacial states and they are referred to by means of chronostratigraphy: (i) the PIG refers to the last 19 ka years, i.e. a precessional cycle should be complete within the next 5 ka years; (ii) the LIG is the time span between 133 ka and 109 ka years, i.e. a time slot of 24 ka years, roughly a precessional cycle. The db compiles comprehensive selected palæo-data retrieved from a variety of sources (http://www.ncdc.noaa.gov, http://www.pangaea.de and others) but also instrumental data, to validate reconstructions against observations (e.g. http://data.giss.nasa.gov, http://iridl.ldeo.columbia.edu, etc). The records included accomplish two simple criteria: they cover the time span of interest, specifically the PIG and the LIG periods, and have sufficient time resolution to distinguish between an ACC and a gradual event. The research has focussed on three main palæoarchives: (i) ice cores, because they have been the reference for ACCs for over half a century; (ii) stalagmites, because they are crucial for dating, particularly for the LIG; (iii) marine and continental sediments, because they have proven to be a powerful source of ACCs. The same response is not seen everywhere due to proxy and location effects (see-saws not only north-to-south between poles, but also tropics-to-poles and eastern-to-western gradients). However, by using global and regional stacks, PIG-like events have been located within the LIG period. The spatio-temporal pattern during several specific events will be discussed. For example, the results show that the PIG 8.2 ka-event was something akin to a dividing line, as it was a LIG event which occured at approximately 120 ka ago. Environment and climate were evidently very different before and after this type of events. Additionally, the 2.6 ka and 0.8 ka-events were more an exception than a rule, somewhat resembling a clear glacial inception initiated from 113 to 111 ka ago. However, ACCs within the PIG are not exactly the same as the ones for the LIG, either in intensity or rates of change. This fact is hardly surprising given that none of the triggers and sources of persistence were reproduced in an identical mode during both periods.
[Utility of axial images in an early Alzheimer disease diagnosis support system (VSRAD)].
Goto, Masami; Aoki, Shigeki; Abe, Osamu; Masumoto, Tomohiko; Watanabe, Yasushi; Satake, Yoshiroh; Nishida, Katsuji; Ino, Kenji; Yano, Keiichi; Iida, Kyohhito; Mima, Kazuo; Ohtomo, Kuni
2006-09-20
In recent years, voxel-based morphometry (VBM) has become a popular tool for the early diagnosis of Alzheimer disease. The Voxel-Based Specific Regional Analysis System for Alzheimer's Disease (VSRAD), a VBM system that uses MRI, has been reported to be clinically useful. The able-bodied person database (DB) of VSRAD, which employs sagittal plane imaging, is not suitable for analysis by axial plane imaging. However, axial plane imaging is useful for avoiding motion artifacts from the eyeball. Therefore, we created an able-bodied person DB by axial plane imaging and examined its utility. We also analyzed groups of able-bodied persons and persons with dementia by axial plane imaging and reviewed the validity. After using the DB of axial plane imaging, the Z-score of the intrahippocampal region improved by 8 in 13 instances. In all brains, the Z-score improved by 13 in all instances.
Classification of Chemicals Based On Structured Toxicity ...
Thirty years and millions of dollars worth of pesticide registration toxicity studies, historically stored as hardcopy and scanned documents, have been digitized into highly standardized and structured toxicity data within the Toxicity Reference Database (ToxRefDB). Toxicity-based classifications of chemicals were performed as a model application of ToxRefDB. These endpoints will ultimately provide the anchoring toxicity information for the development of predictive models and biological signatures utilizing in vitro assay data. Utilizing query and structured data mining approaches, toxicity profiles were uniformly generated for greater than 300 chemicals. Based on observation rate, species concordance and regulatory relevance, individual and aggregated effects have been selected to classify the chemicals providing a set of predictable endpoints. ToxRefDB exhibits the utility of transforming unstructured toxicity data into structured data and, furthermore, into computable outputs, and serves as a model for applying such data to address modern toxicological problems.
Performance Evaluation and Analysis for Gravity Matching Aided Navigation.
Wu, Lin; Wang, Hubiao; Chai, Hua; Zhang, Lu; Hsu, Houtse; Wang, Yong
2017-04-05
Simulation tests were accomplished in this paper to evaluate the performance of gravity matching aided navigation (GMAN). Four essential factors were focused in this study to quantitatively evaluate the performance: gravity database (DB) resolution, fitting degree of gravity measurements, number of samples in matching, and gravity changes in the matching area. Marine gravity anomaly DB derived from satellite altimetry was employed. Actual dynamic gravimetry accuracy and operating conditions were referenced to design the simulation parameters. The results verified that the improvement of DB resolution, gravimetry accuracy, number of measurement samples, or gravity changes in the matching area generally led to higher positioning accuracies, while the effects of them were different and interrelated. Moreover, three typical positioning accuracy targets of GMAN were proposed, and the conditions to achieve these targets were concluded based on the analysis of several different system requirements. Finally, various approaches were provided to improve the positioning accuracy of GMAN.
Performance Evaluation and Analysis for Gravity Matching Aided Navigation
Wu, Lin; Wang, Hubiao; Chai, Hua; Zhang, Lu; Hsu, Houtse; Wang, Yong
2017-01-01
Simulation tests were accomplished in this paper to evaluate the performance of gravity matching aided navigation (GMAN). Four essential factors were focused in this study to quantitatively evaluate the performance: gravity database (DB) resolution, fitting degree of gravity measurements, number of samples in matching, and gravity changes in the matching area. Marine gravity anomaly DB derived from satellite altimetry was employed. Actual dynamic gravimetry accuracy and operating conditions were referenced to design the simulation parameters. The results verified that the improvement of DB resolution, gravimetry accuracy, number of measurement samples, or gravity changes in the matching area generally led to higher positioning accuracies, while the effects of them were different and interrelated. Moreover, three typical positioning accuracy targets of GMAN were proposed, and the conditions to achieve these targets were concluded based on the analysis of several different system requirements. Finally, various approaches were provided to improve the positioning accuracy of GMAN. PMID:28379178
Ostler, Joseph E.; Maurya, Santosh K.; Dials, Justin; Roof, Steve R.; Devor, Steven T.; Ziolo, Mark T.
2014-01-01
Type 2 diabetes mellitus is associated with an accelerated muscle loss during aging, decreased muscle function, and increased disability. To better understand the mechanisms causing this muscle deterioration in type 2 diabetes, we assessed muscle weight, exercise capacity, and biochemistry in db/db and TallyHo mice at prediabetic and overtly diabetic ages. Maximum running speeds and muscle weights were already reduced in prediabetic db/db mice when compared with lean controls and more severely reduced in the overtly diabetic db/db mice. In contrast to db/db mice, TallyHo muscle size dramatically increased and maximum running speed was maintained during the progression from prediabetes to overt diabetes. Analysis of mechanisms that may contribute to decreased muscle weight in db/db mice demonstrated that insulin-dependent phosphorylation of enzymes that promote protein synthesis was severely blunted in db/db muscle. In addition, prediabetic (6-wk-old) and diabetic (12-wk-old) db/db muscle exhibited an increase in a marker of proteasomal protein degradation, the level of polyubiquitinated proteins. Chronic treadmill training of db/db mice improved glucose tolerance and exercise capacity, reduced markers of protein degradation, but only mildly increased muscle weight. The differences in muscle phenotype between these models of type 2 diabetes suggest that insulin resistance and chronic hyperglycemia alone are insufficient to rapidly decrease muscle size and function and that the effects of diabetes on muscle growth and function are animal model-dependent. PMID:24425761
Multivariate analysis of toxicity experimental results of environmental endpoints. (FutureToxII)
The toxicity of hundreds of chemicals have been assessed in laboratory animal studies through EPA chemical regulation and toxicological research. Currently, over 5000 laboratory animal toxicity studies have been collected in the Toxicity Reference Database (ToxRefDB). In addition...
Multiscale Systems Modeling of Male Reproductive Tract Defects: from Genes to Populations (SOT)
The reproductive tract is a complex, integrated organ system with diverse embryology and unique sensitivity to prenatal environmental exposures that disrupt morphoregulatory processes and endocrine signaling. U.S. EPA’s in vitro high-throughput screening (HTS) database (ToxCastDB...
Scrubchem: Building Bioactivity Datasets from Pubchem Bioassay Data (SOT)
The PubChem Bioassay database is a non-curated public repository with data from 64 sources, including: ChEMBL, BindingDb, DrugBank, EPA Tox21, NIH Molecular Libraries Screening Program, and various other academic, government, and industrial contributors. Methods for extracting th...
Impact of Universities' Promotional Materials on College Choice.
ERIC Educational Resources Information Center
Armstrong, Jami J.; Lumsden, D. Barry
1999-01-01
Evaluated the impact of printed promotional materials on the recruitment of college freshmen using focus groups of students attending a large, southern metropolitan university. Students provided detailed suggestions on ways to improve the method of distribution, graphic design, and content of the materials. (Author/DB)
Li, Hua; Ji, Hyeon-Seon; Kang, Ji-Hyun; Shin, Dong-Ha; Park, Ho-Yong; Choi, Myung-Sook; Lee, Chul-Ho; Lee, In-Kyung; Yun, Bong-Sik; Jeong, Tae-Sook
2015-08-19
This study investigated the molecular mechanisms underlying the antidiabetic effect of an ethanol extract of soy leaves (ESL) in db/db mice. Control groups (db/+ and db/db) were fed a normal diet (ND), whereas the db/db-ESL group was fed ND with 1% ESL for 8 weeks. Dietary ESL improved glucose tolerance and lowered plasma glucose, glycated hemoglobin, HOMA-IR, and triglyceride levels. The pancreatic insulin content of the db/db-ESL group was significantly greater than that of the db/db group. ESL supplementation altered pancreatic IRS1, IRS2, Pdx1, Ngn3, Pax4, Ins1, Ins2, and FoxO1 expression. Furthermore, ESL suppressed lipid accumulation and increased glucokinase activity in the liver. ESL primarily contained kaempferol glycosides and pheophorbides. Kaempferol, an aglycone of kaempferol glycosides, improved β-cell proliferation through IRS2-related FoxO1 signaling, whereas pheophorbide a, a product of chlorophyll breakdown, improved insulin secretion and β-cell proliferation through IRS1-related signaling with protein kinase A in MIN6 cells. ESL effectively regulates glucose homeostasis by enhancing IRS-mediated β-cell insulin signaling and suppressing SREBP-1-mediated hepatic lipid accumulation in db/db mice.
MMpI: A WideRange of Available Compounds of Matrix Metalloproteinase Inhibitors
Muvva, Charuvaka; Patra, Sanjukta; Venkatesan, Subramanian
2016-01-01
Matrix metalloproteinases (MMPs) are a family of zinc-dependent proteinases involved in the regulation of the extracellular signaling and structural matrix environment of cells and tissues. MMPs are considered as promising targets for the treatment of many diseases. Therefore, creation of database on the inhibitors of MMP would definitely accelerate the research activities in this area due to its implication in above-mentioned diseases and associated limitations in the first and second generation inhibitors. In this communication, we report the development of a new MMpI database which provides resourceful information for all researchers working in this field. It is a web-accessible, unique resource that contains detailed information on the inhibitors of MMP including small molecules, peptides and MMP Drug Leads. The database contains entries of ~3000 inhibitors including ~72 MMP Drug Leads and ~73 peptide based inhibitors. This database provides the detailed molecular and structural details which are necessary for the drug discovery and development. The MMpI database contains physical properties, 2D and 3D structures (mol2 and pdb format files) of inhibitors of MMP. Other data fields are hyperlinked to PubChem, ChEMBL, BindingDB, DrugBank, PDB, MEROPS and PubMed. The database has extensive searching facility with MMpI ID, IUPAC name, chemical structure and with the title of research article. The MMP inhibitors provided in MMpI database are optimized using Python-based Hierarchical Environment for Integrated Xtallography (Phenix) software. MMpI Database is unique and it is the only public database that contains and provides the complete information on the inhibitors of MMP. Database URL: http://clri.res.in/subramanian/databases/mmpi/index.php. PMID:27509041
Evaluating the Cassandra NoSQL Database Approach for Genomic Data Persistency.
Aniceto, Rodrigo; Xavier, Rene; Guimarães, Valeria; Hondo, Fernanda; Holanda, Maristela; Walter, Maria Emilia; Lifschitz, Sérgio
2015-01-01
Rapid advances in high-throughput sequencing techniques have created interesting computational challenges in bioinformatics. One of them refers to management of massive amounts of data generated by automatic sequencers. We need to deal with the persistency of genomic data, particularly storing and analyzing these large-scale processed data. To find an alternative to the frequently considered relational database model becomes a compelling task. Other data models may be more effective when dealing with a very large amount of nonconventional data, especially for writing and retrieving operations. In this paper, we discuss the Cassandra NoSQL database approach for storing genomic data. We perform an analysis of persistency and I/O operations with real data, using the Cassandra database system. We also compare the results obtained with a classical relational database system and another NoSQL database approach, MongoDB.
A review of drug-induced liver injury databases.
Luo, Guangwen; Shen, Yiting; Yang, Lizhu; Lu, Aiping; Xiang, Zheng
2017-09-01
Drug-induced liver injuries have been a major focus of current research in drug development, and are also one of the major reasons for the failure and withdrawal of drugs in development. Drug-induced liver injuries have been systematically recorded in many public databases, which have become valuable resources in this field. In this study, we provide an overview of these databases, including the liver injury-specific databases LiverTox, LTKB, Open TG-GATEs, LTMap and Hepatox, and the general databases, T3DB, DrugBank, DITOP, DART, CTD and HSDB. The features and limitations of these databases are summarized and discussed in detail. Apart from their powerful functions, we believe that these databases can be improved in several ways: by providing the data about the molecular targets involved in liver toxicity, by incorporating information regarding liver injuries caused by drug interactions, and by regularly updating the data.
searchSCF: Using MongoDB to Enable Richer Searches of Locally Hosted Science Data Repositories
NASA Astrophysics Data System (ADS)
Knosp, B.
2016-12-01
Science teams today are in the unusual position of almost having too much data available to them. Modern sensors and models are capable of outputting terabytes of data per day, which can make it difficult to find specific subsets of data. The sheer size of files can also make it time consuming to retrieve this big data from national data archive centers. Thus, many science teams choose to store what data they can on their local systems, but they are not always equipped with tools to help them intelligently organize and search their data. In its local data repository, the Aura Microwave Limb Sounder (MLS) science team at NASA's Jet Propulsion Laboratory has collected over 300TB of atmospheric science data from 71 missions/models that aid in validation, algorithm development, and research activities. When the project began, the team developed a MySQL database to aid in data queries, but this database was only designed to keep track of MLS and a few ancillary data sets, leving much of the data uncatalogued. The team has also seen database query time rise over the life of the mission. Even though the MLS science team's data holdings are not the size of a national data center's, team members still need tools to help them discover and utilize the data that they have on-hand. Over the past year, members of the science team have been looking for solutions to (1) store information on all the data sets they have collected in a single database, (2) store more metadata about each data file, (3) develop queries that can find relationships among these disparate data types, and (4) plug any new functions developed around this database into existing analysis, visualization, and web tools, transparently to users. In this presentation, I will discuss the searchSCF package that is currently under development. This package includes a NoSQL database management system (MongoDB) and a set of Python tools that both ingests data into the database and supports user queries. I will also highlight case studies of how this system could be used by the MLS science team, and how it could be implemented by other science teams with local data repositories.
A Simple and Universal Aerosol Retrieval Algorithm for Landsat Series Images Over Complex Surfaces
NASA Astrophysics Data System (ADS)
Wei, Jing; Huang, Bo; Sun, Lin; Zhang, Zhaoyang; Wang, Lunche; Bilal, Muhammad
2017-12-01
Operational aerosol optical depth (AOD) products are available at coarse spatial resolutions from several to tens of kilometers. These resolutions limit the application of these products for monitoring atmospheric pollutants at the city level. Therefore, a simple, universal, and high-resolution (30 m) Landsat aerosol retrieval algorithm over complex urban surfaces is developed. The surface reflectance is estimated from a combination of top of atmosphere reflectance at short-wave infrared (2.22 μm) and Landsat 4-7 surface reflectance climate data records over densely vegetated areas and bright areas. The aerosol type is determined using the historical aerosol optical properties derived from the local urban Aerosol Robotic Network (AERONET) site (Beijing). AERONET ground-based sun photometer AOD measurements from five sites located in urban and rural areas are obtained to validate the AOD retrievals. Terra MODerate resolution Imaging Spectrometer Collection (C) 6 AOD products (MOD04) including the dark target (DT), the deep blue (DB), and the combined DT and DB (DT&DB) retrievals at 10 km spatial resolution are obtained for comparison purposes. Validation results show that the Landsat AOD retrievals at a 30 m resolution are well correlated with the AERONET AOD measurements (R2 = 0.932) and that approximately 77.46% of the retrievals fall within the expected error with a low mean absolute error of 0.090 and a root-mean-square error of 0.126. Comparison results show that Landsat AOD retrievals are overall better and less biased than MOD04 AOD products, indicating that the new algorithm is robust and performs well in AOD retrieval over complex surfaces. The new algorithm can provide continuous and detailed spatial distributions of AOD during both low and high aerosol loadings.
Bioacoustic Absorption Spectroscopy (ASIAEX)
2000-09-30
source level of 170 dB re 1 µPa at 1 m for 24 hours. This source will weigh less than 220 ponds, including batteries, and excluding the anchor and float...in the proposed collaborative effort. Dr. Masahiko Furusawa of the University of Tokyo (Fisheries), Japan’s leading authority on fisheries acoustics...about 20 water depths (~ 1.2 km in 60 m of water) at frequencies of 1 to 2 kHz. RESULTS I reviewed the measurements and times of frequency selective
Semantically Enabling Knowledge Representation of Metamorphic Petrology Data
NASA Astrophysics Data System (ADS)
West, P.; Fox, P. A.; Spear, F. S.; Adali, S.; Nguyen, C.; Hallett, B. W.; Horkley, L. K.
2012-12-01
More and more metamorphic petrology data is being collected around the world, and is now being organized together into different virtual data portals by means of virtual organizations. For example, there is the virtual data portal Petrological Database (PetDB, http://www.petdb.org) of the Ocean Floor that is organizing scientific information about geochemical data of ocean floor igneous and metamorphic rocks; and also The Metamorphic Petrology Database (MetPetDB, http://metpetdb.rpi.edu) that is being created by a global community of metamorphic petrologists in collaboration with software engineers and data managers at Rensselaer Polytechnic Institute. The current focus is to provide the ability for scientists and researchers to register their data and search the databases for information regarding sample collections. What we present here is the next step in evolution of the MetPetDB portal, utilizing semantically enabled features such as discovery, data casting, faceted search, knowledge representation, and linked data as well as organizing information about the community and collaboration within the virtual community itself. We take the information that is currently represented in a relational database and make it available through web services, SPARQL endpoints, semantic and triple-stores where inferencing is enabled. We will be leveraging research that has taken place in virtual observatories, such as the Virtual Solar Terrestrial Observatory (VSTO) and the Biological and Chemical Oceanography Data Management Office (BCO-DMO); vocabulary work done in various communities such as Observations and Measurements (ISO 19156), FOAF (Friend of a Friend), Bibo (Bibliography Ontology), and domain specific ontologies; enabling provenance traces of samples and subsamples using the different provenance ontologies; and providing the much needed linking of data from the various research organizations into a common, collaborative virtual observatory. In addition to better representing and presenting the actual data, we also look to organize and represent the knowledge information and expertise behind the data. Domain experts hold a lot of knowledge in their minds, in their presentations and publications, and elsewhere. Not only is this a technical issue, this is also a social issue in that we need to be able to encourage the domain experts to share their knowledge in a way that can be searched and queried over. With this additional focus in MetPetDB the site can be used more efficiently by other domain experts, but can also be utilized by non-specialists as well in order to educate people of the importance of the work being done as well as enable future domain experts.
Hernández-Ibarra, Jose Anselmo; Laredo-Cisneros, Marco Samuel; Mondragón-González, Ricardo; Santamaría-Guayasamín, Natalie; Cisneros, Bulmaro
2015-12-01
α-Dystrobrevin (α-DB) is a cytoplasmic component of the dystrophin-associated complex involved in cell signaling; however, its recently revealed nuclear localization implies a role for this protein in the nucleus. Consistent with this, we demonstrated, in a previous work that α-DB1 isoform associates with the nuclear lamin to maintain nuclei morphology. In this study, we show the distribution of the α-DB2 isoform in different subnuclear compartments of N1E115 neuronal cells, including nucleoli and Cajal bodies, where it colocalizes with B23/nucleophosmin and Nopp140 and with coilin, respectively. Recovery in a pure nucleoli fraction undoubtedly confirms the presence of α-DB2 in the nucleolus. α-DB2 redistributes in a similar fashion to that of fibrillarin and Nopp140 upon actinomycin-mediated disruption of nucleoli and to that of coilin after disorganization of Cajal bodies through ultraviolet-irradiation, with relocalization of the proteins to the corresponding reassembled structures after cessation of the insults, which implies α-DB2 in the plasticity of these nuclear bodies. That localization of α-DB2 in the nucleolus is physiologically relevant is demonstrated by the fact that downregulation of α-DB2 resulted in both altered nucleoli structure and decreased levels of B23/nucleophosmin, fibrillarin, and Nopp140. Since α-DB2 interacts with B23/nucleophosmin and overexpression of the latter protein favors nucleolar accumulation of α-DB2, it appears that targeting of α-DB2 to the nucleolus is dependent on B23/nucleophosmin. In conclusion, we show for the first time localization of α-DB2 in nucleoli and Cajal bodies and provide evidence that α-DB2 is involved in the structure of nucleoli and might modulate nucleolar functions. © 2015 Wiley Periodicals, Inc.
NASA Astrophysics Data System (ADS)
Azizah, A.; Suselo, Y. H.; Muthmainah, M.; Indarto, D.
2018-05-01
Gestational Hypertension is one of the three main causes of maternal mortality in Indonesia. Nifedipine which blockes the Cav1.2 calcium channel has frequently been used to treat gestational hypertension. However the efficacy of nifedipine has not been established yet and the prevalence of gestational hypertension is still high (27.1 %). Indonesian herbal plants have potential to be developed as natural drugs. Molecular docking, a computational method, is very often used to depict interaction between molecules and target receptor This study was therefore to identify Indonesian herbal plants that could inhibit the calcium channel in silico. This was a bioinformatics study with molecular docking approach. Three-dimensional structure of human calcium channel Cav1.2 was determined by modelling with rabbit calcium channel (ID:5GJW) as template and using the SWISS MODEL software. Nifedipine was used as a standard ligand and obtained from ZINC database with the access code ZINC19594578. Active compounds of Indonesian herbal plants were registered in HerbalDB database and their molecular structure was obtained from PubChem. Binding affinity of human Cav1.2 model-ligand complexes were assesed using AutoDock Vina 1.1.2 software and visualization of molecular conformation used Chimera 1.10 and PyMol 1.3 softwares. The Lipinsky’s rules of five were used to determine active compounds which fullfilled drug criteria. The human Cav1-2 model had 72.35% sequence identity with rabbit Cav1.1. Nifedipine bound to the human Cav1.2 model with -2.1 kcal/mol binding affinity and had binding sites at Gln1060, Phe1129, Ser1132, and Ile1173 residues. A lower binding affinity was observed in 8 phytochemicals but only obtusifolin 2-glucoside (-2.2 kcal/mol) had similar binding sites as nifedipin did. In addition, obtusifolin 2-glucoside met the Lipinsky criteria and the molecule conformation was similar with nifedipine. From the HerbalDB database, obtusifolin 2-glucoside is found in Tectona grandis. Obtusifolin 2-glucoside computationally becomes a potensial candidate of calcium channel blocker. In vitro assays should be performed to evaluate the antagonist effect of obtusifolin 2-glucoside on calcium channel Cav1.2.
Application of 3D Spatio-Temporal Data Modeling, Management, and Analysis in DB4GEO
NASA Astrophysics Data System (ADS)
Kuper, P. V.; Breunig, M.; Al-Doori, M.; Thomsen, A.
2016-10-01
Many of todaýs world wide challenges such as climate change, water supply and transport systems in cities or movements of crowds need spatio-temporal data to be examined in detail. Thus the number of examinations in 3D space dealing with geospatial objects moving in space and time or even changing their shapes in time will rapidly increase in the future. Prominent spatio-temporal applications are subsurface reservoir modeling, water supply after seawater desalination and the development of transport systems in mega cities. All of these applications generate large spatio-temporal data sets. However, the modeling, management and analysis of 3D geo-objects with changing shape and attributes in time still is a challenge for geospatial database architectures. In this article we describe the application of concepts for the modeling, management and analysis of 2.5D and 3D spatial plus 1D temporal objects implemented in DB4GeO, our service-oriented geospatial database architecture. An example application with spatio-temporal data of a landfill, near the city of Osnabrück in Germany demonstrates the usage of the concepts. Finally, an outlook on our future research focusing on new applications with big data analysis in three spatial plus one temporal dimension in the United Arab Emirates, especially the Dubai area, is given.
de Kleijn, Jasper L; van Kalmthout, Ludwike W M; van der Vossen, Martijn J B; Vonck, Bernard M D; Topsakal, Vedat; Bruijnzeel, Hanneke
2018-05-24
Although current guidelines recommend cochlear implantation only for children with profound hearing impairment (HI) (>90 decibel [dB] hearing level [HL]), studies show that children with severe hearing impairment (>70-90 dB HL) could also benefit from cochlear implantation. To perform a systematic review to identify audiologic thresholds (in dB HL) that could serve as an audiologic candidacy criterion for pediatric cochlear implantation using 4 domains of speech and language development as independent outcome measures (speech production, speech perception, receptive language, and auditory performance). PubMed and Embase databases were searched up to June 28, 2017, to identify studies comparing speech and language development between children who were profoundly deaf using cochlear implants and children with severe hearing loss using hearing aids, because no studies are available directly comparing children with severe HI in both groups. If cochlear implant users with profound HI score better on speech and language tests than those with severe HI who use hearing aids, this outcome could support adjusting cochlear implantation candidacy criteria to lower audiologic thresholds. Literature search, screening, and article selection were performed using a predefined strategy. Article screening was executed independently by 4 authors in 2 pairs; consensus on article inclusion was reached by discussion between these 4 authors. This study is reported according to the Preferred Reporting Items for Systematic Review and Meta-analysis (PRISMA) statement. Title and abstract screening of 2822 articles resulted in selection of 130 articles for full-text review. Twenty-one studies were selected for critical appraisal, resulting in selection of 10 articles for data extraction. Two studies formulated audiologic thresholds (in dB HLs) at which children could qualify for cochlear implantation: (1) at 4-frequency pure-tone average (PTA) thresholds of 80 dB HL or greater based on speech perception and auditory performance subtests and (2) at PTA thresholds of 88 and 96 dB HL based on a speech perception subtest. In 8 of the 18 outcome measures, children with profound HI using cochlear implants performed similarly to children with severe HI using hearing aids. Better performance of cochlear implant users was shown with a picture-naming test and a speech perception in noise test. Owing to large heterogeneity in study population and selected tests, it was not possible to conduct a meta-analysis. Studies indicate that lower audiologic thresholds (≥80 dB HL) than are advised in current national and manufacturer guidelines would be appropriate as audiologic candidacy criteria for pediatric cochlear implantation.
NASA Astrophysics Data System (ADS)
Tanabe, T.
The CRD database, which has been accumulating financial data on SMEsover the ten years since its founding, and has gathered approximately 12 million records for around 2 million SMEs, approximately 3 million records for somewhere around 900,000 sole proprietors, also collected default data on these companies and sole proprietors. The CRD database's weakness is anonymity. Going forward, therefore, it appears the CRD Association is faced with questions concerning how it will enhance the attractiveness of its database whether new knowledge should be gained by using econophysics or other research approaches. We have already seen several examples of knowledge gained through econophysical analyses using the CRD database, and I would like to express my hope that we will eventually see greater application of the SME credit information database and econophysical analysis for the development of Japans SME policies which are scientific economic policies for avoiding moral hazard, and will expect elucidating risk scenarios for the global financial, natural disaster, and other shocks expected to happen with greater frequency. Therefore, the role played by econophysics will become increasingly important, and we have high expectations for the role to be played by the field of econophysics.
The CoreWall Project: An Update for 2007
NASA Astrophysics Data System (ADS)
Yu-Chung Chen, J.; Higgins, S.; Hur, H.; Ito, E.; Jenkins, C. J.; Johnson, A.; Leigh, J.; Morin, P.; Lee, J.
2007-12-01
The CoreWall Suite is a NSF-supported collaborative development for a real-time core description (Corelyzer), stratigraphic correlation (Correlater), and data visualization (CoreNavigator) software to be used by the marine, terrestrial and Antarctic science communities. The overall goal of the Corewall software development is to bring portable cross-platform tools to the broader drilling and coring communities to expand and enhance data visualization and enhance collaborative integration of multiple datasets. The CoreWall Project is now in its second year and significant progress has been made on all 3 software components. Corelyzer has undergone 2 field deployments and testing by ANDRILL program in 2006 (and again in Fall 2007) and by ICDP's SAFOD project (summer 2007). In addition, Corewall group and ICDP are working together so that the core description (DIS) system can expose DIS core data directly into Corelyzer seamlessly and be available to future ICDP and IODP-Mission Specific Platform expeditions. Educators have also taken note of the software's ease of use and strong visualization capabilities to begin exploring curriculum projects with Corelyzer software. To ensure that the software development is integrated with other community IT activities the development of the U.S. IODP-Phase 2 Scientific Ocean Drilling Vessel (SODV), a Steering Committee was constituted. It is composed of key U.S. IODP and related database (e.g., CHRONOS, SedDB) developers and users as well as representatives of other core-based enterprises (e.g., ANDRILL, ICDP, LacCore). Corelyzer (CoreWall's main visual core description tool) software displays digital core images from one or more cores along with discrete data streams (eg. physical properties, downhole logs) and nested images (eg. thin sections, fossils) to provide a robust approach to the description of sediment cores. Corelyzer's digital image handling allows the cores to be viewed from micron to km scale determined by the image resolution along a sliding plane, effectively making it a "digital microscope". Detailed features such as lithologic variation, macroscopic grain size variation, bioturbation intensity, chemical composition and micropaleontology are easier to interpret and annotate. Significant new capabilities have been added to allow for importing multiple images and data types, sharing/exporting Corelyzer "work sessions" for multiple users, enhanced annotations, as well as support for other activities like examining clasts, and sample requests. The new Correlator software, the updated version of Splicer/Sagan software used by ODP for over 10 years, has been ported into a single new analysis tool that will work across multiple platforms and interact seamlessly with both JANUS (ODP's relational database), CHRONOS, PetDB, SedDB, dbSEABED and other databases. This functionality will result in a CoreWall Suite module that can be used and distributed anywhere for stratigraphic and age correlation tasks. CoreNavigator, a spatial data discovery tool, has taken on a virtual Globe interface that allows users to enter Corelyzer from a geographic-visual standpoint.
Zhou, Guangyu; Wang, Yanqiu; He, Ping; Li, Detian
2013-01-01
The present study was conducted to investigate the effects of probucol on the progression of diabetic nephropathy and the underlying mechanism in type 2 diabetic db/db mice. Eight weeks db/db mice were treated with regular diet or probucol-containing diet (1%) for 12 weeks. Non-diabetic db/m mice were used as controls. We examined body weight, blood glucose, and urinary albumin. At 20 weeks, experimental mice were sacrificed and their blood and kidneys were extracted for the analysis of blood chemistry, kidney histology, oxidative stress marker, and podocyte marker. As a result, 24 h urinary albumin excretions were reduced after probucol treatment. There were improvements of extracellular matrix accumulation and fibronectin and collagen IV deposition in glomeruli in the probucol-treated db/db mice. The reduction of nephrin and the loss of podocytes were effectively prevented by probucol in db/db mice. Furthermore, probucol significantly decreased the production of thiobarbituric acid-reactive substances (TBARS), an index of reactive oxygen species (ROS) generation and down-regulated the expression of Nox2. Taken together, our findings support that probucol may have the potential to protect against type 2 diabetic nephropathy via amelioration of podocyte injury and reduction of oxidative stress.
Speech Articulation of Low-Dose Oral Contraceptive Users.
Meurer, Eliséa Maria; Fontoura, Giana Valeria Fagundez; Corleta, Helena von Eye; Capp, Edison
2015-11-01
In the female life cycle, hormonal fluctuations may result in impaired verbal efficiency and vocal worsening during the premenstrual phase. Oral contraceptives may interfere with vocal range. Voice, resonance, and articulation variations clarify speech content. To investigate the phonoarticulatory sounds produced by oral contraceptive users aged between 20 and 30 years. This is a cross-sectional study. Our study included four groups of women (n = 66): two groups used low-dose oral contraceptives and two groups did not use any oral contraceptives. Questionnaires and sound records were used. Acoustic analysis was performed using the Computerized Speech Laboratory program, Model 4341 (Kay Elemetrics Corp, Lincoln Park, New Jersey). The statistical analysis of the SPPS database, version 13.0, was performed by means of generalized estimating equation. In the groups that did not use oral contraceptives, sustained vowel tones were more acute in the two phases and cycles of women older than 25 years (w/oOC1, 175 ± 74 to 190 ± 55 Hz; w/oOC2, 194 ± 56 to 210 ± 32 Hz). At the midfollicular phase (Fph) and midluteal phase (Lph) of the two cycles, the speed of the speech was slower in this group (w/oOC1: Fph, 5.3 ± 1.6/s and Lph, 5.4 ± 1.4/s; w/oOC2: Fph, 4.5 ± 1.7/s and Lph, 4.8 ± 1.1/s). In both groups that used oral contraceptives, there was a higher modulation frequency in the sentences when compared with nonusers (OC1, 33 ± 10 Hz; w/oOC1, 28 ± 10 Hz; OC2, 34 ± 10 Hz; w/oOC2, 27 ± 10 Hz). Vocal intensity was closer between the OC1 (62 ± 4 dB), w/oOC1 (61 ± 3 dB), and OC2 (63 ± 4 dB) groups when compared with the w/oOC2 (67 ± 6 dB) group. We demonstrated hormonal influences on speech articulation of contraceptive users and nonusers. Copyright © 2015 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Spectral signature verification using statistical analysis and text mining
NASA Astrophysics Data System (ADS)
DeCoster, Mallory E.; Firpi, Alexe H.; Jacobs, Samantha K.; Cone, Shelli R.; Tzeng, Nigel H.; Rodriguez, Benjamin M.
2016-05-01
In the spectral science community, numerous spectral signatures are stored in databases representative of many sample materials collected from a variety of spectrometers and spectroscopists. Due to the variety and variability of the spectra that comprise many spectral databases, it is necessary to establish a metric for validating the quality of spectral signatures. This has been an area of great discussion and debate in the spectral science community. This paper discusses a method that independently validates two different aspects of a spectral signature to arrive at a final qualitative assessment; the textual meta-data and numerical spectral data. Results associated with the spectral data stored in the Signature Database1 (SigDB) are proposed. The numerical data comprising a sample material's spectrum is validated based on statistical properties derived from an ideal population set. The quality of the test spectrum is ranked based on a spectral angle mapper (SAM) comparison to the mean spectrum derived from the population set. Additionally, the contextual data of a test spectrum is qualitatively analyzed using lexical analysis text mining. This technique analyzes to understand the syntax of the meta-data to provide local learning patterns and trends within the spectral data, indicative of the test spectrum's quality. Text mining applications have successfully been implemented for security2 (text encryption/decryption), biomedical3 , and marketing4 applications. The text mining lexical analysis algorithm is trained on the meta-data patterns of a subset of high and low quality spectra, in order to have a model to apply to the entire SigDB data set. The statistical and textual methods combine to assess the quality of a test spectrum existing in a database without the need of an expert user. This method has been compared to other validation methods accepted by the spectral science community, and has provided promising results when a baseline spectral signature is present for comparison. The spectral validation method proposed is described from a practical application and analytical perspective.